Monday, 17 August 2020

5-hydroxytryptophan biosynthesis


I was intrigued by a recent article in the journal Chem (link)  entitled "Creation of Bacterial Cells with 5-hydroxytryptophan as a 21st Amino Acid Building Block" by Chen et al. in the group of Han Xiao at Rice University, wherein they make a strain that metabolically produces 5-hydroxytryptophan for genetic code expansion. It is an interesting example of why metabolic engineering is non-trivial and how scientific research does not progress in a logical fashion.

Saturday, 8 August 2020

Stay hydrated

Waters can be an integral part of a protein structure, in fact, it is common to find water crystallised tightly in an X-ray structure. These waters can change the calculated Gibbs free energy of a protein and give better experimental results. Explicit waters can be added in Rosetta/Pyrosetta thanks to the SPaDES algorithm described in Lai et al. 2017. Here is a guide to using it in Pyrosetta.

Tuesday, 21 July 2020

Switching ligand in a PDB with Fragmenstein

For the Covid Moonshot project, one question by Prof. Frank von Delft of Diamond XChem led to a series of events that culminated in Fragmenstein, a module to do fragment mergers when the followup is as faithful to the starting crystal hits as possible. Even if it's intended use is the hit-to-lead process, there is a nice use that make it rather handy for computational biochemistry in general: switching the ligand in a PDB to another in an energy minimised fashion that obeys the original ligand.

Saturday, 4 July 2020

Filling missing loops —the proper way

Previously, I posted about how to join proteins and add missing loops the shoddy way. Now I'll address how to do it correctly, using Rosetta or Pyrosetta —I am sorry this has been so long overdue.

Monday, 8 June 2020

Love thy neighbours, but select them with caution

In Rosetta NeighborhoodResidueSelector behaves differently than PyMOL's expand selector and it is good to be aware of it. Namely, in PyMOL the distance is from any atom, while in Rosetta it is from the center of mass atom, unless specified differently. Consequently some workarounds are required.

Sunday, 19 April 2020

How to set up an electron density scorefunction in Pyrosetta

Energy minimising structures in Rosetta/Pyrosetta is essential to avoid artifactual results. Say a mutation is introduced and in the protocol the neighbourhood is repacked: if the structure is not energy minimised properly the neighbourhood repacking step will spuriously reward the mutation a very negative ∆∆G. One worry is that the energy minimisation is not faithful to the crystal structure. This argument has two sides, on one the fudgey force fields in Rosetta do not truly model the chemical interactions while on the other crystal packing may be unnatural. Both points have merit. After all Rosetta does use implicit water, which do not behave like the stripped crystallographic waters and some residues may have non-standard protonations etc. But if one wants one can use a scorefunction that is weighted by the electron density map and here is how.

Wednesday, 18 March 2020

Atom names purely in RDKit

For some applications, such as PyMOL scripts or Rosetta, atom names are really important, say CA is the standard name for the α-carbon. Example uses of atom names in Rosetta/pyrosetta include setting constraints, using a params file for a custom ligand and so forth. However, RDKit is a bit of a nuisance with atom names as it is not a central feature, but a feature added for PDB files that is not too well documented.

Friday, 21 February 2020

Working around segmentation faults of pyrosetta: threads & processes

Rosetta often does not die gracefully. Pyrosetta is the same. If the starting template is not great segmentation faults will result in the kernel issuing signal 11 to kill the process. The way around it is to spin it up as its own process via the multiprocessing module and not the threading module, because child threads use the same process.

Wednesday, 12 February 2020

Guess bond order in Rdkit by number of bound atoms

Some compChem/Biochem programs do not care about bond order and strip them, which is rather frustrating. Ligands in random PDB files without any name, smiles are a classic example.
There is no single magic mol.CorrectBondOrder() command in Rdkit, but luckily there are some tricks that can be done. Here I will discuss finding out using the number of bound atoms.