A segfault and NaN driven series of disconnected ideas, analyses and just plain silly posts about computational biochemistry, synthetic biology and microbiology.
Tuesday 29 December 2020
From cartoon to interactive infographic –the sane way
Saturday 21 November 2020
Shake it like a polaroid picture: MD in pyrosetta
> This blog post has been unfinished for two years. So I am posting in the hopes it will spur me to finish it.
The score of a pose reflects how good its interactions are in that static arrangement, a static snapshot. However, given some energy several of these interactions may break and a different conformation is seen. The best way to describe what does 1 kcal/mol mean is that it is the typical strength of a hydrogen bond, but this is rather weak... in fact this is also the average collision energy of water molecules at 37°C, because that is the molar Boltzmann constant times temperature (kBT/NA). (At that point in the explanation is it paramount to resist the urge to explain that kBT coincides with the mean of the Boltzmann distribution describing the energy of collisions as per Maxwell–Boltzmann statistics or else you get that glazed look thermodynamics seems to illicit even in folk that aren't hangover students)
Therefore, hydrogen bonds do come apart and together rather frequently and in some cases these dynamic properties result large scale switching. This cannot really be determined from a static score —even the per residue scores aren't an indication of dynamic properties. So how does one do an MD run in Pyrosetta?
Sunday 1 November 2020
Remote notebooks and Jupyter themes
Saturday 31 October 2020
XML to Pyrosetta: EvolutionaryDynamicsMover as an example
Tuesday 27 October 2020
Pyrosetta scripting without a manual
I got recently asked how to figure out how to write a Pyrosetta script when there is no example. This is definitely the biggest weakness of Pyrosetta and Rosetta script, but it is not insurmountable. In fact, there is a wealth of information that is hidden that can be mined. Here is how and in the next post, I give an example.
Friday 9 October 2020
The Freedom unit for molar energy: the foot-pound-force per pound-mole
In computational biochemistry the most commonly used unit is molar energy. The SI unit is kJ/mol (kilojoule per mole), but kcal/mol is also as frequently used —Google enumerates 5.3e6 and 3.8e6 pages for them respectively. Different programs use one or the other, GROMACS uses kJ/mol, while Rosetta uses kcal/mol. They differ by a factor of about 4, the latter has the advantage that 1 kcal/mol is the strength of a hydrogen bond and kBT/NA is 0.6 kcal/mol (25°C) or 1. kcal/mol (37°C), while the former being SI sounds more sciency ——and not in the overly obnoxious way as folk who use Kelvin for enzymology.
However, whereas it is not an SI unit, kcal/mol is still very metric and European, after all the unit calorie was introduced by a Frenchman. Therefore, a more American unit is clearly required. Hence the need for the foot–pound-force per pound-mole.
Wednesday 7 October 2020
Rosetta/Pyrosetta on a cluster or in the cloud
Due to licensing Rosetta and Pyrosetta cannot be installed via apt-get/pip but has to be downloaded from the Rosetta Commons website. This makes things harder if you are in a colabs notebook, ssh'ed into a machine or running off a remote jupyter notebook. Luckily it actually is straightforward.
Monday 17 August 2020
5-hydroxytryptophan biosynthesis
I was intrigued by a recent article in the journal Chem (link) entitled "Creation of Bacterial Cells with 5-hydroxytryptophan as a 21st Amino Acid Building Block" by Chen et al. in the group of Han Xiao at Rice University, wherein they make a strain that metabolically produces 5-hydroxytryptophan for genetic code expansion. It is an interesting example of why metabolic engineering is non-trivial and how scientific research does not progress in a logical fashion.
Saturday 8 August 2020
Stay hydrated
Tuesday 21 July 2020
Switching ligand in a PDB with Fragmenstein
Saturday 4 July 2020
Filling missing loops —the proper way
Since posting this, I realised one can do it even faster by hijacking the threading algorithm, which albeit not it's intended purpose works fine for fixing a structure without supervision —which the following discussed methods do.
Monday 8 June 2020
Love thy neighbours, but select them with caution
NeighborhoodResidueSelector
behaves differently than PyMOL's expand
selector and it is good to be aware of it. Namely, in PyMOL the distance is from any atom, while in Rosetta it is from the center of mass atom, unless specified differently. In reality CloseContactResidueSelector
works like PyMOL's expand
selector.Sunday 19 April 2020
How to set up an electron density scorefunction in Pyrosetta
Wednesday 18 March 2020
Atom names purely in RDKit
CA
is the standard name for the α-carbon. Example uses of atom names in Rosetta/pyrosetta include setting constraints, using a params file for a custom ligand and so forth. However, RDKit is a bit of a nuisance with atom names as it is not a central feature, but a feature added for PDB files that is not too well documented.Friday 21 February 2020
Working around segmentation faults of pyrosetta: threads & processes
Wednesday 12 February 2020
Guess bond order in Rdkit by number of bound atoms
There is no single magic
mol.CorrectBondOrder()
command in Rdkit, but luckily there are some tricks that can be done. Here I will discuss finding out using the number of bound atoms.