Recently AlphaFold2 released a new batch of models, this time covering all of the Trembl sequences in Uniprot, resulting in a huge number, which got hashtag-academic-twitter and some news editors very excited for the stamp-collecting feat. Personally, I find it annoying, not because it's pointless, but as of writing this, it has made any search for a target by name swamped by irrelevant sequences.
However, AlphaFold is great for other feats.
I have blogged about it a few times (e.g. link), which gives away my positive view of it! It can predict oligomers, with a lot more precision and confidence than docking. It does not always work either technically or meet the hypothesis. I did a long series of experiments with a hypothesis in mind which wasn't valid in the end (here), but revealed novel science and took a few minutes to set up and a few hours to run, which would have taken years if done by Western blot of a co-immunoprecipitation or cross-linking mass-spec.
A segfault and NaN driven series of disconnected ideas, analyses and just plain silly posts about computational biochemistry, synthetic biology and microbiology.
Saturday, 1 October 2022
Move aside coIP Westerns, ColabFold has got this!
Tuesday, 10 May 2022
Show neighbours in nglview
Nglview is a really nice Python library which encodes a widget to show a NGL viewport, a JS 3D protein viewer used until recently by the PDB. One annoying feature is that one cannot select neighbours as easily as say PyMOL's "select byres HEM around 3". But it is possible and here is how.
Sunday, 17 October 2021
Filling missing loops by cannibalising AlphaFold2
![]() |
I could not resist this Photoshop. But the process is not as dramatic and the results not as bad as Temple of Doom... If done right. |
Monday, 23 August 2021
Tweaking AlphaFold2 models with PyRosetta
In a previous post I explored the pitfalls of an AlphaFold2 model from EBI. Here I thought I'd share some PyRosetta methods that may be handy to use with AlphaFold2 models.
Tuesday, 27 July 2021
What to look out for with an AlphaFold2 model
There is nothing more disheartening than telling someone "Sorry, I cannot help you with your protein, because no homologue structures of your protein are solved and any model will be rubbish". Now, with AlphaFold2 proteome release this is no longer the case. Or mostly: in fact there are several pitfalls and issues that need to be looked at, because the algorithm does not account for three things: binding partners and ligands, oligomerisation and alternate conformations.
Wednesday, 7 July 2021
Per residue RMSD
Recently I calculated the local RMSD caused by each residue and I thought I'd share the methods I used using PyRosetta —it is nothing at all novel, but I could not find a suitable implementation. The task is simple given two poses, find out what residue's backbone is changing the most by scanning along comparing each a short peptide window from each.
Monday, 26 April 2021
Remodel in Pyrosetta
The Rosetta binary Remodel is a great tool as it allows interesting designs to be made. However, it is rather incompatible with Rosetta Scripts and Pyrosetta as it is heavily dependent on command line options for customisation and repeats some of the processes internally. Despite this, it can be cohersed rather effectively to work in Pyrosetta with some convenience and this is how.
Monday, 22 February 2021
Multiple poses in NGLView
As mentioned previously, most of my Pyrosetta operations are done in a Jupyter notebook run in a cluster node. As a result, I am heavily dependent on NGLView, an IPython widget that uses NGL.js. This is nice for some quick tasks, although admitted more limited than the PyMOL mover, which however requires another ssh to forward another port. My Michelanglo webapp uses NGL.js, so I cannot but say good things of NGL.js. However, one or two things in the Python module NGLView are not immediately clear, so I'll quickly cover dealing with multiple poses here.
Sunday, 1 November 2020
Remote notebooks and Jupyter themes
Saturday, 31 October 2020
XML to Pyrosetta: EvolutionaryDynamicsMover as an example
Tuesday, 27 October 2020
Pyrosetta scripting without a manual
I got recently asked how to figure out how to write a Pyrosetta script when there is no example. This is definitely the biggest weakness of Pyrosetta and Rosetta script, but it is not insurmountable. In fact, there is a wealth of information that is hidden that can be mined. Here is how and in the next post, I give an example.
Friday, 9 October 2020
The Freedom unit for molar energy: the foot-pound-force per pound-mole
In computational biochemistry the most commonly used unit is molar energy. The SI unit is kJ/mol (kilojoule per mole), but kcal/mol is also as frequently used —Google enumerates 5.3e6 and 3.8e6 pages for them respectively. Different programs use one or the other, GROMACS uses kJ/mol, while Rosetta uses kcal/mol. They differ by a factor of about 4, the latter has the advantage that 1 kcal/mol is the strength of a hydrogen bond and kBT/NA is 0.6 kcal/mol (25°C) or 1. kcal/mol (37°C), while the former being SI sounds more sciency ——and not in the overly obnoxious way as folk who use Kelvin for enzymology.
However, whereas it is not an SI unit, kcal/mol is still very metric and European, after all the unit calorie was introduced by a Frenchman. Therefore, a more American unit is clearly required. Hence the need for the foot–pound-force per pound-mole.
Wednesday, 7 October 2020
Rosetta/Pyrosetta on a cluster or in the cloud
Due to licensing Rosetta and Pyrosetta cannot be installed via apt-get/pip but has to be downloaded from the Rosetta Commons website. This makes things harder if you are in a colabs notebook, ssh'ed into a machine or running off a remote jupyter notebook. Luckily it actually is straightforward.
Saturday, 8 August 2020
Stay hydrated
Tuesday, 21 July 2020
Switching ligand in a PDB with Fragmenstein
Saturday, 4 July 2020
Filling missing loops —the proper way
Since posting this, I realised one can do it even faster by hijacking the threading algorithm, which albeit not it's intended purpose works fine for fixing a structure without supervision —which the following discussed methods do.
Monday, 8 June 2020
Love thy neighbours, but select them with caution
NeighborhoodResidueSelector
behaves differently than PyMOL's expand
selector and it is good to be aware of it. Namely, in PyMOL the distance is from any atom, while in Rosetta it is from the center of mass atom, unless specified differently. In reality CloseContactResidueSelector
works like PyMOL's expand
selector.Sunday, 19 April 2020
How to set up an electron density scorefunction in Pyrosetta
Wednesday, 18 March 2020
Atom names purely in RDKit
CA
is the standard name for the α-carbon. Example uses of atom names in Rosetta/pyrosetta include setting constraints, using a params file for a custom ligand and so forth. However, RDKit is a bit of a nuisance with atom names as it is not a central feature, but a feature added for PDB files that is not too well documented.