Sunday, 31 October 2021

Multiple sequence alignments

A sequence alignment is a rather important tool.
  • Sequence conservation is a key ingredient in most nucleotide mutation severity predictors.
  • The covariance within it powers the AlphaFold2 Evoformer and other de novo structure predictors.
  • The phylogeny extracted from it tells the evolutionary tale of the protein
However, on the very basic level, i.e. getting a nice figure, far from the world of covariance matrices, it is a slight nuisance.
Therefore I would like share some pointers on choosing species and two python operation, namely getting the equivalent residue in a homologue and making a figure in Plotly. Just like with docking, where careful and diligent human choices make all the difference, rational choices help greatly with clarity for sequence alignments.

Sunday, 17 October 2021

Filling missing loops by cannibalising AlphaFold2

I could not resist this Photoshop.
But the process is not as dramatic
and the results not as bad as Temple of Doom...
If done right.
AlphaFold2 models have a complete sequence, but for innumerable reasons the crystal structure of the protein is better, but may have missing spans. As a result one may want, for illustrative purposes only, to rip out the required parts from the AlphaFold2 models (as fragments) and have them built into the target structure. Here is how to do it by threading.

Monday, 23 August 2021

Tweaking AlphaFold2 models with PyRosetta

In a previous post I explored the pitfalls of an AlphaFold2 model from EBI. Here I thought I'd share some PyRosetta methods that may be handy to use with AlphaFold2 models.

Tuesday, 27 July 2021

What to look out for with an AlphaFold2 model

There is nothing more disheartening than telling someone "Sorry, I cannot help you with your protein, because no homologue structures of your protein are solved and any model will be rubbish". Now, with AlphaFold2 proteome release this is no longer the case. Or mostly: in fact there are several pitfalls and issues that need to be looked at, because the algorithm does not account for three things: binding partners and ligands, oligomerisation and alternate conformations.

Wednesday, 7 July 2021

Per residue RMSD

Recently I calculated the local RMSD caused by each residue and I thought I'd share the methods I used using PyRosetta —it is nothing at all novel, but I could not find a suitable implementation. The task is simple given two poses, find out what residue's backbone is changing the most by scanning along comparing each a short peptide window from each.

Monday, 26 April 2021

Remodel in Pyrosetta


The Rosetta binary Remodel is a great tool as it allows interesting designs to be made. However, it is rather incompatible with Rosetta Scripts and Pyrosetta as it is heavily dependent on command line options for customisation and repeats some of the processes internally. Despite this, it can be cohersed rather effectively to work in Pyrosetta with some convenience and this is how.

Monday, 22 February 2021

Multiple poses in NGLView

As mentioned previously, most of my Pyrosetta operations are done in a Jupyter notebook run in a cluster node. As a result, I am heavily dependent on NGLView, an IPython widget that uses NGL.js. This is nice for some quick tasks, although admitted more limited than the PyMOL mover, which however requires another ssh to forward another port. My Michelanglo webapp uses NGL.js, so I cannot but say good things of NGL.js. However, one or two things in the Python module NGLView are not immediately clear, so I'll quickly cover dealing with multiple poses here.

Sunday, 3 January 2021

Things I wish I had known about Raspberry Pis

Raspberry Pis are very fun to play with and are relatively cheap —I believe I have 6 or 7 Pis running across the house. However, very often I learnt a thing or two that I wish I had known beforehand —sometimes rather fortuitously. So here is my list of top "in-hindsight" tips from the perspective of a seasoned coder with no prior electronics experience.