Sunday, 20 November 2022

glibc 2.36 vs. CentOS 7: a tale of failure

Cluster

My favourite part of coding is planning and implementing some cool idea for doing something, especially if it involves some fun maths I read up on Wikipedia a minute beforehand. In reality polishing dirty data, refactoring someone-else's bad code, reverse engineering the use of a module and trying to get stuff to work is what take up most of my time.

Having got cocky I thought I could get the latest GNU library for C (glibc) working on CentOS 7. I failed miserably, here is my sorry tale down the rabbit hole.

Friday, 4 November 2022

A module is not a namespace but a base class, because... ?

Deep learning is changing the world and fast. The list of achievements is impressive, however, why focus on the positive, when we can moan about the negative? In this blog post I will discuss three minor details that I find annoying about deep learning, namely the key word Module, the limited use of Google/Coral Edge TPUs and the coding quality of the field.

Saturday, 8 October 2022

Star imports trick

Star-imports (from typing import *) in Python are a handy, but dangerous. They are meant for quick coding, i.e. like on a jupyterlab notebook. However they are bad as they can mask other variables and cause issues down the line. They are ubiquitous online as are guides explaining why they are bad, here I just want to share a handy snippet to iron out star-imports.

Saturday, 1 October 2022

Move aside coIP Westerns, ColabFold has got this!

Recently AlphaFold2 released a new batch of models, this time covering all of the Trembl sequences in Uniprot, resulting in a huge number, which got hashtag-academic-twitter and some news editors very excited for the stamp-collecting feat. Personally, I find it annoying, not because it's pointless, but as of writing this, it has made any search for a target by name swamped by irrelevant sequences.
However, AlphaFold is great for other feats.
I have blogged about it a few times (e.g. link), which gives away my positive view of it! It can predict oligomers, with a lot more precision and confidence than docking. It does not always work either technically or meet the hypothesis. I did a long series of experiments with a hypothesis in mind which wasn't valid in the end (here), but revealed novel science and took a few minutes to set up and a few hours to run, which would have taken years if done by Western blot of a co-immunoprecipitation or cross-linking mass-spec.

Sunday, 19 June 2022

Top 10 silliest PDB residue names for ligands!

UPDATE: The PDB will finish 3 letter chemical component IDs sometime before 2024 at which point they will switch to 5 letter codes, which will be usable solely in CIF format: https://www.wwpdb.org/news/news?year=2022#630fee4cebdf34532a949c34

In some situations it is handy to use in an in silico experiment a 3-letter residue name that is not taken in the PDB. For example, PyRosetta has a system of pregenerated topologies for PDB components, which can cause issues when a ligand is loaded and the movers may use that over an incorrectly provided residue type / param file, resulting in a blown up mishapen ligand —an overly common incident*. As a result, having a list handy of what is taken is helpful. Herein are some silly observations about what the taken and untaken names are —but not ranked as a top 10, because this is not a science blog, not my local newspaper.

Saturday, 4 June 2022

Annotate as you go

There's a counter-constructive saying: a project is dead as soon as you add documentation (Aeschylus, I believe).

This could not be more incorrect. Whereas it is true that writing documentation on an evolving project will quickly result in the fresh documentation becoming quickly invalid, it is a planning truth that writing documentation once a project is finishing is impossible because there are a hundread and one more pressing issues. Therefore, adding docstrings to each function, method and class in Python as one goes along is by far more advantageous. Once this is done, however this information needs to be transmuted into documentation. Here is how once can set up ReadTheDocs without falling into a few traps, as the documentation generator Sphinx is ironically weirdly documented and should be done ideally early on, so one knows what mistakes one's making.

Tuesday, 10 May 2022

Show neighbours in nglview

Nglview is a really nice Python library which encodes a widget to show a NGL viewport, a JS 3D protein viewer used until recently by the PDB. One annoying feature is that one cannot select neighbours as easily as say PyMOL's "select byres HEM around 3".  But it is possible and here is how.