Saturday, 12 October 2019

Pictograms with Plotly and FontAwesome

Plotly is one of the most powerful graphing packages for Python, JS and Julia. The cool feature is that the graphs are HTML bases with interactive graphs as opposed to a static jpg. There are several graphs that are missing, one of which is a pictogram. It's not a very silly graph, but  Luckily a pictogram is easy-ish to make.

Wednesday, 4 September 2019

PDB numbering rollercoaster

The position in a crystal structure and the protein sequence rarely match. In fact, there are four parts of start-end:
  • position in whole protein,
  • position in extracted sequence,
  • position in residues stated in the PDB/mmCIF structure and 
  • position which actually has coordinates.

Thursday, 8 August 2019

Jupyter notebook progressbar

I have this rather handy wee piece of code I'd like to share: a Jupyter notebook Progress bar.

Saturday, 3 August 2019

When will the PDB run out of 4-letter codes?

The PDB ids are really nice and short: 4 letter codes. But when will all the combinations run out? Actually, not for a long long time.
The current total is 155,618 structures and new ones are added at a rate of 12000 structures per year, which means that, assuming a constant growth, in 125 years —(36 ^ 4 - 155,618 ) /  12,000 —the PDB will finish codes to allocate.
2145. That is a few years after the setting of Kim Robinson's New York 2140, where New York is a flooded super-Venice, so I am guessing the RCSB PDB, in San Diego, will have long been flooded so lack of 4-letter codes is not top of their concerns.

Tuesday, 2 July 2019

Wikipedia datamining

There are several online sites that can be data-mined to reveal really nice trends, top-10s and topdown summaries. Twitter is the archetype site for this, thanks to hashtags making an easy job for anyone wanting to investigate trends. I prefer Reddit for datamining specific trends as it powered by folk having arguments on topics they are passionate about as opposed to ideas of celebrities, corporate spokespeople and ФСБ agents. eBay is also fun as it reveals what people are willing to pay for things. But the best source of data, even for other datasets, is Wikipedia. Not only to read up on things, but also to get data for things within a given "category".

Friday, 28 June 2019

Exporting Jupyter notebooks with Plotly graphs

If it is a small project or analysis, I opt for Jupyter notebook rather than an IDE such as PyCharm, which is great for large projects, but not such much for a small analyse as go project. Plotly is my goto for graphs —I proselytise about it. The advantage is that it is a wrapper for a JS library which allows interactive. However, in my system at least, using the plotly.offline.iplot plotter, when I export it as a HTML an error is thrown due to require not being set up correction. This is easily fixed.

Friday, 31 May 2019

A note on the PyMOL1.8 C01 atom oddity

This weird bug has been haunting me for ages. The PyMOl 1.8 (not 2) builder creates residues with a Cα called C01 as opposed to CA. If any operation is done to these (e.g. Rosetta Relax), they will be discarded during the reading of the file. That is, they will not be fixed and worse if Rosetta Remodel is used, it will assume that the residue never existed, because Remodel does not understand PDB numbering annoyingly. Simply substituting all 'C01' to 'CA' fixes the problem.