I have this rather handy wee piece of code I'd like to share: a Jupyter notebook Progress bar.
A segfault and NaN driven series of disconnected ideas, analyses and just plain silly posts about computational biochemistry, synthetic biology and microbiology.
Thursday, 8 August 2019
Saturday, 3 August 2019
When will the PDB run out of 4-letter codes?
The PDB ids are really nice and short: 4 letter codes. But when will all the combinations run out? Actually, not for a long long time.
The current total is 155,618 structures and new ones are added at a rate of 12000 structures per year, which means that, assuming a constant growth, in 125 years —(36 ^ 4 - 155,618 ) / 12,000 —the PDB will finish codes to allocate.
2145. That is a few years after the setting of Kim Robinson's New York 2140, where New York is a flooded super-Venice, so I am guessing the RCSB PDB, in San Diego, will have long been flooded so lack of 4-letter codes is not top of their concerns.
The current total is 155,618 structures and new ones are added at a rate of 12000 structures per year, which means that, assuming a constant growth, in 125 years —(36 ^ 4 - 155,618 ) / 12,000 —the PDB will finish codes to allocate.
2145. That is a few years after the setting of Kim Robinson's New York 2140, where New York is a flooded super-Venice, so I am guessing the RCSB PDB, in San Diego, will have long been flooded so lack of 4-letter codes is not top of their concerns.
Tuesday, 2 July 2019
Wikipedia datamining
There are several online sites that can be data-mined to reveal really nice trends, top-10s and topdown summaries. Twitter is the archetype site for this, thanks to hashtags making an easy job for anyone wanting to investigate trends. I prefer Reddit for datamining specific trends as it powered by folk having arguments on topics they are passionate about as opposed to ideas of celebrities, corporate spokespeople and ФСБ agents.
eBay is also fun as it reveals what people are willing to pay for things. But the best source of data, even for other datasets, is Wikipedia. Not only to read up on things, but also to get data for things within a given "category".
Friday, 28 June 2019
Exporting Jupyter notebooks with Plotly graphs
If it is a small project or analysis, I opt for Jupyter notebook rather than an IDE such as PyCharm, which is great for large projects, but not such much for a small analyse as go project. Plotly is my goto for graphs —I proselytise about it. The advantage is that it is a wrapper for a JS library which allows interactive. However, in my system at least, using the plotly.offline.iplot
plotter, when I export it as a HTML an error is thrown due to require not being set up correction. This is easily fixed.
Friday, 31 May 2019
A note on the Linux PyMOL C01 atom oddity
This weird bug has been haunting me for ages. The PyMOL 1.8 (not 2 in Win or Mac) and Linux PyMOL 2 builder creates residues with a Cα called C01 as opposed to CA. If any operation is done to these (e.g. Rosetta Relax), they will be discarded during the reading of the file. That is, they will not be fixed and worse if Rosetta Remodel is used, it will assume that the residue never existed, because Remodel does not understand PDB numbering annoyingly. Simply substituting all 'C01' to 'CA' fixes the problem.
Thursday, 16 May 2019
The secondary metabolism of pineberry strawberries
For an upcoming open-day we will extract DNA from strawberries. For this I made a slide that explains how DNA mutations lead to protein variants, than in turn lead to different phenotypes (redness in the strawberry's case). In doing this, I got fascinated by a strawberry cultivar called "Pineberry". But not because it is unpigmented, but because the reviews online say it is bland, which means that a rather early enzyme is missing resulting in a unpigmented phenotype and a bland phenotype.
Sunday, 24 March 2019
An arrow between Bootstrap cards
Recently I wanted to add an arrow (as in the triangle at the side of a tooltip or popover) pointing one card to its neighbour. It is only a few lines of code, but oddly the solutions available online are overly complex and wasteful. So this is my barebone solution.
Subscribe to:
Posts (Atom)