Pages

Monday, 22 February 2021

Multiple poses in NGLView

As mentioned previously, most of my Pyrosetta operations are done in a Jupyter notebook run in a cluster node. As a result, I am heavily dependent on NGLView, an IPython widget that uses NGL.js. This is nice for some quick tasks, although admitted more limited than the PyMOL mover, which however requires another ssh to forward another port. My Michelanglo webapp uses NGL.js, so I cannot but say good things of NGL.js. However, one or two things in the Python module NGLView are not immediately clear, so I'll quickly cover dealing with multiple poses here.

Components

A key concept to understand are components. In PyMOL, in the right hand panel are the "objects" in the scene, models, maps, GCOs, distance lines, etc. Each model loaded is in its own object. Alignment happens between objects and different objects can have the same chain id and residue id —hence why when they are merged with the create command, sometimes things can go wrong.

In NGL these are called components and work the same way —mostly. The two major differences are the names and implementation. The names of the representations does not fully match, eg. spheres in PyMOL is spacefill in NGL. It's fairly intuitive and for Michelanglo, I made a conversion table. The two store how to implement representations differently. Every atom in an object in PyMOL has a representation binary value and a colour, which determines how it is displayed. As a result an atom can have two or more representation, but will have the same colour. In NGL, each component has an array representations, each potentially assigning a different colour to an atom.

Conversion to PDB file

There are several really handy built-ins. This will show a pyrosetta pose:

import nglview as nv
nv.show_rosetta(pose)

While this will show an RDKit molecule

import nglview as nv
nv.show_rdkit(mol)

Behind the scenes the RDKit::Chem::Mol object and the Pyrosetta::Pose object are converted to a PDB file.

As far as I know, to add extra components this wrapper cannot be used as is and a filehandle of a PDB file only can be passed to the method view.add_component(fh, ext='pdb'). Manually filling the component list from another view is fruitless as this will not alter the array of the scene object in JS. Consequently, Chem.MolToMolFile(mol, filename) and pose.dump_pdb(filename) operations with a tempfile.NamedTemporaryFile can be used or Chem.MolToMolBlock(mol, filename) and pose.dump_pdb(buffer) and io.StringIO.

I prefer the latter, so here is how to use it:

view = nv.NGLWidget()
fh = StringIO(Chem.MolToPDBBlock(mol))
view.add_component(fh, ext='pdb')

Parenthetically, as mentioned in another post, the output of the last line gets shows with display from IPython.display. So to show the view, having it as the last line is or display(view) or view.display() will show the ngl widget —do not that subsequent displayed nglview widgets do not change.

To get a the PDB block a Pyrosetta pose as a string, one has to go through a stream object.

buffer = pyrosetta.rosetta.std.stringbuf()
ostream = pyrosetta.rosetta.std.ostream(buffer)
pose.dump_pdb(ostream)
pdbblock = buffer.str()

So in the context of nglview:

def output_pdbblock(pose: pyrosetta.Pose) -> str:
    buffer = pyrosetta.rosetta.std.stringbuf()
    pose.dump_pdb(pyrosetta.rosetta.std.ostream(buffer))
    return buffer.str()
view = nv.NGLWidget()
fh = StringIO(output_pdbblock(pose))
view.add_component(fh, ext='pdb')

Representations

Multiple components need to be coloured differently. Let's saw we have wild type and mutant and we want to show them in green and red. We can do:

import nglview as nv
from io import StringIO

class ModNGLWidget(nv.NGLWidget):
    def add_rosetta(self, pose: pyrosetta.Pose):
        buffer = pyrosetta.rosetta.std.stringbuf()
        pose.dump_pdb(pyrosetta.rosetta.std.ostream(buffer))
        fh = StringIO(buffer.str())
        view.add_component(fh, ext='pdb')

view = nv.ModNGLWidget()

view.add_rosetta(ref_pose)
view.clear_representations(component=0)
view.add_cartoon(color='green', component=0) 
view.add_rosetta(mut_pose)
view.clear_representations(component=1)
view.add_cartoon(color='red', component=1)
view


The colours can be CSS colour names, "red" or "green" or hex codes. In this case, the colours look awful because they are primary. Using the default ggplot colors, which for a pair are salmon (#F8766D) and a medium-dark cyan (#00B4C4) one gets a nicer figure.


In fact one could use a generator to make the colours. Say

def get_ggplot_colour_scale(self, n:int=7):
    ggplot_color_scales = {1: ['#F8766D'],
                           2: ['#F8766D', '#00B4C4'],
                           3: ['#F8766D', '#00BA38', '#619CFF'],
                           4: ['#F8766D', '#7CAE00', '#00BFC4', '#C77CFF']
                           7: ['#F8766D', '#C49A00','#53B400','#00C094','#00B6EB','#A58AFF','#FB61D7']
                           }
    if n in ggplot_color_scales:
        return iter(ggplot_color_scales[n])
    else:
        return iter(ggplot_color_scales[7])

colors = get_ggplot_colour_scale(3)
view.add_cartoon(color=next(colors), component=n)

Do note that the colours in ggplot are generated by spacing equally in hue along the colour wheel (using the HCL model), so change depending on the size, the above is for seven colours. The starting default colour is salmon: but the more colours are added the more they get pushed aside, hence why it starts with a cyan, then green and blue, then a dirty green, a similar turqoise-blue colour to the one in the bichromatic one and a purple. For the sake of sanity, just go with 'salmon', 'turquoise', 'emerald', 'teal', 'lilac' and 'gold' (mustard).

ColorValue


In the case of atomistic represenatations, such as licorice or hyperball it is important to colour only the carbon. In PyMOL, element C is a staple in the selection string (unless you horridly simply colour by the colour atomic afterwards, in which case you should be ashamed). In NGL, the attribute is not color, but colorValue. Note that colorValue does not work in cartoon (as this would make no sense). Combining together:

from rdkit import Chem
from rdkit.Chem import AllChem 
import nglview as nv
from io import StringIO
from typing import *
from warnings import warn
   
def get_ggplot_colour_scale(n:int=7):
    ggplot_color_scales = {1: ['#F8766D'],
                           2: ['#F8766D', '#00B4C4'],
                           3: ['#F8766D', '#00BA38', '#619CFF'],
                           4: ['#F8766D', '#7CAE00', '#00BFC4', '#C77CFF'],
                           7: ['#F8766D', '#C49A00','#53B400','#00C094','#00B6EB','#A58AFF','#FB61D7']
                           }
    if n in ggplot_color_scales:
        return iter(ggplot_color_scales[n])
    else:
        return iter(ggplot_color_scales[7])


def get_ggplot_colour_scale(n:int=7) -> Iterable[NewType('ColorHex', str)]:
    ggplot_color_scales = {1: ['#F8766D'],
                           2: ['#F8766D', '#00B4C4'],
                           3: ['#F8766D', '#00BA38', '#619CFF'],
                           4: ['#F8766D', '#7CAE00', '#00BFC4', '#C77CFF'],
                           7: ['#F8766D', '#C49A00','#53B400','#00C094','#00B6EB','#A58AFF','#FB61D7']
                           }
    if n in ggplot_color_scales:
        return iter(ggplot_color_scales[n])
    else:
        return iter(ggplot_color_scales[7])


def crete_multiple_rdkit_view(*mols: Chem.Mol) -> nv.NGLWidget:
    if len(mols) == 1 and isinstance(mols[0], Sequence):
        warn('Expected each `Chem.Mol` as an argument, got  `Sequence[Chem.Mol]`')
        mols = mols[0]
    colors = get_ggplot_colour_scale(len(mols))
    view = nv.NGLWidget()
    for m, mol in enumerate(mols): #: Tuple[int, Chem.Mol]
        fh = StringIO(Chem.MolToMolBlock(mol))
        view.add_component(fh, ext='mol')
        view.clear_representations(component=m)
        view.add_licorice(colorValue=next(colors), component=m, multipleBond='symmetric')
    return view


def show_colors(*hit_names: str) -> None:
    if len(hit_names) == 1 and isinstance(hit_names[0], Sequence):
        warn('Expected each `str` as an argument, got  `Sequence[str]`')
        mols = mols[0]
    from IPython.display import display, HTML
    template = '{0}'
    zipped = zip(hit_names, get_ggplot_colour_scale(len(hit_names)))
    spans = [template.format(name, color) for name, color in zipped]
    display(HTML('
'.join(spans))) # ---------------------------------------------------------- ortho = Chem.MolFromSmiles('Cc1cc(O)ccc1') meta = Chem.MolFromSmiles('Cc1ccc(O)cc1') AllChem.EmbedMolecule(ortho) AllChem.EmbedMolecule(meta) Chem.rdMolAlign.AlignMol(prbMol=ortho, refMol=meta, atomMap=[(0,0), (1,1)], weights=[1,1], ) show_colors('ortho', 'meta') view = crete_multiple_rdkit_view(ortho, meta) view.download_image() view

No comments:

Post a Comment