A note on PLIP interactions

Sunday, 2 July 2023

A note on PLIP interactions

PLIP is a handy tool to enumerate the interactions of a given ligand. However, a few of tripping point I keep having is related to the fact the interactions are namedtuples. Here are some notes to circumvent the traps.

No SMILES and no connectivity?

One thing to note is that PLIP does not accept a SMILES. It runs OBMol.PerceiveBondOrder to determine bond order (OEDetermineConnectivity and OEPerceiveBondOrders are the OpenEye equivalent) prior to that for the connectivity if absent.
This is far from ideal for compounds that are far from ideal in the first place. As a result generating a protonated structure with correct bond order by repeated CONECT lines is preferable in my opinion (example script).
To tell PLIP you have protons already, use:

from plip.basic import config

config.NOHYDRO = True

EDIT. Something funky goes on behind the scenes, so it might be best to remove the hydrogens and accept any errors from incorrectly percieved bond orders.


from rdkit import Chem
from rdkit.Chem import AllChem
from openbabel import pybel
from plip.basic import config
from plip.structure.preparation import PDBComplex


config.NOHYDRO = True
mol = AllChem.AddHs(Chem.MolFromSmiles('c1ccccc1'))
AllChem.EmbedMolecule(mol)

block = Chem.MolToPDBBlock(mol)  # rubbish CONECT is flavor=8, but it's still okay
pb_mol = pybel.readstring('pdb', block, {'s': None})
print([a.type for a in pb_mol.atoms])
pb_mol.OBMol.PerceiveBondOrders()
print([a.type for a in pb_mol.atoms])
p = PDBComplex()
p.load_pdb(block, as_string=True)
print('this is wrong!')
print([a.type for a in p.ligands[0].mol.atoms])

These are not the interactions you are looking for

It is easy to trip up with namedtuples. The collections.namedtuple is a class factory, a function that returns a class. This returned class is a subclass of tuple (a class), not of collections.namedtuple, which is a function, so would be impossible. It is far from the labrythm of madness of Boost types, but I do trip up on this.

from collections import namedtuple
from types import FunctionType

Foo = namedtuple('Foo', ['a', 'b', 'c'])
foo = Foo(1,2,3)

assert isinstance(foo, Foo)
assert isinstance(namedtuple, FunctionType)
assert isinstance(Foo, type)  # The type of a class is type
print(Foo.__mro__)  # ('__main__.Foo','tuple','object')
assert isinstance(tuple, type)

The name given to namedtuple becomes __name__, which is not used for isinstance checking.

Foo = namedtuple('Foo', ['a', 'b', 'c'])
FauxFoo = namedtuple('Foo', ['a', 'b', 'c'])

isinstance(Foo(1,2,3), FauxFoo), Foo.__name__, FauxFoo.__name__  # (False, 'Foo', 'Foo')

This is obvious, but is a problem as the interaction classes are not exposed, cf. code.
For the sake of sanity here are the relevant lines.


from collections import namedtuple

hydroph_interaction = namedtuple('hydroph_interaction', 'bsatom bsatom_orig_idx ligatom ligatom_orig_idx '
                                             'distance restype resnr reschain restype_l, resnr_l, reschain_l')
hbond = namedtuple('hbond', 'a a_orig_idx d d_orig_idx h distance_ah distance_ad angle type protisdon resnr '
                               'restype reschain resnr_l restype_l reschain_l sidechain atype dtype')
pistack = namedtuple(
        'pistack',
        'proteinring ligandring distance angle offset type restype resnr reschain restype_l resnr_l reschain_l')
pication = namedtuple(
        'pication', 'ring charge distance offset type restype resnr reschain restype_l resnr_l reschain_l protcharged')
saltbridge = namedtuple(
        'saltbridge', 'positive negative distance protispos resnr restype reschain resnr_l restype_l reschain_l')
halogenbond = namedtuple('halogenbond', 'acc acc_orig_idx don don_orig_idx distance don_angle acc_angle restype '
                                 'resnr reschain restype_l resnr_l reschain_l donortype acctype sidechain')
waterbridge = namedtuple('waterbridge', 'a a_orig_idx atype d d_orig_idx dtype h water water_orig_idx distance_aw '
                                     'distance_dw d_angle w_angle type resnr restype reschain resnr_l restype_l reschain_l protisdon')
metal_complex = namedtuple('metal_complex', 'metal metal_orig_idx metal_type target target_orig_idx target_type '
                                       'coordination_num distance resnr restype '
                                       'reschain  restype_l reschain_l resnr_l location rms, geometry num_partners complexnum')

itxn_names = ('hydroph_interaction', 'hbond', 'pistack', 'pication', 'saltbridge', 'halogenbond', 'waterbridge', 'metal_complex')
itxn_classes = (hydroph_interaction, hbond, pistack, pication, saltbridge, halogenbond, waterbridge, metal_complex)

However, I cannot check if something is an instance of any of these as the classes are redeclared on each call (i.e. the class of one Interaction of one call of plip.structure.preparation.PDBComplex.analyze will be different from another, but will be identical within). There are three options as a result: (a) override the builtin isinstance to check by name (a cosmic no-no) (b) rummage through the garbage collector (a big no-no), (c) subclass the namedtuple-generated class and give it a custom metaclass with a custom __instancecheck__ dunder method that compares the __name__ not the memory address (so faffy), or (d) don't touch isinstance and have a function that does the check, without summoning Cthulhu in the process:

def isintxn(instance, cls_or_name: Union[type, str]):
    """
    Is ``interaction`` an instance of a class with the same name as ``cls_or_name``?
    """
    if isinstance(cls_or_name, str):
        name: str = cls_or_name
    else:
        name: str = cls_or_name.__name__
    return instance.__class__.__name__ == name

Is it dist or distance or distance_l?

Each interaction seems to have slightly different fields. Here are the fields:

import pandas as pd

key_tally = pd.DataFrame([{'name': cls.__name__, **{f: True for f in cls._fields}} for cls in itxn_names]).fillna(False).set_index('name')
prefered_order = key_tally.sum().sort_values(ascending=False).index.to_list()
key_tally = key_tally[prefered_order]
key_tally.transpose()
name hydroph_interaction hbond pistack pication saltbridge halogenbond waterbridge metal_complex
restype True True True True True True True True
resnr True True True True True True True True
reschain True True True True True True True True
restype_l True True True True True True True True
resnr_l True True True True True True True True
reschain_l True True True True True True True True
distance True False True True True True False True
type False True True True False False True False
h False True False False False False True False
offset False False True True False False False False
dtype False True False False False False True False
atype False True False False False False True False
sidechain False True False False False True False False
protisdon False True False False False False True False
angle False True True False False False False False
d_orig_idx False True False False False False True False
d False True False False False False True False
a_orig_idx False True False False False False True False
a False True False False False False True False
distance_aw False False False False False False True False
distance_dw False False False False False False True False
metal_type False False False False False False False True
d_angle False False False False False False True False
water_orig_idx False False False False False False True False
w_angle False False False False False False True False
metal False False False False False False False True
water False False False False False False True False
metal_orig_idx False False False False False False False True
bsatom True False False False False False False False
target False False False False False False False True
target_orig_idx False False False False False False False True
target_type False False False False False False False True
coordination_num False False False False False False False True
donortype False False False False False True False False
location False False False False False False False True
rms False False False False False False False True
geometry False False False False False False False True
num_partners False False False False False False False True
acctype False False False False False True False False
protcharged False False False True False False False False
acc_angle False False False False False True False False
charge False False False True False False False False
ligatom True False False False False False False False
ligatom_orig_idx True False False False False False False False
distance_ah False True False False False False False False
distance_ad False True False False False False False False
proteinring False False True False False False False False
ligandring False False True False False False False False
ring False False False True False False False False
bsatom_orig_idx True False False False False False False False
don_angle False False False False False True False False
positive False False False False True False False False
negative False False False False True False False False
protispos False False False False True False False False
acc False False False False False True False False
acc_orig_idx False False False False False True False False
don False False False False False True False False
don_orig_idx False False False False False True False False
complexnum False False False False False False False True

Get atom name

Atom names are a problem throughout compchem (cf. past post) and bar for saving atom names in the molFileAlias atom property in a mol file they are easily lost. PLIP does not care for them, but they can be rescued (code from my PLIPspots repo):

from typing import Union
from openbabel.pybel import Atom, Residue
from openbabel.pybel import ob

def get_atomname(atom: Union[Atom, ob.OBAtom]) -> str:
    """
    Given an atom, return its name.
    """
    if isinstance(atom, Atom):
        res: ob.OBResidue = atom.residue.OBResidue
        obatom = atom.OBAtom
    elif isinstance(atom, ob.OBAtom):
        obatom: ob.OBAtom = atom
        res: ob.OBResidue = obatom.GetResidue()
    else:
        raise TypeError
    return res.GetAtomID(obatom)

def get_atom_by_atomname(residue: Union[ob.OBResidue, Residue], atomname: str) -> ob.OBAtom:
    """
    Get an atom by its name in a residue.
    """
    if isinstance(residue, Residue):
        residue = residue.OBResidue
    obatom: ob.OBAtom
    for obatom in ob.OBResidueAtomIter(residue):
        if residue.GetAtomID(obatom).strip() == atomname:
            return obatom
    else:
        raise ValueError(f'No atom with name {atomname} in residue {residue.GetName()}')

I am not going to into the rabbit hole of Atom vs. ob.OBAtom] as it's not a rabbit hole, but an eldritch gate directly to R'lyeh...

No comments:

Post a Comment