pyrosetta.rosetta.protocols.hydrate
).
Hydrate
is supported by various functions
and can be run in a few different ways.
pyrosetta.rosetta.protocols.hydrate.Hydrate().apply(pose)
The arguments of the class are either nothing, a ScoreFunction or (ScoreFunction
and 'protein_flexibility', a string with value either 'all', 'none',
'near_water' or 'resfile') —more below.
Hyfile loading
The hyfile is read via the read_hyfile
, wherein it fills two
boolean vectors.
enforced_V = pyrosetta.rosetta.utility.vector1_bool(pose.total_residue())
hydrate_V = pyrosetta.rosetta.utility.vector1_bool(pose.total_residue())
pyrosetta.rosetta.protocols.hydrate.read_hyfile(hyfile='hydrate_test/hydrate.hyfile',
enforced_V=enforced_V,
hydrate_V=hydrate_V)
Now we can make pyrosetta add a water (anchored to the residue):
pyrosetta.rosetta.protocols.hydrate.hydrate_hyfile(pose=pose,
hydrate_V=hydrate_V,
resfile='hydrate_test/pack.resfile')
The boolean vectors are the same as residue selection vectors as one gets from
selector.apply(pose)
, so a very obvious alternative would have
been to provide an interface selection vector:
intersele = pyrosetta.rosetta.core.select.residue_selector.InterGroupInterfaceByVectorSelector(pyrosetta.rosetta.core.select.residue_selector.ChainSelector('A'),
pyrosetta.rosetta.core.select.residue_selector.ChainSelector('B'))
hydrate_V = intersele.apply(pose)
The hydration does not really require a resfile (does nothing) and setting it to "default" works fine.
In terms of water enforcement, the command
pyrosetta.rosetta.protocols.hydrate.enforce_all_waters(pose)
does
what you'd expect and one cab make a vector of pre-existing waters the normal
way:
water_selector = pyrosetta.rosetta.core.select.residue_selector.ResidueNameSelector()
water_selector.set_residue_name3('WAT')
wv = water_selector.apply(pose)
Do note that water residue is WAT
and while HOH
is
converted to WAT
during import, others are not, e.g.
OH2
, OD2
, DOD
, SOL
, TIP
,
TIP2
, TIP3
or TIP4
.
Water!
If we inspect the pose with the pymolmover or with nglview we see that a water was indeed added:
import nglview
view = nglview.show_rosetta(pose)
view.add_representation('hyperball', selection='[WAT]')
view
pyrosetta.rosetta.protocols.hydrate.Hydrate().apply(pose)
The waters are now in good positions, but the residues nearby were not repacked as no resfile was provided and no scorefunction given to Hydrate. Oddly this latter step is when stuff gets problematic and segmentation faults happen.
Cavities
pre = pose.total_residue()
pyrosetta.rosetta.protocols.hydrate.hydrate_cavities(pose)
post = pose.total_residue()
print(f'{post - pre} waters added.')
Stragglers
pre = pose.total_residue()
for i in range(pose.total_residue(),0, -1):
residue = pose.residue(i)
if residue.name3() == 'WAT' and round(residue.xyz(1)[0]) == i * 10_000:
# i=197, .xyz(1) =[1970000.000000001,1969999.999999999,1970000.000000004]
pose.delete_residue_slow(i)
post = pose.total_residue()
print(f'{post - pre} waters removed.')
Scorefunction
The Hydrate class can optionally accept a scorefunction and protein flexibility. The latter controls how the protein is repacked and is a string that can be "not" (default), "all", "near_water" or "resfile".
A regular full atom score function is fine —in fact, when omitted the code seems to use the generic system scorefunction. The scorefunction can also be an electron density weighted one (for more see past post about ED scoring).
For best results, one probably should use the SPaDES weights. To use the weights from the test folder one can do:
weight_filename = 'hydrate_test/spades.wts'
scorefxn = pyrosetta.get_score_function()
scorefxn.add_weights_from_file(weight_filename)
Alternatively, there is a hydrate scorefunction scorefxn = pyrosetta.create_score_function('hydrate_score12')
.
Using a Resfile
tests/integration/hydrate
folder of Rosetta there is a test example, with these files made and barstar-barnase (1BRS).123.resfile
is not meant for temp_2-retry_finalB.pdb
for some feat of genius logistics...packtask = pyrosetta.standard_packer_task(pose)
pyrosetta.rosetta.core.pack.task.parse_resfile(pose, packtask, 'hydrate_test/pack.resfile')
As expected it works. However, if we wanted to make our own resfile manually —as opposed to defining a packer for some obscure reason— there is the command:
pyrosetta.toolbox.generate_resfile.generate_resfile_from_pose(pose, 'myfile.resfile', pack=False, design=False, input_sc=True, freeze=[], specific={})
Where the dictionary specific, you can specify the residues to be repacked (NATAA
), kept untouched (NATRO
) or designed (PIKAA X
).
To repack we can calling it like:
pyrosetta.rosetta.protocols.hydrate.Hydrate(scorefxn, 'resfile').apply(pose)
Assing that the -packing:resfile command line option was passed either with pyrosetta.rosetta.basic.options.set_file_option('packing:resfile', 'xxxx')
or pyrosetta.init(extra_options='-packing:resfile hydrate_test/pack.resfile')
.
Spying
The packer task and movemap are generated behind the scenes, but we can spy on what a given command does by calling the same function:
packer_task = pyrosetta.rosetta.core.pack.task.TaskFactory.create_packer_task(pose)
movemap = pyrosetta.MoveMap()
pyrosetta.rosetta.protocols.hydrate.set_task_and_movemap(pose=pose,
protein_flexibility='all',
task=packer_task,
mm=movemap,
minimize_bb_where_packing=True)
This function fills the packer_task and movemap objects according to some specified logic set in minimise backbone and protein flexibility. Which can be printed for inspection.
Unfortunately, there is no way of setting a movemap and packertask.
PDBInfo
A minor annoyance is that the PDBInfo gets reset/rebuilt oddly under some combinations of commands (uncommon) —generally it is okay in pyrosetta but the indices are pose indices in the dumped PDB. But this can be easily circumvented by keeping a copy of it before the operations and reapplying it —don't bother at first, do it only if you hit said corner case. Although you may want to rename the waters to a given chain.pdb_info = pose.pdb_info()
water_selector = pyrosetta.rosetta.core.select.residue_selector.ResidueNameSelector()
water_selector.set_residue_name3('WAT')
wv = water_selector.apply(pose)
RV = pyrosetta.rosetta.core.select.residue_selector.ResidueVector
for i,j in enumerate(RV(wv)):
pdb_info.set_resinfo(j, 'X', i)
pose.pdb_info(pdb_info)
Other observations
-
Non-peptide ligands (without CA atom) cause problems —not sure what happens
with nucleic acid polymers— I think it is a packer dependent problem, but
deleting the residues before hydration fixes it swiftly
(
pyrosetta.rosetta.core.pose.remove_nonprotein_residues(pose)
- Deleting any residue after hydration will cause an error.
- NNCA are treated normally
- Water can dissociate —this, like the charged amino acids, is not accounted for. Neutron diffraction structures feature water (resn HOH), hydroxydes (resn OH) and hydroniums (H3O). These play a role in some coordinations. However, there exists no rule that defines that WAT can be any of the HOH, OH, H3O and this would be a can of worms so makes sense it is not addressed in the hydrate protocol or actually any hydration method.
- Water is polarisable —Drude oscillators are not a feature in Rosetta —yes, OpenMM does it, but really few tools do.
No comments:
Post a Comment