Overview

Basics

The pyxmolpp2 library implements Frame/Molecule/Residue/Atom hierarchy to represent a molecular frame.

Unlike many other molecular libraries Atom, Residue or Molecule instances do not exist in isolation, they are always part of Frame. Therefore any Atom, Residue or Molecule are guaranteed to have a parent. This makes expression atom.residue.molecule.frame.index always valid and eliminates is not None checks from user and library code.

To ease manipulation with group of elements pyxmolpp2 provides number of selection classes: CoordSelection, AtomSelection, ResidueSelection and MoleculeSelection. Selections support mutual interconversion operations, generic set operations (union, intersection, difference), slicing, iteration and number of other handy methods.

Selections are ordered sets of elements, the order in selections matches order in parent frame. Note, on mixing elements from two frames an exception will be raised, see an example.

Predicates

Library provides predicate generators in order to simplify selection filtering. On comparison with value they produce predicates which can be combined together to produce new one. Parent predicates can be naturally applied to selections of child elements, for example ResiduePredicate can be applied to filter AtomSelection.

from pyxmolpp2 import aName, rId, rName, aId, mName

for predicate in [
    aName=="CA",
    aId == 3,
    # combine AtomPredicate and ResiduePredicate:
    (aName.is_in({"N", "CA", "C", "O"})) & rId.is_in({1, 2, 3}),
    rName=="GLY",
]:
    asel = frame.atoms.filter(predicate)
    print(f"Selected {asel.size:2d} atoms from "
          f"{asel.residues.size:2d} residues"
          f" by {type(predicate).__name__}")
Selected 76 atoms from 76 residues by AtomPredicate
Selected  1 atoms from  1 residues by AtomPredicate
Selected 12 atoms from  3 residues by AtomPredicate
Selected 25 atoms from  6 residues by ResiduePredicate

Span vs Selection

There are two slightly different types of "selections" in the library. A Span is a continuous selections of elements, while Selection is an arbitrary set of elements. Those two forms of selections functionally almost the same so you won't notice any difference for most of the time. For more details check API reference.

On this page I make no distinction between two.

print(frame.atoms[10:20])   # AtomSpan
print(frame.atoms[10:20:2]) # AtomSelection
AtomSpan<size=10, atoms=[A.GLN-2.C, A.GLN-2.O, ... , A.ILE-3.C]>
AtomSelection<size=5, atoms=[A.GLN-2.C, A.GLN-2.CB, ... , A.ILE-3.CA]>

Trajectory

A trajectory represents an evolution of Frame in time. It needs a reference topology provided by initial Frame and number of input coordinate files.

Let's construct out trajectory from trjtool .dat files (TrjtoolDatFile)

traj = Trajectory(frame)
for i in range(1, 3):
    traj.extend(TrjtoolDatFile(f"{path_to_traj}/run{i:05d}.dat"))
print(traj.size)
2000

Trajectory supports index access and slices:

frame_10 = traj[10] # returns copy of frame
print(len(traj[:100]))
100

On iteration over trajectory (or its slice) a copy of frame is created at the beginning and updated on every step.

for f in traj[::250]:
    print(f"{f.index:4d}", f.coords.mean())
   0 [8.422286, 0.967190, -13.856332]
 250 [8.376922, 3.669990, -14.561570]
 500 [8.682104, 0.146340, -14.947389]
 750 [6.279336, -0.708019, -14.059577]
1000 [4.111807, -3.405752, -12.400884]
1250 [5.216313, 0.451141, -11.900974]
1500 [5.455517, 0.589096, -12.383871]
1750 [3.487516, 2.592378, -11.418070]

Trajectory does not support simultaneous iterations and keeps track of iterators created. To re-enter trajectory it's required to release all references to iteration variable from previous run.

del f  # release trajectory iterator reference
for f in traj[::500]:
    print(f"{f.index:4d}", f.coords.mean())
   0 [8.422286, 0.967190, -13.856332]
 500 [8.682104, 0.146340, -14.947389]
1000 [4.111807, -3.405752, -12.400884]
1500 [5.455517, 0.589096, -12.383871]

If you forget to do so an exception will be raised.

for f in traj[::500]:
    print(f.index, f.coords.mean())
Traceback (most recent call last):
  File "<string>", line 1, in <module>
pyxmolpp2._core.TrajectoryDoubleTraverseError

Pipe-processing

Common pre-processing operations are available in pyxmolpp2.pipe. For example, we often need all frames in trajectory to be aligned by subset of atoms. We can use conveniently pre-processing trajectory by pipe.Align:

from pyxmolpp2.pipe import Align

for f in traj[::500] | Align(by=aName=="CA"):
    print(f"{f.index:4d}", f.coords.mean())
   0 [8.422286, 0.967190, -13.856332]
 500 [8.485478, 0.972509, -13.941023]
1000 [8.460522, 1.033323, -13.832233]
1500 [8.423310, 1.009532, -13.917922]

Such "pipe" processors can be chained together which makes this scheme very flexible.