-
Notifications
You must be signed in to change notification settings - Fork 667
Frequently (or at least sometimes) asked questions about MDAnalysis.
Feel free to add to this file.
Follow the Installation Quick Start. We recommend the conda installation.
If you want to run the tests and do the examples in the docs and in the tutorial then also install MDAnalysisTests, as described in the Quick Start.
Sometimes on the mailing list we tell you that a certain feature or fix is "in the development version". You can install the development version by following the instructions under DevelopmentBranch.
Based on user list: Add chain IDs to protein files.
For MDAnalysis 0.19.2:
Load the PDB file:
U = mda.Universe("protein.pdb")
Then create a new segment ('T' in this example) and assign the segment to the atoms of interest:
newseg = U.add_Segment(segid='T')
U.select_atoms('bynum 1201:2400').residues.segments = newseg
When you write out the atoms as a PDB file, MDAnalysis will set the ChainID to the segment identifier (or the first letter if you used more than one letter)
U.atoms.write("relabeled.pdb")
A fundamental concept in MDAnalysis is that at any one time, only one time frame of the trajectory is being accessed. Think of the trajectory of a tape and MDAnalysis has a read-head that sits at a specific frame. The data from this specific frame are currently available.
The Timestep data structure holds this information: u.trajectory.ts
. You also see the Timestep when you iterat
for ts in u.trajectory:
print(ts.frame, ts.time)
prints the frame index (starting with 0) and the time for all frames.
Many properties of atoms change with time, namely positions (and velocities and forces if they were recorded). Thus u.atoms.positions
changes with time: it corresponds to the quantity X(t) (all coordinates as a function of time, at time step t).
If you try to add or change coordinates or velocities by assigning ts.positions = new_x
then you will find that your changes are not persistent: when you read the same frame again, the coordinates are again the ones from the trajectory.
This is because of how MDAnalysis only ever loads a single frame of data into memory at any one time (see Why do the positions change?). Loading the next frame of data overwrites the old frame, so whenever you load a new timestep, any changes are discarded. In the case of you adding velocities however, because the trajectory has no velocities, these don't get discarded/overwritten, but instead the stale velocities from the final frame persist.
There are different approaches to solve this problem:
If your trajectory is small enough that you could fit it into memory you could use the MemoryReader. This stores the trajectory in memory, so any changes you make are persistent. Ie if you modify the coordinates (or velocities) then move between frames and return your changes will still be present. To do this:
# make sure that the Universe thinks that there are velocities
u.trajectory.ts.has_velocities = True
# this sucks all data into memory, so the trajectory file isn't read after this command
u.transfer_to_memory()
# you can then set all velocities at once, ie give it a [nsteps, natoms, 3] array
u.trajectory.velocity_array = something
Alternatively if your data is too large to fit into memory, you would have to save it to file then reload this data, so
with mda.Writer('with_vels.trr', n_atoms=len(u.atoms)) as w:
for its,ts in enumerate(u.trajectory):
ts.velocities = velocities[its,:,:]
w.write(ts)
u.load_new('with_vels.trr')
This has created a TRR file with the existing position data and your velocity data