mdciao.contacts.ContactPair¶
-
class
mdciao.contacts.
ContactPair
(res_idxs_pair, ctc_trajs, time_trajs, top=None, trajs=None, atom_pair_trajs=None, fragment_idxs=None, fragment_names=None, fragment_colors=None, anchor_residue_idx=None, consensus_labels=None, consensus_fragnames=None)¶ Container for a contacts between two residues
This is the first level of abstraction of mdciao. It is the “closest” to the actual data, and its methods carry out most of the low-level operations on the data, e.g., the frequency calculations or the basic plotting. Other classes like
ContactGroup
usually just wrap around a collection ofContactPair
-objects and use their methods.This class just needs the pair of residue (serial) indices, the time-traces of the distances between the residues (for all input trajectories), and the time-traces of the timestamps in those trajectories.
Many other pieces of complementary information can be provided as optional parameters, allowing the class to produce better plots, labels, and tables.
Some sanity checks are carried out upon instantiation to ensure things like same number of steps in the in the distance and timestamp time-traces.
Note
Higher-level methods in the API, like those exposed by
mdciao.cli
will returnContactPair
orContactGroup
objects already instantiated and ready to use. It is recommened to use those instead of individually callingContactPair
orContactGroup
.-
__init__
(res_idxs_pair, ctc_trajs, time_trajs, top=None, trajs=None, atom_pair_trajs=None, fragment_idxs=None, fragment_names=None, fragment_colors=None, anchor_residue_idx=None, consensus_labels=None, consensus_fragnames=None)¶ - Parameters
res_idxs_pair (iterable of two ints) – pair of residue indices, corresponding to the zero-indexed, serial number of the residues
ctc_trajs (list of iterables of floats) – time traces of the contact in nm. len(ctc_trajs) is N_trajs. Each traj can have different lengths Will be cast into arrays.
time_trajs (list of iterables of floats) – time traces of the time-values, in ps. Not having the same shape as ctc_trajs will raise an error
top (
mdtraj.Topology
, default is None) – topology associated with the contacttrajs (list of
mdtraj.Trajectory
objects, default is None) – The molecular trajectories responsible for which the contact has been evaluated. Not having the same shape as ctc_trajs will raise an erroratom_pair_trajs (list of iterables of integers, default is None) – Time traces of the pair of atom indices responsible for the distance in
ctc_trajs
Has to be of len(ctc_trajs) and each iterable of shape(Nframes, 2)fragment_idxs (iterable of two ints, default is None) – Indices of the fragments the residues of
res_idxs_pair
fragment_names (iterable of two strings, default is None) – Names of the fragments the residues of
res_idxs_pair
fragment_colors (iterable of len 2, default is None) – Colors associated to the fragments of the residues of
res_idxs_pair
. A color is anything thatmatplotlib.colors
recognizesanchor_residue_idx (int, default is None) –
Label this residue as the anchor of the contact, i.e. the residue that’s shared across a number of contacts. Has to be in
res_idxs_pair
.Note
- Using this argument will automatically populate other properties, like (this is not a complete list)
anchor_index
will contain the [0,1] index of the anchor residue inres_idxs_pair
partner_index
will contain the [0,1] index of the partner residue inres_idxs_pair
partner_residue_index
will contain the other index ofres_idx_pair
and other properties which depend on having defined an anchor and a partner
- Furhtermore, if a topology is parsed as an argument:
anchor_residue_name
will contain the anchor residue as anmdtraj.core.Topology.Residue
objectpartner_residue_name
will contain the partner residue as anmdtraj.core.Topology.Residue
object
consensus_labels (iterable of strings, default is None) – Consensus nomenclature of the residues of
res_idxs_pair
consensus_fragnames (iterable of strings, default is None) – Consensus fragments names of the residues of
res_idxs_pair
Methods
__init__
(res_idxs_pair, ctc_trajs, time_trajs)- param res_idxs_pair
pair of residue indices, corresponding to the zero-indexed, serial number of the residues
binarize_trajs
(ctc_cutoff_Ang[, switch_off_Ang])Turn each distance-trajectory into a boolean using a cutoff.
copy
()copy this object by re-instantiating another
ContactPair
object with the same attributes.count_formed_atom_pairs
(ctc_cutoff_Ang[, sort])Count how many times each atom-pair is considered in contact in the trajectories
distro_overall_trajs
([bins])Wrapper around
numpy.histogram
to produce a distribution of the distance values (not the contact frequencies) this contact over all trajectoriesfrequency_dict
(ctc_cutoff_Ang[, …])Returns the
frequency_overall_trajs
as a more informative dictionary with keys “freq”, “residue idxs”, “label”frequency_overall_trajs
(ctc_cutoff_Ang[, …])How many times this contact is formed overall frames.
frequency_per_traj
(ctc_cutoff_Ang[, …])Contact frequencies for each trajectory
gen_label
([AA_format, fragments, delete_anchor])Generate a labels with different parameters
label_flex
([AA_format, split_label, defrag])A more flexible method to produce the label of this
ContactPair
partial_counts_formed_atom_pairs
(ctc_cutoff_Ang)Count how many times each atom-pair is considered in contact in the trajectories
For those frames in which the contact is formed, group them by relative frequencies of individual atom pairs
retop
(top, mapping[, deepcopy])Return a copy of this object with a different topology.
save
(filename)Save this
ContactPair
as a pickleAttributes
fragments
label
labels
n
neighborhood
residues
returns: :rtype: int or float, maximum time from list of list of time
returns: :rtype: int or float, maximum time from list of list of time
Contains time-traces stored as a
_TimeTraces
objects-
binarize_trajs
(ctc_cutoff_Ang, switch_off_Ang=None)¶ Turn each distance-trajectory into a boolean using a cutoff. The comparison is done using “<=”, s.t. d=ctc_cutoff yields True
Whereas
ctc_cutoff_Ang
is in Angstrom, the trajectories are in nm, as produced bymdtraj.compute_contacts
Note
The method creates a dictionary in self._binarized_trajs keyed with the ctc_cutoff_Ang, to avoid re-computing already binarized trajs
- Parameters
ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”
- Returns
bintrajs
- Return type
list of boolean arrays with the same shape as the trajectories
-
copy
()¶ copy this object by re-instantiating another
ContactPair
object with the same attributes. In theory self == self.copy() should hold (but not self is self.copy()- Returns
CP
- Return type
-
count_formed_atom_pairs
(ctc_cutoff_Ang, sort=True)¶ Count how many times each atom-pair is considered in contact in the trajectories
Ideally we would return a dictionary but atom pairs is not hashable
- Parameters
ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”
sort (boolean, default is True) – Return the counts by descending order
- Returns
atom_pairs (list of atom pairs)
counts (list of ints)
-
distro_overall_trajs
(bins=10)¶ Wrapper around
numpy.histogram
to produce a distribution of the distance values (not the contact frequencies) this contact over all trajectories- Parameters
bins (int or anything
numpy.histogram
accepts) –- Returns
h (_np.ndarray) – The counts (integer valued)
x (_np.ndarray) – The bin edges
(length(hist)+1)
.
-
frequency_dict
(ctc_cutoff_Ang, switch_off_Ang=None, AA_format='short', split_label=True, atom_types=False, defrag=None)¶ Returns the
frequency_overall_trajs
as a more informative dictionary with keys “freq”, “residue idxs”, “label”- Parameters
ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”
AA_format (str, default is "short") – Amino-acid format (“E35” or “GLU25”) for the value fdict[“label”]. Can also be “long” or “just_consensus”
split_label (bool, default is True) –
Split the labels so that stacked contact labels become easier-to-read in plain ascii formats
atom_types (bool, default is false) – Include the relative frequency of atom-type-pairs involved in the contact
defrag (string, default is None) – The character to use for deleting (defragmenting) the fragment info, e.g. “@” for turning “R30@3.51” into “R30”
- Returns
fdict
- Return type
dictionary
-
frequency_overall_trajs
(ctc_cutoff_Ang, switch_off_Ang=None)¶ How many times this contact is formed overall frames. Frequencies have values between 0 and 1
- Parameters
ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”
- Returns
freq – Frequency of the contact over all trajectories
- Return type
float
-
frequency_per_traj
(ctc_cutoff_Ang, switch_off_Ang=None)¶ Contact frequencies for each trajectory
- Parameters
ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”
- Returns
freqs
- Return type
array of len self.n.n_trajs with floats between [0,1]
-
gen_label
(AA_format='short', fragments=False, delete_anchor=False)¶ Generate a labels with different parameters
- Parameters
AA_format (str, default is "short") – Alternative is “long” (“E30” vs “GLU30”)
fragments (bool, default is False) – Include fragment information Will get the “best” information available, ie consensus>fragname>fragindex
delete_anchor (bool, default is False) – the anchor
-
label_flex
(AA_format='short', split_label=True, defrag=None)¶ A more flexible method to produce the label of this
ContactPair
- Parameters
AA_format (str, default is "short") – Amino-acid format for the label, can be “short” (A35@4.50), “long” (ALA35@4.50), or “just_consensus” (4.50)
split_label (bool, default is True) –
Split the labels so that stacked contact labels become easier-to-read in plain ascii formats
defrag (char, default is None) – Character to use when defragging the contact label. Default is to leave them as is, e.g. would be “@”
- Returns
label
- Return type
str
-
partial_counts_formed_atom_pairs
(ctc_cutoff_Ang, switch_off_Ang=None, sort=True)¶ Count how many times each atom-pair is considered in contact in the trajectories
Since the
switch_off_Ang
parameter introduces partial counts, the return value need not be integer countsIdeally we would return a dictionary but atom pairs is not hashable
- Parameters
ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”
sort (boolean, default is True) – Return the counts by descending order
- Returns
atom_pairs (list of atom pairs)
counts (list of ints)
-
relative_frequency_of_formed_atom_pairs_overall_trajs
(ctc_cutoff_Ang, switch_off_Ang=None, keep_resname=False, aggregate_by_atomtype=True, min_freq=0.05)¶ For those frames in which the contact is formed, group them by relative frequencies of individual atom pairs
- Parameters
ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”
keep_resname (bool, default is False) – Keep the atom’s residue name in its descriptor. Only make sense if consolidate_by_atom_type is False
aggregate_by_atomtype (bool, default is True) – Aggregate the frequencies of the contact by tye atom types involved. Atom types are backbone, sidechain or other (BB,SC, X)
min_freq (float, default is .05) – Do not report relative frequencies below this cutoff, e.g. “BB-BB”:.9, “BB-SC”:0.03, “SC-SC”:0.03, “SC-BB”:0.03 gets reported as “BB-BB”:.9
- Returns
out_dict – Relative freqs, keyed by atom-type (atoms) involved in the contact The order is the same as in
self.ctc_labels
- Return type
dictionary
-
retop
(top, mapping, deepcopy=False, **CP_kwargs)¶ Return a copy of this object with a different topology.
Uses the
mapping
to generate new residue- and and atom-indices where necessary, using the rest of the object’s attributes (time-traces, labels, colors, fragments…) as they were.Note
- This method will (rightly) fail if:
the mapping doesn’t contain the needed residues
the individual atoms of those residues cannot be uniquely mapped between topologies
- Parameters
top (
Topology
) – The new topologymapping (indexable (array, dict, list)) – A mapping of old residue indices to new residue indices. Usually, comes from aligning the old and the new topology using
mdciao.utils.sequence.maptops
. These maps only contain (key,value) pairs whenever there’s been a “match”, s.t this method will fail ifmaping
doesn’t contain all the residues in thisContactPair
.deepcopy (bool, default is False) –
Use
copy.deepcopy
on the attributes when creating the newContactPair
. If False, the identity holds:>>> self.residues.consensus_labels is CP.residues.consensus_labels
If True, only the equality holds:
>>> self.residues.consensus_labels == CP.residues.consensus_labels
Note that
time_traces
are always created new no matter what.CP_kwargs –
Optional keyword arguments to instantiate the new
ContactPair
. Any key-value pairsinputted here will update the internal dictionary being used, which is:
- Returns
CP – A new CP with updated top and indices
- Return type
-
save
(filename)¶ Save this
ContactPair
as a pickle- Parameters
filename (str) – filename
-
property
time_max
¶ returns: :rtype: int or float, maximum time from list of list of time
-
property
time_min
¶ returns: :rtype: int or float, maximum time from list of list of time
-
property
time_traces
¶ Contains time-traces stored as a
_TimeTraces
objects
-
property
top
¶
-
property
topology
¶
-