mdciao.contacts.ContactPair

class mdciao.contacts.ContactPair(res_idxs_pair, ctc_trajs, time_trajs, top=None, trajs=None, atom_pair_trajs=None, fragment_idxs=None, fragment_names=None, fragment_colors=None, anchor_residue_idx=None, consensus_labels=None, consensus_fragnames=None)

Container for a contacts between two residues

This is the first level of abstraction of mdciao. It is the “closest” to the actual data, and its methods carry out most of the low-level operations on the data, e.g., the frequency calculations or the basic plotting. Other classes like ContactGroup usually just wrap around a collection of ContactPair-objects and use their methods.

This class just needs the pair of residue (serial) indices, the time-traces of the distances between the residues (for all input trajectories), and the time-traces of the timestamps in those trajectories.

Many other pieces of complementary information can be provided as optional parameters, allowing the class to produce better plots, labels, and tables.

Some sanity checks are carried out upon instantiation to ensure things like same number of steps in the in the distance and timestamp time-traces.

Note

Higher-level methods in the API, like those exposed by mdciao.cli will return ContactPair or ContactGroup objects already instantiated and ready to use. It is recommened to use those instead of individually calling ContactPair or ContactGroup.

__init__(res_idxs_pair, ctc_trajs, time_trajs, top=None, trajs=None, atom_pair_trajs=None, fragment_idxs=None, fragment_names=None, fragment_colors=None, anchor_residue_idx=None, consensus_labels=None, consensus_fragnames=None)
Parameters
  • res_idxs_pair (iterable of two ints) – pair of residue indices, corresponding to the zero-indexed, serial number of the residues

  • ctc_trajs (list of iterables of floats) – time traces of the contact in nm. len(ctc_trajs) is N_trajs. Each traj can have different lengths Will be cast into arrays.

  • time_trajs (list of iterables of floats) – time traces of the time-values, in ps. Not having the same shape as ctc_trajs will raise an error

  • top (mdtraj.Topology, default is None) – topology associated with the contact

  • trajs (list of mdtraj.Trajectory objects, default is None) – The molecular trajectories responsible for which the contact has been evaluated. Not having the same shape as ctc_trajs will raise an error

  • atom_pair_trajs (list of iterables of integers, default is None) – Time traces of the pair of atom indices responsible for the distance in ctc_trajs Has to be of len(ctc_trajs) and each iterable of shape(Nframes, 2)

  • fragment_idxs (iterable of two ints, default is None) – Indices of the fragments the residues of res_idxs_pair

  • fragment_names (iterable of two strings, default is None) – Names of the fragments the residues of res_idxs_pair

  • fragment_colors (iterable of len 2, default is None) – Colors associated to the fragments of the residues of res_idxs_pair. A color is anything that matplotlib.colors recognizes

  • anchor_residue_idx (int, default is None) –

    Label this residue as the anchor of the contact, i.e. the residue that’s shared across a number of contacts. Has to be in res_idxs_pair.

    Note

    Using this argument will automatically populate other properties, like (this is not a complete list)
    • anchor_index will contain the [0,1] index of the anchor residue in res_idxs_pair

    • partner_index will contain the [0,1] index of the partner residue in res_idxs_pair

    • partner_residue_index will contain the other index of res_idx_pair

    and other properties which depend on having defined an anchor and a partner

    Furhtermore, if a topology is parsed as an argument:
    • anchor_residue_name will contain the anchor residue as an mdtraj.core.Topology.Residue object

    • partner_residue_name will contain the partner residue as an mdtraj.core.Topology.Residue object

  • consensus_labels (iterable of strings, default is None) – Consensus nomenclature of the residues of res_idxs_pair

  • consensus_fragnames (iterable of strings, default is None) – Consensus fragments names of the residues of res_idxs_pair

Methods

__init__(res_idxs_pair, ctc_trajs, time_trajs)

param res_idxs_pair

pair of residue indices, corresponding to the zero-indexed, serial number of the residues

binarize_trajs(ctc_cutoff_Ang[, switch_off_Ang])

Turn each distance-trajectory into a boolean using a cutoff.

copy()

copy this object by re-instantiating another ContactPair object with the same attributes.

count_formed_atom_pairs(ctc_cutoff_Ang[, sort])

Count how many times each atom-pair is considered in contact in the trajectories

distro_overall_trajs([bins])

Wrapper around numpy.histogram to produce a distribution of the distance values (not the contact frequencies) this contact over all trajectories

frequency_dict(ctc_cutoff_Ang[, …])

Returns the frequency_overall_trajs as a more informative dictionary with keys “freq”, “residue idxs”, “label”

frequency_overall_trajs(ctc_cutoff_Ang[, …])

How many times this contact is formed overall frames.

frequency_per_traj(ctc_cutoff_Ang[, …])

Contact frequencies for each trajectory

gen_label([AA_format, fragments, delete_anchor])

Generate a labels with different parameters

label_flex([AA_format, split_label, defrag])

A more flexible method to produce the label of this ContactPair

partial_counts_formed_atom_pairs(ctc_cutoff_Ang)

Count how many times each atom-pair is considered in contact in the trajectories

relative_frequency_of_formed_atom_pairs_overall_trajs(…)

For those frames in which the contact is formed, group them by relative frequencies of individual atom pairs

retop(top, mapping[, deepcopy])

Return a copy of this object with a different topology.

save(filename)

Save this ContactPair as a pickle

Attributes

fragments

label

labels

n

neighborhood

residues

time_max

returns: :rtype: int or float, maximum time from list of list of time

time_min

returns: :rtype: int or float, maximum time from list of list of time

time_traces

Contains time-traces stored as a _TimeTraces objects

top

topology

binarize_trajs(ctc_cutoff_Ang, switch_off_Ang=None)

Turn each distance-trajectory into a boolean using a cutoff. The comparison is done using “<=”, s.t. d=ctc_cutoff yields True

Whereas ctc_cutoff_Ang is in Angstrom, the trajectories are in nm, as produced by mdtraj.compute_contacts

Note

The method creates a dictionary in self._binarized_trajs keyed with the ctc_cutoff_Ang, to avoid re-computing already binarized trajs

Parameters

ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”

Returns

bintrajs

Return type

list of boolean arrays with the same shape as the trajectories

copy()

copy this object by re-instantiating another ContactPair object with the same attributes. In theory self == self.copy() should hold (but not self is self.copy()

Returns

CP

Return type

ContactPair

count_formed_atom_pairs(ctc_cutoff_Ang, sort=True)

Count how many times each atom-pair is considered in contact in the trajectories

Ideally we would return a dictionary but atom pairs is not hashable

Parameters
  • ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”

  • sort (boolean, default is True) – Return the counts by descending order

Returns

  • atom_pairs (list of atom pairs)

  • counts (list of ints)

distro_overall_trajs(bins=10)

Wrapper around numpy.histogram to produce a distribution of the distance values (not the contact frequencies) this contact over all trajectories

Parameters

bins (int or anything numpy.histogram accepts) –

Returns

  • h (_np.ndarray) – The counts (integer valued)

  • x (_np.ndarray) – The bin edges (length(hist)+1).

frequency_dict(ctc_cutoff_Ang, switch_off_Ang=None, AA_format='short', split_label=True, atom_types=False, defrag=None)

Returns the frequency_overall_trajs as a more informative dictionary with keys “freq”, “residue idxs”, “label”

Parameters
  • ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”

  • AA_format (str, default is "short") – Amino-acid format (“E35” or “GLU25”) for the value fdict[“label”]. Can also be “long” or “just_consensus”

  • split_label (bool, default is True) –

    Split the labels so that stacked contact labels become easier-to-read in plain ascii formats

  • atom_types (bool, default is false) – Include the relative frequency of atom-type-pairs involved in the contact

  • defrag (string, default is None) – The character to use for deleting (defragmenting) the fragment info, e.g. “@” for turning “R30@3.51” into “R30”

Returns

fdict

Return type

dictionary

frequency_overall_trajs(ctc_cutoff_Ang, switch_off_Ang=None)

How many times this contact is formed overall frames. Frequencies have values between 0 and 1

Parameters

ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”

Returns

freq – Frequency of the contact over all trajectories

Return type

float

frequency_per_traj(ctc_cutoff_Ang, switch_off_Ang=None)

Contact frequencies for each trajectory

Parameters

ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”

Returns

freqs

Return type

array of len self.n.n_trajs with floats between [0,1]

gen_label(AA_format='short', fragments=False, delete_anchor=False)

Generate a labels with different parameters

Parameters
  • AA_format (str, default is "short") – Alternative is “long” (“E30” vs “GLU30”)

  • fragments (bool, default is False) – Include fragment information Will get the “best” information available, ie consensus>fragname>fragindex

  • delete_anchor (bool, default is False) – the anchor

label_flex(AA_format='short', split_label=True, defrag=None)

A more flexible method to produce the label of this ContactPair

Parameters
  • AA_format (str, default is "short") – Amino-acid format for the label, can be “short” (A35@4.50), “long” (ALA35@4.50), or “just_consensus” (4.50)

  • split_label (bool, default is True) –

    Split the labels so that stacked contact labels become easier-to-read in plain ascii formats

  • defrag (char, default is None) – Character to use when defragging the contact label. Default is to leave them as is, e.g. would be “@”

Returns

label

Return type

str

partial_counts_formed_atom_pairs(ctc_cutoff_Ang, switch_off_Ang=None, sort=True)

Count how many times each atom-pair is considered in contact in the trajectories

Since the switch_off_Ang parameter introduces partial counts, the return value need not be integer counts

Ideally we would return a dictionary but atom pairs is not hashable

Parameters
  • ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”

  • sort (boolean, default is True) – Return the counts by descending order

Returns

  • atom_pairs (list of atom pairs)

  • counts (list of ints)

relative_frequency_of_formed_atom_pairs_overall_trajs(ctc_cutoff_Ang, switch_off_Ang=None, keep_resname=False, aggregate_by_atomtype=True, min_freq=0.05)

For those frames in which the contact is formed, group them by relative frequencies of individual atom pairs

Parameters
  • ctc_cutoff_Ang (float) – Cutoff in Angstrom. The comparison operator is “<=”

  • keep_resname (bool, default is False) – Keep the atom’s residue name in its descriptor. Only make sense if consolidate_by_atom_type is False

  • aggregate_by_atomtype (bool, default is True) – Aggregate the frequencies of the contact by tye atom types involved. Atom types are backbone, sidechain or other (BB,SC, X)

  • min_freq (float, default is .05) – Do not report relative frequencies below this cutoff, e.g. “BB-BB”:.9, “BB-SC”:0.03, “SC-SC”:0.03, “SC-BB”:0.03 gets reported as “BB-BB”:.9

Returns

out_dict – Relative freqs, keyed by atom-type (atoms) involved in the contact The order is the same as in self.ctc_labels

Return type

dictionary

retop(top, mapping, deepcopy=False, **CP_kwargs)

Return a copy of this object with a different topology.

Uses the mapping to generate new residue- and and atom-indices where necessary, using the rest of the object’s attributes (time-traces, labels, colors, fragments…) as they were.

Note

This method will (rightly) fail if:
  • the mapping doesn’t contain the needed residues

  • the individual atoms of those residues cannot be uniquely mapped between topologies

Parameters
  • top (Topology) – The new topology

  • mapping (indexable (array, dict, list)) – A mapping of old residue indices to new residue indices. Usually, comes from aligning the old and the new topology using mdciao.utils.sequence.maptops. These maps only contain (key,value) pairs whenever there’s been a “match”, s.t this method will fail if maping doesn’t contain all the residues in this ContactPair.

  • deepcopy (bool, default is False) –

    Use copy.deepcopy on the attributes when creating the new ContactPair. If False, the identity holds:

    >>> self.residues.consensus_labels is CP.residues.consensus_labels
    

    If True, only the equality holds:

    >>> self.residues.consensus_labels == CP.residues.consensus_labels
    

    Note that time_traces are always created new no matter what.

  • CP_kwargs

    Optional keyword arguments to instantiate the new ContactPair. Any key-value pairs

    inputted here will update the internal dictionary being used, which is:

Returns

CP – A new CP with updated top and indices

Return type

ContactPair

save(filename)

Save this ContactPair as a pickle

Parameters

filename (str) – filename

property time_max

returns: :rtype: int or float, maximum time from list of list of time

property time_min

returns: :rtype: int or float, maximum time from list of list of time

property time_traces

Contains time-traces stored as a _TimeTraces objects

property top
property topology