mdciao.cli.sites¶
-
mdciao.cli.
sites
(site_inputs, trajectories, topology=None, ctc_cutoff_Ang=3.5, stride=1, scheme='closest-heavy', chunksize_in_frames=10000, n_smooth_hw=0, pbc=True, GPCR_uniprot='None', CGN_PDB='None', KLIFS_uniprotAC=None, fragments=['lig_resSeq+'], default_fragment_index=None, fragment_names='', output_dir='.', graphic_ext='.pdf', t_unit='ns', curve_color='auto', background=True, graphic_dpi=150, short_AA_names=False, save_nomenclature_files=False, ylim_Ang=10, n_jobs=1, accept_guess=False, table_ext='dat', output_desc='sites', plot_atomtypes=False, distro=False, no_disk=False, savefigs=True, savetabs=True, savetrajs=False, figures=True, plot_timedep=True)¶ Compute distances between groups of contact-pairs that are already pre-defined as sites
- Parameters
site_inputs (list, default is None) –
List of sites to compute. Sites can be either paths to site file(s) in json format or directly a site dictionary. A site dictionary is something like {“name”:”site”,
- ”pairs”:{“AAresSeq”:[“GLU30-ARG40”,
”LYS31-W70”]}}
Any site containing a residue that can’t be found in the topology will be discarded. See
mdciao.sites
for more info on the site format.trajectories –
The MD-trajectories to calculate the frequencies from. This input is pretty flexible. For more info check
mdciao.utils.str_and_dict.get_sorted_trajectories
. Accepted values are:pattern, e.g. “*.ext”
one string containing a filename
list of filenames
one
Trajectory
objectlist of
Trajectory
objects
topology (str or
Trajectory
, default is None) – The topology associated with thetrajectories
If None, the topology of the firsttrajectory
will be used, i.e. when notopology
is passed, the firsttrajectory
has to be either a .gro or .pdb file, or anTrajectory
objectctc_cutoff_Ang (float, default is 3.5) – Any residue-residue distance is considered a contact if d<=ctc_cutoff_Ang
stride (int, default is 1) – Stride the input data by this number of frames
scheme (str, default is 'closest-heavy') – Type of scheme for computing distance between residues. Choices are {‘ca’, ‘closest’, ‘closest- heavy’, ‘sidechain’, ‘sidechain-heavy’}. See mdtraj documentation for more info
chunksize_in_frames (int, default is 10000) – Stream through the trajectory data in chunks of this many frames Can lead to memory errors if
n_jobs
makes it so that e.g. 4 trajectories of 10000 frames each are loaded to memory and their residue-residue distances computedn_smooth_hw (int, default is 0) – Plots of the time-traces will be smoothed using a window of 2*n_smooth_hw
pbc (bool, default is True) – Use periodic boundary conditions
GPCR_uniprot (str or
mdciao.nomenclature.LabelerGPCR
, default is None) – For GPCR nomenclature. If str, e.g. “adrb2_human”. will try to locate a local filename or do a web lookup in the GPCRdb. Ifmdciao.nomenclature.LabelerGPCR
, use this object directly (allows for object re-use when in API mode). Seemdciao.nomenclature
for more info and references. Please note the difference between UniProt Accession Code and UniProt entry name as explained here .CGN_PDB (str or
mdciao.nomenclature.LabelerCGN
, default is None) – For CGN (G-alpha Numbering definitions) nomenclature. If str, e.g. “3SN6”, try to locate local filenames (“3SN6.pdb”, “CGN_3SN6.txt”) or do web lookups in https://www.mrc-lmb.cam.ac.uk/CGN/ and http://www.rcsb.org/. Ifmdciao.nomenclature.LabelerCGN
, use this object directly (allows for object re-use when in API mode) Seemdciao.nomenclature
for more info and references.KLIFS_uniprotAC (str or
mdciao.nomenclature.LabelerKLIFS
, default is None) –Uniprot Accession Code for kinase KLIFS nomenclature. If str, e.g. “P31751”, try to locate a local filename or do a web lookup in the GPCRdb. If
mdciao.nomenclature.LabelerKLIFS
, use this object directly (allows for object re-use when in API mode). Seemdciao.nomenclature
for more info and references. Please note the difference between UniProt Accession Code and UniProt entry name as explained here .fragments (list, default is ['lig_resSeq+']) –
Fragment control. For compatibility reasons, it has to be a list, even if it only has one element. There exist several input modes:
[“consensus”] : use things like “TM*” or “G.H*”, i.e.
GPCR or CGN-sub-subunit labels.
List of len 1 with some fragmentation heuristic, e.g.
[“lig_resSeq+”]. will use the default of
mdciao.fragments.get_fragments
. See there for info on defaults and other heuristics.List of len N that can mix different possibilities: * iterable of integers (lists or np.arrays, e.g. np.arange(20,30) * ranges expressed as integer strings, “20-30” * ranges expressed as residue descriptors [“GLU30-LEU40”]
Numeric expressions are interepreted as zero-indexed and unique residue serial indices, i.e. 30-40 does not necessarily equate “GLU30-LEU40” unless serial and sequence index coincide. If there’s more than one “GLU30”, the user gets asked to disambiguate. The resulting fragments need not cover all of the topology, they only need to not overlap.
default_fragment_index (NoneType, default is None) – In case a residue identified as, e.g, “GLU30”, appears more than one time in the topology, e.g. in case of a dimer, pass which fragment/monomer should be chosen by default. The default behaviour (None) will prompt the user when necessary
fragment_names (str or list, default is '') – If string, it has to be a list of comma-separated values. If you want unnamed fragments, use None, “None”, or “”. Has to contain names for all fragments that result from
fragments
or more. mdciao wil try to usereplace4latex
to generate LaTeX expressions from stuff like “Galpha” You can use fragment_names=”None” or “” to avoid using fragment namesoutput_dir (str, default is '.') – directory to which the results are written
graphic_ext (str, default is '.pdf') – Extension of the output graphics, default is .pdf
t_unit (str, default is 'ns') – Unit used for the temporal axis.
curve_color (str, default is 'auto') – Type of color used for the curves. Alternatives are “P” or “H”
background (bool, or color-like, (str, hex, rgb), default is True) –
When smoothing, the original curve can appear in the background in different colors * True: use a fainted version of
color
* False: don’t plot any background * color-like: use this color for the background,can be: str, hex, rgba, anything matplotlib.pyplot.colors understands
graphic_dpi (int, default is 150) – Dots per Inch (DPI) of the graphic output. Only has an effect for bitmap outputs.
short_AA_names (bool, default is False) – Use one-letter aminoacid names when possible, e.g. K145 insted of Lys145.
save_nomenclature_files (bool, default is False) – Save available nomenclature definitions to disk so that they can be accessed locally in later uses.
ylim_Ang (int, default is 10) – Limit in Angstrom of the y-axis of the time-traces. Switch to any other float or ‘auto’ for automatic scaling
n_jobs (int, default is 1) – Number of processors to use. The parallelization is done over trajectories and not over contacts, beyond n_jobs>n_trajs parallelization will not have any effect
accept_guess (bool, default is False) – Accept mdciao’s guesses regarding fragment identification using nomenclature labels
table_ext (str, default is dat) – Extension for tabled files (.dat, .txt, .xlsx).
output_desc – Descriptor for output files.
plot_atomtypes (bool, default is False) – Add the atom-types to the frequency bars by ‘hatching’ them. ‘–’ is sidechain-sidechain ‘|’ is backbone-backbone ‘’ is backbone-sidechain ‘/’ is sidechain-backbone. See Fig XX for an example
distro (bool, default is False) – Plot distance distributions instead of contact bar plots
savefigs (bool, default is True) – Save the figures
savetabs (bool, default is True) – Save the frequency tables
savetrajs (bool, default is False) – Save the timetraces
no_disk (bool, default is False) – If True, don’t save any files at all: figs, tables, trajs, nomenclature
figures (bool, default is True) – Draw figures
plot_timedep (bool, default is True) – Plot time-traces of the contacts
- Returns
CG_site – Keyed with the site name, its values are the
mdciao.contacts.ContactGroup
-objects, that conform each site- Return type
dictionary