mdc_neighborhoods.py
Analyse residue neighborhoods using a distance cutoff. residue-residue contacts are reported in terms of theiroverall frequencies and time-traces. A number of files containing plots, tables and data will be generated.
usage: mdc_neighborhoods.py [-h] [-r RESIDUES] [--ctc_cutoff_Ang CTC_CUTOFF_ANG] [--serial_idxs] [--stride STRIDE]
[-cc CTC_CONTROL] [--n_nearest N_NEAREST] [--chunksize_in_frames CHUNKSIZE_IN_FRAMES]
[--n_smooth_hw N_SMOOTH_HW] [-fr FRAGMENTS [FRAGMENTS ...]]
[--fragment_names FRAGMENT_NAMES] [-nf] [--fragment_colors FRAGMENT_COLORS] [--no-sort]
[--no-pbc] [-tx TABLE_EXT] [-gx GRAPHIC_EXT] [-GPCR GPCR_UNIPROT] [-CGN CGN_UNIPROT]
[-KLIFS KLIFS_STRING] [--save_nomenclature] [-od OUTPUT_DIR] [-o OUTPUT_DESC]
[--t_unit T_UNIT] [--curve_color CURVE_COLOR] [--background BACKGROUND]
[--graphic_dpi GRAPHIC_DPI] [-sa] [-nsf] [-nt] [-st] [-d] [--n_cols N_COLS]
[--n_jobs N_JOBS] [--pop_N_ctcs] [--ylim_Ang YLIM_ANG] [-ni] [-s SWITCH_OFF_ANG] [-at]
[--naive_bonds] [--scheme SCHEME]
topology trajectories [trajectories ...]
Positional Arguments
- topology
Topology file
- trajectories
trajectory file(s)
Named Arguments
- -r, --residues
The residues of interest, as coma-separated-values without spaces. The input is very flexible and accepts mixed descriptors and wildcards, eg: “GLU*,ARG*,GDP*,LEU394,380-385” is a valid input. Numbers are interpeted as a residue’s sequence number
(394 in LEU394), unless –serial_idxs is passed as an option.
- --ctc_cutoff_Ang, -co
The cutoff distance between two residues for them to be considered in contact. Default is 4.5 Angstrom.
Default: 4.5
- --serial_idxs
Interpret the indices of –residues not as their sequence idxs (e.g. 30 for GLU30), but as their serial order in the topology (e.g. 0 for GLU30 if GLU30 is the first residue in the topology). Default is False
Default: False
- --stride
Stride down the input trajectoy files by this factor. Default is 1.
Default: 1
- -cc, --ctc_control
Control the number of reported contacts. Can be an integer (keep the first n contacts) or a float representing a fraction [0,1] of the totalnumber of contacts.Default is 6.
Default: 6
- --n_nearest, -nn
Ignore this many nearest neighbors when computing neighbor lists. ‘Near’ means ‘connected by this many bonds’. Default is 4.
Default: 4
- --chunksize_in_frames
Trajectories are read in chunks of this size. Helps with big files and/or large number of contacts when you run into memory problems. Default is 2000
Default: 2000
- --n_smooth_hw, -ns
Number of frames one half of the averaging window for the time-traces. Default is 0, which means no averaging.
Default: 0
- -fr, --fragments
R|How to sub-divide the topology into fragments. Several options possible. Taking the example sequence: …-A27,Lig28,K29-…-W40,D45-…-W50,CYSP51,GDP52
- ‘resSeq’
breaks at jumps in resSeq entry: […A27,Lig28,K29,…,W40],[D45,…,W50,CYSP51,GDP52]
- ‘resSeq+’
breaks only at negative jumps in resSeq: […A27,Lig28,K29,…,W40,D45,…,W50,CYSP51,GDP52]
- ‘bonds’
breaks when AAs are not connected by bonds, ignores resSeq: […A27][Lig28],[K29,…,W40],[D45,…,W50],[CYSP51],[GDP52] notice that because phosphorylated CYSP51 didn’t get a bond in the topology, it’s considered a ligand
- ‘resSeq_bonds’
breaks both at resSeq jumps or missing bond
- ‘lig_resSeq+’
Like resSeq+ but put’s any non-AA residue into it’s own fragment: […A27][Lig28],[K29,…,W40],[D45,…,W50,CYSP51],[GDP52]
- ‘chains’
breaks into chains of the PDB file/entry
None or ‘None’ all residues are in one fragment, fragment 0
- ‘consensus’
If any consensus nomenclature is provided, ask the user for definitions using consensus labels
- 0-10,15,14 20,21,30-50 51 (example, advanced users only)
Input arbitrary fragments via their residue serial indices (zero-indexed) using space as separator. Not recommended.
- ‘None’
All residues are in one fragment (fragment 0). Can be harmless or potentially dangerous if residue labels are repeated.
If you are unsure of any of these options, use the command line tool mdc_fragments.py on your topology file.
Default: “lig_resSeq+”
- --fragment_names
Name of the fragments. Default is to name them automatically. Otherwise, give a quoted list of strings separated by commas, e.g. ‘TM1, TM2, TM3,’Use ‘None’ to avoid naming them altogether.
Default: “auto”
- -nf, --no-fragments
Do not use fragments. Default is to use them
Default: True
- --fragment_colors
- comma-separated vales of the fragment colors.
If only one value, use that color for all fragments Any matplotlib colors can be used. Default is ‘tab:blue’ Why ‘tab’? check https://matplotlib.org/3.1.1/tutorials/colors/colors.html !
Default: “tab:blue”
- --no-sort
Don’t sort the residues by their index. Default is to sort them.
Default: True
- --no-pbc
Do not consider periodic boundary conditions when computing distances. Default is to consider them
Default: True
- -tx, --table_ext
Extension for tabled files (.dat, .txt, .xlsx, .ods). Default is ‘.dat’
Default: “dat”
- -gx, --graphic_ext
Extension of the output graphics, default is .pdf
Default: “.pdf”
- -GPCR, --GPCR_UniProt
Look for GPCR consensus nomenclature, e.g Ballesteros-Weinstein, using this UniProt name, e.g. adrb2_human. First, try locally with ‘adrb2_human.xlsx’ (or a full path to the file), then do a web-lookup on the fly on the GPCRdb. See https://gpcrdb.org/services/ for more details.Default is None.
Default: “None”
- -CGN, --CGN_UniProt
Look for Common-G-protein-Nomenclature, CGN, using this UniProt name, e.g. gnas2_human. First, try locally with ‘gnas2_human.xlsx’ (or a full path to the file), then do a web-lookup on the fly on the GPCRdb. See https://gpcrdb.org/services/ for more details.Default is None.
Default: “None”
- -KLIFS, --KLIFS_string
Look for Kinase consensus nomenclature, KLIFS, using this string. e.g. P31751. First, try locally with ‘KLIFS_P31751.xlsx’ (or a full path to the file), then do a web-lookup on the fly on KLIFS. For web-lookups, the string has to be formatted as key:value, eg. ‘UniProtAC:P31751’. See the online documentation on mdciao’s LabelerKLIFS object and also https://klifs.net/ for more details.
Default: “None”
- --save_nomenclature
Save available nomenclature definitions to disk so that they can be accessed locally in later uses. Default is False
Default: False
- -od, --output_dir
directory to which the results are written. Default is ‘.’
Default: “.”
- -o, --output_desc
Descriptor for output files. Default is neighborhood
Default: “neighborhood”
- --t_unit
Unit used for the temporal axis, default is ns.
Default: “ns”
- --curve_color
Type of color used for the curves. Default is auto. Alternatives are ‘P’ or ‘H’
Default: “auto”
- --background, -bg
Type of background when using smoothing windows. Default (True) is to use the unsmoothed curve’s color. A color string e.g. ‘g’ or ‘red’ or ‘gray’ also works, as does an RGB string ‘0.5, 1., 0.5’. Use False for no color.
Default: True
- --graphic_dpi
Dots per Inch (DPI) of the graphic output. Only has an effect for bitmap outputs. Default is 150.
Default: 150
- -sa, --short_AAs
Use one-letter aminoacid names when possible, e.g. K145 insted of Lys145. Default is False
Default: False
- -nsf, --no-same_fragment
Don’t allow contact partners in the same fragment. Default is to allow it.
Default: True
- -nt, --no-time-trace
Don’t plot the time-traces of the contacts. Default is to plot them.
Default: True
- -st, --save-trajs
Save trajectory data, default is not to save it.
Default: False
- -d, --distribution
Plot distance distributions instead of contact bar plots. Default is False.
Default: False
- --n_cols
number of columns of the overall plot. Default is 4
Default: 4
- --n_jobs
Number of processors to use. The parallelization is done over trajectories and not over contacts, beyond n_jobs>n_trajs parallelization will not have any effect.
Default: 1
- --pop_N_ctcs
Separate the plot with the total number contacts from the time-trace plot. Default is False
Default: False
- --ylim_Ang
Limit in Angstrom of the y-axis of the time-traces. Default is 10. Switch to any other float or ‘auto’ for automatic scaling
Default: “10”
- -ni, -no-interactive
Try not to be interactive. This can make wrong choices for the user, advanced only.
Default: False
- -s, --switch_off_Ang
Use a linear switchoff instead of a crisp one. Deafault is None
- -at, --atomtypes
- Add the atom-types to the frequency bars by ‘hatching’ them.
‘–’ is sidechain-sidechain ‘|’ is backbone-backbone ‘' is backbone-sidechain ‘/’ is sidechain-backbone
Default is false
Default: False
- --naive_bonds, -nb
Build naive, linear bonds between protein residues of the same fragment if mdtraj can’t build them automatically. For more info check mdciao.utils.bonds.top2residue_bond_matrix_naive
Default: False
- --scheme
Type of scheme for computing distance between residues. Choices are {‘ca’, ‘closest’, ‘closest-heavy’, ‘sidechain’, ‘sidechain-heavy’}. See mdtraj documentation for more info
Default: “closest-heavy”