mdc_interface.py

Analyse interfaces between any two groups of residues using a distance cutoff. To help in the identification of these two groups of residues, the peptide-chain in the input topology can be automatically broken down into fragments and use them as input. The number of shown contacts depends on the parameters “ctc_control” and “min_freq”.

usage: mdc_interface.py [-h] [-fr FRAGMENTS [FRAGMENTS ...]] [-fg1 FRAG_IDXS_GROUP_1] [-fg2 FRAG_IDXS_GROUP_2] [--ctc_cutoff_Ang CTC_CUTOFF_ANG] [-cc CTC_CONTROL] [-mf MIN_FREQ]
                        [-ic INTERFACE_CUTOFF_ANG] [--cmap CMAP] [--n_nearest N_NEAREST] [--stride STRIDE] [--n_smooth_hw N_SMOOTH_HW] [-nt] [-st] [--n_jobs N_JOBS]
                        [--fragment_names FRAGMENT_NAMES] [-nf] [-GPCR GPCR_UNIPROT] [--save_nomenclature] [-GGN CGN_PDB] [--chunksize_in_frames CHUNKSIZE_IN_FRAMES] [-o OUTPUT_DESC]
                        [-od OUTPUT_DIR] [-gx GRAPHIC_EXT] [--graphic_dpi GRAPHIC_DPI] [--curve_color CURVE_COLOR] [--t_unit T_UNIT] [--background BACKGROUND] [-sa]
                        [--no-sort_by_av_ctcs] [--scheme SCHEME] [--no-flare] [--no-matrix] [--pop_N_ctcs] [-ni] [-t TITLE]
                        topology trajectories [trajectories ...]

Positional Arguments

topology

Topology file

trajectories

trajectory file(s)

Named Arguments

-fr, --fragments

R|How to sub-divide the topology into fragments. Several options possible. Taking the example sequence: …-A27,Lig28,K29-…-W40,D45-…-W50,CYSP51,GDP52

  • ‘resSeq’

    breaks at jumps in resSeq entry: […A27,Lig28,K29,…,W40],[D45,…,W50,CYSP51,GDP52]

  • ‘resSeq+’

    breaks only at negative jumps in resSeq: […A27,Lig28,K29,…,W40,D45,…,W50,CYSP51,GDP52]

  • ‘bonds’

    breaks when AAs are not connected by bonds, ignores resSeq: […A27][Lig28],[K29,…,W40],[D45,…,W50],[CYSP51],[GDP52] notice that because phosphorylated CYSP51 didn’t get a bond in the topology, it’s considered a ligand

  • ‘resSeq_bonds’

    breaks both at resSeq jumps or missing bond

  • ‘lig_resSeq+

‘ Like resSeq+ but put’s any non-AA residue into

it’s own fragment: […A27][Lig28],[K29,…,W40],[D45,…,W50,CYSP51],[GDP52]

  • ‘chains’

    breaks into chains of the PDB file/entry

  • None or ‘None

‘ all residues are in one fragment, fragment 0
  • ‘consensus’

    If any consensus nomenclature is provided, ask the user for definitions using consensus labels

  • 0-10,15,14 20,21,30-50 51 (example, advanced users only)

    Input arbitrary fragments via their residue serial indices (zero-indexed) using space as separator. Not recommended

. - ‘None’

All residues are in one fragment (fragment 0) Can be harmless or potentially dangerous if residue

labels are repeated.If you are unsure of any of these options, use

the command line tool mdc_fragments.py on your topology file.

Default: [‘lig_resSeq+’]

-fg1, --frag_idxs_group_1

Indices of the fragments that belong to the group_1, as CSVs or range, e.g. ‘1,3-4’. Defaults to None which will prompt the user of information, except when only two fragments are present. Then it defaults to [0]

-fg2, --frag_idxs_group_2

Indices of the fragments that belong to the group_2, as CSVs or range, e.g. ‘1,3-4’. Defaults to None which will prompt the user of information, except when only two fragments are present. Then it defaults to [1]

--ctc_cutoff_Ang, -co

The cutoff distance between two residues for them to be considered in contact. Default is 3.5 Angstrom.

Default: 3.5

-cc, --ctc_control

Control the number of reported contacts. Can be an integer (keep the first n contacts) or a float representing a fraction [0,1] of the totalnumber of contacts.Default is 50.

Default: 50

-mf, --min_freq

Do not show frequencies smaller than this. Default is 0.05. If you notice the output being truncated at values much larger than this, but suspect that some contacts are not being reported, increase the ‘ctc_control’ parameter

Default: 0.05

-ic, --interface_cutoff_Ang

The interface between both groups is defined as the set of group_1-group_2-distances that are within this cutoff in the reference topology. Otherwise, a large number of non-necessary distances (e.g. between N-terminus and G-protein) are computed. Default is 35. You can pass ‘0’ to have no cutoff at all (include all possible interface contacts)

Default: 35

--cmap

The colormap for the contact matrix. Default is ‘binary’ which is black and white, but you can choose anthing from here: https://matplotlib.org/3.1.0/tutorials/colors/colormaps.html

Default: “binary”

--n_nearest, -nn

Ignore this many nearest neighbors when computing neighbor lists. ‘Near’ means ‘connected by this many bonds’. Default is 0.

Default: 0

--stride

Stride down the input trajectoy files by this factor. Default is 1.

Default: 1

--n_smooth_hw, -ns

Number of frames one half of the averaging window for the time-traces. Default is 0, which means no averaging.

Default: 0

-nt, --no-time-trace

Don’t plot the time-traces of the contacts. Default is to plot them.

Default: True

-st, --save-trajs

Save trajectory data, default is not to save it.

Default: False

--n_jobs

Number of processors to use. The parallelization is done over trajectories and not over contacts, beyond n_jobs>n_trajs parallelization will not have any effect.

Default: 1

--fragment_names

Name of the fragments. Leave empty if you want them automatically named. Otherwise, give a quoted list of strings separated by commas, e.g. ‘TM1, TM2, TM3,’

Default: “”

-nf, --no-fragments

Do not use fragments. Default is to use them

Default: True

-GPCR, --GPCR_uniprot

Look for Ballesteros-Weinstein definitions in the GPCRdb using a uniprot code, e.g. adrb2_human. See https://gpcrdb.org/services/ for more details.Default is None.

Default: “None”

--save_nomenclature

Save available nomenclature definitions to disk so that they can be accessed locally in later uses. Default is False

Default: False

-GGN, --CGN_PDB

PDB code for a consensus G-protein nomenclature

Default: “None”

--chunksize_in_frames

Trajectories are read in chunks of this size. Helps with big files and/or large number of contacts when you run into memory problems. Default is 10000

Default: 10000

-o, --output_desc

Descriptor for output files. Default is interface

Default: “interface”

-od, --output_dir

directory to which the results are written. Default is ‘.’

Default: “.”

-gx, --graphic_ext

Extension of the output graphics, default is .pdf

Default: “.pdf”

--graphic_dpi

Dots per Inch (DPI) of the graphic output. Only has an effect for bitmap outputs. Default is 150.

Default: 150

--curve_color

Type of color used for the curves. Default is auto. Alternatives are ‘P’ or ‘H’

Default: “auto”

--t_unit

Unit used for the temporal axis, default is ns.

Default: “ns”

--background, -bg

Type of background when using smoothing windows. Default (True) is to use the unsmoothed curve’s color. A color string e.g. ‘g’ or ‘red’ or ‘gray’ also works, as does an RGB string ‘0.5, 1., 0.5’. Use False for no color.

Default: True

-sa, --short_AAs

Use one-letter aminoacid names when possible, e.g. K145 insted of Lys145. Default is False

Default: False

--no-sort_by_av_ctcs

When presenting the results summarized by residue, don’t sort by sum of frequencies (~average number of contacts), but by ascending order whithin each interface member. Default is to sort them by frequencies.

Default: True

--scheme

Type of scheme for computing distance between residues. Choices are {‘ca’, ‘closest’, ‘closest-heavy’, ‘sidechain’, ‘sidechain-heavy’}. See mdtraj documentation for more info

Default: “closest-heavy”

--no-flare

Do not produce a flare plot of the interface contact matrix. If produced, regardless of the ‘–graphic_ext’, the flareplot will always be in .pdf-format, unless ‘–graphic_ext’ is ‘svg’.

Default: True

--no-matrix

Do not produce a plot of the interface contact matrix

Default: True

--pop_N_ctcs

Separate the plot with the total number contacts from the time-trace plot. Default is False

Default: False

-ni, -no-interactive

Try not to be interactive. This can make wrong choices for the user, advanced only.

Default: False

-t, --title

Name of the system. Used for figure titles (not filenames)Defaults to –output_desc if None is given