mdciao.cli.residue_selection

mdciao.cli.residue_selection(expression, top, GPCR_uniprot=None, CGN_PDB=None, save_nomenclature_files=False, accept_guess=False, fragments=None)

Find residues in an input topology using Unix filename pattern matching like in an ‘ls’ Unix operation.

Parameters
  • expression (str) – Unix-like expressions and ranges are allowed, e.g. ‘GLU,PH*,380-394,3.50,GH.5*.’, as are consensus descriptors if consensus labels are provided

  • top (str, Trajectory, or Topology) – The topology to use.

  • GPCR_uniprot (str or mdciao.nomenclature.LabelerGPCR, default is None) – For GPCR nomenclature. If str, e.g. “adrb2_human”. will try to locate a local filename or do a web lookup in the GPCRdb. If mdciao.nomenclature.LabelerGPCR, use this object directly (allows for object re-use when in API mode). See mdciao.nomenclature for more info and references. Please note the difference between UniProt Accession Code and UniProt entry name as explained here .

  • CGN_PDB (str or mdciao.nomenclature.LabelerCGN, default is None) – For CGN (G-alpha Numbering definitions) nomenclature. If str, e.g. “3SN6”, try to locate local filenames (“3SN6.pdb”, “CGN_3SN6.txt”) or do web lookups in https://www.mrc-lmb.cam.ac.uk/CGN/ and http://www.rcsb.org/. If mdciao.nomenclature.LabelerCGN, use this object directly (allows for object re-use when in API mode) See mdciao.nomenclature for more info and references.

  • save_nomenclature_files (bool, default is False) – Save available nomenclature definitions to disk so :

  • accept_guess (bool, default is False) – Accept mdciao’s guesses regarding fragment identification using nomenclature labels

  • fragments (list, default is None) –

    Fragment control. * None: use the default get_fragments,

    currently ‘lig_resSeq+’

    • [“consensus”] : use things like “TM*” or “G.H*”, i.e.

    GPCR or CGN-sub-subunit labels.

    • List of len 1 with some fragmentation heuristic, e.g.

    [“lig_resSeq+”]. will use the default of mdciao.fragments.get_fragments. See there for info on defaults and other heuristics.

    • List of len N that can mix different possibilities: * iterable of integers (lists or np.arrays, e.g. np.arange(20,30) * ranges expressed as integer strings, “20-30” * ranges expressed as residue descriptors [“GLU30-LEU40”]

    Numeric expressions are interepreted as zero-indexed and unique residue serial indices, i.e. 30-40 does not necessarily equate “GLU30-LEU40” unless serial and sequence index coincide. If there’s more than one “GLU30”, the user gets asked to disambiguate. The resulting fragments need not cover all of the topology, they only need to not overlap.

Returns

  • res_idxs_list (np.ndarray) – The residue indices of the residues that match the expression

  • frags (list of integers) – Whatever fragments the user chose

  • consensus_maps (dict) – Keys are currently just ‘GPCR’ and ‘CGN’ Values are lists of len topology.n_residues with the consensus labels. All labels will be None if no consensus info was provided