mdciao.fragments.get_fragments

mdciao.fragments.get_fragments(top, method='lig_resSeq+', fragment_breaker_fullresname=None, atoms=False, verbose=True, join_fragments=None, maxjump=500, salt=['Na+', 'Cl-', 'Na', 'Cl'], water=True, **kwargs_residues_from_descriptors)

Group residues of a molecular topology into fragments using different methods.

Water and ions get their own fragment by default except for the methods None, chains, and any method involving bonds

Parameters
  • top (Topology or str) – When str, path to filename

  • method (str, default is 'lig_resSeq+') –

    The method passed will be the basis for creating fragments. Check the following options with the example sequence

    ”…-A27,Lig28,K29-…-W40,D45-…-W50,CYSP51,GDP52”

    • ’resSeq’

      breaks at jumps in resSeq entry:

      […A27,Lig28,K29,…,W40],[D45,…,W50,CYSP51,GDP52]

    • ’resSeq+’

      breaks only at negative jumps in resSeq:

      […A27,Lig28,K29,…,W40,D45,…,W50,CYSP51,GDP52]

    • ’bonds’

      breaks when residues are not connected by bonds, ignores resSeq:

      […A27][Lig28],[K29,…,W40],[D45,…,W50],[CYSP51],[GDP52]

      notice that because phosphorylated CYSP51 didn’t get a bond in the topology, it’s considered a ligand

    • ’resSeq_bonds’

      breaks at resSeq jumps and at missing bonds

    • ’lig_resSeq+’

      Like resSeq+ but put’s any non-AA residue into it’s own fragment. […A27][Lig28],[K29,…,W40],[D45,…,W50,CYSP51],[GDP52] Also check maxjump

    • ’chains’

      breaks into chains of the PDB file/entry

    • None or ‘None’

      all residues are in one fragment, fragment 0

  • fragment_breaker_fullresname (list) – list of full residue names. Example [GLU30] will be used to break fragments, so that [R1, R2, … GLU30,…R10, R11] will be broken into [R1, R2, …], [GLU30,…,R10,R11]

  • atoms (boolean, optional) – Instead of returning residue indices, return atom indices

  • join_fragments (list of lists) – After getting the fragments with method, join these fragments again. The use case are hard cases where no method gets it right and some post-processing is needed. Duplicate entries in any inner list will be removed. One fragment idx cannot appear in more than one inner list, otherwise an exception is thrown

  • verbose (boolean, optional) – Be verbose

  • salt (list, default is ["Na+","Cl+", "NA","CL"]) – Residues that match these residue names and have only one atom will be put together in the last fragment. Use salt = [] to deactivate. Doesn’t apply for methods involving bonds or None and chains

  • water (bool, default is True) – Put water on its own fragment. Doesn’t apply for methods involving bonds or None and chains

  • maxjump (int or None, default is 500) – The maximum allowed positive sequence-jump in the ‘resSeq+’ methods, i.e. don’t join ALA500 with GLU551 even though the jump in sequence is positive None means no limit for positive jumps

  • kwargs_residues_from_descriptors (optional) – additional arguments, see residues_from_descriptors

Returns

Each array within the list has the residue indices of each fragment. These fragments do not have overlap. Their union contains all indices

Return type

List of integer arrays