mdciao.utils.residue_and_atom¶
Deal with residues, atoms, and their names, mostly.
The function residues_from_descriptors
is probably the
most elaborate and most higher-level.
Functions
|
Residue types, optionally color coded |
|
Return a string BB or SC for backbone or sidechain atom. |
|
Residue matching with UNIX-shell patterns |
|
Return the CA atom (or something equivalent) for this residue |
|
Try to guess what type of input for secondary-structure computation the user wants, and compute it |
|
Returns the integer part from a residue name, None if there isn’t |
|
Return the residue name from a string |
|
Helper method to print information regarding AA descriptors |
|
Generalized range-expander from residue descriptors. |
|
Return a string that describes the residue |
|
Returns residue idxs based on a list of residue descriptors. |
|
Return the short name of an AA, e.g. |
|
Return a list of per-residue attributes as dictionaries |
-
mdciao.utils.residue_and_atom.
AAtype
(res, return_color=False, typecolors={'NA': 'purple', 'hydrophobic': 'gray', 'negative': 'red', 'polar': 'green', 'positive': 'blue', 'special': 'gray'})¶ Residue types, optionally color coded
The types are: * “positive”: “ARG HIS LYS”, * “negative”: “ASP GLU”, * “polar”: “SER THR ASN GLN”, * “special”: “CYS GLY PRO”, * “hydrophobic”: “ALA ILE LEU MET PHE TRP TYR VAL”
- Parameters
res (str or
Residue
) –return_color (bool, default is False) – Return the color associated with the type (positive:blue, negative:red, etc) rather than type itself
typecolors (dict) – The map of types to colors
- Returns
rtype – Either the type or the color
- Return type
str
-
mdciao.utils.residue_and_atom.
atom_type
(aa, no_BB_no_SC='X')¶ Return a string BB or SC for backbone or sidechain atom.
- Parameters
aa (
mdtraj.core.topology.Atom
object) –no_BB_no_SC (str, default is X) – Return this string if
aa
isn’t either BB or SC
- Returns
aatype
- Return type
str
-
mdciao.utils.residue_and_atom.
find_AA
(AA_pattern, top, extra_columns=None, return_df=False)¶ Residue matching with UNIX-shell patterns
Similar to the shell command “ls”, using posix-style wildcards like shown in the examples or here: https://docs.python.org/3/library/fnmatch.html
Any other attribute that’s passed as
extra_columns
will be matched as explained below, e.g. “3.50” to get one residue in the GPCR-nomenclature or “3.*” to get the whole TM-helix 3The examples use ‘*’ as wildcard, but ‘?’ (as in ‘ls’) also works
Examples
‘PRO’ : returns all PROs, matching via the attribute “name”
‘P’ : returns all PROs, matching via the attribute “code”
‘P*’ : returns all PROs,PHEs and any other residue that starts with “P”, either in “name” or in “code”
‘PRO39’ : returns PRO39, matching via full residue name (long)
‘P39’ : returns PRO39, matching via full residue name (short)
‘PRO3*’ : returns all PROs with sequence indices that start with 3, e.g. ‘PRO39, PRO323, PRO330’ etc
‘3’ : returns all residues with sequence indices 3
‘3*’ : returns all residues with sequence indices that start with 3
-
mdciao.utils.residue_and_atom.
find_CA
(res, CA_name='CA', CA_dict=None)¶ Return the CA atom (or something equivalent) for this residue
- Parameters
res (
mdtraj.Residue
object) –CA_name (str, default is "CA") – The name by which you identify the CA. This overrules anything that’s parsed in the
CA_dict
, i.e. if the residue you are passing has both an atom “CA” and an entry in the CA_dict, the “CA” atom will be returned.CA_dict (dict, default is None) – You can provide a dictionary keyed with residue names and valued with strings that identify a “CA”-equivalent atom (e.g. in ligands) If None, the default
_CA_rules
are used: _CA_rules = {“GDP”: “C1”, “P0G”:”C12”}
-
mdciao.utils.residue_and_atom.
get_SS
(SS, top=None)¶ Try to guess what type of input for secondary-structure computation the user wants, and compute it
- Parameters
SS (secondary structure information) –
Can be many things: * triple of ints (CP_idx, traj_idx, frame_idx)
Nothing happens, the tuple is returned as is and handled externally by the
ContactGroup
that called this method. Tuple representing a ContactPair, trajectory See the docs there for more infoTrue same as [0,0,0]
None or False Do nothing
mdtraj.Trajectory
Use this geometry to compute the SSstring Path to a filename, of which only the first frame will be read. The SS will be computed from there. The file will be tried to read first without topology information (e.g. .pdb, .gro, .h5 will work), and when this fails, the
top
will be passed (e.g. .xtc, .dcd)array_like Use the SS from here, s.t.ss_inf[idx] gives the SS-info for the residue with that idx
top (
Topology
, default is None) –
- Returns
from_tuple (bool) – Whether the infor should be gotten from a tuple or not
ss_array (np.ndarray or None)
-
mdciao.utils.residue_and_atom.
int_from_AA_code
(key)¶ Returns the integer part from a residue name, None if there isn’t
- Parameters
key (string) – Residue name passed as a string, example “GLU30”
- Returns
Integer part of the residue id, example- 30 if the input is “GLU30”
- Return type
int
-
mdciao.utils.residue_and_atom.
name_from_AA
(key)¶ Return the residue name from a string
- Parameters
key (string or obj:mdtraj.Topology.Residue object) – Residue name passed as a string, example “GLU30” or as residue object
- Returns
name – Name of the residue, like “GLU” for “GLU30” or “E” for “E30”
- Return type
str
-
mdciao.utils.residue_and_atom.
parse_and_list_AAs_input
(AAs, top, map_conlab=None)¶ Helper method to print information regarding AA descriptors
-
mdciao.utils.residue_and_atom.
rangeexpand_residues2residxs
(range_as_str, fragments, top, interpret_as_res_idxs=False, sort=False, **residues_from_descriptors_kwargs)¶ Generalized range-expander from residue descriptors.
Residue descriptors can be anything that
find_AA
understands. Expanding a range means getting “2-5,7” as input and returning “2,3,4,5,7”To dis-ambiguate descriptors, a fragment definition and a topology are needed
Note
The input (= compressed range) is very flexible and accepts mixed descriptors and wildcards, eg: GLU*,ARG*,GDP*,LEU394,380-385 is a valid range.
Wildcards use the full resnames, i.e. E* is NOT equivalent to GLU*
Be aware, though, that wildcards are very powerful and easily “grab” a lot of residues, leading to long calculations and large outputs.
See
find_AA
for more on residue descriptors- Parameters
range_as_str (string, int or iterable of ints) –
fragments (list of iterable of residue indices) –
top (
Topology
object) –interpret_as_res_idxs (bool, default is False) – If True, indices without residue names (“380-385”) values will be interpreted as residue indices, not residue sequential indices
sort (bool) – sort the expanded range on return
residues_from_descriptors_kwargs – Optional parameters for
residues_from_descriptors
- Returns
- Return type
residxs_out = list of unique residue indices
-
mdciao.utils.residue_and_atom.
residue_line
(item_desc, residue, frag_idx, consensus_maps=None, fragment_names=None, table=False)¶ Return a string that describes the residue
Can be used justo to inform or to help dis-ambiguating: 0.0) GLU10 in fragment 0 with residue index 6 (CGN: G.HN.27) … 1.0) GLU10 in fragment 1 with residue index 363
- Parameters
item_desc (str) – Description for the item of the list, “1.0” or “3.2”
residue (
Residue
) –frag_idx (int) – Fragment index
fragment_names (list, default is None) – Fragment names
consensus_maps (dict of indexables, default is None) – Dictionary of dictionaries. Lower-level dicts are keyed with residue indices and valued with additional residue names. Higher-level keys can be whatever. Use case is e.g. if “R131” needs to be disambiguated bc. it pops up in many fragments. You can pass {“GPCR”:{895:”3.50”, …} here and that label will be displayed next to the residue.
table (bool, default is False) – Assume a header has been aready printed out and print the line with the inline tags
- Returns
istr – An informative string about this residue, that can be used to dis-ambiguate via the unique item descriptor, e.g: 3.1) GLU122 in fragment 3 with residue index 852 (: 3.41)
- Return type
str
-
mdciao.utils.residue_and_atom.
residues_from_descriptors
(residue_descriptors, fragments, top, pick_this_fragment_by_default=None, fragment_names=None, additional_resnaming_dicts=None, extra_string_info='', just_inform=False)¶ Returns residue idxs based on a list of residue descriptors.
Fragments are needed to better identify residues. If a residue is present in multiple fragments, the user can dis-ambiguate or pick all residue idxs matching the
residue_descriptor
Because of this (one descriptor can match more than one residue) the return values are not necessarily of the same length as
residue_descriptors
- Parameters
residue_descriptors (string or list of of strings) – AAs of the form of “GLU30” or “E30” or 30, can be mixed
fragments (iterable of iterables of integers) – The integers in the iterables of ‘fragments’ represent residue indices of that fragment
top (
Topology
) –pick_this_fragment_by_default (None or integer.) – Pick this fragment without asking in case of ambiguity. If None, the user will we prompted
fragment_names – list of strings providing informative names input
fragments
additional_resnaming_dicts (dict of dicts, default is None) – Dictionary of dictionaries. Lower-level dicts are keyed with residue indices and valued with additional residue names. Higher-level keys can be whatever. Use case is e.g. if “R131” needs to be disambiguated bc. it pops up in many fragments. You can pass {“GPCR”:{895:”3.50”, …} here and that label will be displayed next to the residue.
mdciao.cli
methods use this.just_inform (bool, default is False) – Just inform about the AAs, don’t ask for a selection
extra_string_info (string with any additional info to be printed in case of ambiguity) –
- Returns
residxs (list) – lists of integers that have been selected
fragidxs (list) – The list of fragments where the residues are
-
mdciao.utils.residue_and_atom.
shorten_AA
(AA, substitute_fail=None, keep_index=False)¶ Return the short name of an AA, e.g. TRP30 to W by trying to use either the
mdtraj.Topology.Residue.code
attribute ormdtraj
internals AA dictionary- Parameters
AA (
Residue
or a str) – The residue in questionsubstitute_fail (str, default is None) – If there is no .code attribute, different options are there depending on the value of this parameter * None : throw an exception when no short code is found (default) * ‘long’ : keep the residue’s long name, i.e. do nothing * ‘c’: any alphabetic character, as long as it is of len=1 * 0 : the first alphabetic character in the residue’s name
keep_index (bool, default is False) – If True return “Y30” for “TRP30”, instead of returning just “Y”
- Returns
code – A string representing this AA using the short code
- Return type
str
-
mdciao.utils.residue_and_atom.
top2lsd
(top, substitute_fail='X', extra_columns=None)¶ Return a list of per-residue attributes as dictionaries
Use
DataFrame
on the return value for a nice table- Parameters
top (
Topology
) –substitute_fail (str, None, int, default is "X") –
If there is no .code attribute, different options are there depending on the value of this parameter
None : throw an exception when no short code is found (default)
’long’ : keep the residue’s long name, i.e. do nothing
’c’: any alphabetic character, as long as it is of len=1
0 : the first alphabetic character in the residue’s name
extra_columns (dictionary of indexables) – Any other column you want to include in the
DataFrame
- Returns
df
- Return type