mdciao.fragments.match_fragments
- mdciao.fragments.match_fragments(seq0, seq1, frags0=None, frags1=None, probe=None, verbose=False, shortest=3)
Align fragments of seq0 and seq1 pairwise and return a matrix of scores.
The score is the absolute number matches between two fragments. Depending on how informed the user is about the topologies, their fragments, their similarities, and what the user is trying to do, this absolute measure can be just right or be highly misleading, e.g:
two fragments of ~500 AAs each can score 20 matches “easily”, without this being meaningful
two fragments of 11 AAs each having 10 matches between them are almost identical
however, in absolute terms, the first case has a higher score
If you know what you’re doing, you can specify which one of the sequences is the
probe
, s.t. the score is divided by the length the fragment of the probe. E.g., ifprobe
=1, it means that you are interested in finding out if fragments ofseq1
appear in fragments ofseq0
, (the ‘target’ sequence), regardless of how long the target fragments are. The score is then normalized to 1, where 1 means you found the entire probe fragment in the target fragment, no matter how long the probe or the target were.- Parameters:
seq0 (str or
Topology
)seq1 (str or
Topology
)frags0 (list or None, default is None) – If None,
get_fragments
will be called with the default options to generate a fragment list.frags1 (list or None, default is None) – If None,
get_fragments
will be called with the default options to generate a fragment list.probe (int, default is None) – If None, scores are absolute numbers. If 0, the scores are divided by the seq0’s fragment length. If 1, by seq1’s fragment length. In these cases, the score is always between 0 and 1, regardless how long the probe and the target fragments are.
shortest (int, default is 3) – Fragments of len <
shortest
won’t produce a score but a np.NaN, s.t. thescore
doesn’t get highjacked by very smallprobe
fragments, which will always yield relative good scores. Absolute scores (probe
= None) are not affected by this.verbose (bool, default is False) – Be verbose, affects all methods called by the this method as well.
- Returns:
score (2D np.ndarray of shape(len(frags0),len(frags1))) – Will be between 0 and 1 if a
probe
is specifiedfrags0 (list) – The fragments that were either provided or generated on the fly. Their indices are the row-indices of
score
frags1 (list) – The fragments that were either provided or generated on the fly. Their indices are the row-indices of
score