mdciao.plots.plot_unified_freq_dicts
- mdciao.plots.plot_unified_freq_dicts(freqs, colordict=None, width=0.2, ax=None, figsize=(10, 5), panelheight_inches=5, inch_per_contacts=1, fontsize=16, sort_by='mean', lower_cutoff_val=0, remove_identities=False, vertical_plot=False, identity_cutoff=1, assign_w_color=False, title=None, legend_rows=4, verbose_legend=True, half_sigma=False)
Plot unified (= with identical keys) frequency dictionaries for different systems
- Parameters:
freqs (dictionary of dictionaries) – The first-level dict is keyed by system names, e.g freqs.keys() = [“WT”,”D10A”,”D10R”] The second-level dict is keyed by contact names
colordict (dict, default is None.) – What color each system gets. Default is some sane matplotlib values
width (None or float, default is .2) – Bar width each bar in the plot. If None, .8/len(freqs) will be used, leaving a .1 gap of free space between contacts.
ax (
Axes
, default is None) – Plot into this axis, else create one usingfigsize
.figsize (iterable of len 2) – Figure size (x,y), in inches. If None, one will be created using
panelheight_inches
andinch_per_contacts
. If you are transposing the figure usingvertical_plot
, you do not have to invert (y,x) this parameter here, it is done automatically.panelheight_inches (int, default is 5) – The height of the panel, in inches. Determines the figure size if
figsize
is None, else has no effectinch_per_contacts (int, default is 1) – How many inches each contact-pair is given in the panel. Determines the figure size if
figsize
is None, else has no effectfontsize (int, default is 16) – Will be used in
matplotlib._rcParams["font.size"]
# TODO be less invasivesort_by (str or list of strings, default is “mean”) – If str, the property by which to sort the contacts. If list, the list of contact labels in the order in which they will be shown. If str, the possibilities are
“mean” sort (descending) by mean frequency over all systems, making most frequent contacts appear on the left/top of the plot.
“std” sort (descending) by per-contact standard deviation over all systems, making the contacts with most different values appear on top. This highlights more “deviant” contacts and might hence be more informative than “mean” in cases where a lot of contacts have similar frequencies (high or low). If this option is activated, a faint dotted line is incorporated into the plot that marks the std for each contact group
“keep” keep the contacts in whatever order they have in the first dictionary
“numeric” sort (ascending) the contacts by the first number
that appears in the contact labels, e.g. “30” if the label is “GLU30@3.50-GDP”. You can use this to order by resSeq if the AA to sort by is the first one of the pair. Contact labels without numbers in them will be sorted alphabetically at the end of the labels with numbers.
“residue” alias for “numeric”
list of contact-labels : sort in the order established by this list. What will actually be plotted is the intersection of this list and the available contact labels of freqs after other parameters like lower_cutoff_val or identity_cutoff have taken effect, e.g. if a contact-label is discarded because of lower_cutoff_val, adding the label to this list won’t have any effect.
lower_cutoff_val (float, default is 0) – Hide contacts with small values. “values” changes meaning depending on sort_by. If sort_by is any of
“mean”, “keep”, “numeric”, “residue” or a list, then the contacts where all systems have frequencies lower than this value are hidden.
“std”, then the contacts where the standard deviation across systems itself is lower than this value are hidden. This hides contacts where all systems are similar, regardless of whether they’re all around 1, around .5 or around 0
remove_identities (bool, default is False) – If True, the contacts where freq[sys][ctc] >=
identity_cutoff
across all systems will not be plotted nor considered in the sum over contacts TODO : the word identity might be confusingvertical_plot (bool, default is False) – Plot the bars vertically in descending sort_by instead of horizontally (better for large number of frequencies)
identity_cutoff (float, default is 1) – If
remove_identities
, use this value to define what is considered an identity, s.t. contacts with values e.g. .95 can also be removed TODO consider merging both identity parameters into one that is None or floatassign_w_color (boolean, default is False) – Color the text of the contact-labels according to the following criterion.
If all frequencies are below the lower_cutoff_val except for one system, then the label adopts the color of this system and gets prepended with a “+” sign.
If all frequencies are above the lower_cutoff_val except for one system, then the label adopts the color of this system and gets prepended with a “-” sign
For more details see the paragraph “Visual Aides” of this notebook
title (str, default is None) – The title of the plot, if any
legend_rows (int, default is 4) – The maximum number of rows per column of the legend. If you have 10 systems, :obj:`legend_rows`=5 means you’ll get two columns, =2 means you’ll get five.
verbose_legend (bool, default is True) – Verbose legends inform about contacts that were in the input but have been left out of the plot. Contacts are left out if they are:
above the
identity_cutoff
orbelow the
lower_cutoff_val
They will appear in the verbose legend as “+ A.a + B.b”, respectively denoting the missing contacts that are “a(bove” and b(elow)” with their respective sums “A” and “B”.
half_sigma (bool, default is False) – When True, instead of showing Sigma=20, Sigma = 2x10 will be shown. If a ContactGroup has a Sigma=10 normally, when showing per-residue values, that number doubles, because each contact is shown two times. Hence, showing half-sigma allows to “keep” the number 10 in the legend, even though the shown Sigma is 20
- Returns:
fig (
Figure
)ax (
Axes
)freqs (dict) – Dictionary of dictionaries with the plotted frequencies in the plotted order. It’s keyed with system-names first and contact-names second, like the input. It has the sort_by strategy as an extra key containing the value that resorted of that strategy for each contact-name.