mdciao.plots.plot_unified_freq_dicts

mdciao.plots.plot_unified_freq_dicts(freqs, colordict=None, width=0.2, ax=None, figsize=(10, 5), panelheight_inches=5, inch_per_contacts=1, fontsize=16, sort_by='mean', lower_cutoff_val=0, remove_identities=False, vertical_plot=False, identity_cutoff=1, ylim=1, assign_w_color=False, title=None, legend_rows=4, verbose_legend=True, half_sigma=False)

Plot unified (= with identical keys) frequency dictionaries for different systems

Parameters
  • freqs (dictionary of dictionaries) – The first-level dict is keyed by system names, e.g freqs.keys() = [“WT”,”D10A”,”D10R”] The second-level dict is keyed by contact names

  • colordict (dict, default is None.) – What color each system gets. Default is some sane matplotlib values

  • width (None or float, default is .2) – Bar width each bar in the plot. If None, .8/len(freqs) will be used, leaving a .1 gap of free space between contacts.

  • ax (Axes, default is None) – Plot into this axis, else create one using figsize.

  • figsize (iterable of len 2) – Figure size (x,y), in inches. If None, one will be created using panelheight_inches and inch_per_contacts. If you are transposing the figure using vertical_plot, you do not have to invert (y,x) this parameter here, it is done automatically.

  • panelheight_inches (int, default is 5) – The height of the panel, in inches. Determines the figure size if figsize is None, else has no effect

  • inch_per_contacts (int, default is 1) – How many inches each contact-pair is given in the panel. Determines the figure size if figsize is None, else has no effect

  • fontsize (int, default is 16) – Will be used in :obj:`matplotlib._rcParams[“font.size”] # TODO be less invasive

  • sort_by (str, default is "mean") –

    The property by which to sort the contacts. It is always descending and the property can be:

    • ”mean” sort by mean frequency over all systems, making most frequent contacts appear on the left/top of the plot.

    • ”std” sort by per-contact standard deviation over all systems, making the contacts with most different values appear on top. This highlights more “deviant” contacts and might hence be more informative than “mean” in cases where a lot of contacts have similar frequencies (high or low). If this option is activated, a faint dotted line is incorporated into the plot that marks the std for each contact group

    • ”keep” keep the contacts in whatever order they have in the first dictionary

    • ”numeric” sort the contacts by the first number

    that appears in the contact labels, e.g. “30” if the label is “GLU30@3.50-GDP”. You can use this to order by resSeq if the AA to sort by is the first one of the pair.

  • lower_cutoff_val (float, default is 0) –

    Hide contacts with small values. “values” changes meaning depending on sort_by. If sort_by is:

    • ”mean” or “keep” or “numeric”, then hide contacts where all systems have frequencies lower than this value.

    • ”std”, then hide contacts where the standard deviation across systems itself is lower than this value. This hides contacts where all systems are similar, regardless of whether they’re all around 1, around .5 or around 0

  • remove_identities (bool, default is False) – If True, the contacts where freq[sys][ctc] >= identity_cutoff across all systems will not be plotted nor considered in the sum over contacts TODO : the word identity might be confusing

  • vertical_plot (bool, default is False) – Plot the bars vertically in descending sort_by instead of horizontally (better for large number of frequencies)

  • identity_cutoff (float, default is 1) – If remove_identities, use this value to define what is considered an identity, s.t. contacts with values e.g. .95 can also be removed TODO consider merging both identity parameters into one that is None or float

  • ylim (float, default is 1) – The limit on the y-axis

  • assign_w_color (boolean, default is False) – If there are contacts where only one system (as in keys, of freqs) appears, color the textlabel of that contact with the system’s color

  • title (str, default is None) – The title of the plot, if any

  • legend_rows (int, default is 4) – The maximum number of rows per column of the legend. If you have 10 systems, :obj:`legend_rows`=5 means you’ll get two columns, =2 means you’ll get five.

  • verbose_legend (bool, default is True) –

    Verbose legends inform about contacts that were in the input but have been left out of the plot. Contacts are left out if they are:

    • above the identity_cutoff or

    • below the lower_cutoff_val

    They will appear in the verbose legend as “+ A.a + B.b”, respectively denoting the missing contacts that are “a(bove” and b(elow)” with their respective sums “A” and “B”.

  • half_sigma (bool, default is False) – When True, instead of showing Sigma=20, Sigma = 2x10 will be shown. If a ContactGroup has a Sigma=10 normally, when showing per-residue values, that number doubles, because each contact is shown two times. Hence, showing half-sigma allows to “keep” the number 10 in the legend, even though the shown Sigma is 20

Returns

  • fig (Figure)

  • ax (Axes)

  • freqs (dict) – Dictionary of dictionaries with the plotted frequencies in the plotted order. It’s keyed with first wity system-names first and contact-names second, like the input. It has the sort_by strategy as an extra key containing the value that resorted of that strategy for each contact-name.