mdciao.utils.lists

Miscellaneous operations on list or list-like objects .. autosummary:

:nosignatures:
:toctree: generated/

Functions

assert_min_len(input_iterable[, min_len])

Checks if an iterable satisfies the criteria of minimum length.

assert_no_intersection(list_of_lists_of_integers)

Checks if two or more lists contain the same integer :param list_of_lists_of_integers: :type list_of_lists_of_integers: list of lists

contiguous_ranges(list_in)

For every unique entry in list_in return the contiguous ranges in list

does_not_contain_strings(iterable)

Checks if iterable has any string element, returns False if it contains atleast one string

exclude_same_fragments_from_residx_pairlist(…)

If the members of the pair belong to the same fragment, exclude them from pairlist.

find_parent_list(sublists, parent_lists)

For each sublist, return the index of the parent list

force_iterable(var)

Forces var to be iterable, if not already

hash_list(ilist)

Try to hash all the objects of a list (regardless of type) into one hash

idx_at_fraction(val_desc_order, frac)

Index of val_desc_order where np.cumsum(val)/np.sum(val)>= frac for the first time

in_what_N_fragments(idxs, fragment_list)

For each element of idxs, return the index of “fragments” in which it appears

in_what_fragment(residx, …[, fragment_names])

For the residue id, returns the name(if provided) or the index of the “fragment” in which it appears

is_iterable(var)

Checks if the input is an iterable or not

join_lists(lists, idxs_of_lists_to_join)

Provided a list of lists, join them following idxs_of_lists_to_join

put_this_idx_first_in_pair(idx, pair)

Returns the original pair if the value already appears first, else returns reversed pair :param idx: :type idx: value which needs to be brought in the first place (not the index but value itself) :param pair: pair of values as a list :type pair: list

rangeexpand(txt)

For a given integer range or multiple integer ranges, returns a list of individual integers.

re_warp(array_in, lengths)

Return iterable ::py:obj:array_in as a list of arrays, each

remove_from_lists(list_of_lists, remove_these)

Wraps safely around numpy.setdiff1d not returning empty lists

unique_list_of_iterables_by_tuple_hashing(ilist)

Returns the unique entries(if there are duplicates) from a list of iterables.

unique_product_w_intersection(a1, a2)

Fast way to create the product of two intersecting sets without repeated/unwanted pairs

window_average_fast(input_array_y[, …])

Returns the moving average using np.convolve :param input_array_y: numpy array for which moving average should be calculated :type input_array_y: array :param half_window_size: the actual window size will be 2 * half_window_size + 1.

mdciao.utils.lists.assert_min_len(input_iterable, min_len=2)

Checks if an iterable satisfies the criteria of minimum length. (Default minimum length is 2). :param input_iterable: example np.zeros((2,1,1) or [[1,2],[3,4]] when min_len = 2 :type input_iterable: numpy array, list of list :param min_len: :type min_len: minimum length which the iterable should satisfy (Default is 2)

Returns

Return type

Prints error if each item within the iterable has lesser number of elements than min_len

mdciao.utils.lists.assert_no_intersection(list_of_lists_of_integers, word='iterables')

Checks if two or more lists contain the same integer :param list_of_lists_of_integers: :type list_of_lists_of_integers: list of lists

Returns

Return type

Prints assertion message if inner lists have the same integer, else no output

mdciao.utils.lists.contiguous_ranges(list_in)

For every unique entry in list_in return the contiguous ranges in list

Parameters

list_in (list) –

Returns

ranges – The keys are with unique entries of list_in, values are the ranges in which the entry appears

Return type

dict

mdciao.utils.lists.does_not_contain_strings(iterable)

Checks if iterable has any string element, returns False if it contains atleast one string

Parameters

iterable (integer, float, string or any combination thereof) –

Returns

True if iterable does not contain any string, else False

Return type

boolean

mdciao.utils.lists.exclude_same_fragments_from_residx_pairlist(pairlist, fragments, return_excluded_idxs=False)

If the members of the pair belong to the same fragment, exclude them from pairlist.

Parameters
  • pairlist (list of iterables) – each iterable within the list should be a pair.

  • fragments (list of iterables) – each inner list should have residue indexes that form a fragment

  • return_excluded_idxs (boolean) – True if index of excluded pair is needed as an output. (Default is False).

Returns

pairs that don’t belong to the same fragment, or index of the excluded pairs if return_excluded_idxs is True

Return type

list

mdciao.utils.lists.find_parent_list(sublists, parent_lists)

For each sublist, return the index of the parent list

Parameters
  • sublists (list of iterables) –

  • parent_lists (list of iterables) –

Returns

  • parents_by_child (list) – A list of len(sublists) with indices indicating which element of parent_lists each sublist is a subset of. If a sublist doesn’t have a parent, its parent is None

  • child_by_parent (dict) – A dictionary keyed by parent idx and valued with idxs of their children

mdciao.utils.lists.force_iterable(var)

Forces var to be iterable, if not already

Parameters

var (integer, float, string , list) –

Returns

var as iterable

Return type

iterable

mdciao.utils.lists.hash_list(ilist)

Try to hash all the objects of a list (regardless of type) into one hash

Parameters
  • iobj (anthing) –

  • Returns (hashed object) –

  • -------

mdciao.utils.lists.idx_at_fraction(val_desc_order, frac)

Index of val_desc_order where np.cumsum(val)/np.sum(val)>= frac for the first time

Parameters
  • val_desc_order (array like of floats) – The values that the determine the sum of which a fraction will be taken The have to be in descending order

  • frac (float) – The target fraction of sum(val) that is needed

Returns

n – Index of val where the fraction is attained for the first time. For the number of entries of val, just use n+1

Return type

int

mdciao.utils.lists.in_what_N_fragments(idxs, fragment_list)

For each element of idxs, return the index of “fragments” in which it appears

Parameters
  • idxs (integer, float, or iterable thereof) –

  • fragment_list (iterable of iterables) – iterable of iterables containing integers or floats

Returns

list of length len(idxs) containing an iterable with the indices of ‘fragments’ in which that index appears

Return type

list

mdciao.utils.lists.in_what_fragment(residx, list_of_nonoverlapping_lists_of_residxs, fragment_names=None)

For the residue id, returns the name(if provided) or the index of the “fragment” in which it appears

Parameters
  • residx (int) – residue index

  • list_of_nonoverlapping_lists_of_residxs (list) – list of integer list of non overlapping ids

  • fragment_names ((optional) list of strings) – fragment names for each list in list_of_nonoverlapping_lists_of_residxs

Returns

returns the name(if names is provided) otherwise returns index of the “fragment” in which the residue index appears

Return type

integer or string

mdciao.utils.lists.is_iterable(var)

Checks if the input is an iterable or not

Parameters

var (integer, float, string, list) –

Returns

Returns ‘True’ if var is iterable else False

Return type

boolean

mdciao.utils.lists.join_lists(lists, idxs_of_lists_to_join)

Provided a list of lists, join them following idxs_of_lists_to_join

Parameters
  • lists (iterable of iterables) – The lists to be joined

  • idxs_of_lists_to_join (iterable of iterables containing integers) –

    The lists to join. These 3 things will be done before using this array
    • remove duplicate entries in each iterable

    • sort the entries in each iterable by ascending order

    • assert there is no overlap between iterables

Returns

joined_listslists joined following the criterion of idxs_of_lists_to_join Once the new iterables have been created by joining the initial interables, they will be re-ordered by ascending first element

Return type

iterable of iterables

mdciao.utils.lists.put_this_idx_first_in_pair(idx, pair)

Returns the original pair if the value already appears first, else returns reversed pair :param idx: :type idx: value which needs to be brought in the first place (not the index but value itself) :param pair: pair of values as a list :type pair: list

Returns

Return type

pair

mdciao.utils.lists.rangeexpand(txt)

For a given integer range or multiple integer ranges, returns a list of individual integers. Example- “1-2,3-4” will return [1,2,3,4]

Parameters

txt (string) – string of integers or integer range separated by “,”

Returns

list of integers

Return type

list

mdciao.utils.lists.re_warp(array_in, lengths)
Return iterable ::py:obj:array_in as a list of arrays, each

one with the length specified in lengths

Parameters
  • array_in (any iterable) – Iterable to be re_warped

  • lengths (int or iterable of integers) –

    Lengths of the individual elements of the returned array. If only one int is parsed, all lengths will be that int. Special cases:

    • more lengths than needed are parsed: the last elements of the returned value are empty

    until all lengths have been used * less lengths than array_in could take: only the lenghts specified are returned in the warped list, the rest is unreturned

Returns

warped

Return type

list

mdciao.utils.lists.remove_from_lists(list_of_lists, remove_these)

Wraps safely around numpy.setdiff1d not returning empty lists

Parameters
  • list_of_lists (iterable of iterables) –

  • remove_these (iterable) –

Returns

clean_list

Return type

list

mdciao.utils.lists.unique_list_of_iterables_by_tuple_hashing(ilist, return_idxs=False, ignore_order=False)

Returns the unique entries(if there are duplicates) from a list of iterables.

Default is to take order into account, i.e. [[0,1],[1,0]] are considered different iterables

If ilist contains non-iterables, they will be turned into iterables, s.t. 1==[1]==np.array(1) and ‘A’==[‘A’]. They will also be returned as iterables

Parameters
  • ilist (list of iterables) – list of iterables with redundant entries (redundant in the list, not in entries)

  • return_idxs (boolean) – ‘True’ if required to return indices instead of unique list. (Default is False).

  • ignore_order (bool, default is False) – ignore order, s.t. [0,1] and [1,0] are considered equal. Only the first instance ([0,1]) is kept

Returns

result – list of unique iterables or indices of ‘ilist’ where the unique entries are

Return type

list

mdciao.utils.lists.unique_product_w_intersection(a1, a2)

Fast way to create the product of two intersecting sets without repeated/unwanted pairs

Consider that >>> list(itertools.product([0,1,2,3],[2,3,4,5])) [(0, 2),

(0, 3), (0, 4), (0, 5), (1, 2), (1, 3), (1, 4), (1, 5), (2, 2), (2, 3), (2, 4), (2, 5), (3, 2), (3, 3), (3, 4), (3, 5)]

Has the repeated/unwanted pairs (2,2),(3,3),(3,2) which need to be taken out a posteriori by comparing pairs.

The unique_list_of_iterables_by_tuple_hashing method accepts also arrays (since pairlists may not necessarily have been generated as tuples, but also as np.arrays), s.t. the arrays need to be casted into tuples before hashing and one comparison per pair (grows quadratically)

>>> a1 = np.arange(200)
>>> a2 = np.arange(195,300)
>>> pairs = np.array(list(itertools.product(a1,a2)))
>>> %timeit mdciao.utils.lists.unique_list_of_iterables_by_tuple_hashing(slow)
2.83 s ± 170 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Whereas >>> %timeit mdciao.utils.lists.unique_product_w_intersection(a1,a2) 47 ms ± 394 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

For reference >>> %timeit list(itertools.product(a1,a2)) 783 µs ± 5.37 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

I.e. clearly, for non-intersecting sets a1 and a2 without unwanted/repeated pairs, it’s always better to use itertools.product directly

Parameters
  • a1 (iterable) – The integers of the set1

  • a2 (iterable) – The integers of the set2

Returns

pairlist – The pairlist product of a1 and a2 without self-pairs (ii,ii) and the only (ii,jj) (not (jj,ii))

Return type

np.ndarray

mdciao.utils.lists.window_average_fast(input_array_y, half_window_size=2)

Returns the moving average using np.convolve :param input_array_y: numpy array for which moving average should be calculated :type input_array_y: array :param half_window_size: the actual window size will be 2 * half_window_size + 1.

Example- when half window size = 2, moving average calculation will use window=5

Returns

Return type

array