mdciao.utils.lists
Miscellaneous operations on list or list-like objects .. autosummary:
:nosignatures:
:toctree: generated/
Functions
|
Checks if an iterable satisfies the criteria of minimum length. (Default minimum length is 2). :Parameters: * input_iterable (numpy array, list of list) -- example np.zeros((2,1,1) or [[1,2],[3,4]] when min_len = 2 * min_len (minimum length which the iterable should satisfy (Default is 2)). |
|
Assert if two or more lists contain the same integer(s) |
|
For every unique entry in |
|
Checks if iterable has any string element, returns False if it contains atleast one string |
If the members of the pair belong to the same fragment, exclude them from pairlist. |
|
|
For each sublist, return the index of the parent list |
|
Forces var to be iterable, if not already |
|
Try to hash all the objects of a list (regardless of type) into one hash |
|
Index of |
|
For each element of idxs, return the index of "fragments" in which it appears |
|
For the residue id, returns the name(if provided) or the index of the "fragment" in which it appears |
|
Checks if the input is an iterable or not |
|
Provided a list of lists, join them following idxs_of_lists_to_join |
|
Returns the original pair if the value already appears first, else returns reversed pair :Parameters: * idx (value which needs to be brought in the first place (not the index but value itself)) * pair (list) -- pair of values as a list |
|
For a given integer range or multiple integer ranges, returns a list of individual integers. |
|
Return iterable ::py:obj:array_in as a list of arrays, each |
|
Wraps safely around |
Returns the unique entries(if there are duplicates) from a list of iterables. |
|
|
Fast way to create the product of two intersecting sets without repeated/unwanted pairs |
|
Returns the moving average using |
- mdciao.utils.lists.assert_min_len(input_iterable, min_len=2)
Checks if an iterable satisfies the criteria of minimum length. (Default minimum length is 2). :Parameters: * input_iterable (numpy array, list of list) – example np.zeros((2,1,1) or [[1,2],[3,4]] when min_len = 2
min_len (minimum length which the iterable should satisfy (Default is 2))
- Return type:
Prints error if each item within the iterable has lesser number of elements than min_len
- mdciao.utils.lists.assert_no_intersection(list_of_lists_of_integers, word='iterables')
Assert if two or more lists contain the same integer(s)
- Parameters:
list_of_lists_of_integers (list of lists) – Empty lists are considered not intersecting and won’t raise AssertionError, though this is an interesting read: https://www.coopertoons.com/education/emptyclass_intersection/emptyclass_union_intersection.html”
- Return type:
Raises AssertionError if inner lists have the same integer, else no output
- mdciao.utils.lists.contiguous_ranges(list_in)
For every unique entry in
list_in
return the contiguous ranges in list- Parameters:
list_in (list)
- Returns:
ranges – The keys are with unique entries of list_in, values are the ranges in which the entry appears
- Return type:
dict
- mdciao.utils.lists.does_not_contain_strings(iterable)
Checks if iterable has any string element, returns False if it contains atleast one string
- Parameters:
iterable (integer, float, string or any combination thereof)
- Returns:
True if iterable does not contain any string, else False
- Return type:
boolean
- mdciao.utils.lists.exclude_same_fragments_from_residx_pairlist(pairlist, fragments, return_excluded_idxs=False)
If the members of the pair belong to the same fragment, exclude them from pairlist.
- Parameters:
pairlist (list of iterables) – each iterable within the list should be a pair.
fragments (list of iterables) – each inner list should have residue indexes that form a fragment
return_excluded_idxs (boolean) – True if index of excluded pair is needed as an output. (Default is False).
- Returns:
pairs that don’t belong to the same fragment, or index of the excluded pairs if return_excluded_idxs is True
- Return type:
list
- mdciao.utils.lists.find_parent_list(sublists, parent_lists)
For each sublist, return the index of the parent list
- Parameters:
sublists (list of iterables)
parent_lists (list of iterables)
- Returns:
parents_by_child (list) – A list of len(sublists) with indices indicating which element of
parent_lists
each sublist is a subset of. If a sublist doesn’t have a parent, its parent is Nonechild_by_parent (dict) – A dictionary keyed by parent idx and valued with idxs of their children
- mdciao.utils.lists.force_iterable(var)
Forces var to be iterable, if not already
- Parameters:
var (integer, float, string , list)
- Returns:
var as iterable
- Return type:
iterable
- mdciao.utils.lists.hash_list(ilist)
Try to hash all the objects of a list (regardless of type) into one hash
- Parameters:
iobj (anthing)
Returns (hashed object)
——-
- mdciao.utils.lists.idx_at_fraction(val_desc_order, frac)
Index of
val_desc_order
where np.cumsum(val)/np.sum(val)>= frac for the first time- Parameters:
val_desc_order (array like of floats) – The values that the determine the sum of which a fraction will be taken The have to be in descending order
frac (float) – The target fraction of sum(val) that is needed
- Returns:
n – Index of val where the fraction is attained for the first time. For the number of entries of
val
, just use n+1- Return type:
int
- mdciao.utils.lists.in_what_N_fragments(idxs, fragment_list)
For each element of idxs, return the index of “fragments” in which it appears
- Parameters:
idxs (integer, float, or iterable thereof)
fragment_list (iterable of iterables) – iterable of iterables containing integers or floats
- Returns:
list of length len(idxs) containing an iterable with the indices of ‘fragments’ in which that index appears
- Return type:
list
- mdciao.utils.lists.in_what_fragment(residx, list_of_nonoverlapping_lists_of_residxs, fragment_names=None)
For the residue id, returns the name(if provided) or the index of the “fragment” in which it appears
- Parameters:
residx (int) – residue index
list_of_nonoverlapping_lists_of_residxs (list) – list of integer list of non overlapping ids
fragment_names ((optional) list of strings) – fragment names for each list in list_of_nonoverlapping_lists_of_residxs
- Returns:
returns the name (if names is provided) otherwise returns index of the “fragment” in which the residue index appears
- Return type:
integer or string
- mdciao.utils.lists.is_iterable(var)
Checks if the input is an iterable or not
- Parameters:
var (integer, float, string, list)
- Returns:
Returns ‘True’ if var is iterable else False
- Return type:
boolean
- mdciao.utils.lists.join_lists(lists, idxs_of_lists_to_join)
Provided a list of lists, join them following idxs_of_lists_to_join
- Parameters:
lists (iterable of iterables) – The lists to be joined
idxs_of_lists_to_join (iterable of iterables containing integers) –
- The lists to join. These 3 things will be done before using this array
remove duplicate entries in each iterable
sort the entries in each iterable by ascending order
assert there is no overlap between iterables
- Returns:
joined_lists –
lists
joined following the criterion ofidxs_of_lists_to_join
Once the new iterables have been created by joining the initial interables, they will be re-ordered by ascending first element- Return type:
iterable of iterables
- mdciao.utils.lists.put_this_idx_first_in_pair(idx, pair)
Returns the original pair if the value already appears first, else returns reversed pair :Parameters: * idx (value which needs to be brought in the first place (not the index but value itself))
pair (list) – pair of values as a list
- Return type:
pair
- mdciao.utils.lists.rangeexpand(txt)
For a given integer range or multiple integer ranges, returns a list of individual integers. Example- “1-2,3-4” will return [1,2,3,4]
- Parameters:
txt (string) – string of integers or integer range separated by “,”
- Returns:
list of integers
- Return type:
list
- mdciao.utils.lists.re_warp(array_in, lengths)
- Return iterable ::py:obj:array_in as a list of arrays, each
one with the length specified in lengths
- Parameters:
array_in (any iterable) – Iterable to be re_warped
lengths (int or iterable of integers) – Lengths of the individual elements of the returned array. If only one int is parsed, all lengths will be that int. Special cases:
more lengths than needed are parsed: the last elements of the returned value are empty
until all lengths have been used * less lengths than array_in could take: only the lenghts specified are returned in the warped list, the rest is unreturned
- Returns:
warped
- Return type:
list
- mdciao.utils.lists.remove_from_lists(list_of_lists, remove_these)
Wraps safely around
numpy.setdiff1d
not returning empty lists- Parameters:
list_of_lists (iterable of iterables)
remove_these (iterable)
- Returns:
clean_list
- Return type:
list
- mdciao.utils.lists.unique_list_of_iterables_by_tuple_hashing(ilist, return_idxs=False, ignore_order=False)
Returns the unique entries(if there are duplicates) from a list of iterables.
Default is to take order into account, i.e. [[0,1],[1,0]] are considered different iterables
If
ilist
contains non-iterables, they will be turned into iterables, s.t. 1==[1]==np.array(1) and ‘A’==[‘A’]. They will also be returned as iterables- Parameters:
ilist (list of iterables) – list of iterables with redundant entries (redundant in the list, not in entries)
return_idxs (boolean) – ‘True’ if required to return indices instead of unique list. (Default is False).
ignore_order (bool, default is False) – ignore order, s.t. [0,1] and [1,0] are considered equal. Only the first instance ([0,1]) is kept
- Returns:
result – list of unique iterables or indices of ‘ilist’ where the unique entries are
- Return type:
list
- mdciao.utils.lists.unique_product_w_intersection(a1, a2)
Fast way to create the product of two intersecting sets without repeated/unwanted pairs
Consider that >>> list(itertools.product([0,1,2,3],[2,3,4,5])) [(0, 2),
(0, 3), (0, 4), (0, 5), (1, 2), (1, 3), (1, 4), (1, 5), (2, 2), (2, 3), (2, 4), (2, 5), (3, 2), (3, 3), (3, 4), (3, 5)]
Has the repeated/unwanted pairs (2,2),(3,3),(3,2) which need to be taken out a posteriori by comparing pairs.
The
unique_list_of_iterables_by_tuple_hashing
method accepts also arrays (since pairlists may not necessarily have been generated as tuples, but also as np.arrays), s.t. the arrays need to be casted into tuples before hashing and one comparison per pair (grows quadratically)>>> a1 = np.arange(200) >>> a2 = np.arange(195,300) >>> pairs = np.array(list(itertools.product(a1,a2))) >>> %timeit mdciao.utils.lists.unique_list_of_iterables_by_tuple_hashing(slow) 2.83 s ± 170 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Whereas >>> %timeit mdciao.utils.lists.unique_product_w_intersection(a1,a2) 47 ms ± 394 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
For reference >>> %timeit list(itertools.product(a1,a2)) 783 µs ± 5.37 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
I.e. clearly, for non-intersecting sets a1 and a2 without unwanted/repeated pairs, it’s always better to use itertools.product directly
- Parameters:
a1 (iterable) – The integers of the set1
a2 (iterable) – The integers of the set2
- Returns:
pairlist – The pairlist product of a1 and a2 without self-pairs (ii,ii) and the only (ii,jj) (not (jj,ii))
- Return type:
np.ndarray
- mdciao.utils.lists.window_average_fast(input_array_y, half_window_size=2)
Returns the moving average using
numpy.convolve
- Parameters:
input_array_y (array) – numpy array for which moving average should be calculated
half_window_size (int) – the actual window size will be 2 * half_window_size + 1. Example- when half window size = 2, moving average calculation will use window=5
- Return type:
array