mdciao.utils.lists¶
Miscellaneous operations on list or list-like objects .. autosummary:
:nosignatures:
:toctree: generated/
Functions
|
Checks if an iterable satisfies the criteria of minimum length. |
|
Checks if two or more lists contain the same integer :param list_of_lists_of_integers: :type list_of_lists_of_integers: list of lists |
|
For every unique entry in |
|
Checks if iterable has any string element, returns False if it contains atleast one string |
If the members of the pair belong to the same fragment, exclude them from pairlist. |
|
|
For each sublist, return the index of the parent list |
|
Forces var to be iterable, if not already |
|
Try to hash all the objects of a list (regardless of type) into one hash |
|
Index of |
|
For each element of idxs, return the index of “fragments” in which it appears |
|
For the residue id, returns the name(if provided) or the index of the “fragment” in which it appears |
|
Checks if the input is an iterable or not |
|
Provided a list of lists, join them following idxs_of_lists_to_join |
|
Returns the original pair if the value already appears first, else returns reversed pair :param idx: :type idx: value which needs to be brought in the first place (not the index but value itself) :param pair: pair of values as a list :type pair: list |
|
For a given integer range or multiple integer ranges, returns a list of individual integers. |
|
Return iterable ::py:obj:array_in as a list of arrays, each |
|
Wraps safely around |
Returns the unique entries(if there are duplicates) from a list of iterables. |
|
|
Fast way to create the product of two intersecting sets without repeated/unwanted pairs |
|
Returns the moving average using np.convolve :param input_array_y: numpy array for which moving average should be calculated :type input_array_y: array :param half_window_size: the actual window size will be 2 * half_window_size + 1. |
-
mdciao.utils.lists.
assert_min_len
(input_iterable, min_len=2)¶ Checks if an iterable satisfies the criteria of minimum length. (Default minimum length is 2). :param input_iterable: example np.zeros((2,1,1) or [[1,2],[3,4]] when min_len = 2 :type input_iterable: numpy array, list of list :param min_len: :type min_len: minimum length which the iterable should satisfy (Default is 2)
- Returns
- Return type
Prints error if each item within the iterable has lesser number of elements than min_len
-
mdciao.utils.lists.
assert_no_intersection
(list_of_lists_of_integers, word='iterables')¶ Checks if two or more lists contain the same integer :param list_of_lists_of_integers: :type list_of_lists_of_integers: list of lists
- Returns
- Return type
Prints assertion message if inner lists have the same integer, else no output
-
mdciao.utils.lists.
contiguous_ranges
(list_in)¶ For every unique entry in
list_in
return the contiguous ranges in list- Parameters
list_in (list) –
- Returns
ranges – The keys are with unique entries of list_in, values are the ranges in which the entry appears
- Return type
dict
-
mdciao.utils.lists.
does_not_contain_strings
(iterable)¶ Checks if iterable has any string element, returns False if it contains atleast one string
- Parameters
iterable (integer, float, string or any combination thereof) –
- Returns
True if iterable does not contain any string, else False
- Return type
boolean
-
mdciao.utils.lists.
exclude_same_fragments_from_residx_pairlist
(pairlist, fragments, return_excluded_idxs=False)¶ If the members of the pair belong to the same fragment, exclude them from pairlist.
- Parameters
pairlist (list of iterables) – each iterable within the list should be a pair.
fragments (list of iterables) – each inner list should have residue indexes that form a fragment
return_excluded_idxs (boolean) – True if index of excluded pair is needed as an output. (Default is False).
- Returns
pairs that don’t belong to the same fragment, or index of the excluded pairs if return_excluded_idxs is True
- Return type
list
-
mdciao.utils.lists.
find_parent_list
(sublists, parent_lists)¶ For each sublist, return the index of the parent list
- Parameters
sublists (list of iterables) –
parent_lists (list of iterables) –
- Returns
parents_by_child (list) – A list of len(sublists) with indices indicating which element of
parent_lists
each sublist is a subset of. If a sublist doesn’t have a parent, its parent is Nonechild_by_parent (dict) – A dictionary keyed by parent idx and valued with idxs of their children
-
mdciao.utils.lists.
force_iterable
(var)¶ Forces var to be iterable, if not already
- Parameters
var (integer, float, string , list) –
- Returns
var as iterable
- Return type
iterable
-
mdciao.utils.lists.
hash_list
(ilist)¶ Try to hash all the objects of a list (regardless of type) into one hash
- Parameters
iobj (anthing) –
Returns (hashed object) –
------- –
-
mdciao.utils.lists.
idx_at_fraction
(val_desc_order, frac)¶ Index of
val_desc_order
where np.cumsum(val)/np.sum(val)>= frac for the first time- Parameters
val_desc_order (array like of floats) – The values that the determine the sum of which a fraction will be taken The have to be in descending order
frac (float) – The target fraction of sum(val) that is needed
- Returns
n – Index of val where the fraction is attained for the first time. For the number of entries of
val
, just use n+1- Return type
int
-
mdciao.utils.lists.
in_what_N_fragments
(idxs, fragment_list)¶ For each element of idxs, return the index of “fragments” in which it appears
- Parameters
idxs (integer, float, or iterable thereof) –
fragment_list (iterable of iterables) – iterable of iterables containing integers or floats
- Returns
list of length len(idxs) containing an iterable with the indices of ‘fragments’ in which that index appears
- Return type
list
-
mdciao.utils.lists.
in_what_fragment
(residx, list_of_nonoverlapping_lists_of_residxs, fragment_names=None)¶ For the residue id, returns the name(if provided) or the index of the “fragment” in which it appears
- Parameters
residx (int) – residue index
list_of_nonoverlapping_lists_of_residxs (list) – list of integer list of non overlapping ids
fragment_names ((optional) list of strings) – fragment names for each list in list_of_nonoverlapping_lists_of_residxs
- Returns
returns the name(if names is provided) otherwise returns index of the “fragment” in which the residue index appears
- Return type
integer or string
-
mdciao.utils.lists.
is_iterable
(var)¶ Checks if the input is an iterable or not
- Parameters
var (integer, float, string, list) –
- Returns
Returns ‘True’ if var is iterable else False
- Return type
boolean
-
mdciao.utils.lists.
join_lists
(lists, idxs_of_lists_to_join)¶ Provided a list of lists, join them following idxs_of_lists_to_join
- Parameters
lists (iterable of iterables) – The lists to be joined
idxs_of_lists_to_join (iterable of iterables containing integers) –
- The lists to join. These 3 things will be done before using this array
remove duplicate entries in each iterable
sort the entries in each iterable by ascending order
assert there is no overlap between iterables
- Returns
joined_lists –
lists
joined following the criterion ofidxs_of_lists_to_join
Once the new iterables have been created by joining the initial interables, they will be re-ordered by ascending first element- Return type
iterable of iterables
-
mdciao.utils.lists.
put_this_idx_first_in_pair
(idx, pair)¶ Returns the original pair if the value already appears first, else returns reversed pair :param idx: :type idx: value which needs to be brought in the first place (not the index but value itself) :param pair: pair of values as a list :type pair: list
- Returns
- Return type
pair
-
mdciao.utils.lists.
rangeexpand
(txt)¶ For a given integer range or multiple integer ranges, returns a list of individual integers. Example- “1-2,3-4” will return [1,2,3,4]
- Parameters
txt (string) – string of integers or integer range separated by “,”
- Returns
list of integers
- Return type
list
-
mdciao.utils.lists.
re_warp
(array_in, lengths)¶ - Return iterable ::py:obj:array_in as a list of arrays, each
one with the length specified in lengths
- Parameters
array_in (any iterable) – Iterable to be re_warped
lengths (int or iterable of integers) –
Lengths of the individual elements of the returned array. If only one int is parsed, all lengths will be that int. Special cases:
more lengths than needed are parsed: the last elements of the returned value are empty
until all lengths have been used * less lengths than array_in could take: only the lenghts specified are returned in the warped list, the rest is unreturned
- Returns
warped
- Return type
list
-
mdciao.utils.lists.
remove_from_lists
(list_of_lists, remove_these)¶ Wraps safely around
numpy.setdiff1d
not returning empty lists- Parameters
list_of_lists (iterable of iterables) –
remove_these (iterable) –
- Returns
clean_list
- Return type
list
-
mdciao.utils.lists.
unique_list_of_iterables_by_tuple_hashing
(ilist, return_idxs=False, ignore_order=False)¶ Returns the unique entries(if there are duplicates) from a list of iterables.
Default is to take order into account, i.e. [[0,1],[1,0]] are considered different iterables
If
ilist
contains non-iterables, they will be turned into iterables, s.t. 1==[1]==np.array(1) and ‘A’==[‘A’]. They will also be returned as iterables- Parameters
ilist (list of iterables) – list of iterables with redundant entries (redundant in the list, not in entries)
return_idxs (boolean) – ‘True’ if required to return indices instead of unique list. (Default is False).
ignore_order (bool, default is False) – ignore order, s.t. [0,1] and [1,0] are considered equal. Only the first instance ([0,1]) is kept
- Returns
result – list of unique iterables or indices of ‘ilist’ where the unique entries are
- Return type
list
-
mdciao.utils.lists.
unique_product_w_intersection
(a1, a2)¶ Fast way to create the product of two intersecting sets without repeated/unwanted pairs
Consider that >>> list(itertools.product([0,1,2,3],[2,3,4,5])) [(0, 2),
(0, 3), (0, 4), (0, 5), (1, 2), (1, 3), (1, 4), (1, 5), (2, 2), (2, 3), (2, 4), (2, 5), (3, 2), (3, 3), (3, 4), (3, 5)]
Has the repeated/unwanted pairs (2,2),(3,3),(3,2) which need to be taken out a posteriori by comparing pairs.
The
unique_list_of_iterables_by_tuple_hashing
method accepts also arrays (since pairlists may not necessarily have been generated as tuples, but also as np.arrays), s.t. the arrays need to be casted into tuples before hashing and one comparison per pair (grows quadratically)>>> a1 = np.arange(200) >>> a2 = np.arange(195,300) >>> pairs = np.array(list(itertools.product(a1,a2))) >>> %timeit mdciao.utils.lists.unique_list_of_iterables_by_tuple_hashing(slow) 2.83 s ± 170 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Whereas >>> %timeit mdciao.utils.lists.unique_product_w_intersection(a1,a2) 47 ms ± 394 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
For reference >>> %timeit list(itertools.product(a1,a2)) 783 µs ± 5.37 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
I.e. clearly, for non-intersecting sets a1 and a2 without unwanted/repeated pairs, it’s always better to use itertools.product directly
- Parameters
a1 (iterable) – The integers of the set1
a2 (iterable) – The integers of the set2
- Returns
pairlist – The pairlist product of a1 and a2 without self-pairs (ii,ii) and the only (ii,jj) (not (jj,ii))
- Return type
np.ndarray
-
mdciao.utils.lists.
window_average_fast
(input_array_y, half_window_size=2)¶ Returns the moving average using np.convolve :param input_array_y: numpy array for which moving average should be calculated :type input_array_y: array :param half_window_size: the actual window size will be 2 * half_window_size + 1.
Example- when half window size = 2, moving average calculation will use window=5
- Returns
- Return type
array