gnomad.utils.release

`gnomad.utils.release.make_faf_index_dict`(...)	Create a look-up Dictionary for entries contained in the filter allele frequency annotation array.
`gnomad.utils.release.make_freq_index_dict`(...)	Create a look-up Dictionary for entries contained in the frequency annotation array.
`gnomad.utils.release.make_freq_index_dict_from_meta`(...)	Create a dictionary for accessing frequency array.

gnomad.utils.release.make_faf_index_dict(faf_meta, groups=['adj'], gen_anc_groups=['afr', 'amr', 'asj', 'eas', 'fin', 'mid', 'nfe', 'remaining', 'sas'], sexes=['XX', 'XY'], label_delimiter='_')[source]

Create a look-up Dictionary for entries contained in the filter allele frequency annotation array.

Parameters:

faf_meta (List[Dict[str, str]]) – Global annotation containing the set of groupings for each element of the faf array (e.g., [{‘group’: ‘adj’}, {‘group’: ‘adj’, ‘gen_anc’: ‘nfe’}])
groups (List[str]) – List of sample groups [adj, raw]. Default is GROUPS
gen_anc_groups (List[str]) – List of sample genetic ancestry group names for gnomAD data type. Default is GEN_ANC_GROUPS[CURRENT_MAJOR_RELEASE][“exomes”].
sexes (List[str]) – List of sample sexes used in VCF export. Default is SEXES
label_delimiter (str) – String used as delimiter when making group label combinations

Return type:

Dict[str, int]

Returns:

Dictionary of faf annotation genetic ancestry group groupings, where values are the corresponding 0-based indices for the groupings in the faf_meta array

gnomad.utils.release.make_freq_index_dict(freq_meta, groups=['adj', 'raw'], gen_anc_groups=['afr', 'amr', 'asj', 'eas', 'fin', 'mid', 'nfe', 'remaining', 'sas'], sexes=['XX', 'XY'], subsets=['non_ukb'], downsamplings=None, label_delimiter='_')[source]

Create a look-up Dictionary for entries contained in the frequency annotation array.

Parameters:

freq_meta (List[Dict[str, str]]) – List containing the set of groupings for each element of the freq array (e.g., [{‘group’: ‘adj’}, {‘group’: ‘adj’, ‘gen_anc’: ‘nfe’}])
groups (List[str]) – List of sample groups [adj, raw]. Default is GROUPS
gen_anc_groups (List[str]) – List of sample global genetic ancestry group names for gnomAD data type. Default is GEN_ANC_GROUPS[CURRENT_MAJOR_RELEASE][“exomes”].
sexes (List[str]) – List of sample sexes used in VCF export. Default is SEXES
subsets (List[str]) – List of sample subsets in dataset. Default is SUBSETS[CURRENT_MAJOR_RELEASE]
downsamplings (Optional[List[int]]) – List of downsampling cohort sizes present in global frequency array
label_delimiter (str) – String used as delimiter when making group label combinations

Return type:

Dict[str, int]

Returns:

Dictionary keyed by the grouping combinations found in the frequency array, where values are the corresponding 0-based indices for the groupings in the freq_meta array

gnomad.utils.release.make_freq_index_dict_from_meta(freq_meta, label_delimiter='_', sort_order=['subset', 'downsampling', 'grpmax', 'popmax', 'gen_anc', 'pop', 'subgrp', 'subpop', 'sex', 'group'])[source]

Create a dictionary for accessing frequency array.

The dictionary is keyed by the grouping combinations found in the frequency metadata array, where values are the corresponding 0-based indices for the groupings in the frequency array. For example, if the freq_meta entry [{‘gen_anc’: ‘nfe’}, {‘sex’: ‘XX’}] corresponds to the 5th entry in the frequency array, the returned dictionary entry would be {‘nfe_XX’: 4}.

Parameters:

freq_meta (List[Dict[str, str]]) – List of dictionaries containing frequency metadata.
label_delimiter (str) – Delimiter to use when joining frequency metadata labels.
sort_order (Optional[List[str]]) – List of frequency metadata labels to use when sorting the dictionary.

Return type:

Dict[str, int]

Returns:

Dictionary of frequency metadata.