gnomad_qc.v5.data_ingestion.federated_validity_checks
Script to generate annotations for variant QC on gnomAD v4.
Module Functions
|
Validate JSON config inputs. |
|
Check that necessary fields defined in the JSON config are present in the Hail Table. |
|
Check for and report the fraction of missing data in the Table. |
|
Perform validity checks on federated data. |
|
Perform validity checks for federated data. |
Script to generate annotations for variant QC on gnomAD v4.
- gnomad_qc.v5.data_ingestion.federated_validity_checks.validate_config(config, schema)[source]
Validate JSON config inputs.
- Parameters:
config (
Dict
[str
,Any
]) – JSON configuration for parameter inputs.schema (
Dict
[str
,Any
]) – JSON schema to use for validation.
- Return type:
None
- Returns:
None.
- gnomad_qc.v5.data_ingestion.federated_validity_checks.validate_ht_fields(ht, config)[source]
Check that necessary fields defined in the JSON config are present in the Hail Table.
- Parameters:
ht (
Table
) – Hail Table.config (
Dict
[str
,Any
]) – JSON configuration for parameter inputs.
- Return type:
None
- Returns:
None.
- gnomad_qc.v5.data_ingestion.federated_validity_checks.check_missingness(ht, missingness_threshold=0.5, struct_annotations=['grpmax', 'fafmax', 'histograms'])[source]
Check for and report the fraction of missing data in the Table.
- Parameters:
ht (
Table
) – Input Table.missingness_threshold (
float
) – Upper cutoff for allowed amount of missingness. Default is 0.50.struct_annotations (
List
[str
]) – List of struct annotations to check for missingness. Default is [‘grpmax’, ‘fafmax’, ‘histograms’].
- Return type:
None
- Returns:
None
- gnomad_qc.v5.data_ingestion.federated_validity_checks.validate_federated_data(ht, freq_meta_expr, missingness_threshold=0.5, struct_annotations_for_missingness=['grpmax', 'fafmax', 'histograms'], freq_annotations_to_sum=['AC', 'AN', 'homozygote_count'], freq_sort_order=['gen_anc', 'sex', 'group'], nhomalt_metric='nhomalt', verbose=False)[source]
Perform validity checks on federated data.
- Parameters:
ht (
Table
) – Input Table.freq_meta_expr (
ArrayExpression
) – Metadata expression that contains the values of the elements in meta_indexed_expr. The most often used expression is freq_meta to index into a ‘freq’ array (example: ht.freq_meta).freq_annotations_to_sum (
List
[str
]) – List of annotation fields within meta_expr to sum. Default is [‘AC’, ‘AN’, ‘homozygote_count’].freq_sort_order (
List
[str
]) – Order in which groupings are unfurled into flattened annotations. Default is [“gen_anc”, “sex”, “group”].nhomalt_metric (
str
) – Name of metric denoting homozygous alternate count. Default is “nhomalt”.verbose (
bool
) – If True, show top values of annotations being checked, including checks that pass; if False, show only top values of annotations that fail checks. Default is False.missingness_threshold (
float
) –struct_annotations_for_missingness (
List
[str
]) –
- Return type:
None
- Returns:
None