gnomad_qc.v5.resources.annotations
Script containing annotation related resources.
Module Functions
Get gnomAD v5 (AoU genomes only) trio stats VersionedTableResource. |
|
Get the gnomAD v5 (AoU genomes only) sibling stats VersionedTableResource. |
|
|
Get the downsampling annotation table. |
Get the group membership Table for coverage, AN, quality histograms, and frequency calculations. |
|
Get the quality histograms annotation table. |
|
|
Fetch filepath for all sites coverage or allele number Table. |
Get the frequency annotation Table for v5. |
|
Get the gnomAD v5 (AoU genomes only) info VersionedTableResource. |
|
Path to sites VCF (input information for running VQSR). |
|
|
Get path to AoU annotation sites-only VCF header. |
|
Get path to AoU sites-only VCF with annotations needed for variant QC. |
Get the gnomAD v5 VEP annotation VersionedTableResource. |
|
Get the gnomAD v5 VEP annotation VersionedTableResource for validation counts. |
Script containing annotation related resources.
- gnomad_qc.v5.resources.annotations.get_trio_stats(test=False, environment='rwb')[source]
Get gnomAD v5 (AoU genomes only) trio stats VersionedTableResource.
- Parameters:
test (
bool) – Whether to use a temporary path for testing.environment (
str) – Environment to use. Default is “rwb”. Must be one of “rwb” or “batch”.
- Return type:
VersionedTableResource- Returns:
AoU trio stats VersionedTableResource.
- gnomad_qc.v5.resources.annotations.get_sib_stats(test=False, environment='rwb')[source]
Get the gnomAD v5 (AoU genomes only) sibling stats VersionedTableResource.
- Parameters:
test (
bool) – Whether to use a tmp path for testing.environment (
str) – Environment to use. Default is “rwb”. Must be one of “rwb” or “batch”.
- Return type:
VersionedTableResource- Returns:
AoU sibling stats VersionedTableResource.
- gnomad_qc.v5.resources.annotations.get_aou_downsampling(test=False, environment='rwb')[source]
Get the downsampling annotation table.
v5 downsamplings only applies to the AoU dataset.
- Parameters:
test (
bool) – Whether to use a tmp path for tests. Default is False.environment (
str) – Environment to use. Default is “rwb”. Must be one of “rwb” or “batch”.
- Return type:
VersionedTableResource- Returns:
Hail Table containing downsampling annotations.
- gnomad_qc.v5.resources.annotations.group_membership(test=False, data_set='aou', environment='rwb')[source]
Get the group membership Table for coverage, AN, quality histograms, and frequency calculations.
- Parameters:
test (
bool) – Whether to use a tmp path for tests. Default is False.data_set (
str) – Data set of annotation resource. Default is “aou”.environment (
str) – Environment to use. Default is “rwb”. Must be one of “rwb” or “batch”.
- Return type:
VersionedTableResource- Returns:
Hail Table containing group membership annotations.
- gnomad_qc.v5.resources.annotations.qual_hists(test=False, environment='rwb')[source]
Get the quality histograms annotation table.
- Parameters:
test (
bool) – Whether to use a tmp path for tests. Default is False.environment (
str) – Environment to use for quality histograms. Must be one of “rwb” or “batch”.
- Return type:
VersionedTableResource- Returns:
Hail Table containing quality histogram annotations.
- gnomad_qc.v5.resources.annotations.coverage_and_an_path(test=False, data_set='aou', environment='rwb')[source]
Fetch filepath for all sites coverage or allele number Table.
Note
If data_set is ‘gnomAD’, the returned table only contains coverage and AN for consent drop samples.
- Parameters:
test (
bool) – Whether to use a tmp path for testing. Default is False.data_set (
str) – Dataset identifier. Must be one of “aou” or “gnomad”. Default is “aou”.environment (
str) – Environment to use. Default is “rwb”. Must be one of “rwb”, “batch”, or “dataproc”.
- Return type:
VersionedTableResource- Returns:
Coverage and allele number Hail Table.
- gnomad_qc.v5.resources.annotations.get_freq(version='5.0', data_type='genomes', test=False, data_set='aou', environment='rwb')[source]
Get the frequency annotation Table for v5.
- Parameters:
version (
str) – Version of annotation path to return.data_type (
str) – Data type of annotation resource (“genomes” or “exomes”).test (
bool) – Whether to use a tmp path for testing.data_set (
str) – Data set of annotation resource. Default is “aou”.environment (
str) – Environment to use. Default is “rwb”. Must be one of “rwb”, “batch”, or “dataproc”.
- Return type:
TableResource- Returns:
Hail Table containing frequency annotations.
- gnomad_qc.v5.resources.annotations.get_info_ht(test=False, environment='batch')[source]
Get the gnomAD v5 (AoU genomes only) info VersionedTableResource.
- Parameters:
test (
bool) – Whether to use a tmp path for testing.environment (
str) – Environment to use. Default is “batch”. Must be one of “rwb” or “batch”.
- Return type:
VersionedTableResource- Returns:
Info VersionedTableResource.
- gnomad_qc.v5.resources.annotations.info_vcf_path(version='5.0', test=False, environment='batch')[source]
Path to sites VCF (input information for running VQSR).
- Parameters:
version (
str) – Version of annotation path to return.test (
bool) – Whether to use a tmp path for testing.environment (
str) – Environment to use. Must be one of “rwb” or “batch”. Default is “batch”.
- Return type:
str- Returns:
String for the path to the info VCF.
- gnomad_qc.v5.resources.annotations.get_aou_vcf_header(environment='batch')[source]
Get path to AoU annotation sites-only VCF header.
This is needed for proper import of the sites-only VCF as the QUALapprox annotation is stated in the previous header as an int but is actually a float.
- Parameters:
environment (
str) – Environment to use. Default is “batch”. Must be one of “rwb” or “batch”.- Return type:
str- Returns:
Path to the VCF header file.
- gnomad_qc.v5.resources.annotations.get_aou_annotated_sites_only_vcf(environment='batch')[source]
Get path to AoU sites-only VCF with annotations needed for variant QC.
- Parameters:
environment (
str) – Environment to use. Default is “batch”. Must be one of “rwb” or “batch”.- Return type:
str- Returns:
Path to the annotated sites-only VCF.
- gnomad_qc.v5.resources.annotations.get_vep(test=False, vep_version='105', environment='batch')[source]
Get the gnomAD v5 VEP annotation VersionedTableResource.
- Parameters:
test (
bool) – Whether to use a tmp path for analysis of the test Table instead of the full v5 Table.vep_version (
str) – VEP version to use (e.g., “105”, “115”). Default is “105”.environment (
str) – Environment to use. Default is “batch”. Must be one of “rwb”, “batch”, or “dataproc”.
- Return type:
VersionedTableResource- Returns:
gnomAD v5 VEP VersionedTableResource.
- gnomad_qc.v5.resources.annotations.validate_vep_path(test=False, vep_version='105', environment='batch')[source]
Get the gnomAD v5 VEP annotation VersionedTableResource for validation counts.
- Parameters:
test (
bool) – Whether to use a tmp path for analysis of the test VDS instead of the full v5 VDS.vep_version (
str) – VEP version to use (e.g., “105”, “115”). Default is “105”.environment (
str) – Environment to use. Default is “batch”. Must be one of “rwb”, “batch”, or “dataproc”.
- Return type:
VersionedTableResource- Returns:
gnomAD v5 VEP VersionedTableResource containing validity check.