gnomad_qc.v5.annotations.generate_variant_qc_annotations
Script to generate annotations for variant QC on gnomAD v5.
usage: gnomad_qc.v5.annotations.generate_variant_qc_annotations.py
[-h] [--rwb] [--overwrite] [--test]
[--test-n-partitions [TEST_N_PARTITIONS]] [--generate-trio-stats]
[--generate-sibling-stats] [--create-info-ht]
[--lowqual-indel-phred-het-prior LOWQUAL_INDEL_PHRED_HET_PRIOR]
Named Arguments
- --rwb
Run the script in RWB environment.
Default: False
- --overwrite
Overwrite output files.
Default: False
- --test
Write to test path.
Default: False
- --test-n-partitions
Use only n partitions of the VDS as input for testing purposes (default: 2).
- --generate-trio-stats
Calculates trio stats.
Default: False
- --generate-sibling-stats
Calculates sibling stats.
Default: False
- --create-info-ht
Create the info ht containing annotations needed for variant QC.
Default: False
- --lowqual-indel-phred-het-prior
Phred-scaled prior for a het genotype at a site with a low quality indel. Default is 40. We use 1/10k bases (phred=40) to be more consistent with the filtering used by Broad’s Data Sciences Platform for VQSR.
Default: 40
Module Functions
|
Compute AC and AC_raw annotations for each allele count filter group. |
|
Import a VCF of AoU annotated sites, reformat annotations, and add AS_lowqual. |
|
Generate trio transmission stats from a VariantDataset and pedigree info. |
|
Generate sibling stats from a VariantDataset and relatedness info. |
|
Generate all variant annotations needed for variant QC. |
|
Get script argument parser. |
Script to generate annotations for variant QC on gnomAD v5.
- gnomad_qc.v5.annotations.generate_variant_qc_annotations.generate_ac_info_ht(vds)[source]
Compute AC and AC_raw annotations for each allele count filter group.
- Parameters:
vds (
VariantDataset) – VariantDataset to use for computing AC and AC_raw annotations.- Return type:
- Returns:
Table with AC and AC_raw annotations split by high quality, release, and unrelated.
- gnomad_qc.v5.annotations.generate_variant_qc_annotations.create_info_ht(vcf_path, header_path, lowqual_indel_phred_het_prior=40, vds=None, test=False)[source]
Import a VCF of AoU annotated sites, reformat annotations, and add AS_lowqual.
- Parameters:
vcf_path (
str) – Path to the annotated sites-only VCF.header_path (
str) – Path to the header file for the VCF.lowqual_indel_phred_het_prior (
int) – Phred-scaled prior for a het genotype at a site with a low quality indel. Default is 40. We use 1/10k bases (phred=40) to be more consistent with the filtering used by Broad’s Data Sciences Platform for VQSR.vds (
VariantDataset) – VariantDataset to use for computing AC and AC_raw annotations.test (
bool) – Whether to write run a test using just the first two partitions of the loaded VCF.
- Return type:
- Returns:
Hail Table with reformatted annotations.
- gnomad_qc.v5.annotations.generate_variant_qc_annotations.run_generate_trio_stats(mt, fam_ped)[source]
Generate trio transmission stats from a VariantDataset and pedigree info.
- Parameters:
mt (
MatrixTable) – Dense trio MatrixTable.fam_ped (
Pedigree) – Pedigree containing trio info.
- Return type:
- Returns:
Table containing trio stats.
- gnomad_qc.v5.annotations.generate_variant_qc_annotations.run_generate_sib_stats(mt, relatedness_ht)[source]
Generate sibling stats from a VariantDataset and relatedness info.
- Parameters:
mt (
MatrixTable) – Input MatrixTable.relatedness_ht (
Table) – Table containing relatedness info.
- Return type:
- Returns:
Table containing sibling stats.