gnomad_qc.v5.annotations.generate_variant_qc_annotations

Script to generate annotations for variant QC on gnomAD v5.

usage: gnomad_qc.v5.annotations.generate_variant_qc_annotations.py
       [-h] [--rwb] [--overwrite] [--test]
       [--test-n-partitions [TEST_N_PARTITIONS]] [--generate-trio-stats]
       [--generate-sibling-stats] [--create-info-ht]
       [--lowqual-indel-phred-het-prior LOWQUAL_INDEL_PHRED_HET_PRIOR]

Named Arguments

--rwb

Run the script in RWB environment.

Default: False

--overwrite

Overwrite output files.

Default: False

--test

Write to test path.

Default: False

--test-n-partitions

Use only n partitions of the VDS as input for testing purposes (default: 2).

--generate-trio-stats

Calculates trio stats.

Default: False

--generate-sibling-stats

Calculates sibling stats.

Default: False

--create-info-ht

Create the info ht containing annotations needed for variant QC.

Default: False

--lowqual-indel-phred-het-prior

Phred-scaled prior for a het genotype at a site with a low quality indel. Default is 40. We use 1/10k bases (phred=40) to be more consistent with the filtering used by Broad’s Data Sciences Platform for VQSR.

Default: 40

Module Functions

gnomad_qc.v5.annotations.generate_variant_qc_annotations.generate_ac_info_ht(vds)

Compute AC and AC_raw annotations for each allele count filter group.

gnomad_qc.v5.annotations.generate_variant_qc_annotations.create_info_ht(...)

Import a VCF of AoU annotated sites, reformat annotations, and add AS_lowqual.

gnomad_qc.v5.annotations.generate_variant_qc_annotations.run_generate_trio_stats(mt, ...)

Generate trio transmission stats from a VariantDataset and pedigree info.

gnomad_qc.v5.annotations.generate_variant_qc_annotations.run_generate_sib_stats(mt, ...)

Generate sibling stats from a VariantDataset and relatedness info.

gnomad_qc.v5.annotations.generate_variant_qc_annotations.main(args)

Generate all variant annotations needed for variant QC.

gnomad_qc.v5.annotations.generate_variant_qc_annotations.get_script_argument_parser()

Get script argument parser.

Script to generate annotations for variant QC on gnomAD v5.

gnomad_qc.v5.annotations.generate_variant_qc_annotations.generate_ac_info_ht(vds)[source]

Compute AC and AC_raw annotations for each allele count filter group.

Parameters:

vds (VariantDataset) – VariantDataset to use for computing AC and AC_raw annotations.

Return type:

Table

Returns:

Table with AC and AC_raw annotations split by high quality, release, and unrelated.

gnomad_qc.v5.annotations.generate_variant_qc_annotations.create_info_ht(vcf_path, header_path, lowqual_indel_phred_het_prior=40, vds=None, test=False)[source]

Import a VCF of AoU annotated sites, reformat annotations, and add AS_lowqual.

Parameters:
  • vcf_path (str) – Path to the annotated sites-only VCF.

  • header_path (str) – Path to the header file for the VCF.

  • lowqual_indel_phred_het_prior (int) – Phred-scaled prior for a het genotype at a site with a low quality indel. Default is 40. We use 1/10k bases (phred=40) to be more consistent with the filtering used by Broad’s Data Sciences Platform for VQSR.

  • vds (VariantDataset) – VariantDataset to use for computing AC and AC_raw annotations.

  • test (bool) – Whether to write run a test using just the first two partitions of the loaded VCF.

Return type:

Table

Returns:

Hail Table with reformatted annotations.

gnomad_qc.v5.annotations.generate_variant_qc_annotations.run_generate_trio_stats(mt, fam_ped)[source]

Generate trio transmission stats from a VariantDataset and pedigree info.

Parameters:
  • mt (MatrixTable) – Dense trio MatrixTable.

  • fam_ped (Pedigree) – Pedigree containing trio info.

Return type:

Table

Returns:

Table containing trio stats.

gnomad_qc.v5.annotations.generate_variant_qc_annotations.run_generate_sib_stats(mt, relatedness_ht)[source]

Generate sibling stats from a VariantDataset and relatedness info.

Parameters:
  • mt (MatrixTable) – Input MatrixTable.

  • relatedness_ht (Table) – Table containing relatedness info.

Return type:

Table

Returns:

Table containing sibling stats.

gnomad_qc.v5.annotations.generate_variant_qc_annotations.main(args)[source]

Generate all variant annotations needed for variant QC.