gnomad_qc.v4.variant_qc.import_variant_qc_vcf

Script to load variant QC result VCF into a Hail Table.

usage: gnomad_qc.v4.variant_qc.import_variant_qc_vcf.py [-h] [-o]
                                                        [--slack-channel SLACK_CHANNEL]
                                                        --vcf-path VCF_PATH
                                                        --model-id MODEL_ID
                                                        --compute-info-method
                                                        {AS,quasi,set_long_AS_missing_info}
                                                        --transmitted-singletons
                                                        TRANSMITTED_SINGLETONS
                                                        --sibling-singletons
                                                        SIBLING_SINGLETONS
                                                        --adj ADJ
                                                        --interval-qc-filter
                                                        INTERVAL_QC_FILTER
                                                        --calling-interval-filter
                                                        CALLING_INTERVAL_FILTER
                                                        [--n-partitions N_PARTITIONS]
                                                        [--header-path HEADER_PATH]
                                                        [--array-elements-required]
                                                        [--is-split]
                                                        [--deduplication-check]
                                                        [--snp-features SNP_FEATURES [SNP_FEATURES ...]]
                                                        [--indel-features INDEL_FEATURES [INDEL_FEATURES ...]]

Named Arguments

-o, --overwrite

Whether to overwrite data already present in the output Table.

Default: False

--slack-channel

Slack channel to post results and notifications to.

--vcf-path

Path to variant QC result VCF. Can be specified as Hadoop glob patterns.

--model-id

Model ID for the variant QC result HT.

--compute-info-method

Possible choices: AS, quasi, set_long_AS_missing_info

Compute info method used to generate the variant QC results. Options are ‘AS’, ‘quasi’ or ‘set_long_AS_missing_info’.

--transmitted-singletons

Whether transmitted singletons were used in training the model.

--sibling-singletons

Whether sibling singletons were used in training the model.

--adj

Whether adj filtered singletons were used in training the model.

--interval-qc-filter

Whether only variants in intervals passing interval QC were used in training the model.

--calling-interval-filter

Whether only variants in the intersection of Broad/DSP calling intervals with 50 bp of padding were used for training.

--n-partitions

Number of desired partitions for output Table.

Default: 5000

--header-path

Optional path to a header file to use for importing the variant QC result VCF.

--array-elements-required

Pass if you would like array elements required in import_vcf to be true.

Default: False

--is-split

Whether the VCF is already split.

Default: False

--deduplication-check

Remove duplicate variants. Useful for v4 MVP when reading from potentially overlapping shards.

Default: False

--snp-features

Features used in the SNP VQSR model.

Default: [‘AS_QD’, ‘AS_MQRankSum’, ‘AS_ReadPosRankSum’, ‘AS_FS’, ‘AS_MQ’]

--indel-features

Features used in the indel VQSR model.

Default: [‘AS_QD’, ‘AS_MQRankSum’, ‘AS_ReadPosRankSum’, ‘AS_FS’]

Module Functions

gnomad_qc.v4.variant_qc.import_variant_qc_vcf.import_variant_qc_vcf(...)

Import variant QC result site VCF into a HT.

gnomad_qc.v4.variant_qc.import_variant_qc_vcf.main(args)

Load variant QC result VCF into a Hail Table.

gnomad_qc.v4.variant_qc.import_variant_qc_vcf.get_script_argument_parser()

Get script argument parser.

Script to load variant QC result VCF into a Hail Table.

gnomad_qc.v4.variant_qc.import_variant_qc_vcf.import_variant_qc_vcf(vcf_path, model_id, num_partitions=5000, import_header_path=None, array_elements_required=False, is_split=False, deduplicate_check=False)[source]

Import variant QC result site VCF into a HT.

Parameters:
  • vcf_path (str) – Path to input variant QC result site vcf. This can be specified as Hadoop glob patterns.

  • model_id (str) – Model ID for the variant QC results. Must start with ‘rf_’, ‘vqsr_’, or ‘if_’.

  • num_partitions (int) – Number of partitions to use for the output HT.

  • import_header_path (Optional[str]) – Optional path to a header file to use for import.

  • array_elements_required (bool) – Value of array_elements_required to pass to hl.import_vcf.

  • is_split (bool) – Whether the VCF is already split.

  • deduplicate_check (bool) – Check for and remove duplicate variants.

Return type:

Union[Table, Tuple[Table, Table]]

Returns:

HT containing variant QC results.

gnomad_qc.v4.variant_qc.import_variant_qc_vcf.main(args)[source]

Load variant QC result VCF into a Hail Table.