Skip to main content

FilterGenotypes

WDL source code

Filter genotypes using the GQ model with recalibrated quality scores. The output VCF contains the HIGH_NCR field, which is a filter status assigned to variants exceeding a threshold proportion of no-call genotypes. This will also be applied to variants with genotypes that have already been filtered in the input VCF.

The following diagram illustrates the recommended invocation order:

QC recommendations

We strongly recommend performing call set QC after this module. By default, QC plotting is enabled with the run_qc argument. Users should carefully inspect the main plots from the main_vcf_qc_tarball. Please see the MainVcfQc module documentation for more information on interpreting these plots and recommended QC criteria.

Inputs

vcf

Input VCF with recalibrated scores generated from ScoreGenotypes.

Optional output_prefix

Default: use input VCF filename. Prefix for the output VCF, such as the cohort name. May be alphanumeric with underscores.

ploidy_table

Table of sample ploidies generated in JoinRawCalls.

sl_cutoff_table

An argument for the SL filtering script which is used to set scaled logit (SL) cutoffs for filtering. Overridden by optimized_sl_cutoff_table.

Optional optimized_sl_cutoff_table

This is an output from the SL optimization script. This can be used to set SL cutoffs for filtering in a more truth-aware manner. Overrides sl_cutoff_table if passed.

Optional no_call_rate_cutoff

Default: 0.05. Threshold fraction of samples that must have no-call genotypes in order to filter a variant. Set to 1 to disable.

Optional sl_filter_args

Arguments for the SL filtering script.

Optional run_qc

Default: true. Enable running MainVcfQc automatically. By default, filtered variants will be excluded from the plots.

Optional filter_vcf_records_per_shard

Default: 20000. Shard size for scattered GQ recalibration tasks. Decrease this if those steps are running slowly.

Outputs

filtered_vcf

Filtered VCF.

Optional main_vcf_qc_tarball

QC plots generated with MainVcfQc. Only generated if using run_qc.