GenotypeComplexVariants
Genotypes, filters, and classifies putative complex variants using depth evidence.
The following diagram illustrates the recommended invocation order:
Inputs
Some inputs of batch data must match in order. Specifically, the order of the batches
array should match that of
depth_vcfs
, bincov_files
, depth_gt_rd_sep_files
, and median_coverage_files
.
cohort_name
Cohort name. The guidelines outlined in the sample ID requirements section apply here.
batches
Array of batch identifiers. Should match the name used in GatherBatchEvidence.
ped_file
Family structures and sex assignments determined in EvidenceQC. See PED file format.
depth_vcfs
Array of re-genotyped depth caller variants for all batches, generated in RegenotypeCNVs. Must match order of batches.
Optional merge_vcfs
Default: false
. If true, merge contig-sharded VCFs into one genome-wide VCF. This may be used for convenience but cannot be used with
downstream workflows.
Optional localize_shard_size
Default: 50000
. Shard size for parallel computations. Decreasing this parameter may help reduce run time.
complex_resolve_vcfs
Array of contig-sharded VCFs containing putative complex variants, generated in ResolveComplexVariants.
bincov_files
Array of RD evidence files for all batches from GatherBatchEvidence. Must match order of batches.
depth_gt_rd_sep_files
Array of "depth_depth" genotype cutoff files (depth evidence for depth-based calls) generated in GenotypeBatch. Order must match that of batches.
median_coverage_files
Array of median coverage tables for all batches from GatherBatchEvidence. Order must match that of batches.
Optional use_hail
Default: false
. Use Hail for VCF concatenation. This should only be used for projects with over 50k samples. If enabled, the
gcs_project must also be provided. Does not work on Terra.
Optional gcs_project
Google Cloud project ID. Required only if enabling use_hail.
Outputs
complex_genotype_vcfs
Array of contig-sharded VCFs containing fully resolved and genotyped complex variants.
complex_genotype_merged_vcf
Genome-wide output VCF. Only generated if using merge_vcfs.