ResolveComplexVariants
Identifies multi-breakpoint complex variants, which are annotated with the CPX
value in the SVTYPE
field. These
variants are putative, as read depth evidence is not assessed at this stage.
The following diagram illustrates the recommended invocation order:
Inputs
Some inputs of batch data must match in order. Specifically, the order of the disc_files
array should match that of
rf_cutoff_files
.
cohort_name
Cohort name. The guidelines outlined in the sample ID requirements section apply here.
Optional merge_vcfs
Default: false
. If true, merge contig-sharded VCFs into one genome-wide VCF. This may be used for convenience but cannot be used with
downstream workflows.
cluster_vcfs
Array of contig-sharded VCFs, generated in CombineBatches.
cluster_bothside_pass_lists
Array of variant lists with bothside SR support for all batches, generated in CombineBatches.
cluster_background_fail_lists
Array of variant lists with low SR signal-to-noise ratio for all batches, generated in CombineBatches.
disc_files
Array of PE evidence files for all batches from GatherBatchEvidence.
rf_cutoffs
Array of batch genotyping cutoff files trained with the random forest filtering model from FilterBatch. Must match the order of disc_files.
Optional use_hail
Default: false
. Use Hail for VCF concatenation. This should only be used for projects with over 50k samples. If enabled, the
gcs_project must also be provided. Does not work on Terra.
Optional gcs_project
Google Cloud project ID. Required only if enabling use_hail.
Outputs
complex_resolve_vcfs
Array of contig-sharded VCFs containing putative complex variants.
complex_resolve_bothside_pass_list
Array of contig-sharded bothside SR support variant lists.
complex_resolve_background_fail_list
Array of contig-sharded high SR background variant lists.
breakpoint_overlap_dropped_record_vcfs
Variants dropped due to exact overlap with another's breakpoint.
complex_resolve_merged_vcf
Genome-wide output VCF. Only generated if using merge_vcfs.