RefineComplexVariants
Refines complex SVs and translocations and filters based on discordant read pair and read depth evidence reassessment.
The following diagram illustrates the recommended invocation order:
Inputs
All array inputs of batch data must match in order. For example, the order of the batch_name_list
array should match
that of batch_sample_lists
, PE_metrics
, etc.
vcf
Input vcf, generated in CleanVcf.
prefix
Prefix for output VCF, such as the cohort name. May be alphanumeric with underscores.
batch_name_list
Array of batch names. These should be the same batch names used in GatherBatchEvidence.
batch_sample_lists
Array of sample ID lists for all batches, generated in FilterBatch. Order must match batch_name_list.
PE_metrics
Array of PE metrics files for all batches, generated in GatherBatchEvidence. Order must match batch_name_list.
Depth_DEL_beds
, Depth_DUP_beds
Arrays of raw DEL and DUP depth calls for all batches, generated in GatherBatchEvidence. Order must match batch_name_list.
n_per_split
Shard size for parallel computations. Decreasing this parameter may help reduce run time.
Optional min_pe_cpx
Default: 3
. Minimum PE read count for complex variants (CPX).
Optional min_pe_ctx
Default: 3
. Minimum PE read count for translocations (CTX).
Optional use_hail
Default: false
. Use Hail for VCF concatenation. This should only be used for projects with over 50k samples. If enabled, the
gcs_project must also be provided. Does not work on Terra.
Optional gcs_project
Google Cloud project ID. Required only if enabling use_hail.
Outputs
cpx_refined_vcf
Output VCF.
cpx_evidences
Supplementary output table of complex variant evidence.