Skip to main content

GatherBatchEvidence

WDL source code

Runs CNV callers (cn.MOPS, GATK-gCNV) and combines single-sample raw evidence into batched files.

The following diagram illustrates the recommended invocation order:

Inputs

info

All array inputs of sample data must match in order. For example, the order of the samples array should match that of counts, PE_files, etc.

batch

An identifier for the batch; may only be alphanumeric with underscores.

samples

Sample IDs. Must match the sample IDs used in GatherSampleEvidence unless rename_samples is enabled, in which case sample IDs will be overwritten. See sample ID requirements for specifications of allowable sample IDs.

ped_file

Family structures and sex assignments determined in EvidenceQC. See PED file format.

counts

Binned read count files (*.rd.txt.gz) generated in GatherSampleEvidence.

PE_files

Discordant pair evidence files (*.pe.txt.gz) generated in GatherSampleEvidence.

SR_files

Split read evidence files (*.sr.txt.gz) generated in GatherSampleEvidence.

SD_files

Site depth files (*.sd.txt.gz) generated in GatherSampleEvidence.

*_vcfs

Raw caller VCFs generated in GatherSampleEvidence. Callers may be omitted if they were not run.

run_matrix_qc

Enables running QC tasks.

contig_ploidy_model_tar

Contig ploidy model tarball generated in TrainGCNV.

gcnv_model_tars

CNV model tarball generated in TrainGCNV.

Optional rename_samples

Default: false. Overwrite sample IDs with the samples input.

Optional run_ploidy

Default: false. Runs ploidy estimation. Note this calls the same method used in EvidenceQc.

Outputs

merged_BAF

Batch B-allele frequencies file (.baf.txt.gz) derived from site depth evidence.

merged_SR

Batch split read evidence file (.sr.txt.gz).

merged_PE

Batch paired-end evidence file (.pe.txt.gz).

merged_bincov

Batch binned read counts file (.rd.txt.gz).

merged_dels, merged_dups

Batch CNV calls (.bed.gz).

median_cov

Median coverage table.

std_*_vcf_tar

Tarballs containing per-sample raw caller VCFs in standardized formats. This will be ommitted for any callers not provided in the inputs.

Optional batch_ploidy_*

Ploidy analysis files. Enabled with run_ploidy.

Optional *_stats, Matrix_QC_plot

QC files. Enabled with run_matrix_qc.

Optional manta_tloc

Supplemental evidence for translocation variants. These records are hard filtered from the main call set but may be of interest to users investigating reciprocal translocations and other complex events.