SRWholeGenome

SRWholeGenome

author
Jonn Smith
description
This workflow performs single-sample variant calling on Illumina reads from one or more flow cells containing replicates of the same sample. The workflow merges multiple samples into a single BAM prior to variant calling.

Inputs

Required

  • aligned_bais (Array[File], required): Array of aligned bam indices to process. Order must correspond to aligned_bams.
  • aligned_bams (Array[File], required): Array of aligned bam files to process.
  • indel_is_calibration (Array[Boolean], required): Array of booleans indicating which files in indel_known_reference_variants should be used as calibration sets. True ->calibration set. False -> NOT a calibration set.
  • indel_is_training (Array[Boolean], required): Array of booleans indicating which files in indel_known_reference_variants should be used as training sets. True -> training set. False -> NOT a training set.
  • indel_known_reference_variants (Array[File], required): Array of VCF files to use as input reference variants for INDELs. Each can be designated as either calibration or training using indel_is_training and indel_is_calibration.
  • indel_known_reference_variants_identifier (Array[File], required): Array of names to give to the VCF files given in indel_known_reference_variants. Order should correspond to that in indel_known_reference_variants.
  • indel_known_reference_variants_index (Array[File], required): Array of VCF index files for indel_known_reference_variants. Order should correspond to that in indel_known_reference_variants.
  • participant_name (String, required): The unique identifier of this sample being processed.
  • ref_map_file (File, required): Table indicating reference sequence, auxillary file locations, and metadata.
  • snp_is_calibration (Array[Boolean], required): Array of booleans indicating which files in snp_known_reference_variants should be used as calibration sets. True ->calibration set. False -> NOT a calibration set.
  • snp_is_training (Array[Boolean], required): Array of booleans indicating which files in snp_known_reference_variants should be used as training sets. True -> training set. False -> NOT a training set.
  • snp_known_reference_variants (Array[File], required): Array of VCF files to use as input reference variants for SNPs. Each can be designated as either calibration or training using snp_is_training and snp_is_calibration.
  • snp_known_reference_variants_identifier (Array[File], required): Array of names to give to the VCF files given in snp_known_reference_variants. Order should correspond to that in snp_known_reference_variants.
  • snp_known_reference_variants_index (Array[File], required): Array of VCF index files for snp_known_reference_variants. Order should correspond to that in snp_known_reference_variants.

Optional

  • bed_to_compute_coverage (File?): Bed file to use as regions over which to measure coverage.
  • fingerprint_haploytpe_db_file (File?): Haplotype DB file from which to fingerprint the input data.
  • gcs_out_root_dir (String?): GCS Bucket into which to finalize outputs. If no bucket is given, outputs will not be finalized and instead will remain in their native execution location.
  • interval_list (File?)
  • CallVariantsWithHaplotypeCaller.haplotype_caller_runtime_attr_override (RuntimeAttr?)
  • ComputeBamStats.qual_threshold (Int?)
  • ComputeBamStats.runtime_attr_override (RuntimeAttr?)
  • ComputeGenomeLength.runtime_attr_override (RuntimeAttr?)
  • ExtractIndelVariantAnnotations.runtime_attr_override (RuntimeAttr?)
  • ExtractSnpVariantAnnotations.runtime_attr_override (RuntimeAttr?)
  • FastQC.runtime_attr_override (RuntimeAttr?)
  • FinalizeBai.runtime_attr_override (RuntimeAttr?)
  • FinalizeBam.runtime_attr_override (RuntimeAttr?)
  • FinalizeFastQCReport.keyfile (File?)
  • FinalizeFastQCReport.name (String?)
  • FinalizeFastQCReport.runtime_attr_override (RuntimeAttr?)
  • FinalizeFingerprintVcf.name (String?)
  • FinalizeFingerprintVcf.runtime_attr_override (RuntimeAttr?)
  • FinalizeHCBaiOut.name (String?)
  • FinalizeHCBaiOut.runtime_attr_override (RuntimeAttr?)
  • FinalizeHCBamOut.name (String?)
  • FinalizeHCBamOut.runtime_attr_override (RuntimeAttr?)
  • FinalizeHCGTbi.name (String?)
  • FinalizeHCGTbi.runtime_attr_override (RuntimeAttr?)
  • FinalizeHCGVcf.name (String?)
  • FinalizeHCGVcf.runtime_attr_override (RuntimeAttr?)
  • FinalizeHCRescoredTbi.name (String?)
  • FinalizeHCRescoredTbi.runtime_attr_override (RuntimeAttr?)
  • FinalizeHCRescoredVcf.name (String?)
  • FinalizeHCRescoredVcf.runtime_attr_override (RuntimeAttr?)
  • FinalizeIndelExtractedAnnotations.name (String?)
  • FinalizeIndelExtractedAnnotations.runtime_attr_override (RuntimeAttr?)
  • FinalizeIndelExtractedSitesOnlyVcf.name (String?)
  • FinalizeIndelExtractedSitesOnlyVcf.runtime_attr_override (RuntimeAttr?)
  • FinalizeIndelExtractedSitesOnlyVcfIndex.name (String?)
  • FinalizeIndelExtractedSitesOnlyVcfIndex.runtime_attr_override (RuntimeAttr?)
  • FinalizeIndelExtractedUnlabeledAnnotations.name (String?)
  • FinalizeIndelExtractedUnlabeledAnnotations.runtime_attr_override (RuntimeAttr?)
  • FinalizeIndelTrainVariantAnnotationsCalibrationSetScores.name (String?)
  • FinalizeIndelTrainVariantAnnotationsCalibrationSetScores.runtime_attr_override (RuntimeAttr?)
  • FinalizeIndelTrainVariantAnnotationsNegativeModelScorer.name (String?)
  • FinalizeIndelTrainVariantAnnotationsNegativeModelScorer.runtime_attr_override (RuntimeAttr?)
  • FinalizeIndelTrainVariantAnnotationsPositiveModelScorer.name (String?)
  • FinalizeIndelTrainVariantAnnotationsPositiveModelScorer.runtime_attr_override (RuntimeAttr?)
  • FinalizeIndelTrainVariantAnnotationsTrainingScores.name (String?)
  • FinalizeIndelTrainVariantAnnotationsTrainingScores.runtime_attr_override (RuntimeAttr?)
  • FinalizeIndelTrainVariantAnnotationsUnlabeledPositiveModelScores.name (String?)
  • FinalizeIndelTrainVariantAnnotationsUnlabeledPositiveModelScores.runtime_attr_override (RuntimeAttr?)
  • FinalizeRegionalCoverage.keyfile (File?)
  • FinalizeRegionalCoverage.name (String?)
  • FinalizeRegionalCoverage.runtime_attr_override (RuntimeAttr?)
  • FinalizeScoreIndelVariantAnnotationsAnnotationsHdf5.name (String?)
  • FinalizeScoreIndelVariantAnnotationsAnnotationsHdf5.runtime_attr_override (RuntimeAttr?)
  • FinalizeScoreIndelVariantAnnotationsScoredVcf.name (String?)
  • FinalizeScoreIndelVariantAnnotationsScoredVcf.runtime_attr_override (RuntimeAttr?)
  • FinalizeScoreIndelVariantAnnotationsScoredVcfIndex.name (String?)
  • FinalizeScoreIndelVariantAnnotationsScoredVcfIndex.runtime_attr_override (RuntimeAttr?)
  • FinalizeScoreIndelVariantAnnotationsScoresHdf5.name (String?)
  • FinalizeScoreIndelVariantAnnotationsScoresHdf5.runtime_attr_override (RuntimeAttr?)
  • FinalizeScoreSnpVariantAnnotationsAnnotationsHdf5.name (String?)
  • FinalizeScoreSnpVariantAnnotationsAnnotationsHdf5.runtime_attr_override (RuntimeAttr?)
  • FinalizeScoreSnpVariantAnnotationsScoredVcf.name (String?)
  • FinalizeScoreSnpVariantAnnotationsScoredVcf.runtime_attr_override (RuntimeAttr?)
  • FinalizeScoreSnpVariantAnnotationsScoredVcfIndex.name (String?)
  • FinalizeScoreSnpVariantAnnotationsScoredVcfIndex.runtime_attr_override (RuntimeAttr?)
  • FinalizeScoreSnpVariantAnnotationsScoresHdf5.name (String?)
  • FinalizeScoreSnpVariantAnnotationsScoresHdf5.runtime_attr_override (RuntimeAttr?)
  • FinalizeSnpExtractedAnnotations.name (String?)
  • FinalizeSnpExtractedAnnotations.runtime_attr_override (RuntimeAttr?)
  • FinalizeSnpExtractedSitesOnlyVcf.name (String?)
  • FinalizeSnpExtractedSitesOnlyVcf.runtime_attr_override (RuntimeAttr?)
  • FinalizeSnpExtractedSitesOnlyVcfIndex.name (String?)
  • FinalizeSnpExtractedSitesOnlyVcfIndex.runtime_attr_override (RuntimeAttr?)
  • FinalizeSnpExtractedUnlabeledAnnotations.name (String?)
  • FinalizeSnpExtractedUnlabeledAnnotations.runtime_attr_override (RuntimeAttr?)
  • FinalizeSnpTrainVariantAnnotationsCalibrationSetScores.name (String?)
  • FinalizeSnpTrainVariantAnnotationsCalibrationSetScores.runtime_attr_override (RuntimeAttr?)
  • FinalizeSnpTrainVariantAnnotationsNegativeModelScorer.name (String?)
  • FinalizeSnpTrainVariantAnnotationsNegativeModelScorer.runtime_attr_override (RuntimeAttr?)
  • FinalizeSnpTrainVariantAnnotationsPositiveModelScorer.name (String?)
  • FinalizeSnpTrainVariantAnnotationsPositiveModelScorer.runtime_attr_override (RuntimeAttr?)
  • FinalizeSnpTrainVariantAnnotationsTrainingScores.name (String?)
  • FinalizeSnpTrainVariantAnnotationsTrainingScores.runtime_attr_override (RuntimeAttr?)
  • FinalizeSnpTrainVariantAnnotationsUnlabeledPositiveModelScores.name (String?)
  • FinalizeSnpTrainVariantAnnotationsUnlabeledPositiveModelScores.runtime_attr_override (RuntimeAttr?)
  • FingerprintAndBarcodeVcf.runtime_attr_override (RuntimeAttr?)
  • MergeAllReads.runtime_attr_override (RuntimeAttr?)
  • MosDepth.runtime_attr_override (RuntimeAttr?)
  • RegionalCoverage.runtime_attr_override (RuntimeAttr?)
  • RenameRawHcGvcf.runtime_attr_override (RuntimeAttr?)
  • RenameRawHcVcf.runtime_attr_override (RuntimeAttr?)
  • SamStats.runtime_attr_override (RuntimeAttr?)
  • ScoreIndelVariantAnnotations.runtime_attr_override (RuntimeAttr?)
  • ScoreSnpVariantAnnotations.runtime_attr_override (RuntimeAttr?)
  • TrainIndelVariantAnnotationsModel.runtime_attr_override (RuntimeAttr?)
  • TrainIndelVariantAnnotationsModel.unlabeled_annotation_hdf5 (File?)
  • TrainSnpVariantAnnotationsModel.runtime_attr_override (RuntimeAttr?)
  • TrainSnpVariantAnnotationsModel.unlabeled_annotation_hdf5 (File?)
  • CallVariantsWithHaplotypeCaller.CallVariantsWithHC.single_interval (String?)
  • CallVariantsWithHaplotypeCaller.CollapseGVCFtoVCF.runtime_attr_override (RuntimeAttr?)
  • CallVariantsWithHaplotypeCaller.CreateIntervalListFileFromIntervalInfo.runtime_attr_override (RuntimeAttr?)
  • CallVariantsWithHaplotypeCaller.ExtractIntervalNamesFromIntervalOrBamFile.runtime_attr_override (RuntimeAttr?)
  • CallVariantsWithHaplotypeCaller.IndexBamout.runtime_attr_override (RuntimeAttr?)
  • CallVariantsWithHaplotypeCaller.MergeGVCFs.runtime_attr_override (RuntimeAttr?)
  • CallVariantsWithHaplotypeCaller.MergeVariantCalledBamOuts.runtime_attr_override (RuntimeAttr?)
  • CallVariantsWithHaplotypeCaller.ReblockHcGVCF.annotations_to_keep (Array[String]?)
  • CallVariantsWithHaplotypeCaller.ReblockHcGVCF.runtime_attr_override (RuntimeAttr?)
  • CallVariantsWithHaplotypeCaller.ReblockHcGVCF.tree_score_cutoff (Float?)
  • CallVariantsWithHaplotypeCaller.SmallVariantsScatterPrep.runtime_attr_override (RuntimeAttr?)

Defaults

  • contigs_names_to_ignore (Array[String], default=["RANDOM_PLACEHOLDER_VALUE"]): Array of names of contigs to ignore for the purposes of reporting variants.
  • enable_hc_pileup_mode (Boolean, default=true): If true, will enable pileup mode in HaplotypeCaller.
  • heterozygosity (Float, default=0.001): HaplotypeCaller Parameter - Heterozygosity value used to compute prior likelihoods for any locus. See the GATKDocs for full details on the meaning of this population genetics concept
  • heterozygosity_stdev (Float, default=0.01): HaplotypeCaller Parameter - Standard deviation of heterozygosity for SNP and indel calling.
  • indel_calibration_sensitivity (Float, default=0.99): VETS (ScoreVariantAnnotations) parameter - score below which INDEL variants will be filtered.
  • indel_heterozygosity (Float, default=0.000125): HaplotypeCaller Parameter - Heterozygosity for indel calling. See the GATKDocs for heterozygosity for full details on the meaning of this population genetics concept
  • indel_max_unlabeled_variants (Int, default=0): VETS (ExtractVariantAnnotations) parameter - maximum number of unlabeled INDEL variants/alleles to randomly sample with reservoir sampling. If nonzero, annotations will also be extracted from unlabeled sites.
  • indel_recalibration_annotation_values (Array[String], default=["BaseQRankSum", "ExcessHet", "FS", "HAPCOMP", "HAPDOM", "HEC", "MQ", "MQRankSum", "QD", "ReadPosRankSum", "SOR", "DP"]): VETS (ScoreSnpVariantAnnotations/ScoreVariantAnnotations) parameter - Array of annotation names to use to create the INDEL variant scoring model and over which to score INDEL variants.
  • ploidy (Int, default=2): Ploidy of the species being variant called.
  • snp_calibration_sensitivity (Float, default=0.99): VETS (ScoreVariantAnnotations) parameter - score below which SNP variants will be filtered.
  • snp_max_unlabeled_variants (Int, default=0): VETS (ExtractVariantAnnotations) parameter - maximum number of unlabeled SNP variants/alleles to randomly sample with reservoir sampling. If nonzero, annotations will also be extracted from unlabeled sites.
  • snp_recalibration_annotation_values (Array[String], default=["BaseQRankSum", "ExcessHet", "FS", "HAPCOMP", "HAPDOM", "HEC", "MQ", "MQRankSum", "QD", "ReadPosRankSum", "SOR", "DP"]): VETS (ScoreSnpVariantAnnotations/ScoreVariantAnnotations) parameter - Array of annotation names to use to create the SNP variant scoring model and over which to score SNP variants.
  • CallVariantsWithHaplotypeCaller.call_vars_on_mitochondria (Boolean, default=true)
  • FastQC.num_cpus (Int, default=4)
  • RenameRawHcVcf.is_gvcf (Boolean, default=false)
  • TrainIndelVariantAnnotationsModel.calibration_sensitivity_threshold (Float, default=0.95)
  • TrainSnpVariantAnnotationsModel.calibration_sensitivity_threshold (Float, default=0.95)
  • CallVariantsWithHaplotypeCaller.CallVariantsWithHC.enable_dangling_branch_recovery (Boolean, default=false)
  • CallVariantsWithHaplotypeCaller.CollapseGVCFtoVCF.heterozygosity (Float, default=0.001): HaplotypeCaller Parameter - Heterozygosity value used to compute prior likelihoods for any locus. See the GATKDocs for full details on the meaning of this population genetics concept
  • CallVariantsWithHaplotypeCaller.CollapseGVCFtoVCF.heterozygosity_stdev (Float, default=0.01): HaplotypeCaller Parameter - Standard deviation of heterozygosity for SNP and indel calling.
  • CallVariantsWithHaplotypeCaller.CollapseGVCFtoVCF.indel_heterozygosity (Float, default=0.000125): HaplotypeCaller Parameter - Heterozygosity for indel calling. See the GATKDocs for heterozygosity for full details on the meaning of this population genetics concept
  • CallVariantsWithHaplotypeCaller.CollapseGVCFtoVCF.keep_combined_raw_annotations (Boolean, default=false)
  • CallVariantsWithHaplotypeCaller.ReblockHcGVCF.gq_blocks (Array[Int], default=[20, 30, 40])

Outputs

  • vcf (File)
  • tbi (File)
  • g_vcf (File)
  • g_tbi (File)
  • bamout (File)
  • baiout (File)
  • fingerprint_vcf (File?)
  • barcode (String?)
  • successfully_processed (Boolean)
  • aligned_bam (File?)
  • aligned_bai (File?)
  • average_identity (Float?)
  • aligned_num_reads (Float?)
  • aligned_num_bases (Float?)
  • aligned_frac_bases (Float?)
  • aligned_est_fold_cov (Float?)
  • aligned_read_length_mean (Float?)
  • insert_size_average (Float?)
  • insert_size_standard_deviation (Float?)
  • pct_properly_paired_reads (Float?)
  • fastqc_report (File?)
  • bed_cov_summary (File?)

Dot Diagram

SRWholeGenome