VariantUtils

MergePerChrCalls

description
Merge per-chromosome calls into a single VCF

Inputs

Required

  • prefix (String, required): Prefix for output VCF
  • ref_dict (File, required): Reference dictionary
  • vcfs (Array[File], required): List of per-chromosome VCFs to merge

Optional

  • runtime_attr_override (RuntimeAttr?)

Outputs

  • vcf (File)
  • tbi (File)

MergeAndSortVCFs

description
Fast merging & sorting VCFs when the default sorting is expected to be slow

Inputs

Required

  • prefix (String, required)
  • ref_fasta_fai (File, required)
  • vcfs (Array[File], required)

Optional

  • header_definitions_file (File?): a union of definition header lines for input VCFs (related to https://github.com/samtools/bcftools/issues/1629)
  • runtime_attr_override (RuntimeAttr?)

Outputs

  • vcf (File)
  • tbi (File)

CollectDefinitions

description
Collect (union) various definitions in vcf files, adddressing a bcftols bug: https://github.com/samtools/bcftools/issues/1629

Inputs

Required

  • vcfs (Array[File], required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Outputs

  • union_definitions (File)

GetVCFSampleName

description
Currently mostly used for extracting sample name in fingerprinting genotyped VCF

Inputs

Required

  • fingerprint_vcf (File, required): Assumed to be genotyped, and hold only one sample (other samples will be ignored).

Optional

  • runtime_attr_override (RuntimeAttr?): Override default runtime attributes

Outputs

  • sample_name (String)

SubsetVCF

description
Subset a VCF file to a given locus

Inputs

Required

  • locus (String, required): Locus to be subsetted
  • vcf_gz (File, required): VCF file to be subsetted
  • vcf_tbi (File, required): Tabix index for the VCF file

Optional

  • runtime_attr_override (RuntimeAttr?): Override default runtime attributes

Defaults

  • prefix (String, default="subset"): Prefix for the output file

Outputs

  • subset_vcf (File)
  • subset_tbi (File)

ZipAndIndexVCF

description
gZip plain text VCF and index it.

Inputs

Required

  • vcf (File, required): VCF file to be zipped and indexed

Optional

  • runtime_attr_override (RuntimeAttr?): Override default runtime attributes

Outputs

  • vcfgz (File)
  • tbi (File)

IndexVCF

description
Indexing vcf.gz. Note: do NOT use remote index as that's buggy.

Inputs

Required

  • vcf (File, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Outputs

  • tbi (File)

FixSnifflesVCF

Inputs

Required

  • sample_name (String, required): Sniffles infers sample name from the BAM file name, so we fix it here
  • vcf (File, required)

Optional

  • ref_fasta_fai (File?): provide only when the contig section of the input vcf is suspected to be corrupted
  • runtime_attr_override (RuntimeAttr?)

Outputs

  • sortedVCF (File)
  • tbi (File)

HardFilterVcf

Inputs

Required

  • prefix (String, required)
  • vcf (File, required)
  • vcf_index (File, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Defaults

  • excess_het_threshold (Float, default=54.69)

Outputs

  • variant_filtered_vcf (File)
  • variant_filtered_vcf_index (File)

MakeSitesOnlyVcf

Inputs

Required

  • prefix (String, required)
  • vcf (File, required)
  • vcf_index (File, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Outputs

  • sites_only_vcf (File)
  • sites_only_vcf_index (File)

AnnotateVcfWithBedRegions

Inputs

Required

  • bed_file_annotation_names (Array[String], required)
  • bed_file_indexes (Array[File], required)
  • bed_files (Array[File], required)
  • prefix (String, required)
  • vcf (File, required)
  • vcf_index (File, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Outputs

  • annotated_vcf (File)
  • annotated_vcf_index (File)

IndelsVariantRecalibrator

Inputs

Required

  • is_known (Array[Boolean], required): Array of boolean values indicating if the known_reference_variant file at the same array position contains known variants. Must be the same length as known_reference_variants.
  • is_training (Array[Boolean], required): Array of boolean values indicating if the known_reference_variant file at the same array position contains training data. Must be the same length as known_reference_variants.
  • is_truth (Array[Boolean], required): Array of boolean values indicating if the known_reference_variant file at the same array position contains truth data. Must be the same length as known_reference_variants.
  • known_reference_variants (Array[File], required): Array of known reference VCF files. For humans, dbSNP is one example.
  • known_reference_variants_identifier (Array[String], required): Array of boolean values the identifier / name for the known_reference_variant file at the same array position. Must be the same length as known_reference_variants.
  • known_reference_variants_index (Array[File], required): Array of index files for known reference VCF files.
  • prefix (String, required)
  • prior (Array[Float], required): Array of integer values indicating the priors for the known_reference_variant file at the same array position. Must be the same length as known_reference_variants.
  • recalibration_annotation_values (Array[String], required)
  • recalibration_tranche_values (Array[String], required)
  • use_allele_specific_annotations (Boolean, required)
  • vcf_indices (Array[File], required): Tribble Indexes for sites only VCF.
  • vcfs (Array[File], required): Sites only VCFs. Can be pre-filtered using hard-filters.

Optional

  • runtime_attr_override (RuntimeAttr?)

Defaults

  • max_gaussians (Int, default=4)

Outputs

  • recalibration (File)
  • recalibration_index (File)
  • tranches (File)
  • model_report (File)

SNPsVariantRecalibratorCreateModel

Inputs

Required

  • is_known (Array[Boolean], required): Array of boolean values indicating if the known_reference_variant file at the same array position contains known variants. Must be the same length as known_reference_variants.
  • is_training (Array[Boolean], required): Array of boolean values indicating if the known_reference_variant file at the same array position contains training data. Must be the same length as known_reference_variants.
  • is_truth (Array[Boolean], required): Array of boolean values indicating if the known_reference_variant file at the same array position contains truth data. Must be the same length as known_reference_variants.
  • known_reference_variants (Array[File], required): Array of known reference VCF files. For humans, dbSNP is one example.
  • known_reference_variants_identifier (Array[String], required): Array of boolean values the identifier / name for the known_reference_variant file at the same array position. Must be the same length as known_reference_variants.
  • known_reference_variants_index (Array[File], required): Array of index files for known reference VCF files.
  • prefix (String, required)
  • prior (Array[Float], required): Array of integer values indicating the priors for the known_reference_variant file at the same array position. Must be the same length as known_reference_variants.
  • recalibration_annotation_values (Array[String], required)
  • recalibration_tranche_values (Array[String], required)
  • use_allele_specific_annotations (Boolean, required)
  • vcf_indices (Array[File], required): Tribble Indexes for sites only VCF.
  • vcfs (Array[File], required): Sites only VCFs. Can be pre-filtered using hard-filters.

Optional

  • downsampleFactor (Int?)
  • runtime_attr_override (RuntimeAttr?)

Defaults

  • max_gaussians (Int, default=6)

Outputs

  • recalibration (File)
  • recalibration_index (File)
  • tranches (File)
  • model_report (File)

ApplyVqsr

Inputs

Required

  • indel_filter_level (Float, required)
  • indels_recalibration (File, required)
  • indels_recalibration_index (File, required)
  • indels_tranches (File, required)
  • prefix (String, required)
  • snp_filter_level (Float, required)
  • snps_recalibration (File, required)
  • snps_recalibration_index (File, required)
  • snps_tranches (File, required)
  • use_allele_specific_annotations (Boolean, required)
  • vcf (File, required)
  • vcf_index (File, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Outputs

  • recalibrated_vcf (File)
  • recalibrated_vcf_index (File)

SelectVariants

Inputs

Required

  • prefix (String, required)
  • vcf (File, required)
  • vcf_index (File, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Outputs

  • vcf_out (File)
  • vcf_out_index (File)

RenameSingleSampleVcf

Inputs

Required

  • new_sample_name (String, required)
  • prefix (String, required)
  • vcf (File, required)
  • vcf_index (File, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Defaults

  • is_gvcf (Boolean, default=false)

Outputs

  • new_sample_name_vcf (File)
  • new_sample_name_vcf_index (File)

GatherVcfs

Inputs

Required

  • input_vcf_indices (Array[File], required)
  • input_vcfs (Array[File], required); localization_optional: true
  • prefix (String, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Outputs

  • output_vcf (File)
  • output_vcf_index (File)

ExtractFingerprint

Inputs

Required

  • bai (File, required)
  • bam (File, required)
  • haplotype_database_file (File, required)
  • ref_dict (File, required)
  • ref_fasta (File, required)
  • ref_index (File, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Defaults

  • prefix (String, default="fingerprint")

Outputs

  • output_vcf (File)
  • fingerprint_string (File)

ExtractFingerprintAndBarcode

Inputs

Required

  • haplotype_database_file (File, required)
  • ref_dict (File, required)
  • ref_fasta (File, required)
  • ref_fasta_fai (File, required)
  • vcf (File, required)
  • vcf_index (File, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Defaults

  • prefix (String, default="fingerprint")

Outputs

  • output_vcf (File)
  • barcode (String)
  • barcode_file (File)

ExtractVariantAnnotations

Inputs

Required

  • is_calibration (Array[Boolean], required): Array of boolean values indicating if the known_reference_variant file at the same array position should be used for 'calibration' data. Must be the same length as known_reference_variants.
  • is_training (Array[Boolean], required): Array of boolean values indicating if the known_reference_variant file at the same array position should be used for 'training' data. Must be the same length as known_reference_variants.
  • known_reference_variants (Array[File], required): Array of known reference VCF files. For humans, dbSNP is one example.
  • known_reference_variants_identifier (Array[String], required): Array of boolean values the identifier / name for the known_reference_variant file at the same array position. Must be the same length as known_reference_variants.
  • known_reference_variants_index (Array[File], required): Array of index files for known reference VCF files.
  • mode (String, required): SNP or INDEL
  • prefix (String, required): Prefix of the output files.
  • recalibration_annotation_values (Array[String], required)
  • vcf (File, required): VCF File from which to extract annotations.
  • vcf_index (File, required): Index for the given VCF file.

Optional

  • runtime_attr_override (RuntimeAttr?)

Defaults

  • max_unlabeled_variants (Int, default=0): How many sites should be used for unlableled training data. Setting this to values > 0 will enable a positive-negative training model.

Outputs

  • annotation_hdf5 (File)
  • sites_only_vcf (File)
  • sites_only_vcf_index (File)
  • unlabeled_annotation_hdf5 (File?)

TrainVariantAnnotationsModel

Inputs

Required

  • annotation_hdf5 (File, required): Labeled-annotations HDF5 file.
  • mode (String, required): SNP or INDEL
  • prefix (String, required): Prefix of the output files.

Optional

  • runtime_attr_override (RuntimeAttr?)
  • unlabeled_annotation_hdf5 (File?): Unlabeled-annotations HDF5 file (optional)

Defaults

  • calibration_sensitivity_threshold (Float, default=0.95): Calibration-set sensitivity threshold. (optional)

Outputs

  • training_scores (File)
  • positive_model_scorer_pickle (File)
  • unlabeled_positive_model_scores (File?)
  • calibration_set_scores (File?)
  • negative_model_scorer_pickle (File?)

ScoreVariantAnnotations

Inputs

Required

  • is_calibration (Array[Boolean], required): Array of boolean values indicating if the known_reference_variant file at the same array position should be used for 'calibration' data. Must be the same length as known_reference_variants.
  • is_training (Array[Boolean], required): Array of boolean values indicating if the known_reference_variant file at the same array position should be used for 'training' data. Must be the same length as known_reference_variants.
  • known_reference_variants (Array[File], required): Array of known reference VCF files. For humans, dbSNP is one example.
  • known_reference_variants_identifier (Array[String], required): Array of boolean values the identifier / name for the known_reference_variant file at the same array position. Must be the same length as known_reference_variants.
  • known_reference_variants_index (Array[File], required): Array of index files for known reference VCF files.
  • mode (String, required): SNP or INDEL
  • model_files (Array[File], required)
  • model_prefix (String, required)
  • prefix (String, required): Prefix of the output files.
  • recalibration_annotation_values (Array[String], required)
  • sites_only_extracted_vcf (File, required)
  • sites_only_extracted_vcf_index (File, required)
  • vcf (File, required): VCF File from which to extract annotations.
  • vcf_index (File, required): Index for the given VCF file.

Optional

  • runtime_attr_override (RuntimeAttr?)

Defaults

  • calibration_sensitivity_threshold (Float, default=0.99)

Outputs

  • scored_vcf (File)
  • scored_vcf_index (File)
  • annotations_hdf5 (File?)
  • scores_hdf5 (File?)

CompressAndIndex

description
Convert a BCF file to a vcf.bgz file and index it.

Inputs

Required

  • joint_bcf (File, required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Defaults

  • num_cpus (Int, default=8)
  • prefix (String, default="out")

Outputs

  • joint_gvcf (File)
  • joint_gvcf_tbi (File)

ConcatBCFs

description
Concatenate BCFs into a single .vcf.bgz file and index it.

Inputs

Required

  • bcfs (Array[File], required)

Optional

  • runtime_attr_override (RuntimeAttr?)

Defaults

  • num_cpus (Int, default=4)
  • prefix (String, default="out")

Outputs

  • joint_gvcf (File)
  • joint_gvcf_tbi (File)