Skip to main content

StripyWorkflow

WDL source code

StripyWorkflow is an optional standalone workflow that runs STRipy on a single sample to genotype a curated set of known pathogenic short tandem repeat (STR) expansions. It is intended for targeted follow-up rather than genome-wide STR discovery.

In joint-calling workflows, this module is typically run after EvidenceQC and sample QC once the cohort PED file has been finalized. The resulting single-sample STRipy VCFs can then be merged in ClusterBatch and appended to the final cohort VCF in AnnotateVcf.

The following diagram illustrates the recommended invocation order:

note

This workflow is optional. Run it only for samples where targeted analysis of known pathogenic STR expansions is desired.

Inputs

bam_or_cram_file

Sample alignment file in BAM or CRAM format.

Optional bam_or_cram_index

Index for the input BAM or CRAM. If omitted, the workflow expects the index to be located beside the input file using the standard .bai or .crai extension.

ped_file

PED file used to look up the sample sex for STRipy. See PED file format.

reference_fasta, reference_fasta_fai

Reference FASTA and FASTA index matching the aligned sample.

sample_name

Sample identifier. This must match the sample ID in the PED file.

Optional genome_build

Reference build name passed to STRipy. Default: hg38.

Optional locus

Comma-separated list of loci to analyze. By default, the workflow runs STRipy on its built-in panel of known pathogenic repeat-expansion loci.

Optional custom_catalog

Custom STRipy catalog file. Use this to add or override loci beyond the default pathogenic panel.

Optional analysis

STRipy analysis mode. Default: standard.

Optional config

Base STRipy configuration file.

Optional verbose

Enable verbose STRipy logging. Default: false.

Outputs

stripy_json

Per-sample STRipy JSON output.

stripy_tsv

Per-sample STRipy tabular summary.

stripy_html

Per-sample STRipy HTML report.

Optional stripy_vcf

Single-sample STRipy VCF for downstream merging in ClusterBatch and optional inclusion in the final cohort VCF.