StripyWorkflow
StripyWorkflow is an optional standalone workflow that runs
STRipy on a single sample to genotype a curated set of known pathogenic
short tandem repeat (STR) expansions. It is intended for targeted follow-up rather than
genome-wide STR discovery.
In joint-calling workflows, this module is typically run after EvidenceQC and sample QC once the cohort PED file has been finalized. The resulting single-sample STRipy VCFs can then be merged in ClusterBatch and appended to the final cohort VCF in AnnotateVcf.
The following diagram illustrates the recommended invocation order:
This workflow is optional. Run it only for samples where targeted analysis of known pathogenic STR expansions is desired.
Inputs
bam_or_cram_file
Sample alignment file in BAM or CRAM format.
Optional bam_or_cram_index
Index for the input BAM or CRAM. If omitted, the workflow expects the index to be located beside
the input file using the standard .bai or .crai extension.
ped_file
PED file used to look up the sample sex for STRipy. See PED file format.
reference_fasta, reference_fasta_fai
Reference FASTA and FASTA index matching the aligned sample.
sample_name
Sample identifier. This must match the sample ID in the PED file.
Optional genome_build
Reference build name passed to STRipy. Default: hg38.
Optional locus
Comma-separated list of loci to analyze. By default, the workflow runs STRipy on its built-in panel of known pathogenic repeat-expansion loci.
Optional custom_catalog
Custom STRipy catalog file. Use this to add or override loci beyond the default pathogenic panel.
Optional analysis
STRipy analysis mode. Default: standard.
Optional config
Base STRipy configuration file.
Optional verbose
Enable verbose STRipy logging. Default: false.
Outputs
stripy_json
Per-sample STRipy JSON output.
stripy_tsv
Per-sample STRipy tabular summary.
stripy_html
Per-sample STRipy HTML report.
Optional stripy_vcf
Single-sample STRipy VCF for downstream merging in ClusterBatch and optional inclusion in the final cohort VCF.