Skip to main content

Leafcutter Clustering

Pipeline VersionDate UpdatedDocumentation AuthorQuestions or Feedback
aou_9.0.0July, 2025WARP PipelinesFile an issue

Introduction to the Leafcutter Clustering workflow

leafcutter_cluster.wdl defines workflow leafcutter_cluster_workflow, which clusters junction data and produces sQTL phenotype artifacts including bed/parquet matrices, phenotype groups, and PCs.

The workflow localizes junction files from a file-of-paths list, runs cluster preparation, and outputs leafcutter-ready downstream inputs.

Inputs

Input descriptions

Input variable nameDescriptionType
junc_files_listFile containing one junction file path per line.File
exon_listExon annotation list for clustering script.File
genes_gtfGene GTF used for clustering and feature annotation.File
prefixPrefix used for all output filenames.String
sample_participant_lookupSample-to-participant mapping file.File
min_clu_reads(Optional) minimum cluster reads threshold.Int?
min_clu_ratio(Optional) minimum cluster ratio threshold.Float?
max_intron_len(Optional) maximum intron length.Int?
num_pcs(Optional) number of PCs to compute.Int?
memoryRuntime memory in GB.Int
disk_spaceRuntime disk size.Int
num_threadsRuntime CPU threads.Int
num_preemptRuntime preemptible count.Int

Outputs

Output variable nameFilename, if applicableOutput format and description
counts_out<prefix>_perind.counts.gzPer-individual intron cluster counts.
counts_numers_out<prefix>_perind_numers.counts.gzPer-individual numerator counts.
clusters_pooled_out<prefix>_pooled.gzPooled cluster definitions.
clusters_refined_out<prefix>_refined.gzRefined cluster definitions.
phenotype_groups_out<prefix>.leafcutter.phenotype_groups.txtPhenotype group file for TensorQTL-style sQTL analysis.
bed_parquet_out<prefix>.leafcutter.bed.parquetLeafcutter phenotype matrix in parquet format.
bed_out<prefix>.leafcutter.bed.gzLeafcutter phenotype matrix in BED-like gzip format.
bed_index_out<prefix>.leafcutter.bed.gz.tbiTabix index for bed_out.
pcs_out<prefix>.leafcutter.PCs.txtPhenotype PCs from clustering pipeline.
leafcutter_pipeline_versionaou_9.0.0Workflow version string output.

Workflow and WDL

Versioning

All leafcutter_cluster releases are documented in the changelog.

Feedback

Please help us make our tools better by filing an issue in WARP; we welcome pipeline-related suggestions or questions.