Skip to main content

Run Sample Outlier QC Plotting

Pipeline VersionDate UpdatedDocumentation AuthorQuestions or Feedback
aou_9.0.0September, 2025WARP PipelinesFile an issue

Introduction to the Run Sample Outlier QC Plotting workflow

run_sample_outlier_qc_plotting is a WDL workflow that joins sample-level QC/ancestry annotations with demographics and generates interactive visualization outputs.

The workflow reads Hail table artifacts from run_sample_outlier_qc, enriches records with demographic race/ethnicity labels, then produces two HTML reports: a principal component scatter plot and a multi-tab QC metric/fitting visualization.

Quickstart table

Pipeline FeatureDescriptionSource
Analysis typeDemographic join + ancestry/QC visualization
Workflow languageWDL 1.0openWDL
Data input file formatDemographics TSV + tarred Hail table from outlier QC
Data output file formatInteractive HTML plots
Primary softwareHail + bokehHail, Bokeh

Set-up

Run Sample Outlier QC Plotting installation and requirements

The workflow code can be downloaded by cloning the WARP GitHub repository. For the latest release, please see the run_sample_outlier_qc_plotting changelog.

The pipeline can be deployed using Cromwell, a GA4GH-compliant workflow management system.

Inputs

Input descriptions

Input variable nameDescriptionType
aou_demographics_tsvDemographics TSV containing research_id and race/ethnicity metadata fields.File
output_prefixPrefix applied to visualization outputs.String
ancestry_with_flagged_samples_tar_gzTarred Hail table output from run_sample_outlier_qc (<input_prefix>.full.ht.tar.gz).File
input_prefixPrefix used for input Hail table names generated upstream.String

Run Sample Outlier QC Plotting tasks and tools

The workflow joins demographics and produces two types of interactive reports.

  1. Join demographics to ancestry/QC table
  2. Generate PC scatter plot
  3. Generate QC metric fitting report

To see specific tool parameters, select the task WDL link in the table; then view the command {} section of the task in the WDL script.

Task name and WDL linkToolSoftwareDescription
join_ancestry_to_demographicsHail table joinhailgenetics/hail:0.2.67Joins demographics data into ancestry/QC table and writes tarred Hail artifact.
plot_first_pcsHail plottinghailgenetics/hail:0.2.67Generates interactive PC1-vs-PC2 ancestry plot.
plot_metrics_and_fittingHail + bokeh tabshailgenetics/hail:0.2.67Generates interactive multi-metric QC fitting visualization.

1. Join demographics to ancestry/QC table

join_ancestry_to_demographics extracts the upstream Hail table, joins records by sample ID to demographics metadata, and writes <output_prefix>.ancestry_with_flagged_samples_demographics.ht.tar.gz.

2. Generate PC scatter plot

plot_first_pcs creates <output_prefix>.pc1vspc2.html, visualizing ancestry clusters and overlaying self-reported race/ethnicity metadata in hover fields.

3. Generate QC metric fitting report

plot_metrics_and_fitting creates <output_prefix>.metrics.html, an interactive tabbed report showing ancestry-stratified metric distributions and fitted trends.

Outputs

Output variable nameFilename, if applicableOutput format and description
pcs_plot<output_prefix>.pc1vspc2.htmlInteractive PC1-vs-PC2 ancestry scatter plot with demographic hover metadata.
metrics_plot<output_prefix>.metrics.htmlInteractive tabbed QC metric and fitting visualization across selected QC metrics.

Versioning

All run_sample_outlier_qc_plotting releases are documented in the changelog.

Feedback

Please help us make our tools better by filing an issue in WARP; we welcome pipeline-related suggestions or questions.