Skip to main content

Single Nucleus Methyl-Seq and Chromatin Capture (snM3C) Overview

Pipeline VersionDate UpdatedDocumentation AuthorsQuestions or Feedback
snM3C_v1.0.0August, 2023Kaylee MathewsPlease file GitHub issues in the WARP repository

Introduction to snM3C

The Single Nucleus Methly-Seq and Chromatin Capture (snM3C) workflow is a cloud-based computational workflow for processing single-nucleus methylome and chromatin contact (snM3C) sequencing data. The workflow is designed to demultiplex raw sequencing reads, align them, call chromatin contacts, and generate summary metrics. It is developed in collaboration Hanqing Liu and the laboratory of Joseph Ecker. For more information about the snM3C tools and analysis, please see the YAP documentation or the cemba_data GitHub repository created by Hanqing Liu.

Set-up

Installation

To use the latest release of the snM3C pipeline, visit the WARP releases page and download the desired version.

Running the Workflow

To download the latest release of the snM3C pipeline, see the release tags prefixed with "snM3C" on the WARP releases page. All releases of the snM3C pipeline are documented in the snM3C changelog.

To search releases of this and other pipelines, use the WARP command-line tool Wreleaser.

The snM3C pipeline can be deployed using Cromwell, a GA4GH compliant, flexible workflow management system that supports multiple computing platforms. The workflow can also be run in Terra, a cloud-based analysis platform.

Inputs

The snM3C workflow requires a JSON configuration file specifying the input files and parameters for the analysis. Example configuration files can be found in the snM3C test_inputs directory in the WARP repository.

The main input files and parameters include:

ParameterDescription
fastq_input_read1Array of multiplexed FASTQ files for read 1
fastq_input_read2Array of multiplexed FASTQ files for read 2
random_primer_indexesFile containing random primer indexes
plate_idString specifying the plate ID
output_basenameString specifying a basename to be used for naming files
tarred_index_filesFile containing tarred index files for hisat-3 mapping
mapping_yamlFile containing YAML configuration for mapping steps with snakemake
snakefileFile containing the snakefile for mapping
chromosome_sizesFile containing chromosome sizes information
genome_faFile containing the reference genome in FASTA format

Tasks and Tools

The workflow contains two tasks described below. The parameters and more details about these tools can be found in the YAP documentation.

Task nameToolSoftwareDescription
DemultiplexingcutadaptcutadaptPerforms demultiplexing to cell-level FASTQ files
Mappinghisat-3hisat-3Performs trimming, alignment and calling chromatin contacts with a custom snakemake file developed by Hanqing Liu.

Outputs

The snM3C workflow produces the following main outputs:

OutputDescription
mappingSummaryMapping summary file in CSV format
allcFilesTarred file containing allc files
allc_CGNFilesTarred file containing CGN context-specific allc files
bamFilesTarred file containing cell-level aligned BAM files
detail_statsFilesTarred file containing detail stats files
hicFilesTarred file containing Hi-C files

Versioning

All snM3C pipeline releases are documented in the pipeline changelog.

Feedback

For questions, suggestions, or feedback related to the snM3C pipeline, please contact the WARP team. Your feedback is valuable for improving the pipeline and addressing any issues that may arise during its usage.