Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Demultiplex

Description

If multiple MAS-seq libraries with different array designs are pooled into a single sequencing run, Longbow can demultiplex the reads into separate BAM files. This is accomplished with the demultiplex command, which tests multiple MAS-seq models per read (see annotate) and chooses the one with the highest overall likelihood.

Note that in Longbow parlance, the act of splitting a MAS-seq read array into its constituent elements is referred to as “segmentation”, not “demultiplexing”. If you’re trying to split the reads into individual transcripts, see the segment command.

Command help

$ longbow demultiplex --help
Usage: longbow demultiplex [OPTIONS] INPUT_BAM

  Separate reads into files based on which model they fit best.

  Resulting reads will be annotated with the model they best fit as well as
  the score and segments for that model.

Options:
  -v, --verbosity LVL       Either CRITICAL, ERROR, WARNING, INFO or DEBUG
  -p, --pbi PATH            BAM .pbi index file
  -t, --threads INTEGER     number of threads to use (0 for all)  [default: 7]
  -o, --out-base-name TEXT  base name for output files  [default:
                            longbow_demultiplexed]
  -m, --model TEXT          Models to use to demultiplex the input bam file.
                            Given model must either be a Longbow built-in
                            model, or a valid Longbow model json file.  If
                            specified, this option must be specified at least
                            twice.  [default: mas10, mas15]
  --max-length INTEGER      Maximum length of a read to process.  Reads beyond
                            this length will not be annotated.  [default:
                            60000]
  --min-rq FLOAT            Minimum ccs-determined read quality for a read to
                            be annotated.  CCS read quality range is [-1,1].
                            [default: -2.0]
  --help                    Show this message and exit.

Examples

Default demultiplexing

$ longbow demultiplex -t4 -v INFO  -o MAS-seq_demuxed multiplexed_data_mas10_mas15.bam
[INFO 2021-08-20 17:59:16 demultiplex] Invoked via: longbow demultiplex -t4 -v INFO -o MAS-seq_demuxed multiplexed_data_mas10_mas15.bam
[INFO 2021-08-20 17:59:16 demultiplex] Running with 4 worker subprocess(es)
[INFO 2021-08-20 17:59:16 demultiplex] Annotating 996 reads
[INFO 2021-08-20 17:59:16 demultiplex] Demultiplexing with models: mas10, mas15
Progress: 100%|████████████████████████████| 996/996 [10:24<00:00,  1.59 read/s]
[INFO 2021-08-20 18:09:45 demultiplex] Annotated 996 reads with 46004 total sections.
[INFO 2021-08-20 18:09:45 demultiplex] Model mas15 annotated 818 reads.
[INFO 2021-08-20 18:09:45 demultiplex] Model mas10 annotated 178 reads.
[INFO 2021-08-20 18:09:45 demultiplex] Done. Elapsed time: 629.44s. Overall processing rate: 1.58 reads/s.

Demultiplexing with two models

$ longbow demultiplex -t4 -v INFO  -o MAS-seq_demuxed -m mas10 -m mas15 multiplexed_data.bam
[INFO 2021-08-20 17:59:16 demultiplex] Invoked via: longbow demultiplex -t4 -v INFO -o MAS-seq_demuxed -m mas10 -m mas15 multiplexed_data_mas10_mas15.bam
[INFO 2021-08-20 17:59:16 demultiplex] Running with 4 worker subprocess(es)
[INFO 2021-08-20 17:59:16 demultiplex] Annotating 996 reads
[INFO 2021-08-20 17:59:16 demultiplex] Demultiplexing with models: mas10, mas15
Progress: 100%|████████████████████████████| 996/996 [10:24<00:00,  1.59 read/s]
[INFO 2021-08-20 18:09:45 demultiplex] Annotated 996 reads with 46004 total sections.
[INFO 2021-08-20 18:09:45 demultiplex] Model mas15 annotated 818 reads.
[INFO 2021-08-20 18:09:45 demultiplex] Model mas10 annotated 178 reads.
[INFO 2021-08-20 18:09:45 demultiplex] Done. Elapsed time: 629.44s. Overall processing rate: 1.58 reads/s.

Demultiplexing with three models

$ longbow demultiplex -t4 -v INFO  -o MAS-seq_demuxed -m mas10 -m mas15 -m slide-seq multiplexed_data_mas10_mas15.bam
[INFO 2021-08-20 17:59:16 demultiplex] Invoked via: longbow demultiplex -t4 -v INFO -o MAS-seq_demuxed  -m mas10 -m mas15 -m slide-seq multiplexed_data_mas10_mas15.bam
[INFO 2021-08-20 17:59:16 demultiplex] Running with 4 worker subprocess(es)
[INFO 2021-08-20 17:59:16 demultiplex] Annotating 996 reads
[INFO 2021-08-20 17:59:16 demultiplex] Demultiplexing with models: mas10, mas15, slide-seq
Progress: 100%|████████████████████████████| 996/996 [10:24<00:00,  1.59 read/s]
[INFO 2021-08-20 18:09:45 demultiplex] Annotated 996 reads with 46004 total sections.
[INFO 2021-08-20 18:09:45 demultiplex] Model mas15 annotated 818 reads.
[INFO 2021-08-20 18:09:45 demultiplex] Model mas10 annotated 178 reads.
[INFO 2021-08-20 18:09:45 demultiplex] Model slide-seq annotated 0 reads.
[INFO 2021-08-20 18:09:45 demultiplex] Done. Elapsed time: 629.44s. Overall processing rate: 1.58 reads/s.

© 2021: Jonn Smith, Kiran V Garimella, Broad Institute of MIT and Harvard.