Commands (high-level overview)
Listing available commands
Longbow implements a number of commands useful for working with MAS-seq data. A listing of all available commands can be obtained by running Longbow with the --help
option:
$ longbow --help
Usage: longbow [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
annotate Annotate reads in a BAM file with segments from the model.
convert Convert reads from fastq{,.gz} files for use with `annotate`.
correct Correct tag to values provided in barcode allowlist.
correct_umi Correct UMIs with Set Cover algorithm.
demultiplex Separate reads into files based on which model they fit best.
extract Extract coding segments from the reads in the given bam.
filter Filter reads by conformation to expected segment order.
inspect Inspect the classification results on specified reads.
models Get information about built-in Longbow models.
pad Pad tag by specified number of adjacent bases from the read.
peek Guess the best pre-built array model to use for annotation.
segment Segment pre-annotated reads from an input BAM file.
sift Filter segmented reads by conformation to expected cDNA.
stats Calculate and produce stats on the given input bam file.
tagfix Update longbow read tags after alignment.
train Train transition and emission probabilities on real data.
version Print the version of longbow.
Help for individual commands
Help for individual commands can be obtained by running longbow <command> --help
. For example:
$ longbow version --help
Usage: longbow version [OPTIONS]
Print the version of longbow.
Options:
-v, --verbosity LVL Either CRITICAL, ERROR, WARNING, INFO or DEBUG
--help Show this message and exit.
Verbosity
By default, the verbosity of Longbow is set to INFO. This can be useful to ensure that useful processing information is captured in log files. However, if one considers this output too noisy, a verbosity level of WARNING
will appropriately quiet Longbow’s reporting.
Pipelining
Longbow makes use of Unix pipes, allowing output from one command to be easily streamed into the next without the need to store intermediate files. For example:
$ longbow annotate -v WARN tests/test_data/mas15_test_input.bam | \
longbow segment -v WARN | \
longbow extract -v WARN -o extracted.bam
Progress: 100%|██████████████████████████████████████████████████████████████████████████████| 8/8 [00:01<00:00, 5.23 read/s]
$
If you wish to capture each output, simply supply an output filename with the -o <output_path>.bam
option:
$ longbow annotate -v WARN -o annotated.bam tests/test_data/mas15_test_input.bam
Progress: 100%|██████████████████████████████████████████████████████████████████████████████| 8/8 [00:01<00:00, 4.58 read/s]
$ longbow segment -v WARN -o segmented.bam annotated.bam
Progress: 108 read [00:00, 11177.91 read/s]
$ longbow extract -v WARN -o extracted.bam segmented.bam
Progress: 108 read [00:00, 11176.81 read/s]
Optional .pbi index file
Longbow can make use of a .pbi index file, which provides information as to the amount of data in a PacBio BAM file and enables random access over its contents. This facilitates accurate progress reporting and helps in certain commands to permit specific reads to be fetched and processed without processing the entire file (e.g. longbow inspect
, useful for inspecting model annotations). The .pbi file is optional; if not present at the path guessed or not explicitly supplied on the command-line, Longbow will simply process records linearly, and will not report the expected time to completion. For example:
# with .pbi file:
$ ls tests/test_data/mas15_test_input.bam*
tests/test_data/mas15_test_input.bam tests/test_data/mas15_test_input.bam.pbi
$ longbow annotate -v WARN -o /dev/null tests/test_data/mas15_test_input.bam
Progress: 100%|██████████████████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 8.08 read/s]
# without .pbi file
$ cp tests/test_data/mas15_test_input.bam .
$ ls mas15_test_input.bam*
mas15_test_input.bam
$ longbow annotate -v WARN -o /dev/null mas15_test_input.bam
Progress: 8 read [00:01, 4.70 read/s]
Table of contents
- annotate
- convert
- correct
- correct_umi
- demultiplex
- extract
- filter
- inspect
- models
- pad
- peek
- segment
- sift
- stats
- extract
- train
- version