Skip to main content

Smart-seq2 Single Nucleus Multi-Sample Count Matrix Overview

The Smart-seq2 Single Nucleus Multi-Sample (Multi-snSS2) pipeline's default count matrix output is a Loom file, an HDF5 file generated using Loompy v.3.0.6. It contains the raw cell-by-gene intron and exon counts.

The matrix also contains multiple metrics for both individual cells (the columns of the matrix; Table 2) and individual genes (the rows of the matrix; Table 3).

Additional details for each metric are provided in the JAVA source code for Picard's AlignmentSummaryMetrics, GcBiasSummaryMetrics, and DuplicationMetrics.

Table 1. Global attributes

The global attributes in the Loom apply to the whole file, not any specific part.

AttributeDetails
CreationDateDate the Loom file was created.
LOOM_SPEC_VERSIONLoom file spec version used during creation of the Loom file.
batch_idThe batch_id provided to the pipeline as input.
pipeline_versionVersion of the Multi-snSS2 pipeline used to generate the Loom file.

Table 2. Column attributes (cell metrics)

The cell metrics below are computed using Picard, with the exception of CellID, cell_names, and input_id which are provided to the pipeline as input.

Cell MetricsToolDetails
ACCUMULATION_LEVELCollectMultipleMetricsLevel of metric accumulation; set to ALL_READS using the --METRIC_ACCUMULATION_LEVEL argument.
ALIGNED_READSCollectGcBiasMetricsTotal number of aligned reads produced in a run.
AT_DROPOUTCollectGcBiasMetricsPercentage of misaligned reads with GC content below 50%.
AVG_POS_3PRIME_SOFTCLIP_LENGTH.FIRST_OF_PAIRCollectAlignmentSummaryMetricsAverage length of soft-clipped bases at the 3' end of the first reads.
AVG_POS_3PRIME_SOFTCLIP_LENGTH.PAIRCollectAlignmentSummaryMetricsAverage length of soft-clipped bases at the 3' end of all reads.
AVG_POS_3PRIME_SOFTCLIP_LENGTH.SECOND_OF_PAIRCollectAlignmentSummaryMetricsAverage length of soft-clipped bases at the 3' end of the second reads.
BAD_CYCLES.FIRST_OF_PAIRCollectAlignmentSummaryMetricsNumber of cycles with combined no-call and mismatch rates greater than or equal to 80% for the first reads.
BAD_CYCLES.PAIRCollectAlignmentSummaryMetricsNumber of cycles with combined no-call and mismatch rates greater than or equal to 80% for all reads.
BAD_CYCLES.SECOND_OF_PAIRCollectAlignmentSummaryMetricsNumber of cycles with combined no-call and mismatch rates greater than or equal to 80% for the second reads.
CellIDwarp-toolsUnique identifier for each cell provided to the pipeline as input_ids; identical to cell_names and input_id.
ESTIMATED_LIBRARY_SIZEMarkDuplicatesEstimated number of unique molecules in the library based on paired-end duplication.
GC_DROPOUTCollectGcBiasMetricsPercentage of misaligned reads with GC content above 50%.
GC_NC_0_19CollectGcBiasMetricsNormalized coverage over reads with GC content from 0 - 19%.
GC_NC_20_39CollectGcBiasMetricsNormalized coverage over reads with GC content from 20 - 39%.
GC_NC_40_59CollectGcBiasMetricsNormalized coverage over reads with GC content from 40 - 59%.
GC_NC_60_79CollectGcBiasMetricsNormalized coverage over reads with GC content from 60 - 79%.
GC_NC_80_100CollectGcBiasMetricsNormalized coverage over reads with GC content from 80 - 100%.
MAD_READ_LENGTH.FIRST_OF_PAIRCollectAlignmentSummaryMetricsMedian absolute deviation of the lengths of forward reads.
MAD_READ_LENGTH.PAIRCollectAlignmentSummaryMetricsMedian absolute deviation of the lengths of all reads.
MAD_READ_LENGTH.SECOND_OF_PAIRCollectAlignmentSummaryMetricsMedian absolute deviation of the lengths of reverse reads.
MAX_READ_LENGTH.FIRST_OF_PAIRCollectAlignmentSummaryMetricsMaximum length of forward reads.
MAX_READ_LENGTH.PAIRCollectAlignmentSummaryMetricsMaximum length of all reads.
MAX_READ_LENGTH.SECOND_OF_PAIRCollectAlignmentSummaryMetricsMaximum length of reverse reads.
MEAN_READ_LENGTH.FIRST_OF_PAIRCollectAlignmentSummaryMetricsMean length of forward reads.
MEAN_READ_LENGTH.PAIRCollectAlignmentSummaryMetricsMean length of all reads.
MEAN_READ_LENGTH.SECOND_OF_PAIRCollectAlignmentSummaryMetricsMean length of reverse reads.
MEDIAN_READ_LENGTH.FIRST_OF_PAIRCollectAlignmentSummaryMetricsMedian length of forward reads.
MEDIAN_READ_LENGTH.PAIRCollectAlignmentSummaryMetricsMedian length of all reads.
MEDIAN_READ_LENGTH.SECOND_OF_PAIRCollectAlignmentSummaryMetricsMedian length of reverse reads.
MIN_READ_LENGTH.FIRST_OF_PAIRCollectAlignmentSummaryMetricsMinimum length of forward reads.
MIN_READ_LENGTH.PAIRCollectAlignmentSummaryMetricsMinimum length of all reads.
MIN_READ_LENGTH.SECOND_OF_PAIRCollectAlignmentSummaryMetricsMinimum length of reverse reads.
PCT_ADAPTER.FIRST_OF_PAIRCollectAlignmentSummaryMetricsFraction of pass-filter forward reads that are unaligned or aligned with a mapping quality of 0 and match to a known adapter sequence from the start of the read.
PCT_ADAPTER.PAIRCollectAlignmentSummaryMetricsFraction of all pass-filter reads that are unaligned or aligned with a mapping quality of 0 and match to a known adapter sequence from the start of the read.
PCT_ADAPTER.SECOND_OF_PAIRCollectAlignmentSummaryMetricsFraction of pass-filter reverse reads that are unaligned or aligned with a mapping quality of 0 and match to a known adapter sequence from the start of the read.
PCT_CHIMERAS.FIRST_OF_PAIRCollectAlignmentSummaryMetricsFraction of forward reads where the insert is larger than 100 kb or the ends of the pair map to different chromosomes.
PCT_CHIMERAS.PAIRCollectAlignmentSummaryMetricsFraction of all reads where the insert is larger than 100 kb or the ends of the pair map to different chromosomes.
PCT_CHIMERAS.SECOND_OF_PAIRCollectAlignmentSummaryMetricsFraction of reverse reads where the insert is larger than 100 kb or the ends of the pair map to different chromosomes.
PCT_HARDCLIP.FIRST_OF_PAIRCollectAlignmentSummaryMetricsFraction of pass-filter bases that are hard-clipped from aligned, forward reads.
PCT_HARDCLIP.PAIRCollectAlignmentSummaryMetricsFraction of pass-filter bases that are hard-clipped from all aligned reads.
PCT_HARDCLIP.SECOND_OF_PAIRCollectAlignmentSummaryMetricsFraction of pass-filter bases that are hard-clipped from aligned, reverse reads.
PCT_PF_READS.FIRST_OF_PAIRCollectAlignmentSummaryMetricsFraction of forward reads that pass vendor check (pass-filter).
PCT_PF_READS.PAIRCollectAlignmentSummaryMetricsFraction of reads that pass vendor check (pass-filter).
PCT_PF_READS.SECOND_OF_PAIRCollectAlignmentSummaryMetricsFraction of reverse reads that pass vendor check (pass-filter).
PCT_PF_READS_ALIGNED.FIRST_OF_PAIRCollectAlignmentSummaryMetricsFraction of pass-filter forward reads that are aligned.
PCT_PF_READS_ALIGNED.PAIRCollectAlignmentSummaryMetricsFraction of all pass-filter reads that are aligned.
PCT_PF_READS_ALIGNED.SECOND_OF_PAIRCollectAlignmentSummaryMetricsFraction of pass-filter reverse reads that are aligned.
PCT_PF_READS_IMPROPER_PAIRS.FIRST_OF_PAIRCollectAlignmentSummaryMetricsFraction of forward reads not properly aligned in pairs.
PCT_PF_READS_IMPROPER_PAIRS.PAIRCollectAlignmentSummaryMetricsFraction of reads not properly aligned in pairs.
PCT_PF_READS_IMPROPER_PAIRS.SECOND_OF_PAIRCollectAlignmentSummaryMetricsFraction of reverse reads not properly aligned in pairs.
PCT_READS_ALIGNED_IN_PAIRS.FIRST_OF_PAIRCollectAlignmentSummaryMetricsFraction of forward reads properly aligned in pairs.
PCT_READS_ALIGNED_IN_PAIRS.PAIRCollectAlignmentSummaryMetricsFraction of reads properly aligned in pairs.
PCT_READS_ALIGNED_IN_PAIRS.SECOND_OF_PAIRCollectAlignmentSummaryMetricsFraction of reverse reads properly aligned in pairs.
PCT_SOFTCLIP.FIRST_OF_PAIRCollectAlignmentSummaryMetricsFraction of pass-filter bases that are soft-clipped from aligned forward reads.
PCT_SOFTCLIP.PAIRCollectAlignmentSummaryMetricsFraction of pass-filter bases that are soft-clipped from all aligned reads.
PCT_SOFTCLIP.SECOND_OF_PAIRCollectAlignmentSummaryMetricsFraction of pass-filter bases that are soft-clipped from aligned reverse reads.
PERCENT_DUPLICATIONMarkDuplicatesFraction of mapped sequence marked as duplicate.
PF_ALIGNED_BASES.FIRST_OF_PAIRCollectAlignmentSummaryMetricsTotal number of aligned bases in pass-filter forward reads.
PF_ALIGNED_BASES.PAIRCollectAlignmentSummaryMetricsTotal number of aligned bases in all pass-filter reads.
PF_ALIGNED_BASES.SECOND_OF_PAIRCollectAlignmentSummaryMetricsTotal number of aligned bases in pass-filter reverse reads.
PF_HQ_ALIGNED_BASES.FIRST_OF_PAIRCollectAlignmentSummaryMetricsNumber of bases aligned to the reference sequence in forward reads with high mapping quality.
PF_HQ_ALIGNED_BASES.PAIRCollectAlignmentSummaryMetricsNumber of bases aligned to the reference sequence in all reads with high mapping quality.
PF_HQ_ALIGNED_BASES.SECOND_OF_PAIRCollectAlignmentSummaryMetricsNumber of bases aligned to the reference sequence in reverse reads with high mapping quality.
PF_HQ_ALIGNED_Q20_BASES.FIRST_OF_PAIRCollectAlignmentSummaryMetricsSubset of PF_HQ_ALIGNED_BASES.FIRST_OF_PAIR with a base call quality of at least 20.
PF_HQ_ALIGNED_Q20_BASES.PAIRCollectAlignmentSummaryMetricsSubset of PF_HQ_ALIGNED_BASES.PAIR with a base call quality of at least 20.
PF_HQ_ALIGNED_Q20_BASES.SECOND_OF_PAIRCollectAlignmentSummaryMetricsSubset of PF_HQ_ALIGNED_BASES.SECOND_OF_PAIR with a base call quality of at least 20.
PF_HQ_ALIGNED_READS.FIRST_OF_PAIRCollectAlignmentSummaryMetricsNumber of pass-filter forward reads aligned with a mapping quality of at least 20.
PF_HQ_ALIGNED_READS.PAIRCollectAlignmentSummaryMetricsNumber of all pass-filter reads aligned with a mapping quality of at least 20.
PF_HQ_ALIGNED_READS.SECOND_OF_PAIRCollectAlignmentSummaryMetricsNumber of pass-filter reverse reads aligned with a mapping quality of at least 20.
PF_HQ_ERROR_RATE.FIRST_OF_PAIRCollectAlignmentSummaryMetricsFraction of bases in pass-filter, high-quality forward reads that do not match the reference.
PF_HQ_ERROR_RATE.PAIRCollectAlignmentSummaryMetricsFraction of bases in all pass-filter, high-quality reads that do not match the reference.
PF_HQ_ERROR_RATE.SECOND_OF_PAIRCollectAlignmentSummaryMetricsFraction of bases in pass-filter, high-quality reverse reads that do not match the reference.
PF_HQ_MEDIAN_MISMATCHES.FIRST_OF_PAIRCollectAlignmentSummaryMetricsMedian number of mismatches in high-quality forward reads.
PF_HQ_MEDIAN_MISMATCHES.PAIRCollectAlignmentSummaryMetricsMedian number of mismatches in all high-quality reads.
PF_HQ_MEDIAN_MISMATCHES.SECOND_OF_PAIRCollectAlignmentSummaryMetricsMedian number of mismatches in high-quality reverse reads.
PF_INDEL_RATE.FIRST_OF_PAIRCollectAlignmentSummaryMetricsNumber of insertion and deletion events per 100 aligned bases in forward reads.
PF_INDEL_RATE.PAIRCollectAlignmentSummaryMetricsNumber of insertion and deletion events per 100 aligned bases in all reads.
PF_INDEL_RATE.SECOND_OF_PAIRCollectAlignmentSummaryMetricsNumber of insertion and deletion events per 100 aligned bases in reverse reads.
PF_MISMATCH_RATE.FIRST_OF_PAIRCollectAlignmentSummaryMetricsRate of base mismatching for all aligned bases in forward reads.
PF_MISMATCH_RATE.PAIRCollectAlignmentSummaryMetricsRate of base mismatching for all aligned bases in all reads.
PF_MISMATCH_RATE.SECOND_OF_PAIRCollectAlignmentSummaryMetricsRate of base mismatching for all aligned bases in reverse reads.
PF_NOISE_READS.FIRST_OF_PAIRCollectAlignmentSummaryMetricsNumber of pass-filter forward reads marked as noise.
PF_NOISE_READS.PAIRCollectAlignmentSummaryMetricsNumber of all pass-filter reads marked as noise.
PF_NOISE_READS.SECOND_OF_PAIRCollectAlignmentSummaryMetricsNumber of pass-filter reverse reads marked as noise.
PF_READS.FIRST_OF_PAIRCollectAlignmentSummaryMetricsNumber of forward reads that pass vendor check (pass-filter).
PF_READS.PAIRCollectAlignmentSummaryMetricsNumber of reads that pass vendor check (pass-filter).
PF_READS.SECOND_OF_PAIRCollectAlignmentSummaryMetricsNumber of reverse reads that pass vendor check (pass-filter).
PF_READS_ALIGNED.FIRST_OF_PAIRCollectAlignmentSummaryMetricsNumber of pass-filter forward reads that are aligned.
PF_READS_ALIGNED.PAIRCollectAlignmentSummaryMetricsNumber of pass-filter reads that are aligned.
PF_READS_ALIGNED.SECOND_OF_PAIRCollectAlignmentSummaryMetricsNumber of pass-filter reverse reads that are aligned.
PF_READS_IMPROPER_PAIRS.FIRST_OF_PAIRCollectAlignmentSummaryMetricsNumber of forward reads not properly aligned in pairs.
PF_READS_IMPROPER_PAIRS.PAIRCollectAlignmentSummaryMetricsNumber of reads not properly aligned in pairs.
PF_READS_IMPROPER_PAIRS.SECOND_OF_PAIRCollectAlignmentSummaryMetricsNumber of reverse reads not properly aligned in pairs.
READS_ALIGNED_IN_PAIRS.FIRST_OF_PAIRCollectAlignmentSummaryMetricsNumber of forward reads properly aligned in pairs.
READS_ALIGNED_IN_PAIRS.PAIRCollectAlignmentSummaryMetricsNumber of reads properly aligned in pairs.
READS_ALIGNED_IN_PAIRS.SECOND_OF_PAIRCollectAlignmentSummaryMetricsNumber of reverse reads properly aligned in pairs.
READS_USEDCollectGcBiasMetricsString describing whether duplicates are included in metrics produced by CollectGcBiasMetrics; the pipeline removes duplicates before metrics are calculated.
READ_PAIRS_EXAMINEDMarkDuplicatesNumber of mapped read pairs examined by MarkDuplicates.
READ_PAIR_DUPLICATESMarkDuplicatesNumber of read pairs marked as duplicates.
READ_PAIR_OPTICAL_DUPLICATESMarkDuplicatesNumber of read pairs duplicates caused by optical duplication.
SD_READ_LENGTH.FIRST_OF_PAIRCollectAlignmentSummaryMetricsStandard deviation of forward read lengths.
SD_READ_LENGTH.PAIRCollectAlignmentSummaryMetricsStandard deviation of read lengths.
SD_READ_LENGTH.SECOND_OF_PAIRCollectAlignmentSummaryMetricsStandard deviation of reverse read lengths.
SECONDARY_OR_SUPPLEMENTARY_RDSMarkDuplicatesNumber of secondary or supplemetary reads.
STRAND_BALANCE.FIRST_OF_PAIRCollectAlignmentSummaryMetricsNumber of pass-filter forward reads aligned divided by the total number of pass-filter reads aligned.
STRAND_BALANCE.PAIRCollectAlignmentSummaryMetricsAverage strand balance of forward and reverse reads.
STRAND_BALANCE.SECOND_OF_PAIRCollectAlignmentSummaryMetricsNumber of pass-filter reverse reads aligned divided by the total number of pass-filter reads aligned.
TOTAL_CLUSTERSCollectGcBiasMetricsTotal number of reads after filtering used in GC bias calculation.
TOTAL_READS.FIRST_OF_PAIRCollectAlignmentSummaryMetricsTotal number of forward reads.
TOTAL_READS.PAIRCollectAlignmentSummaryMetricsTotal number of reads.
TOTAL_READS.SECOND_OF_PAIRCollectAlignmentSummaryMetricsTotal number of reverse reads.
UNMAPPED_READSMarkDuplicatesNumber of unmapped reads examined by MarkDuplicates.
UNPAIRED_READS_EXAMINEDMarkDuplicatesNumber of mapped reads without a mapped mate pair examined by MarkDuplicates.
UNPAIRED_READ_DUPLICATESMarkDuplicatesNumber of fragments marked as duplicates.
WINDOW_SIZECollectGcBiasMetricsGenomic window size used in GC bias calculation.
cell_nameswarp-toolsUnique identifier for each cell provided to the pipeline as input_ids; identical to Cell_ID and input_id.
input_idwarp-toolsUnique identifier for each cell provided to the pipeline as input_ids; identical to Cell_ID and cell_names.

Table 3. Row attributes (gene metrics)

Gene MetricsToolDetails
GeneGENCODE GTFThe unique gene_ids provided in the GENCODE GTF; identical to the ensembl_ids attribute.
ensembl_idsGENCODE GTFThe unique gene_ids provided in the GENCODE GTF; identical to the Gene attribute.
exon_lengthswarp-toolsThe length in base pairs of the exons corresponding to this entity.
gene_namesGENCODE GTFThe unique gene_name provided in the GENCODE GTF.
intron_lengthswarp-toolsThe length in base pairs of the introns corresponding to this entity.