-
Notifications
You must be signed in to change notification settings - Fork 0
Output File Format
Nimble produces different output formats depending on the type of input data when running nimble align. If the input consists of single-read or paired-end FASTQ files, the results are written as a TSV file, with one row per feature in the reference library that generated at least one count. If the input is a BAM file, the results are also stored in TSV format but include additional metadata.
For each reference library, nimble processes one or both FASTQ files through the scoring pipeline and generates a corresponding TSV file. The output format includes:
- Feature calls: A list of features mapped by the read data, which may include ambiguous alignments to multiple features.
- nimble score: The count of the reads or read-pairs that aligned to that feature.
When processing BAM files, nimble generates TSV output compressed into .gz files. This format is significantly more detailed than the .fastq-based output, capturing:
- Feature calls and scores: Similar to the FASTQ-based output.
- BAM metadata: Includes relevant alignment details for both R1 and R2 reads.
- Filtering and scoring fields: Forward and reverse strand alignment details for both reads, alongside filtering flags.
- Triage and alignment direction: Additional metadata describing why a read was included or excluded.
Unlike the FASTQ-based output, which is rolled up to the feature level, BAM-derived nimble output contains per-read/per-read-pair information. Additionally, if a single read within a UMI aligns to the reference library, nimble writes out the entire UMI. The rows themselves contain much more structured information, including many values copied directly from the .bam tags, as well as metadata about nimble-specific filtering decisions and the individual r1/r2 alignment scores if applicable. Generally speaking, you should be able to only view the output from nimble report instead of introspecting the raw alignments from nimble align, but it can be useful for debugging or library-tuning purposes.
The output from nimble-report is a cell-by-feature count matrix, also in .tsv format, as expected by many downstream toolchains like Seurat.