Processing w/ a combined genome for spike normalization

Hi,

I've been trying to use your pipeline to align samples that have Drosophila spike-ins. Rather than doing sequential alignment, I generated a combined mouse and Drosophila genome w/ the dmel chromosomes in the format "dm6_{chrom}". I didn't recover any signal along the Dmel genome. The problem seems to be that when you filter out rRNA and chrM, you also pass it through `grep '_' -v` here:
https://github.com/Danko-Lab/proseq2.0/blob/c3260bdffb571beb58c33ea086a968d7ac519e6f/proseq2.0.bsh#L873 and https://github.com/Danko-Lab/proseq2.0/blob/c3260bdffb571beb58c33ea086a968d7ac519e6f/proseq2.0.bsh#L1166

I edited those lines to remove the `grep '_' -v` section while still removing the rRNA and chrM reads, and it seems to have fixed the problem. However, I was wondering why that was there. In the mm10 annotation I'm using, none of the chromosomes have '_' in them.

I would also maybe recommend documenting that behavior, as this seems to be a relatively common way of doing spike normalization.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Processing w/ a combined genome for spike normalization #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Processing w/ a combined genome for spike normalization #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions