Skip to content

Can AMULET identify doublets with relatively low read counts from cells with a wide range of read counts #17

@Yydtqqqg

Description

@Yydtqqqg

Hi! I am applying AMULET on snATAC-seq data generated from my lab. For a dataset like this, after filtering with TSSE >= 10 and log10(UQ + 1) >= 3 (UQ means the number of uniquely mapped fragments), the cells still have a wide range of UQ.
屏幕快照 2022-04-04 14 23 37
Since the number of overlaps (genomic regions with >2 overlapping reads) for a cell positively correlates with the UQ of the cell, cells have high UQ tend to have a larger number of overlaps. Because the majority of cells fall in the range of 3 < log10(UQ + 1) < 4, I believe the majority of doublets are also there (at least with the same order of magnitude). However, with a doublet formed by two singlets each with UQ = 3000 and a singlet with UQ = 30000, AMULET would very probably locate more overlaps in the singlet than in the doublet. The cells whose UQ > 10000 tend to have more overlaps and are thus more likely to be identified as doublets.
The AMULET paper indicates that 25K median valid reads per nucleus is optimal, but is there a way to maximize the probability of finding doublets in the range of 3 < log10(UQ + 1) < 4 in this dataset given that there are cells in range 4 < log10(UQ + 1) < 5?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions