Skip to content

HaploBlocker for Sars/CoVid #57

@tpook92

Description

@tpook92

I just finished some initial analysis regarding the use of HaploBlocker for the Sars/CoVid genomes @superjox has provided me with. I would see the main use is to locally differenciate between variants, as two sequences, on average, only differ by 70 Bins (1w) which could basically lead to the same path for all sequences.

When applying HB on just the Sars2 data the finally obtained blocks look like this:
sars2

There is mainly the differenciation between one variant (colored) and everything else that cant be assigned to any block (black). However, differences between lines seem to be highly different based on region. I feel like a coloring according to haplotype block could really be helpful to display local variation.

I also tested HB on a joined dataset with Sars1 / Sars2 lines (with Sars2 being the last 106 lines and no other sorting applied):

Sars1vsSars2_regular

Besides very few short overlapping regions there isnt really much i can get out of this. Also the mode to remove overlapping blocks is currenlty not robust enough to handle the combination of such short / long blocks.

Lastly, I added the constrain to specificially only detect haplotype blocks that are present in Sars1/Sars2 at least 5 times. I even allowed for 5% differences between haplotypes in a blocks but there was really basically noteworthy to get (same sorting):

Sars1vsSars2

I have not done anything in regard to phylogenetic trees so far as Eric seem to already have contacted the that group via slack - as all Sars1 lines are so similar i would not expect a tree between Sars1 and Sars2 to be highly informativ but thats probably topic of a different issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions