Skip to content

CCHF isolate name potentially misspelled on Genbank #12

@anna-parker

Description

@anna-parker

Describe the possible issue

Issue first described here: pathoplexus/pathoplexus#790 (comment).

Previously the L and S segments could be grouped resulting in the PPX entry: https://pathoplexus.org/seq/PP_000QFTH.3, however after Genbank added the isolate and strain name to the data ingested by NCBI Virus ingest broke up the grouping as the strain names are not the same.

Evidence of the problem

I believe this is actually a typo as all other relevant fields match, e.g. the Genbank files https://www.ncbi.nlm.nih.gov/nuccore/KT384397.1 and https://www.ncbi.nlm.nih.gov/nuccore/KT384388.1 both contain:

AUTHORS   Yadav,P.D., Shete,A.M. and Mourya,D.T.
  TITLE     First report of nosocomial outbreak of Crimean-Congo hemorrhagic
            fever, Rajasthan State, India
  JOURNAL   Unpublished
...
  AUTHORS   Yadav,P.D., Shete,A.M. and Mourya,D.T.
  TITLE     Direct Submission
  JOURNAL   Submitted (11-AUG-2015) Maximum Containment Laboratory, National
            Institute of Virology, Maximum Containment Laboratory, Pune,
            Maharashtra India, India

and the strain names are very similar: /strain="NIV130776" vs /strain="NIV1310776"

Suggested change

Manual curation of the ingested strain names to be the same, leading to the now revoked sequence being unrevoked.

Full list of affected sequences

PP_000QFTH.3 will need to be unrevoked and the isolate name curated, sequences PP_004G4NH and PP_004G4PF will need to be revoked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions