Skip to content

Hierarchy of disambiguation information #56

@tilltnet

Description

@tilltnet

Hi,
I've noticed that the pruning part of the authors_match function separates entries that were formerly matched by the same ORCID, Researcher ID or E-Mail address. In my case that would lead to "unnecessary" under-matching. My quick fix for that was to set the similarity for those entries that were matched by ORCID and RID to 1, which would then exclude them from the pruning. For entries matched by E-Mail addresses the pruning seemed to do a good job though!

I don't know if giving ORCID and RID a higher priority is a universally better solution to the problem, but the way I understand this, ORCID and RID are quite reliable and might also identify a person that had their name changed due to marriage etc. Therefore the pruning those matches by name initials might not be the best solution.

If there are good reasons to overrule ORCID/ RID matches by name initial differences, it I might be worthwhile to consider letting the user decide the hierarchy ORCID/ RID, Email, and names.

Best,
Till

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions