-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Hi,
I'm using your VIAF recon service to reconcile scholar's names from the field of Lexicography and Dictionary Research, to contruct a domain bibliography and person registry in the Linked Open Data environment.
After reconciling and manually validating 200 person names with VIAF (and getting very good results in general!), I came across a peculiar feature in VIAF that seems to prevent automatic matching in many cases, and increases tedious manual validation. Apparently, one of the VIAF contributors, NUKAT, sets a full stop behind a person name heading, resulting in an otherwise non-existent edit distance and causing the score to drop below 1. Even with the selected option in OpenRefine to auto-match candidates with a high confidence during reconciling, the score is often below the threshold.
Typical example from my data:
Name literal: Quasthoff, Uwe
VIAF candidate: Quasthoff, Uwe. (score: 0.933)
VIAF URI: https://viaf.org/viaf/22741331/
As far as I can see, NUKAT ist the only VIAF contributor with a full stop behind a person's name, and yet this particular heading is always ranked highest in the VIAF cluster. As we have no way to anticipate whether a matching VIAF cluster includes NUKAT headings or not, is there a way to modify the matching algorithm and chop off the full stop (if it exists) for the candidates returned from VIAF?
This would really help to improve your VIAF recon service even further. Thanks for all the work you've already done!
Regards,
Christiane