Skip to content

Some problems when preprocessing #11

@alderpaw

Description

@alderpaw

Hello! Thanks for your contribution.
When I run this code to preprocess the ACE 2005 corpus, some warnings and errors occurred, and I wonder if these warnings and errors would affect the result?

  • [Warning] The entity in the other sentence is mentioned. This argument will be ignored. This warning occurred multiple times during preprocessing.

  • [Warning] fail to find offset! (start_index: 3348, text: Doctors Without Borders/Médecins Sans Frontières (MSF, path: D:\Data\ace_2005_td_v7\data\English\un/timex2norm/alt.vacation.las-vegas_20050109.0133) Actually this warning raises an assertion error(end_idx != -1), but I comment out the corresponding code in main.py to avoid the error. I have read other issues and I know simply deleting the file may solve the problem, but I want to know if there are other solutions except for deleting. And I also wonder if the result includes some mistakes due to this warning?
    Look forward to your reply!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions