Skip to content

Standardized POS/Features with Universal Dependency #1332

@Nickersoft

Description

@Nickersoft

Hey folks – opening a parent issue to openly discuss something that has been kicking around the back of my mind lately. Currently in order to achieve maximum flexibility, Sense blocks, along with some others like Form, allow you to pass <tag> children to denote any modifiers to the lexical component. For example, you may pass <tag>female</tag> for the sense "actrice" in French.

However, I'm thinking for the sake of clarity it may be worth it to explicitly allow attributes for these blocks instead, leveraging UD's official feature list to use some kind of standard for these.

For example, <sense gender="masc" definite="spec" /> to denote gender and definite. In a similar vein, it would be nice to leverage UD's official POS tags for ODict's PartOfSpeech, though the POS tags supported by ODict are currently much more expansive than what UD offers (such as the many Japanese-specific part of speech tags).

Wondering what everyone's thoughts are on this and whether it is worth the change!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions