Skip to content

Prediction of multi-word expression #28

@gifdog97

Description

@gifdog97

Is it possible to predict multi-word expression (MWE) from raw text?
I run predict.py with option --raw_text to find that MWE cannot be predicted.

For example, in Italy, "della" is abbreviation of "di la" and UD annotates such token like as follows:

31-32	della	_	_	_	_	_	_	_	_
31	di	di	ADP	E	_	35	case	35:case	_
32	la	il	DET	RD	Definite=Def|Gender=Fem|Number=Sing|PronType=Art	35	det	35:det	_

However, the output of UDify is something like this:

31	della	della	ADP	_	_	3	case	_	_

I hope to obtain the conllu output with proper MWE. Are there any way to realize it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions