-
Notifications
You must be signed in to change notification settings - Fork 1
How to Use
Marcel Heinz edited this page Jun 3, 2019
·
5 revisions
-
Required technology:
- Too many to list here, inspect the below referenced files.
- Download Stanford Core NLP: https://stanfordnlp.github.io/CoreNLP/index.html#download. You need to run a CoreNLP Server for reproducing results.
-
How to reproduce results:
- The file src/data/init.py serves as the core configuration. You need to enter depth level, and root categories.
- Run src/mine/pipeline.py. The process creates an annotated dictionary of article titles. Most data is mined from DBpedia.
- Run src/check/seed.py for annotating whether an articles is a seed.
- src/classify/decision_tree.py configures the decision tree classifier.
Be careful when inspecting other scripts. Many scripts explore indication directly in an active learning manner.
Having the titles as keys of article dictionaries allows convenient querying in the Python Console of Pycharm. For example:
from data import load_articledict`
ad = load_articledict()
# Get all articles with 'language' as the retrieved hypernym:
[a for a in ad if "COPHypernym" in ad[a] and "language" in ad[a]["COPHypernym"]]
# Get all articles classified as relevant for software languages.
[l for l,ld in ad.items() if ld["Class"]=='1']