TextCNN-For-Relation-Classification

Introduction

An implementation for Relation Classification using textCNN

Most part of code refers to roomylee,a little modification have been made to adapt current version of tensorflow.

The model is descriped as below in Paper Relation Extraction: Perspective from Convolutional Neural Networks

To make the actual algorithm clear, I would like to present you this picture:

Usage

Requirements

Tensorflow-1.11
scikit-learn

File Organize

┃━ SemEval2010_task8_all_data
┃━ SemEval2010_task8_training
┃━ TRAIN_FILE.TXT
┃━ TRAIN_TEST_DISTRIB.TXT
┃━ runs
┃━ 1547716564(timestap)
┃━ checkpoints
┃━ summaries
┃━ summaries
┃━ train
┃━ dev
┃━ configure.py
┃━ data_helpers.py
┃━ model.py
┃━ configure.py
┃━ train.py
┃━ utils.py
┃━ GoogleNews-vectors-negative300.bin

Train

The train data is located in "SemEval2010_task8_all_data/SemEval2010_task8_training/TRAIN_FILE.TXT".
"GoogleNews-vectors-negative300" is used as pre-trained word embeddings.Or you can get it at BaiduYun , 2muq
The parameters setting is located in configure.py.You can get help by command python train --help
To run : python train (--[parameter_name] [value])

Evaluation

Test data is located in "SemEval2010_task8_all_data/SemEval2010_task8_testing_keys/TEST_FILE_FULL.TXT".
Parameter checkpoint_dir and the vacab_to_index,pos_to_index json file(generated by train step) paths text_tokenizer_path、pos_tokenizer_path must be given.Just like below:

python eval.py --checkpoint_dir "runs/1523902663/checkpoints/" --text_tokenizer_path "runs/1523902663/checkpoints/text_tokenizer.json --pos_tokenizer_path "runs/1523902663/checkpoints/pos_tokenizer.json"

Result

Use macro average F1 score and Accuracy to evaluate the performance.
I got a F1 score of 74.85% and the accuracy is 76% by using default settings
Used dropout and l2 regularization , but overfitting is still a big problem .
The best performance may reach 82% for F1, you can adjust the parameters to get better performance .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TextCNN-For-Relation-Classification

Introduction

Usage

Requirements

File Organize

Train

Evaluation

Result

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
SemEval2010_task8_all_data		SemEval2010_task8_all_data
images		images
runs		runs
README.md		README.md
configure.py		configure.py
data_helpers.py		data_helpers.py
eval.py		eval.py
model.py		model.py
train.py		train.py
utils.py		utils.py

DengChan/TextCNN-For-Relation-Classification

Folders and files

Latest commit

History

Repository files navigation

TextCNN-For-Relation-Classification

Introduction

Usage

Requirements

File Organize

Train

Evaluation

Result

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages