-
Notifications
You must be signed in to change notification settings - Fork 0
darcy3000/tf-idf_vectorizer
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
*********************INFORMATION RETRIEVAL***************** VECTOR SPACE MODEL MODULES AND PACKAGES REQUIRED: NLTK KIVY PICKLE HEAPQ_MAX TIME STRING COLLECTIONS Download the corpora from nltk. We are using movie_reviews as our corpus which is used for sentiment analysis by others. It contains 2000 documents. Run the python file as python <filename>.py The classifier has been written in pickle files. So you can directly run the program without training the classifier by reading from the pickle files. Uncomment the approriate lines to train the classifier again. Keep the .kv files in the same folder for running the GUI based program. A GUI based window will open. Follow the instructions. CLick on the links to view the entire document.
About
Information retrieval search engine based on tf-idf. Corpus used is movie dataset from nltk in python
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published