Skip to content

Developed an end-to-end intent detection system using TF-IDF with Logistic Regression and fine-tuned BERT models, including comparative analysis and evaluation.

Notifications You must be signed in to change notification settings

bavi404/IntentDetectionProject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intent Detection Project Overview This project aims to develop an Intent Detection system for conversational AI. Two different machine learning models are implemented and compared:

Baseline Model: TF-IDF Vectorization + Logistic Regression Classifier

Advanced Model: Fine-tuned BERT (Bidirectional Encoder Representations from Transformers)

The objective is to predict user intent from text input using supervised learning techniques.

Project Structure Intent-Detection/ │ ├── baseline_model/ │ ├── train_baseline.py │ ├── bert_model/ │ ├── train_bert.py │ ├── data/ │ └── sofmattress_train.csv │ ├── results/ │ └── (saved models, confusion matrices) │ ├── report/ │ └── final_report.md │ ├── README.md ├── requirements.txt Setup Instructions Clone or download this repository.

Install the required Python libraries:

pip install -r requirements.txt The following libraries are used:

pandas

scikit-learn

matplotlib

seaborn

nltk

torch (>=2.2)

transformers (>=4.30)

accelerate

Running the Models

  1. Train the Baseline Model Navigate to the baseline model directory and run the script:

cd baseline_model python train_baseline.py This will:

Train a Logistic Regression classifier using TF-IDF features.

Save the trained model, vectorizer, and label encoder in the results/ directory.

Generate a confusion matrix plot.

  1. Fine-tune the BERT Model Navigate to the BERT model directory and run the script:

cd bert_model python train_bert.py This will:

Fine-tune a pre-trained bert-base-uncased model for the intent classification task.

Save the fine-tuned model and tokenizer in the results/bert_model/ directory.

Note: Fine-tuning BERT may take additional time, especially on CPUs. Using a GPU is recommended if available.

Results The performance of the two models is compared using evaluation metrics such as accuracy, F1-score, and confusion matrix visualization. Details and analysis are provided in the report/final_report.md file.

Notes Ensure that the dataset (sofmattress_train.csv) is placed inside the data/ directory.

Adjust batch size and number of epochs in train_bert.py if facing memory constraints.

Model files and outputs are automatically saved in the results/ directory after training.

About

Developed an end-to-end intent detection system using TF-IDF with Logistic Regression and fine-tuned BERT models, including comparative analysis and evaluation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages