GitHub - bavi404/IntentDetectionProject: Developed an end-to-end intent detection system using TF-IDF with Logistic Regression and fine-tuned BERT models, including comparative analysis and evaluation.

Intent Detection Project Overview This project aims to develop an Intent Detection system for conversational AI. Two different machine learning models are implemented and compared:

Baseline Model: TF-IDF Vectorization + Logistic Regression Classifier

Advanced Model: Fine-tuned BERT (Bidirectional Encoder Representations from Transformers)

The objective is to predict user intent from text input using supervised learning techniques.

Project Structure Intent-Detection/ │ ├── baseline_model/ │ ├── train_baseline.py │ ├── bert_model/ │ ├── train_bert.py │ ├── data/ │ └── sofmattress_train.csv │ ├── results/ │ └── (saved models, confusion matrices) │ ├── report/ │ └── final_report.md │ ├── README.md ├── requirements.txt Setup Instructions Clone or download this repository.

Install the required Python libraries:

pip install -r requirements.txt The following libraries are used:

pandas

scikit-learn

matplotlib

seaborn

nltk

torch (>=2.2)

transformers (>=4.30)

accelerate

Running the Models

Train the Baseline Model Navigate to the baseline model directory and run the script:

cd baseline_model python train_baseline.py This will:

Train a Logistic Regression classifier using TF-IDF features.

Save the trained model, vectorizer, and label encoder in the results/ directory.

Generate a confusion matrix plot.

Fine-tune the BERT Model Navigate to the BERT model directory and run the script:

cd bert_model python train_bert.py This will:

Fine-tune a pre-trained bert-base-uncased model for the intent classification task.

Save the fine-tuned model and tokenizer in the results/bert_model/ directory.

Note: Fine-tuning BERT may take additional time, especially on CPUs. Using a GPU is recommended if available.

Results The performance of the two models is compared using evaluation metrics such as accuracy, F1-score, and confusion matrix visualization. Details and analysis are provided in the report/final_report.md file.

Notes Ensure that the dataset (sofmattress_train.csv) is placed inside the data/ directory.

Adjust batch size and number of epochs in train_bert.py if facing memory constraints.

Model files and outputs are automatically saved in the results/ directory after training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
baseline_model		baseline_model
bert_model		bert_model
data		data
results		results
README.md		README.md
requirements.txt		requirements.txt

bavi404/IntentDetectionProject

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages