RAG PDF Chatbot - 100% Local

This is a Streamlit-based chatbot that allows users to upload PDFs and ask questions about the content. It runs entirely locally, meaning no API costs, no internet requirements, and full privacy.

Workflow

Features

📂 Upload multiple PDFs
🔍 Vector search using FAISS
🧠 Embeddings with Hugging Face models
🤖 Uses Ollama (Llama 2) for local inference
⚡ No API keys required
🔒 Runs 100% on your local machine

🛠 Installation

1️⃣ Create a Virtual Environment (Optional but recommended)

python -m venv pdf-chat-env
source pdf-chat-env/bin/activate  # On Windows: pdf-chat-env\Scripts\activate

2️⃣ Install Dependencies

Using `pip`

pip install -r requirements.txt

Using `conda` (for CPU versions only)

conda install -c conda-forge pytorch transformers sentence-transformers faiss-cpu
pip install -r requirements.txt

🏗 Project Structure

📂 pdf-chatbot
├── 📄 app.py               # Main Streamlit app
├── 📂 modules
│   ├── process_pdf.py      # PDF processing (text extraction, chunking)
│   ├── vector_store.py     # FAISS vector database management
│   ├── llm_inference.py    # Running Ollama for answering queries
│   ├── config.py           # Configurations (chunk size, model settings)
│   ├── __init__.py         # Module initialization
├── 📂 faiss_index          # Saved FAISS index (generated after processing PDFs)
├── 📄 requirements.txt      # Dependencies
├── 📄 README.md            # This file

▶️ How to Run

streamlit run app.py

Then open http://localhost:8501 in your browser.

📝 Usage

Upload one or more PDF files from the sidebar.
Set the chunk size (default: 300).
Click "Process Documents" to generate embeddings.
Enter your question in the text box.
Get an AI-powered answer based on the PDFs!

🏗 Technologies Used

Streamlit (Web UI)
PyPDF2 (PDF text extraction)
LangChain (Text chunking, LLM inference)
FAISS (Vector search)
Hugging Face Embeddings (Text representations)
Ollama (Llama 2) (Local LLM for answering queries)

⚠️ Troubleshooting

No answer or incorrect response?
- Ensure PDFs contain selectable text (scanned images won’t work)
- Increase chunk size if responses lack context
Performance issues?
- Reduce chunk size for faster processing
- Use tinyllama instead of llama2 for quicker inference

Future Enhancements

✅ Add support for PDFs with scanned text (OCR)
✅ Implement multi-document summarization
✅ Improve LLM response formatting and citations

💡 Credits & License

Developed by Swagath Babu. Free for personal and research use.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
faiss_index		faiss_index
pdf_chatbot		pdf_chatbot
All-in-one_Finalrag.py		All-in-one_Finalrag.py
RAG_PDF_Github.png		RAG_PDF_Github.png
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG PDF Chatbot - 100% Local

Workflow

Features

🛠 Installation

1️⃣ Create a Virtual Environment (Optional but recommended)

2️⃣ Install Dependencies

Using `pip`

Using `conda` (for CPU versions only)

🏗 Project Structure

▶️ How to Run

📝 Usage

🏗 Technologies Used

⚠️ Troubleshooting

Future Enhancements

💡 Credits & License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Swagath18/RAG-PDF-Assistant

Folders and files

Latest commit

History

Repository files navigation

RAG PDF Chatbot - 100% Local

Workflow

Features

🛠 Installation

1️⃣ Create a Virtual Environment (Optional but recommended)

2️⃣ Install Dependencies

Using pip

Using conda (for CPU versions only)

🏗 Project Structure

▶️ How to Run

📝 Usage

🏗 Technologies Used

⚠️ Troubleshooting

Future Enhancements

💡 Credits & License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Using `pip`

Using `conda` (for CPU versions only)

Packages