PDF Question Answering System

This project implements a PDF Question Answering system using LangChain, Ollama, and Streamlit. It allows users to upload PDF documents and ask questions about their content, getting AI-powered responses based on the document's content.

Why Ollama?

This project uses Ollama instead of cloud-based APIs for several key advantages:

Privacy & Security
- All processing happens locally on your machine
- No data is sent to external servers
- Complete control over your documents and queries
Cost-Effective
- No API usage fees or subscription costs
- No token-based pricing
- Free to use without limitations
Offline Capability
- Works without internet connection
- No dependency on external services
- Consistent performance regardless of network status
Customization
- Ability to use different models locally
- Custom model fine-tuning possibilities
- Full control over model parameters
Performance
- Lower latency as everything runs locally
- No rate limiting or API quotas
- Consistent response times

Features

PDF document upload and processing
Document chunking and vectorization using GPT4All embeddings
Question answering using Llama2 model through Ollama
Interactive web interface built with Streamlit
History of previous queries and answers
Concise and accurate responses

Prerequisites

Python 3.8 or higher
Ollama installed locally
Virtual environment (recommended)

Installation

Clone the repository:

git clone <your-repository-url>
cd <repository-name>

Create and activate a virtual environment:

For macOS/Linux:

python -m venv env
source env/bin/activate

For Windows:

python -m venv env
.\env\Scripts\activate

Install Ollama:
- Visit Ollama's official website and download the appropriate version for your OS
- Follow the installation instructions for your platform
- Pull the Llama2 model:
```
ollama pull llama2:7b
```
Install project dependencies:

pip install -r requirements.txt

Usage

Start the Streamlit application:

streamlit run app.py

Open your web browser and navigate to the URL shown in the terminal (typically http://localhost:8501)
Upload a PDF file using the file uploader
Click "Process PDF" to initialize the document processing
Enter your questions in the query input field and click "Submit Query"
View the answers and previous conversation history below

Project Structure

app.py: Main Streamlit application
requirements.txt: Project dependencies
uploaded_file.pdf: Temporary storage for uploaded PDFs
my_vectorstore.pkl: Vector store for document embeddings

Future Goals

Enhanced Document Processing
- Support for multiple document formats (DOCX, TXT, etc.)
- Batch processing of multiple documents
- Improved text chunking strategies
Advanced Features
- Document summarization
- Key points extraction
- Citation tracking for answers
- Support for images and tables in documents
UI/UX Improvements
- Dark mode support
- Better error handling and user feedback
- Export conversation history
- Customizable response length and style
Performance Optimization
- Caching mechanisms for faster responses
- Optimized vector storage
- Support for larger documents
Model Enhancements
- Support for multiple LLM models
- Fine-tuning capabilities
- Custom model integration

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.vscode		.vscode
.DS_Store		.DS_Store
.gitignore		.gitignore
CNASSIGNMENT.pdf		CNASSIGNMENT.pdf
README.md		README.md
app.py		app.py
main.py		main.py
my_vectorstore.pkl		my_vectorstore.pkl
present.txt		present.txt
requirements.txt		requirements.txt
test.pdf		test.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF Question Answering System

Why Ollama?

Features

Prerequisites

Installation

Usage

Project Structure

Future Goals

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

priyancjain/LLM_Project

Folders and files

Latest commit

History

Repository files navigation

PDF Question Answering System

Why Ollama?

Features

Prerequisites

Installation

Usage

Project Structure

Future Goals

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages