RAG Tutor is a Streamlit-based Retrieval-Augmented Generation (RAG) application designed to provide personalized tutoring and mentoring based on your uploaded documents. This system processes documents like PDFs and DOCX files, retrieves relevant content, and uses a Large Language Model (LLM) to generate accurate, contextual answers and explanations.
- Multi-Document Support: Easily upload and process
.pdf,.docx, and.txtfiles. - Contextual Q&A: The RAG pipeline grounds all answers strictly in the content of your documents, minimizing LLM hallucination.
- Intuitive UI: Built with Streamlit for a simple, interactive, and fast chat interface.
- Vector Database: Uses ChromaDB to efficiently store and retrieve document embeddings (vector representations) for rapid similarity search.
- LangChain Orchestration: Utilizes the LangChain framework to manage the entire RAG workflow: document loading, chunking, embedding, retrieval, and response generation.
| Component | Technology | Role |
|---|---|---|
| Frontend/UI | Streamlit | Interactive web application interface. |
| Orchestration | LangChain | Framework connecting the LLM, documents, and vector store. |
| Vector DB | ChromaDB | Stores document embeddings for efficient retrieval. |
| Document Parsing | pypdf, python-docx |
Libraries to read and extract text from various file formats. |
| Backend | Python 3.x | Core programming language. |
- Python: Ensure you have Python 3.8+ installed.
- API Key: An API key for your chosen Large Language Model (In our case it is Google Gemini API key.).
-
Clone the Repository:
git clone [YOUR-REPO-URL] cd rag-tutor-system -
Create and Activate a Virtual Environment (Recommended):
python -m venv venv # Activate on Linux/macOS source venv/bin/activate # Activate on Windows .\venv\Scripts\activate
-
Install Dependencies: You'll need the
requirements.txtfile (see the next section) to install all necessary packages.pip install -r requirements.txt
-
Set API Key: Set your LLM API key as an environment variable in the ./rag_tutor/generator.py (
GEMINI_API_KEY).os.environ["GOOGLE_API_KEY"] = "********YOUR-API-KEY********"
-
Run the Application:
streamlit run app.py # Replace app.py with your main Streamlit file