This project is a modular pipeline for retrieval-augmented generation (RAG) using Discord, FastAPI, ChromaDB, Ollama, and n8n. It enables capturing Discord messages, embedding and storing them in a vector database, and generating context-aware responses using LLMs.
- Discord Bot (
BotListenDiscord.py): Listens to Discord messages, enqueues them, and sends data to a local webhook for further processing. - Queue Dispatcher (
QueueDispatcher.py): Manages a message queue, dispatches save/query commands to webhooks, and integrates with the Discord bot. - ChromaDB Integration (
ChromaDBSaver.py,UnusedSaver.py,AlternateSaverFinished.py,FastAPIChromaSaver.py): Handles embedding generation (via Ollama or HuggingFace), persistent storage, retrieval, and reset operations using ChromaDB. - FastAPI Services: Expose REST endpoints for saving, viewing, querying, and resetting stored messages and embeddings.
- Ollama: Provides local LLM and embedding model inference (e.g., for generating embeddings or answers).
- n8n: Orchestrates webhooks and automates message flow between Discord, FastAPI, and Ollama.
- Discord messages are captured and sent to a webhook (via n8n).
- Messages are embedded (using Ollama or HuggingFace models) and stored in ChromaDB.
- FastAPI endpoints allow saving new data, querying for relevant context, and generating LLM-based answers.
- Responses can be sent back to Discord via n8n.
- Python 3.8+
- Docker (for Ollama)
- n8n (for workflow automation)
- ChromaDB, FastAPI, Discord.py, requests, sentence-transformers, transformers, etc.
- Start Ollama (Docker) and n8n.
- Run the FastAPI and Discord bot scripts as needed.
- Configure webhooks and endpoints in n8n to connect all services.
- Interact with the Discord bot; messages will be processed, stored, and can be queried with LLM-powered responses.
This project is intended for research and experimentation with RAG pipelines and LLM integration in real-time chat environments.