Video RAG (Retrieval-Augmented Generation) System

A powerful video analysis and retrieval system that processes videos to extract meaningful information and enables natural language queries to find relevant video segments.

Features

Video Processing: Extract frames, audio, and generate video clips
Object Detection: Identify objects in video frames using YOLOv8
Audio Transcription: Convert speech to text using OpenAI's Whisper
OCR (Optical Character Recognition): Extract text from video frames using Tesseract
Semantic Search: Find relevant video segments using natural language queries
Web Interface: User-friendly Gradio interface for easy interaction

Prerequisites

Python 3.8+
FFmpeg (for audio/video processing)
Tesseract OCR (for text recognition)

Installation

Clone the repository:

git clone <repository-url>
cd ANN-Video-RAG

Create and activate a virtual environment:

python -m venv venv
.\venv\Scripts\activate  # On Windows
source venv/bin/activate  # On Linux/Mac

Install the required packages:
```
pip install -r requirements.txt
```
Install system dependencies:
- FFmpeg: Download from ffmpeg.org and add to PATH
- Tesseract OCR: Download from GitHub - tesseract-ocr/tesseract and add to PATH

Usage

Run the application:
```
python main.py
```
Open your web browser and go to: http://127.0.0.1:7860
Upload a video file and click "Process Video"
Once processed, enter natural language queries to find relevant video segments

Project Structure

main.py: Main application code
requirements.txt: Python dependencies
uploads/: Directory for uploaded videos
processed/: Directory for processed data and FAISS index
clips/: Directory for generated video clips

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

YOLOv8 for object detection
OpenAI Whisper for speech recognition
Tesseract OCR for text recognition
FAISS for efficient similarity search
Gradio for the web interface

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LOGS		LOGS
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
VIDEO_RAG_WORKING_MULTIPLE_VIDEO.ipynb		VIDEO_RAG_WORKING_MULTIPLE_VIDEO.ipynb
main.py		main.py
notebook4fb9556edb.ipynb		notebook4fb9556edb.ipynb
requirements.txt		requirements.txt
yolov8n.pt		yolov8n.pt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Video RAG (Retrieval-Augmented Generation) System

Features

Prerequisites

Installation

Usage

Project Structure

License

Acknowledgments

About

Uh oh!

Languages

License

Aditya-Ranjan1234/Video-RAG

Folders and files

Latest commit

History

Repository files navigation

Video RAG (Retrieval-Augmented Generation) System

Features

Prerequisites

Installation

Usage

Project Structure

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages