Automatically generate accurate, meaningful chapter markers, titles, and descriptions for long-form videos using Whisper ASR, advanced NLP topic segmentation, and scene/transition detection.
- Automatic Speech Recognition: Uses OpenAI Whisper/Faster-Whisper for high-accuracy audio transcription; supports multiple languages.
- NLP Topic Segmentation: Segments video content into logical chapters using embeddings, semantic similarity, clustering, and topic modeling.
- Scene/Transition Detection: Optionally uses PySceneDetect/OpenCV for visual boundary refinement.
- Export-Ready Chapters: Outputs:
- YouTube timestamp chapters
- SRT and VTT subtitles
- JSON metadata (with timestamps, titles, descriptions)
- EDL, XML, and other NLE/editor marker files
- High Performance & Scalability: Fast processing using GPU (if available), async API, Docker, and horizontal scaling.
- REST API: FastAPI-powered endpoints for automation and easy integration.
video-chapter-generator/
├── src/
│ ├── audio_extraction/
│ ├── transcription/
│ ├── segmentation/
│ ├── scene_detection/
│ ├── chapter_generation/
│ ├── export/
│ └── api/
├── config/
├── tests/
├── scripts/
├── data/
├── docs/
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
├── setup.py
├── .env.example
└── README.md
- Python: 3.8+
- FFmpeg: System dependency for audio/video processing
- Docker/Docker Compose (for deployment, optional)
- NVIDIA GPU (optional, for speedup)
git clone https://github.com/yourusername/video-chapter-generator.git
cd video-chapter-generator
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
# Download a Whisper model (first-run only)
python -c "import whisper; whisper.load_model('base')"python scripts/process_video.py --input myvideo.mp4 --output-dir data/output/uvicorn src.api.main:app --reload
# Visit http://localhost:8000/docs for the OpenAPI UIdocker-compose up -d
# FastAPI service at http://localhost:8000- YouTube format (for direct copy-paste in description)
.srt/.vttsubtitle files.jsonchapter metadata.edl,.xmlmarker files for NLEs- SEO-optimized text and optional thumbnails/descriptions
POST /generate-chapters: Upload a video and generate chapter files.GET /download/{job_id}/{format}: Download output in chosen format.GET /health: Service status.
See /docs endpoint for the full interactive API!
00:00 - Introduction
02:15 - Key Concept 1
05:40 - Case Study
09:55 - Conclusion
Run all tests with:
pytestTest coverage includes unit tests for all core modules and integration tests for the full pipeline and API.
- Local: Use the provided
Dockerfileanddocker-compose.ymlfor ease of deployment. - Cloud/Kubernetes: Ready for container orchestration (EKS, GKE, AKS). Add scaling and monitoring as needed.
- See the included PDF: Video Chapter Generation – Complete Implementation Guide for full code, diagrams, and instructions.
- Sample output files and API usage examples included in the
examples/folder.
- ASR: OpenAI Whisper
- Embeddings: Sentence-BERT
- Scene Detection: PySceneDetect
- API: FastAPI
MIT (see LICENSE for details)
Enhance your video content—automate logical, discoverable, and user-friendly chapters for every video, at production scale!