An intelligent chatbot powered by LangChain and RAG (Retrieval Augmented Generation) that can answer questions about YouTube video content. Simply provide a YouTube URL, and chat with the video's transcript using AI!
- ๐ฅ YouTube Transcript Extraction: Automatically fetches and processes video transcripts
- ๐ง RAG Architecture: Uses Retrieval Augmented Generation for accurate responses
- ๐ฌ Natural Conversation: Chat naturally about video content
- ๐ Context-Aware: Maintains conversation context for follow-up questions
- โก Fast Retrieval: Vector-based semantic search for relevant information
- ๐ Interactive Notebook: Easy-to-use Jupyter notebook interface
- How It Works
- Prerequisites
- Installation
- Usage
- Architecture
- Configuration
- Examples
- Troubleshooting
- Contributing
The chatbot uses a Retrieval Augmented Generation (RAG) pipeline:
- Transcript Extraction: Downloads YouTube video transcript
- Text Chunking: Splits transcript into manageable chunks
- Embedding Generation: Converts text chunks into vector embeddings
- Vector Storage: Stores embeddings in a vector database
- Query Processing: Converts user questions into embeddings
- Semantic Search: Finds most relevant chunks from the transcript
- Response Generation: Uses LLM to generate answers based on retrieved context
YouTube URL โ Transcript โ Chunks โ Embeddings โ Vector DB
โ
User Question โ Query Embedding โ Semantic Search โ Context
โ
LLM โ Answer
Before you begin, ensure you have:
- Python 3.8 or higher
- Jupyter Notebook or JupyterLab
- API keys for:
- OpenAI API (or other LLM provider)
- YouTube Data API (optional, for enhanced features)
git clone https://github.com/Devatva24/Youtube_Chatbot.git
cd Youtube_Chatbot# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activatepip install langchain
pip install langchain-openai
pip install youtube-transcript-api
pip install chromadb
pip install tiktoken
pip install openai
pip install jupyterOr create a requirements.txt:
langchain>=0.1.0
langchain-openai>=0.0.2
youtube-transcript-api>=0.6.1
chromadb>=0.4.0
tiktoken>=0.5.1
openai>=1.0.0
jupyter>=1.0.0Then install:
pip install -r requirements.txtCreate a .env file in the project root:
OPENAI_API_KEY=your_openai_api_key_hereOr set environment variables directly in the notebook:
import os
os.environ["OPENAI_API_KEY"] = "your_api_key_here"- Launch Jupyter Notebook
jupyter notebook- Open the Notebook
Navigate to rag_using_langchain.ipynb
- Run the Cells
Follow these steps in the notebook:
# Step 1: Import libraries
from langchain.document_loaders import YoutubeLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
# Step 2: Load YouTube video
video_url = "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
loader = YoutubeLoader.from_youtube_url(video_url)
transcript = loader.load()
# Step 3: Split text into chunks
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
docs = text_splitter.split_documents(transcript)
# Step 4: Create embeddings and vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(docs, embeddings)
# Step 5: Create QA chain
llm = ChatOpenAI(temperature=0)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
retriever=vectorstore.as_retriever()
)
# Step 6: Ask questions!
question = "What is the main topic of this video?"
answer = qa_chain.run(question)
print(answer)-
Document Loader
YoutubeLoader: Fetches video transcripts from YouTube
-
Text Splitter
RecursiveCharacterTextSplitter: Intelligently splits text into chunks- Chunk size: 1000 characters
- Chunk overlap: 200 characters
-
Embeddings
OpenAIEmbeddings: Converts text to vector representations- Model: text-embedding-ada-002
-
Vector Store
Chroma: Stores and retrieves embeddings- In-memory or persistent storage options
-
Language Model
ChatOpenAI: Generates responses- Model: GPT-3.5-turbo or GPT-4
-
Retrieval Chain
RetrievalQA: Combines retrieval and generation
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ YouTube โโโโโโถโ Transcript โโโโโโถโ Chunks โ
โ URL โ โ Extraction โ โ (1000 chr) โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Answer โโโโโโโ LLM + RAG โโโโโโโ Embeddings โ
โ โ โ Pipeline โ โ & VectorDB โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โฒ
โ
โโโโโโโโโโโโโโโโ
โ User Questionโ
โโโโโโโโโโโโโโโโ
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1500, # Larger chunks for more context
chunk_overlap=300 # More overlap for continuity
)# Use GPT-4 for better responses
llm = ChatOpenAI(model_name="gpt-4", temperature=0)
# Use GPT-3.5-turbo for faster, cheaper responses
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)qa_chain = RetrievalQA.from_chain_type(
llm=llm,
retriever=vectorstore.as_retriever(
search_kwargs={"k": 5} # Return top 5 most relevant chunks
)
)# Persistent Chroma storage
vectorstore = Chroma.from_documents(
docs,
embeddings,
persist_directory="./chroma_db"
)
# Or use FAISS
from langchain.vectorstores import FAISS
vectorstore = FAISS.from_documents(docs, embeddings)video_url = "https://www.youtube.com/watch?v=educational_video"
# Sample questions:
questions = [
"What is the main concept explained in this video?",
"Can you summarize the key points?",
"What examples were given?",
"What are the practical applications?"
]
for question in questions:
answer = qa_chain.run(question)
print(f"Q: {question}")
print(f"A: {answer}\n")video_url = "https://www.youtube.com/watch?v=tutorial_video"
# Step-by-step questions:
print(qa_chain.run("What tools are needed?"))
print(qa_chain.run("What is step 1?"))
print(qa_chain.run("What are common mistakes to avoid?"))video_url = "https://www.youtube.com/watch?v=podcast_video"
# Extract insights:
print(qa_chain.run("Who are the speakers?"))
print(qa_chain.run("What are the main topics discussed?"))
print(qa_chain.run("What interesting stories were shared?"))Solution:
- Video may not have captions/subtitles
- Try videos with auto-generated or manual captions
- Check if video is public and accessible
Solution:
import os
os.environ["OPENAI_API_KEY"] = "your-actual-api-key"Solution:
- Wait a few minutes before retrying
- Reduce the frequency of requests
- Consider upgrading your OpenAI plan
Solution:
- Reduce chunk size
- Process shorter videos
- Use persistent vector store instead of in-memory
Solution:
- Increase chunk overlap for better context
- Adjust retrieval parameters (increase k value)
- Use a more powerful model (GPT-4)
- Improve question phrasing
-
Chunk Size: Balance between context and performance
- Smaller chunks (500-800): Better for specific questions
- Larger chunks (1000-1500): Better for broader questions
-
Overlap: Ensures continuity between chunks
- Recommended: 10-20% of chunk size
-
Model Selection:
- GPT-3.5-turbo: Fast and cost-effective
- GPT-4: More accurate but slower and expensive
-
Caching: Store vector database to avoid re-processing
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Add support for multiple videos
- Create a web interface with Streamlit/Gradio
- Add support for different languages
- Implement conversation memory
- Add video timestamp citations
- Support for YouTube playlists
- Export chat history
This project is licensed under the MIT License - see the LICENSE file for details.
- LangChain - Amazing framework for LLM applications
- OpenAI - Powerful language models
- YouTube Transcript API - Easy transcript extraction
Devatva Rachit
- GitHub: @Devatva24
- Project Link: YouTube Chatbot
Give a โญ๏ธ if this project helped you!
- Multi-language support
- Video timestamp references
- Streamlit web interface
- Support for multiple LLM providers
- Conversation history export
- Video summary generation
- Topic extraction and tagging
Built with ๐ง and LangChain