A powerful command-line tool to fetch video metadata, thumbnails, and download videos from YouTube channels using the YouTube Data API v3 and yt-dlp.
- Comprehensive Data Fetching: Retrieves video metadata (title, description, published date, duration, thumbnails) for channels.
- Flexible Channel Identification: Supports identifying channels by their unique Channel ID (
UC...), modern@handle, or legacy username. - Local Data Storage: All channel and video metadata is stored persistently in a local SQLite database (
yt.db). - Organized Thumbnail Management: Downloads and saves thumbnails into structured folders (e.g.,
thumbnails/@channel_handle/). - Video Downloading with Presets: Download videos in various quality presets (
best,normal(up to 720p),small(up to 360p)). - Structured Video Output: Videos are saved in an organized directory structure (e.g.,
videos/@channel_handle/) with detailed filenames (YYMMDD_video_name_video_duration_videoid_preset.mp4). - In-line Download Progress: Displays real-time download progress on a single line in your terminal.
- Cutoff Date Filtering: When fetching, optionally exclude videos published before a specified date.
- Force Fetch/Redownload: Option to force re-fetching all metadata and re-downloading thumbnails.
- API Quota Monitoring: Includes commands to test your API key and get a "best-effort" API quota status check to help you manage your API usage.
- Detailed Logging: Tracks fetch operations, downloads, and errors in
ytextractor.log. Logging level is configurable via environment variable. - User-Friendly CLI: Provides straightforward commands via
typer. - Secure API Key Handling: Configurable API key via a
.envfile for secure environment variable management. - Automated Versioning: Integrates
commitizenfor conventional commits and automated version bumping. - Task Automation: Utilizes
Taskfile.ymlfor simplified execution of common development and operational tasks.
- Python 3.12+
- Poetry for robust dependency management.
- YouTube Data API v3 Key (requires a Google Cloud Project).
- FFmpeg: Required by
yt-dlpfor video merging and format conversion. Ensure it's installed and accessible in your system's PATH.
-
Clone or Download the Repository:
git clone [https://github.com/your-username/ytextractor.git](https://github.com/your-username/ytextractor.git) # Replace with your repo URL cd ytextractor
-
Create a
.envFile: In the project root, create a file named.envand add your YouTube Data API Key. Optionally, you can set the logging level:YOUTUBE_API_KEY=YOUR_YOUTUBE_API_KEY_HERE YTEXTRACTOR_LOG_LEVEL=INFO # Optional: DEBUG, INFO, WARNING, ERROR, CRITICAL
-
Configure Poetry to Use Your Python Environment & Install Dependencies:
a. If you want Poetry to manage its own virtual environment (recommended for simplicity, it creates a
.venvfolder in your project):poetry install
b. If you prefer to use an existing Python virtual environment (e.g., managed by
pyenv,conda, orpython -m venv): First, activate your desired environment (e.g.,source /path/to/my/venv/bin/activate). Then, tell Poetry to use it:poetry env use $(pyenv which python) poetry installThis step (
poetry env use) only needs to be done once per project unless you switch environments. -
(Optional) Add Channels to
channels.list: Create a file namedchannels.listin the project root. Add YouTube channel identifiers (Channel ID,@handle, or username) one per line. Lines starting with#will be ignored.# Example channels.list UC-lHJZR3Gqxm24_Vd_DKFwa # PewDiePie's channel ID @MrBeast # MrBeast's handle googledevelopers # A legacy username example
Once dependencies are installed and your API key is set up, you can run the tool two ways:
- Activate the environment, and call ytextractor directly:
eval "$(poetry env activate)
ytextractor --help
- Or run it via poetry:
poetry run python ytextractor.py <command>.
Note
If you choose option 2, replace all ytextractor commands below with poetry run python ytextractor.py
Fetch videos metadata and thumbnails for all channels listed in channels.list:
ytextractor fetchFetch for a specific channel by ID, handle, or username:
ytextractor fetch UC-lHJZR3Gqxm24_Vd_DKFwa
ytextractor fetch @MrBeastForce fetch all videos from the beginning (overwriting existing metadata and re-downloading thumbnails):
ytextractor fetch --force-allFetch videos published after a specific cutoff date (YYMMDD format):
ytextractor fetch --cutoff 231231 # Only videos published on or after Dec 31, 2023You can combine options:
ytextractor fetch --force-all --cutoff 240101Downloads videos based on the metadata already present in your database (yt.db). Videos will be saved in videos/@channel_handle/.
Download videos for all channels in channels.list using the default 'best' quality preset:
ytextractor fetch-videosDownload videos with a specific quality preset ('normal' for up to 720p, 'small' for up to 360p):
ytextractor fetch-videos --preset normalDownload a single video directly by its URL or ID, without interacting with the database. Videos will be saved to the current directory or a specified output directory.
Download a video using its URL or ID:
ytextractor download [https://www.youtube.com/watch?v=dQw4w9WgXcQ](https://www.youtube.com/watch?v=dQw4w9WgXcQ)
ytextractor download dQw4w9WgXcQDownload a video with a specific quality preset:
ytextractor download dQw4w9WgXcQ --preset normalDownload to a specific output directory:
ytextractor download dQw4w9WgXcQ --dir /path/to/my/videosTests your YouTube Data API key by making a minimal request. This consumes 1 unit of your daily quota.
ytextractor test-api-keyProvides an estimate of your remaining API quota. This also consumes 1 unit of quota.
ytextractor quotaDisplays the current version of the tool.
ytextractor --versionMIT License. See LICENSE for more information.