A command-line utility built on top of nvitop that monitors NVIDIA GPUs and automatically executes queued commands when specified GPUs become idle.
- 🔍 GPU Monitoring: Real-time GPU status display with live updates
- 📋 Task Queue: Persistent queue for commands waiting on specific GPUs
- ⚙️ Background Scheduler: Daemon process that continues running after TUI exit
- 🎯 Auto-Execution: Automatically launches tasks when GPUs become idle
- 💾 Data Persistence: All data saved to disk, survives TUI/terminal restarts
- 🎨 Interactive TUI: Beautiful terminal UI for task management
- 🔔 Notifications: Email, webhook, or stdout notifications
- 🔒 Single Instance: Prevents multiple scheduler conflicts
Install globally to use gst anywhere (Recommended):
# Using uv
uv tool install .
# Using pipx
pipx install .Or install in the current virtual environment:
# Using pip
pip install .1. Launch TUI
gst2. Submit a task
- Select "📝 Submit"
- Enter GPU indices:
0 - Enter command:
python train.py --epochs 3 - Confirm
3. Start scheduler
- Select "⚙️ Start Scheduler"
- Exit TUI (scheduler keeps running!)
4. Check status anytime
gst # Reopen TUI
# Select "📊 View Status" to see running tasksThat's it! Your tasks will execute automatically when GPUs become idle. 🎉
# Monitor GPU
gpus monitor --watch
# Submit task
gpus submit --gpu 0 -- python train.py
# View queue
gpus queue
# Start scheduler (background)
nohup gpus scheduler > scheduler.log 2>&1 &New to GPU Scheduler? Start here:
- Quick Start Guide - 5-minute tutorial
- FAQ - Common questions and solutions
Need detailed reference?
- CLI Command Reference - Complete command documentation
- Documentation Center - Full documentation index
Want to understand internals?
- Workflow - How scheduling works
- Persistence - Data storage guarantees
Developer resources:
- Testing Guide - Run and write tests
- UI Theming - TUI customization
.bashrc or conda activation.
Solution - Use absolute paths:
# ✅ Recommended
gpus submit --gpu 0 -- /path/to/.venv/bin/python train.py
# ✅ Or use wrapper script
cat > run_task.sh << 'EOF'
#!/bin/bash
cd /path/to/project
source .venv/bin/activate
python train.py
EOF
gpus submit --gpu 0 -- bash run_task.shSee FAQ - Virtual Environments for details.
The scheduler runs independently as a daemon process:
# Start via TUI
gst → Start Scheduler → Exit
# Or via CLI (recommended for production)
nohup gpus scheduler --log-dir ./logs > scheduler.log 2>&1 &
# Scheduler survives:
# ✅ TUI exit
# ✅ Terminal close
# ✅ SSH disconnectSee FAQ - Background Running for production setup.
All data stored in ~/.local/share/gpu_scheduler/:
~/.local/share/gpu_scheduler/
├── queue.json # Task queue
├── task_history.json # Execution history
├── scheduler_state.json # Running tasks
└── scheduler.log # Logs
Guarantees:
- ✅ Tasks persist across TUI sessions
- ✅ Queue survives system restarts
- ✅ Atomic operations prevent corruption
# Submit multiple training jobs
gpus submit --gpu 0 -- python train_model_a.py
gpus submit --gpu 0 -- python train_model_b.py
gpus submit --gpu 1 -- python train_model_c.py
# Start scheduler
gst → Start Scheduler → Exit
# Jobs execute automatically in order# Task requires 2 GPUs
gpus submit --gpu 0 1 -- python large_model.py
# Will wait until BOTH GPUs are idle# Enable automatic retry (3 attempts, exponential backoff)
gpus submit --gpu 0 --retry -- python flaky_job.py
# Custom retry
gpus submit \
--gpu 0 \
--retry \
--max-retries 5 \
--retry-delay 60.0 \
-- python train.py# Via TUI
gst → Stop Task → Select task → Confirm
# Via CLI
gpus queue --stop <task_id>
# Graceful termination:
# 1. SIGTERM sent (5s timeout)
# 2. SIGKILL if needed
# 3. GPU resources released# Quick GPU test (allocate 2GB for 60s)
python tests/tools/gpu_test.py --memory 2048 --duration 60
# Interactive test scenarios
bash tests/tools/run_scenarios.sh
# Run automated tests
pytest tests/integration/See tests/README.md for comprehensive testing guide.
Tasks not executing?
- Check scheduler is running:
ps aux | grep "gpus scheduler" - Check GPU status:
gpus monitor - View logs:
tail -f ~/.local/share/gpu_scheduler/scheduler.log
Virtual environment issues?
- Use absolute Python paths:
/path/to/.venv/bin/python - See FAQ - Virtual Environments
More help:
- Check FAQ
- Review Documentation
- Submit GitHub Issue
Contributions welcome! Please check:
- Documentation - Understanding the project
- Testing Guide - Running tests
- Existing issues before submitting PRs
See LICENSE file for details.
Built on top of the excellent nvitop library.