Skip to content

AndrewTKent/athlete-analytics

Repository files navigation

Athlete Analytics

Personal training management system that syncs Garmin Connect and TrainingPeaks with macOS Calendar, featuring intelligent workout scheduling and race progress tracking.

What It Does

Core workflow:

  1. Import scheduled workouts from TrainingPeaks ICS feed
  2. Pull completed activities from Garmin Connect
  3. Schedule workouts in macOS Calendar with smart timing (avoids conflicts, optimizes wake-up time)
  4. Track training progress toward race goals with phase-specific benchmarks
  5. (Optional) Generate AI-powered training plans with GPT-4o

Key benefit: Never manually schedule workouts or worry about calendar conflicts. The system finds optimal times based on your work schedule, existing events, workout type, and exercise science.


Quick Start

Prerequisites

  • macOS (for Calendar integration)
  • Python 3.9+
  • Garmin Connect account with activity history
  • TrainingPeaks account with Premium (for ICS calendar feed)
  • OpenAI API key (optional, only for AI workout generation)

Installation

git clone https://github.com/AndrewTKent/athlete-analytics.git
cd athlete-analytics

# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate  # or: source activate.sh

# Install dependencies (includes PyObjC for calendar integration)
pip install -U pip
pip install -e ".[dev]"

Grant Calendar Access: First time running, macOS will prompt for Calendar access. Click OK. If denied, enable manually:

  • System Settings → Privacy & Security → Calendar → Toggle ON for Python/Terminal

Configuration

1. Create .env with credentials:

cp env.example .env
nano .env  # or your preferred editor

Add your credentials:

# Garmin Connect
GARMIN_EMAIL=your.email@example.com
GARMIN_PASSWORD=your_password

# TrainingPeaks ICS Feed (Settings > Account > Calendar)
TRAININGPEAKS_ICS_URL=webcal://www.trainingpeaks.com/ical/YOUR_ID.ics

# OpenAI (optional, for AI workouts)
OPENAI_API_KEY=sk-your-key-here

2. Edit config.yaml for everything else:

# Race goals
races:
  ultramarathon:
    date: "2026-02-14"
    name: "Tarawera Ultra"
    distance_miles: 100
  ironman:
    date: "2026-06-15"
    name: "Ironman 2026"

# Work schedule - workouts will be scheduled around this
work_schedule:
  weekday:
    start_time: "09:00"
    end_time: "20:00"
  weekend:
    enabled: true
    start_time: "09:00"
    end_time: "13:00"

# Scheduling preferences
scheduling:
  buffer_before_work: 60        # Workouts end 60 min before work starts
  prefer_latest_morning_time: true  # Don't wake up earlier than needed
  allow_brick_workouts: true    # Bike→run can be back-to-back
  buffer_between_workouts: 10   # Minutes between multiple workouts

Usage

Test connections:

source activate.sh
python scripts/sync_training.py check-status

Preview sync (safe, no changes):

python scripts/sync_training.py sync --days-forward 7 --dry-run

Sync workouts to calendar:

python scripts/sync_training.py sync --days-forward 14

How Smart Scheduling Works

The Problem

You have a 60-minute swim scheduled for Monday morning. When should it start?

  • Too early (5:00 AM): Unnecessary wake-up if work starts at 9:00 AM
  • Too late (8:00 AM): Conflicts with your 9:00 AM work start

The Solution

System calculates backwards from your work start time:

Work starts: 9:00 AM
Buffer needed: 60 minutes (shower, breakfast, commute)
→ Workout must end by: 8:00 AM
→ For 60-min swim: Start at 7:00 AM ✅

Examples with different durations:

  • 20-min strength: 7:40 AM → 8:00 AM (latest possible!)
  • 60-min swim: 7:00 AM → 8:00 AM
  • 120-min bike: 6:00 AM → 8:00 AM
  • 270-min long ride: 5:00 AM → 9:30 AM (weekend, ends after work starts since it's off)

Conflict Avoidance

Checks these calendars for conflicts:

  1. Your personal calendars (meetings, appointments)
  2. "To Do" calendar (prevents duplicates if you run sync twice)
  3. Existing workouts scheduled in same session

Does NOT check:

  • "Done" calendar (those are completed workouts)

Conflict resolution:

  • First workout: Scheduled at latest safe morning time
  • Additional workouts same day: Spaced 10-15 minutes after previous
  • If won't fit before work: Moved to after work (8:30 PM, 9:45 PM, etc.)
  • Brick workouts (bike→run): Allowed back-to-back with no buffer

Workout-Specific Logic

Based on exercise science and practical constraints:

Workout Type Weekday Weekend Reason
Swim 7:00 AM or 8:30 PM 7:30 AM Pool hours (5 AM-10 PM weekdays, 7 AM-8 PM weekends)
Long bike (3+ hrs) After work 5:00-7:30 AM Duration-based: 4.5hr ride needs 5:00 AM start to end by 9:30 AM
Hard run (intervals, tempo) 8:30 PM 8:00 AM Peak body temperature (late afternoon)
Easy run Latest safe morning Any Flexible timing
Strength After work 10:00 AM Gym availability

Commands

Sync Workouts

# Sync next 2 weeks (recommended)
python scripts/sync_training.py sync --days-forward 14

# Dry run to preview (no calendar changes)
python scripts/sync_training.py sync --days-forward 14 --dry-run

# Sync without calendar conflict checking (faster, less safe)
python scripts/sync_training.py sync --days-forward 14 --no-check-calendar

What this does:

  1. Downloads workouts from TrainingPeaks ICS feed (14 days ahead)
  2. Pulls completed activities from Garmin Connect
  3. Schedules each workout at optimal time (avoiding conflicts)
  4. Creates events in "To Do" calendar
  5. Marks completed workouts (moves to "Done" calendar)

Generate AI Workouts

python scripts/sync_training.py generate-workouts \
  --ultramarathon-date 2026-02-14 \
  --ironman-date 2026-06-15 \
  --days-forward 14 \
  --use-ai \
  --push-to-calendar

How it works:

  1. Analyzes your last 4 weeks of Garmin data (volume, intensity, trends)
  2. Determines current training phase (Base/Build/Peak/Taper) based on race dates
  3. Generates workouts using GPT-4o with coaching context
  4. Schedules workouts using same smart scheduling logic
  5. Pushes to "To Do" calendar

Current limitation: Generates all 14 days at once, then schedules. May suggest 3 workouts for a day with only time for 2.

Note: Default limited to 14 days to match TrainingPeaks sync window and avoid conflicts.

Get Training Insights & Visualizations

python scripts/get_insights.py

Analyzes recent training and generates:

  • 4 visualization plots: Weekly volume, sport distribution, training load, HR zones
  • Training summary: Hours, miles, activities by sport
  • Polarized training check: % time in Zone 2 (easy) vs Zone 3+ (hard)
  • Training load analysis: Average per workout, weekly totals
  • Recovery patterns: Body battery drain, VO2max trends
  • AI coaching suggestions: GPT-4o-mini provides 2-3 specific recommendations for next week

Saves plots to data/insights/training_analysis_YYYYMMDD.png (not auto-opened).

Example insight:

❤️ Intensity Distribution:
  Zone 2 (base): 13.2%
  Zone 3+ (hard): 83.9%
  ❌ Not polarized! Too much threshold/tempo. Add easy miles.

Target: 80% Zone 2, 20% Zone 3+ for optimal base building.

Check Progress Toward Race Goals

python scripts/sync_training.py check-progress

Analyzes recent Garmin data against phase-specific benchmarks (Base/Build/Peak/Taper) for your upcoming races.

Pull Wellness Data

python scripts/pull_wellness.py --days-back 30

Pulls daily wellness metrics from Garmin:

  • Sleep data (duration, stages, sleep score)
  • Body Battery (charge/drain, recovery quality)
  • HRV (heart rate variability - recovery indicator)
  • Stress levels
  • Resting heart rate
  • Daily stats (steps, calories, active minutes)

Enables correlation of sleep quality → performance, overtraining detection (elevated RHR, low HRV).

Pull Garmin Activities

python scripts/pull_garmin.py --days-back 30

Manually fetch Garmin activities. The sync command does this automatically.

View Dashboard

streamlit run src/dashboard/app.py

Interactive analytics dashboard with volume trends, intensity distribution, fitness metrics.


Project Structure

athlete-analytics/
├── config.yaml              # Settings (work schedule, race goals, preferences)
├── .env                      # Credentials (gitignored)
├── scripts/
│   ├── sync_training.py      # Main sync command
│   ├── pull_garmin.py        # Garmin data fetcher
│   └── build_features.py     # Analytics pipeline
├── src/
│   ├── ingest/
│   │   ├── garmin.py         # Garmin Connect API (using garth)
│   │   ├── trainingpeaks_ics.py  # TrainingPeaks ICS parser
│   │   └── macos_calendar.py # macOS Calendar (PyObjC EventKit)
│   ├── models/
│   │   ├── ai_workout_generator.py  # GPT-4o workout generation
│   │   ├── milestone_tracker.py     # Race progress tracking
│   │   └── workout_generator.py     # Rule-based fallback
│   ├── utils/
│   │   ├── calendar_scheduling.py  # Smart scheduling logic
│   │   ├── config.py               # Config loader
│   │   └── units.py                # Mile/km conversions
│   └── dashboard/app.py      # Streamlit dashboard
├── data/
│   ├── raw/garmin/           # Downloaded Garmin activities (JSON)
│   └── warehouse/            # Processed analytics (Parquet)
└── prompts/                  # AI workout generation prompts

Training Phase Tracking

Automatically determines training phase based on race date:

Phase Weeks to Race Focus Example Volume (Running)
Base 16+ Build aerobic base 25-37 mi/week
Build 8-16 Add intensity, increase volume 37-50 mi/week
Peak 3-8 Maximum volume with quality 43-56 mi/week
Taper 1-3 Reduce volume, maintain fitness 19-28 mi/week

For Ironman training, tracks weekly hours and sport-specific balance (swim/bike/run).


Troubleshooting

Calendar access denied:

  • Go to System Settings → Privacy & Security → Calendar
  • Enable access for Python or Terminal
  • Re-run sync command

Workouts at wrong times:

  • Check config.yaml work schedule (start_time, end_time)
  • Verify buffer_before_work setting (default 60 minutes)
  • Run with --dry-run to preview without making changes

Overlapping workouts:

  • System should automatically space them
  • If you see overlaps, run --dry-run and report the issue
  • Check if "To Do" calendar is included in conflict checking

Garmin auth fails:

  • Delete .tokens/garth_tokens.json
  • Re-run pull command (will re-authenticate)

TrainingPeaks workouts not appearing:

  • Verify ICS URL in .env (get from TrainingPeaks: Settings → Account → Calendar)
  • Check Premium subscription (required for ICS feed)
  • ICS feed only provides 14 days ahead

Missing metrics:

  • Not all Garmin activities include HR, power, or VO₂max
  • Older activities may have limited data

Automation

Run daily with cron:

crontab -e

# Add this line (runs at 6 AM daily)
0 6 * * * cd /path/to/athlete-analytics && .venv/bin/python scripts/sync_training.py sync --days-forward 14 >> /tmp/training_sync.log 2>&1

Or use Docker:

docker compose -f docker/compose.yml up --build

Future Enhancements

Day-by-Day LLM Generation (Recommended)

Current approach:

  • LLM generates 14 days of workouts all at once
  • System tries to fit them into calendar
  • May suggest unrealistic schedules (3 workouts on a day with 2 hours free)

Better approach:

for each_day in next_14_days:
    free_slots = check_calendar(day)  # "You have 90 min morning, 2 hrs evening"
    workouts = ask_llm(f"Generate workouts that FIT these slots: {free_slots}")
    schedule_immediately(workouts)

This would enable LLM to say "You only have 90 minutes free, so here's one 80-minute run" instead of blindly suggesting 3 workouts.

Smart Garmin Caching

Current: Fetches all activities every time (~30 seconds)

Better: Incremental updates

  • Cache activities in data/warehouse/activities.parquet
  • Only fetch NEW activities since last sync (~2-5 seconds)
  • Keep 90-day rolling window for analysis

Other Ideas

  • Zone-based analysis (time in HR/power zones)
  • FTP estimator from 20-min best power
  • Polarized training balance checker
  • Auto-run feature engineering after Garmin sync

Data Model

Raw data (data/raw/garmin/):

  • activities_index.json: Basic metadata for all activities
  • {activity_id}.json: Full details per activity
  • {activity_id}.gpx: GPS track (optional)

Processed data (data/warehouse/):

  • activities.parquet: One row per activity with 56+ derived metrics

Stored metrics (56 fields per activity):

  • Basic: activity_id, name, time, sport, distance, duration
  • Heart rate: avg/max HR, time in zones 1-5, zone percentages
  • Power: avg/max/norm power, time in zones 1-7, TSS, intensity factor
  • Running: cadence, vertical oscillation, ground contact time, stride length
  • Training load: aerobic/anaerobic training effect, activity load, labels
  • Recovery: calories, body battery change, water loss, intensity minutes
  • Elevation: gain/loss, min/max
  • Derived: pace, week/date keys, zone percentages

Wellness data (data/raw/garmin/wellness/):

  • Daily sleep, HRV, body battery, stress, resting HR, daily stats

Technical Notes

Garmin Integration

Uses garth library for unofficial Garmin Connect access. Suitable for personal use only. For commercial applications, apply for official Garmin Health/Activity APIs.

TrainingPeaks Integration

Uses standard ICS calendar feed (requires Premium subscription). Provides 14 days ahead in rolling window.

Calendar Integration

Uses PyObjC EventKit for native macOS Calendar access. More reliable than AppleScript. Falls back to AppleScript if PyObjC unavailable.

AI Workout Generation

Uses OpenAI GPT-4o with custom coaching prompts. Training history and race phase context included in prompt for personalized recommendations.


License

MIT


Tips

Best workflow:

  1. Use TrainingPeaks for workout planning (full featured training platform)
  2. Run sync daily to schedule workouts in macOS Calendar with smart timing
  3. Complete workouts → Garmin auto-uploads
  4. Run sync again → system marks completed (moves to "Done" calendar)
  5. Run get-insights weekly to check training quality (polarized training, recovery, trends)
  6. Run check-progress monthly to track phase benchmarks toward race goals

Don't use AI generation if:

  • You have a coach providing TrainingPeaks workouts
  • You prefer manual control over training

Use AI generation when:

  • Testing the system
  • TrainingPeaks not set up yet
  • Need quick 2-week plan for specific race
  • Want AI coaching context/explanations

Calendar tips:

  • Don't manually edit "To Do" calendar (sync will recreate events)
  • "Done" calendar is for reference only (system moves completed workouts there)
  • Use other calendars (Work, Personal) for non-training events
  • System checks all calendars (except Done) to avoid conflicts

Performance:

  • First Garmin sync: ~30 seconds (fetches all activities)
  • Subsequent syncs: ~5-10 seconds (incremental)
  • TrainingPeaks ICS download: ~2 seconds
  • Calendar scheduling: <1 second per workout
  • AI generation: ~10-30 seconds (depends on OpenAI API response time)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages