Download and maintain a complete offline archive of Apple Developer Documentation in AI-optimized formats (Markdown + JSON), with git-like incremental updates.
- Offline Access - Complete documentation available without internet
- AI-Optimized - Clean Markdown format perfect for AI/LLM processing
- Git-Like Updates - Incremental updates with
check,pull, andstatuscommands - Multiple Formats - Raw JSON + AI-friendly Markdown
- Smart Caching - Only downloads changed pages using ETags
- Resume Support - Continue interrupted downloads
- Framework Selection - Download only what you need
# Clone or download this repository
git clone https://github.com/OxADD1/Apple-Developer-Documentation-Offline-Archive.git
cd Apple-Developer-Documentation-Offline-Archive
# Setup
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r scripts/requirements.txt
# Download Swift documentation (test with one framework)
python scripts/01_discover_docs.py --frameworks swift
python scripts/02_download_json.py --frameworks swift
python scripts/03_json_to_markdown.py --frameworks swift
# Later: Check for updates
python scripts/update_check.py
python scripts/update_pull.py- Python 3.8+
- 5-10 GB free disk space (depending on frameworks)
- Internet connection for initial download
- Create Virtual Environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install Dependencies
pip install -r scripts/requirements.txt- Optional: PDF Support (not required for Markdown/JSON)
playwright install chromiumCrawls Apple's API to discover all available documentation pages.
Important: The discovery process respects framework boundaries.
- If you run it without arguments, it will crawl all default frameworks.
- If you specify frameworks (e.g.,
--frameworks swift), it will only crawl those specific frameworks. - Cross-framework references are intentionally ignored to allow selective downloads.
# Discover all default frameworks (swift, swiftui, uikit, foundation, etc.)
python scripts/01_discover_docs.py
# Discover specific frameworks only
python scripts/01_discover_docs.py --frameworks swift swiftui
# Resume interrupted discovery
python scripts/01_discover_docs.py --resumeDuration: 1-3 hours
Output: index.json with complete documentation hierarchy
Downloads all discovered pages as JSON files:
# All frameworks
python scripts/02_download_json.py
# Specific frameworks
python scripts/02_download_json.py --frameworks swift swiftuiDuration: 4-12 hours (rate limited at 5 req/sec) Output:
raw-json/- Original Apple JSON.docsync/manifest.json- Metadata for update tracking
Converts JSON to clean, AI-readable Markdown:
# All frameworks
python scripts/03_json_to_markdown.py
# Specific frameworks
python scripts/03_json_to_markdown.py --frameworks swift swiftuiDuration: 1-3 hours
Output: markdown/ - Markdown files with YAML frontmatter
python scripts/update_check.pyOutput:
✓ Checking for updates...
Updates available:
[CHANGED] SwiftUI/TabView - Navigation updates for iOS 18
[CHANGED] Swift/Array - Performance improvements documentation
...
Total: 12 modified pages
To download: python scripts/update_pull.py
python scripts/update_status.pyShows current documentation state, last update check, and available updates.
# Download all updates
python scripts/update_pull.py
# Specific framework only
python scripts/update_pull.py --frameworks swiftui
# Skip markdown conversion
python scripts/update_pull.py --no-convertOutput:
- Updated JSON and Markdown files
- Changelog in
.docsync/changelog/ - Version snapshot
apple-docs-offline/
├── scripts/
│ ├── 01_discover_docs.py # Recursive documentation crawler
│ ├── 02_download_json.py # JSON downloader with manifest
│ ├── 03_json_to_markdown.py # JSON → Markdown converter
│ ├── 04_markdown_to_pdf.py # Generate PDFs from Markdown
│ ├── 05_markdown_to_html.py # Generate browsable HTML site
│ ├── update_check.py # Check for updates (git fetch)
│ ├── update_pull.py # Download updates (git pull)
│ ├── update_status.py # Show status (git status)
│ └── requirements.txt # Python dependencies
│
├── markdown/ # AI-optimized Markdown
│ ├── swift/
│ │ ├── Array.md
│ │ ├── String.md
│ │ └── ... (~30,000 files)
│ ├── swiftui/
│ │ ├── View.md
│ │ └── ... (~5,000 files)
│ └── ... (8 more frameworks)
│
├── pdf/ # Human-readable PDFs
│ ├── swift_documentation.pdf
│ ├── swiftui_documentation.pdf
│ └── ... (one per framework)
│
├── html/ # Browsable HTML documentation
│ ├── index.html
│ ├── swift/
│ │ ├── index.html
│ │ └── ...
│ └── ...
│
├── raw-json/ # Original Apple JSON (backup)
│ ├── swift/
│ ├── swiftui/
│ └── ...
│
├── .docsync/ # Update tracking metadata
│ ├── manifest.json # All pages with ETags/hashes
│ ├── versions/ # Version snapshots
│ ├── changelog/ # Update changelogs
│ └── cache/ # Update check cache
│
├── index.json # Discovery index
├── README.md # This file
└── venv/ # Python virtual environment
The following frameworks are supported by default:
| Framework | Description |
|---|---|
| swift | Swift Language Documentation |
| swiftui | SwiftUI Framework |
| uikit | UIKit Framework |
| foundation | Foundation Framework |
| coredata | Core Data Framework |
| combine | Combine Framework |
| swiftdata | SwiftData Framework |
| coreml | Core ML Framework |
| mapkit | MapKit Framework |
| avfoundation | AVFoundation Framework |
You can add more frameworks by editing FRAMEWORK_ROOTS in scripts/01_discover_docs.py.
The Markdown output is optimized for AI processing:
- YAML Frontmatter - Structured metadata (title, role, platforms)
- Clean Hierarchy - Proper heading levels, lists, code blocks
- Code Highlighting - Syntax information preserved
- Cross-References - Relative links to related documentation
- No HTML - Pure Markdown for easy parsing
Read the SwiftUI View documentation:
markdown/swiftui/view.md
Explain the main protocol requirements.
import os
from pathlib import Path
# Load all markdown files
docs_dir = Path("markdown")
for md_file in docs_dir.rglob("*.md"):
with open(md_file) as f:
content = f.read()
# Add to your vector database
# vector_db.add(content, metadata={"source": str(md_file)})| Component | Size |
|---|---|
JSON (raw-json/) |
~2-3 GB |
Markdown (markdown/) |
~1-2 GB |
| Total | ~3-5 GB |
All scripts implement respectful rate limiting:
- Discovery: 5 requests/second
- Download: 5 requests/second with exponential backoff
- Update Check: 10 requests/second (HEAD requests only)
The manifest is created during the first download:
python scripts/02_download_json.pyRun an update check first:
python scripts/update_check.pyScripts automatically retry with backoff. If issues persist:
- Wait 10-15 minutes
- Resume with
--resumeflag (for discovery)
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r scripts/requirements.txtFor faster testing, download only one framework:
python scripts/01_discover_docs.py --frameworks swift
python scripts/02_download_json.py --frameworks swift
python scripts/03_json_to_markdown.py --frameworks swiftDiscovery supports resume:
python scripts/01_discover_docs.py --resumeDownload automatically skips existing files.
python scripts/update_check.py --frameworks swiftui
python scripts/update_pull.py --frameworks swiftuiAll scripts support --help:
python scripts/01_discover_docs.py --help
python scripts/02_download_json.py --help
python scripts/03_json_to_markdown.py --help
python scripts/04_markdown_to_pdf.py --help
python scripts/05_markdown_to_html.py --help
python scripts/update_check.py --help
python scripts/update_pull.py --help
python scripts/update_status.py --helpTo download the complete Apple Developer Documentation for all frameworks:
# Full download sequence (runs for 12-24 hours)
python scripts/01_discover_docs.py
python scripts/02_download_json.py
python scripts/03_json_to_markdown.pyExpected Results:
- ~68,500 pages across all frameworks
- ~3-5 GB total storage
- 12-24 hours download time (due to rate limiting)
Run in background:
# macOS/Linux - run overnight
nohup bash -c '
python scripts/01_discover_docs.py && \
python scripts/02_download_json.py && \
python scripts/03_json_to_markdown.py
' > download.log 2>&1 &
# Keep Mac awake during download
caffeinate -i -t 86400 &
# Monitor progress
tail -f download.logAfter downloading markdown files, generate searchable PDFs:
# Install pandoc (required for PDF generation)
brew install pandoc basictex # macOS
# sudo apt-get install pandoc texlive-xetex # Linux
# Install Python dependency
pip install markdown
# Generate PDFs (one per framework, recommended sizes)
python scripts/04_markdown_to_pdf.py --framework swift --max-files 500
python scripts/04_markdown_to_pdf.py --framework swiftui --max-files 300
python scripts/04_markdown_to_pdf.py --framework foundation --max-files 400PDF Features:
- ✅ Auto-generated table of contents
- ✅ Syntax highlighting for code blocks
- ✅ Numbered sections
- ✅ Working hyperlinks
- ✅ Professional layout (1-inch margins, 10pt font)
Output: pdf/<framework>_documentation.pdf (~20-50 MB each)
Framework-Specific Recommendations:
| Framework | Recommended Files | Expected PDF Size | Coverage |
|---|---|---|---|
| swift | 500 | ~50 MB | Core language features |
| swiftui | 300 | ~30 MB | Essential UI components |
| uikit | 400 | ~40 MB | Major view controllers & views |
| foundation | 400 | ~40 MB | Core data types & APIs |
| coredata | 200 | ~20 MB | Full framework |
| combine | 150 | ~15 MB | Full framework |
| swiftdata | 100 | ~10 MB | Full framework |
| coreml | 200 | ~20 MB | ML essentials |
| mapkit | 150 | ~15 MB | Map & location APIs |
| avfoundation | 250 | ~25 MB | A/V processing APIs |
Test with smaller subset first:
# Generate a small test PDF (50 pages)
python scripts/04_markdown_to_pdf.py --framework swift --max-files 50
open pdf/swift_documentation.pdfCreate a browsable, searchable static HTML website:
# Install dependency
pip install markdown
# Generate HTML for all frameworks
python scripts/05_markdown_to_html.py
# Or specific frameworks only
python scripts/05_markdown_to_html.py --frameworks swift swiftui
# Open in browser
open html/index.htmlHTML Features:
- ✅ Complete offline browsing
- ✅ Search functionality per framework
- ✅ Apple-style dark code highlighting
- ✅ Responsive design
- ✅ Fast navigation
- ✅ No server required
Output: html/ directory with complete static website
After downloading all frameworks and generating documentation:
Apple-Developer-Documentation-Offline-Archive/
├── markdown/ # AI-optimized Markdown (for LLMs, RAG systems)
│ ├── swift/ # ~30,000 files (~500 MB)
│ ├── swiftui/ # ~5,000 files (~80 MB)
│ ├── uikit/ # ~8,000 files (~130 MB)
│ ├── foundation/ # ~12,000 files (~200 MB)
│ └── ... # 6 more frameworks
│
├── pdf/ # Human-readable PDFs
│ ├── swift_documentation.pdf
│ ├── swiftui_documentation.pdf
│ ├── foundation_documentation.pdf
│ └── ...
│
├── html/ # Browsable HTML documentation
│ ├── index.html
│ ├── swift/
│ │ ├── index.html
│ │ └── ...
│ └── ...
│
├── raw-json/ # Original Apple JSON (backup)
│ ├── swift/ # ~1.2 GB
│ ├── swiftui/ # ~200 MB
│ └── ...
│
├── .docsync/ # Update tracking
│ ├── manifest.json # ETags & hashes
│ ├── versions/ # Version snapshots
│ └── changelog/ # Update history
│
└── scripts/ # Python tools
├── 01_discover_docs.py
├── 02_download_json.py
├── 03_json_to_markdown.py
├── 04_markdown_to_pdf.py
├── 05_markdown_to_html.py
├── update_check.py
├── update_pull.py
└── update_status.py
- Starts with framework root URLs (e.g.,
swift.json) - Recursively extracts references from
topicSectionsandreferences - Builds complete documentation hierarchy
- Saves to
index.json
- Reads index from discovery step
- Downloads each page with rate limiting
- Stores ETag and content hash in manifest
- Creates
.docsync/manifest.jsonfor update tracking
Converts Apple's JSON schema to Markdown:
JSON Structure → Markdown Output
═══════════════════════════════════════════
metadata → YAML frontmatter
primaryContentSections → Main content
├─ heading → ## Headers
├─ paragraph → Text paragraphs
├─ codeListing → ```swift code blocks
├─ unorderedList → - Bullet lists
└─ orderedList → 1. Numbered lists
topicSections → ## Topics section
references → Internal links
- Checks ETags via HTTP HEAD requests
- Compares with stored ETags in manifest
- Only downloads changed pages
- Generates changelog
# Activate environment
source venv/bin/activate
# Check status
python scripts/update_status.py
# Check for updates
python scripts/update_check.py
# Download if available
python scripts/update_pull.py
# Review changes
cat .docsync/changelog/*.md- PDF generation (
04_markdown_to_pdf.py) ✅ - HTML documentation website (
05_markdown_to_html.py) ✅ - Update history viewer (
update_history.py) - Version rollback (
update_rollback.py) - Swift Book integration
- Full-text search index for HTML
- Desktop app wrapper (Electron/Tauri)
This tool downloads publicly available Apple Developer Documentation for offline use.
Important:
- Documentation content © Apple Inc.
- For personal, non-commercial use only
- Respects Apple's servers via rate limiting
- Not affiliated with or endorsed by Apple
Contributions welcome! Areas for improvement:
- Additional framework support
- PDF generation templates
- Search functionality
- Performance optimizations
- Bug fixes
- Check this README
- Review script output for error messages
- Use
--helpflag for script-specific documentation - Check existing issues
Built with:
- aiohttp - Async HTTP
- BeautifulSoup - HTML parsing
- tqdm - Progress bars
- PyYAML - YAML processing
Made for the developer community 🚀 Download once, reference forever