A Snapshot-Hypervisor for Python Tests
Replace pytest's execution model with microsecond-scale memory snapshots.
Alpha Release: APIs may change. Not recommended for production use. See CHANGELOG.md for roadmap.
Tach is a Runtime Hypervisor for Python Tests. It abandons the traditional process creation model (fork() or spawn()) in favor of Snapshot/Restore architecture using Linux userfaultfd.
Instead of creating a new process for every test (~2ms + import time), Tach creates a process once, captures a memory snapshot, runs a test, and restores the memory state in less than 50 microseconds.
Traditional test runners suffer from three fundamental performance bottlenecks:
- Import Tax: Python module imports are expensive.
import pandastakes 200ms+. - Fork Safety:
fork()copies locked mutexes from background threads, causing deadlocks. - Allocator Churn: Python's
obmallocfragments memory, making snapshots unstable.
flowchart LR
subgraph Traditional["TRADITIONAL (pytest-xdist)"]
direction TB
T1[Fork] --> T2[Import] --> T3[Run Test] --> T4[Exit]
T5[Fork] --> T6[Import] --> T7[Run Test] --> T8[Exit]
end
subgraph Tach["TACH HYPERVISOR"]
direction TB
Z1[Initialize Once] --> Z2[Snapshot]
Z2 --> W1 & W2 & WN
subgraph Workers["Parallel Workers"]
W1["Worker 1:<br/>Run → Reset → Run..."]
W2["Worker 2:<br/>Run → Reset → Run..."]
WN["Worker N:<br/>Run → Reset → Run..."]
end
end
| Metric | pytest (Standard) | Tach (Hypervisor) |
|---|---|---|
| Reset Latency | ~200ms | < 50us |
| Throughput | 1x | 100x+ |
| Fork Safety | Unsafe (Deadlocks) | Safe (Lock Reset) |
| Security | None | Landlock + Seccomp |
# Clone and build
git clone https://github.com/user/tach-core.git && cd tach-core
python -m venv .venv && source .venv/bin/activate && pip install pytest
export PYO3_PYTHON=$(which python) && cargo build --release
# Run tests
./target/release/tach-core .
# With coverage (Python 3.12+)
./target/release/tach-core --coverage .| Requirement | Specification |
|---|---|
| OS | Linux Kernel 5.13+ (Ubuntu 22.04+, Fedora 34+) |
| Python | 3.10+ (3.12+ for PEP 669 coverage) |
| Rust | 1.75+ |
| Privileges | CAP_SYS_PTRACE or root |
Tach uses a modular architecture organized into 5 domain-based subdirectories:
src/
├── core/ # Infrastructure
│ ├── allocator.rs # Jemalloc integration
│ ├── config.rs # Configuration engine
│ ├── environment.rs # Environment variable handling
│ ├── lifecycle.rs # Process lifecycle management
│ ├── protocol.rs # Binary IPC protocol
│ └── signals.rs # Signal handling
│
├── discovery/ # Test discovery
│ ├── scanner.rs # AST-based test scanning
│ ├── resolver.rs # Fixture resolution
│ ├── loader.rs # Zero-copy bytecode loading
│ ├── graph.rs # Dependency graph (petgraph)
│ └── analysis.rs # Toxicity analysis
│
├── execution/ # Test execution
│ ├── scheduler.rs # Dual-queue test scheduler
│ ├── watch.rs # File watch mode
│ └── zygote.rs # Process spawning and worker pool
│
├── isolation/ # Process isolation
│ ├── namespace.rs # Linux namespaces
│ ├── sandbox.rs # Landlock + Seccomp (Iron Dome)
│ └── snapshot.rs # userfaultfd memory snapshots
│
├── reporting/ # Result reporting
│ ├── reporter.rs # Progress and output formatting
│ ├── junit.rs # JUnit XML generation
│ ├── logcapture.rs # stdout/stderr capture
│ ├── debugger.rs # Debug output
│ └── coverage.rs # PEP 669 coverage collection
│
├── lib.rs # Library entry point (re-exports all modules)
├── main.rs # CLI entry point
└── tach_harness.py # Python test harness
All modules are re-exported at the top level via lib.rs for backward compatibility.
Tach consists of 5 domain modules with interconnected subsystems:
flowchart TB
subgraph Supervisor["RUST SUPERVISOR"]
subgraph Core["core/"]
Config["Config Engine"]
Protocol["IPC Protocol"]
Allocator["Allocator<br/>(Jemalloc)"]
end
subgraph DiscoveryMod["discovery/"]
Scanner["Scanner<br/>(rustpython-parser)"]
Analysis["Toxicity Analyzer<br/>(petgraph)"]
Loader["Zero-Copy Loader<br/>(PyMarshal)"]
end
subgraph Execution["execution/"]
Scheduler["Test Scheduler<br/>(Dual-Queue)"]
Zygote["Zygote<br/>(Worker Pool)"]
end
subgraph IsolationMod["isolation/"]
Sandbox["Iron Dome<br/>(Landlock+Seccomp)"]
Namespace["Namespaces<br/>(OverlayFS)"]
Snapshot["Physics Engine<br/>(userfaultfd)"]
end
subgraph Reporting["reporting/"]
Reporter["Reporter<br/>(indicatif)"]
Coverage["Coverage<br/>(PEP 669)"]
end
end
subgraph Worker["PYTHON WORKER"]
Harness["Test Harness<br/>(tach_harness.py)"]
end
Scanner --> Analysis --> Scheduler
Loader --> Zygote
Scheduler --> Zygote
Zygote --> Sandbox --> Namespace --> Harness
Harness --> Coverage
Snapshot <--> Worker
Allocator --> Snapshot
Detailed technical documentation for each subsystem:
| Document | Description |
|---|---|
| Architecture | |
| Overview | System architecture and component interactions |
| Discovery Engine | AST-based test discovery with rustpython-parser |
| Zero-Copy Loader | Bytecode compilation and injection |
| Toxicity Analysis | Module toxicity detection and propagation |
| Physics Engine | userfaultfd memory snapshots |
| Zygote Lifecycle | Process management and worker spawning |
| Iron Dome | Landlock and Seccomp security |
| Isolation | Namespaces and OverlayFS |
| Coverage System | PEP 669 ring buffer coverage |
| Fixture Resolver | Fixture discovery and resolution |
| Allocator | Jemalloc integration |
| IPC Protocol | Binary protocol and message format |
| Scheduler | Test scheduling and dispatch |
| Reporter | Output formatting and progress |
| TTY Debugger | Interactive debugging via breakpoint() |
| Restoration Physics | Memory restoration invariants |
| Security | |
| EPERM Doctrine | Kernel-level security validation |
| Operations | |
| CI Runner Setup | Self-hosted runner configuration |
| WSL2 Setup | Platform-specific setup for WSL2 |
| Reference | |
| Configuration | CLI, pyproject.toml, environment variables |
| Development | Build, test, project structure |
| Troubleshooting | Common issues and debug commands |
| API Reference | FFI functions and data structures |
| Decisions | |
| ADR: Rust 2024 Edition | Edition migration rationale |
# Run all tests
tach-core .
# Run specific file
tach-core tests/test_example.py
# Parallel execution with 4 workers
tach-core -n 4 .
# Filter tests by keyword or marker
tach-core -k "network" .
tach-core -m "not slow" .
# Fail fast - stop on first failure
tach-core -x .
# Verbose output
tach-core -v .
# List tests without running
tach-core list .
# Self-test kernel support
tach-core self-test
# Enable coverage
tach-core --coverage .
# JSON output
tach-core --format json .
# JUnit XML report
tach-core --junit-xml results.xml .
# Disable sandbox (development)
tach-core --no-isolation .
# Watch mode
tach-core --watch .See Configuration Reference for all CLI flags and options.
Configure via pyproject.toml:
[tool.tach]
test_pattern = "test_*.py"
timeout = 60
workers = 4
[tool.tach.coverage]
enabled = true
source = ["src"]
omit = ["**/test_*"]
format = "lcov"
[tool.pytest_env]
DATABASE_URL = "sqlite:///:memory:"See Configuration Reference for full details.
| Component | Status |
|---|---|
| Physics Check (userfaultfd) | Complete |
| Zero-Copy Loader | Complete |
| Toxicity Filter | Complete |
| Worker Loop | Complete |
| Coverage (PEP 669) | Complete |
| Iron Dome (Sandbox) | Complete |
| Hot Reload | Complete |
| Allocator (Jemalloc) | Complete |
| Coverage Resolution | Complete |
| Configuration Engine | Complete |
| Progress Reporter | Complete |
| Security Hardening | Complete |
See CHANGELOG.md for version history.
- Zero-Copy Module Loading: Bypasses
importlibentirely viaPyMarshal_ReadObjectFromString - userfaultfd Snapshots: Sub-50us memory reset via
madvise(MADV_DONTNEED) - Landlock + Seccomp: Defense-in-depth sandbox for worker processes
- PEP 669 Coverage: Lock-free ring buffer with
memfd_create - Jemalloc Integration: Deterministic heap via
mallctltcache flush - Toxicity Propagation: Fixed-point algorithm over petgraph dependency graph
- Django Integration: Automatic transaction rollback and connection pooling
- Async Support: Built-in asyncio loop management for coroutine tests
Tach implements comprehensive security hardening across all subsystems:
| Fix | Description |
|---|---|
| Static Mut Elimination | Replaced static mut with OnceLock/Mutex for thread-safe global state |
| Dangling Pointer Prevention | Fixed CString lifetime issues in FFI calls to prevent use-after-free |
| TOCTOU Race Fix | Lock-free CAS loop in ring buffer prevents race conditions |
| Mutex Poisoning Recovery | All mutex locks use unwrap_or_else(|e| e.into_inner()) for crash resilience |
| Fix | Description |
|---|---|
| Seccomp Hardening | Added ptrace, mount, umount2, unshare, setns to syscall blacklist |
| Landlock TOCTOU | Removed path.exists() check, handle ENOENT atomically |
| Environment Denylist | Blocks 11 dangerous env vars: LD_PRELOAD, PYTHONPATH, PYTHONMALLOC, etc. |
| Fix | Description |
|---|---|
| RwLock for Read-Heavy Data | code_map uses RwLock instead of Mutex for concurrent reads |
| Zero-Copy Data Extraction | take_data() uses std::mem::take() to avoid HashMap cloning |
| Pre-sized Collections | Thread-local HashSet pre-allocated with 1024 capacity |
Regression tests added across critical subsystems:
namespace.rs: Path logic, overlay options, isolation bypasslogcapture.rs: memfd operations, read/clear, fd lifecyclescheduler.rs: Queue separation, priority dispatch, slot calculation
MIT License. See LICENSE for details.
# Setup
python -m venv .venv && source .venv/bin/activate && pip install pytest
rustup component add rustfmt clippy
# Build
export PYO3_PYTHON=$(which python)
cargo build --release
# Test
cargo test --lib # Unit tests
cargo test --test '*' # Integration tests
# Lint
cargo fmt --check && cargo clippy -- -D warningsThis project uses pre-commit for automated code quality checks:
pip install pre-commit && pre-commit installSee Development Guide for complete build commands, testing details, and project structure.
Built with Rust for performance and reliability.