Pyrope: AI-Controlled Vector Database

Cut AI search costs by 50% and speed up responses by 3x.

Pyrope is an "AI Cache Controlled" Vector Database designed to optimize both latency (P99) and compute costs (GPU/CPU seconds) for AI applications. It uniquely combines a high-performance frontend, an industry-standard ANN engine, and an intelligent AI controller to manage caching and query routing dynamically.

🚀 Key Features

AI-Driven Caching: A 2-stage architecture (Hot/Warm paths) that intelligently decides what to cache (Admission), how long (Variable TTL), and when to evict based on predicted future utility and cost.
Cost-Aware Semantic Cache: Goes beyond exact matches. Pyrope uses clustering and quantization to answer "similar" queries from cache, trading off a specific error margin for significant cost savings (30-50% reduction in FAISS compute).
Cost-Aware Query Routing: Automatically adjusts search parameters (e.g., nprobe, efSearch) based on predicted query cost and tenant quotas, ensuring stability under load.
Query Prediction & Prefetching: Learns from user session history to pre-calculate results for potential next queries, drastically reducing latency for interactive RAG applications.
SLO Guardrails: Includes fail-safes that automatically downgrade precision or fallback to rule-based caching if P99 latency spikes, ensuring service reliability.

🏗 Architecture

Pyrope uses a robust, layered architecture:

Front/Cache Layer (Garnet):
- Handles RESP-compatible commands.
- Manages Result, Candidate, and Meta caches.
- Executes lightweight "Hot Path" policy decisions (< 0.1ms).
Vector Engine (Native C#):
- HNSW: High-performance graph index (SIMD-accelerated) for low-latency search.
- IVF-PQ: Product Quantization for high-scale, memory-efficient indexing.
- Delta Indexing: Real-time updates via Head (HNSW/Flat) + Tail (IVF-PQ/Flat) generic architecture.
AI Cache Controller:
- Warm Path (Sidecar): Runs complex inference (Python/ONNX) to update caching policies and scoring models asynchronously (10-50ms).
- Learns and evolves caching strategies continuously based on query logs.

⚙️ Configuration

Pyrope uses standard .NET configuration (appsettings/environment variables). Sidecar settings:

Setting	Default	Description
`Sidecar:Endpoint`	(unset)	gRPC endpoint for the AI sidecar (also supports `PYROPE_SIDECAR_ENDPOINT`).
`Sidecar:MetricsIntervalSeconds`	10	Interval in seconds between metrics reports to the sidecar.
`Sidecar:WarmPathTimeoutMs`	50	Timeout in milliseconds for warm-path responses before falling back to cached rules and incrementing `ai_fallback_total`.

🎯 Use Cases

RAG (Retrieval-Augmented Generation): Stabilize P99 latency and reduce the cost of repetitive semantic queries.
Search Infrastructure: Simplify ANN tuning and operation complexity for ML engineers.
High-Traffic AI Services: Manage "noisy neighbor" problems with multi-tenant QoS and strict quotas.
FinOps: Directly control and reduce the unit cost of vector search requests.

🛠 Usage (Conceptual)

Pyrope speaks the RESP protocol (Redis serialization), making it compatible with many existing clients for basic commands, while offering extended commands for vector operations.

Vector Search

# Search for top 10 similar vectors
VEC.SEARCH my_app main_idx TOPK 10 VECTOR \x00\x01...

Adding Vectors

# Add a new vector with metadata
VEC.ADD my_app main_idx "doc1" VECTOR \x00\x01... META {"category":"news"}

📈 Benchmarking (P1-10)

Pyrope includes a benchmarking tool to load common datasets and measure latency/QPS/Recall.

Start server

# Set Admin Key and start server
PYROPE_ADMIN_API_KEY=your_admin_key dotnet run --project src/Pyrope.GarnetServer -- --port 3278 --bind 127.0.0.1

Run benchmark (Quick Start)

Baseline (IVF-Flat):

./scripts/bench_vectors.sh \
  --dataset synthetic \
  --dim 128 --base-limit 10000 --query-limit 100 \
  --algorithm IVF_FLAT --params nlist=100 \
  --api-key tenant1 --http http://localhost:5000 --admin-api-key your_admin_key --build-index

High Performance (HNSW):

./scripts/bench_vectors.sh \
  --dataset synthetic \
  --dim 128 --base-limit 10000 --query-limit 100 \
  --algorithm HNSW --params m=16,ef_construction=200,ef_search=50 \
  --api-key tenant1 --http http://localhost:5000 --admin-api-key your_admin_key --build-index

Memory Efficient (IVF-PQ):

./scripts/bench_vectors.sh \
  --dataset synthetic \
  --dim 128 --base-limit 10000 --query-limit 100 \
  --algorithm IVF_PQ --params "nlist=100,m=4,k=256" \
  --api-key tenant1 --http http://localhost:5000 --admin-api-key your_admin_key --build-index

Benchmark Options

Option	Description
`--algorithm`	`HNSW`, `IVF_PQ`, `IVF_FLAT`, `FLAT`
`--params`	Algorithm params (e.g. `nlist=1024`, `m=16`, `ef_search=50`)

📊 Comparison

Feature	Pyrope	Pinecone	Milvus	Weaviate
Search Engine	Native C# (HNSW/PQ)	Proprietary	FAISS/HNSW	HNSW
AI Cache Control	✅ Unique	❌	❌	❌
Semantic Cache	✅ Cost-Aware	❌	❌	❌
SLO Guardrails	✅ Pro	❌	❌	Partial
Query Prediction	✅	❌	❌	❌

📄 License

[TBD]

Name		Name	Last commit message	Last commit date
Latest commit History 197 Commits
.github/workflows		.github/workflows
docs/benchmarks		docs/benchmarks
example		example
hackathon		hackathon
prompt		prompt
review		review
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Directory.Build.props		Directory.Build.props
Directory.Build.targets		Directory.Build.targets
Pyrope.sln		Pyrope.sln
README.md		README.md
build_log.txt		build_log.txt
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pyrope: AI-Controlled Vector Database

🚀 Key Features

🏗 Architecture

⚙️ Configuration

🎯 Use Cases

🛠 Usage (Conceptual)

Vector Search

Adding Vectors

📈 Benchmarking (P1-10)

Start server

Run benchmark (Quick Start)

Benchmark Options

📊 Comparison

📄 License

About

Uh oh!

Releases

Packages

Languages

takurot/Pyrope

Folders and files

Latest commit

History

Repository files navigation

Pyrope: AI-Controlled Vector Database

🚀 Key Features

🏗 Architecture

⚙️ Configuration

🎯 Use Cases

🛠 Usage (Conceptual)

Vector Search

Adding Vectors

📈 Benchmarking (P1-10)

Start server

Run benchmark (Quick Start)

Benchmark Options

📊 Comparison

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages