Skip to content

ankthba/EdgeFabric

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Edge Inference Fabric MVP

A minimal viable edge inference fabric that demonstrates distributed inference routing on a single device. This proof-of-concept shows how inference requests can be routed to available workers based on load and latency.

Features

  • Control Plane Server: Routes inference requests to available workers
  • Device Agents: Execute inference workloads (simulate multiple with different ports)
  • Client API: Simple REST interface for submitting inference requests
  • Web Dashboard: Visualize routing decisions and performance metrics
  • Real ML Model: Uses ONNX Runtime with MobileNetV2

Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Run the Demo

python run_demo.py

This will:

  • Start the control plane on port 8000
  • Start 3 device agents on ports 8001, 8002, 8003
  • Download the MobileNetV2 model if needed

3. Access the Dashboard

Open http://localhost:8000 in your browser to see:

  • Active devices with CPU/memory stats
  • Real-time metrics
  • Test inference button

4. Run Tests

python test_fabric.py

Architecture

┌─────────────────┐     ┌─────────────────┐
│     Client      │────▶│  Control Plane  │
│  (API/Browser)  │     │   (Port 8000)   │
└─────────────────┘     └────────┬────────┘
                                 │
                    ┌────────────┼────────────┐
                    │            │            │
                    ▼            ▼            ▼
             ┌──────────┐ ┌──────────┐ ┌──────────┐
             │ Device 1 │ │ Device 2 │ │ Device 3 │
             │ (8001)   │ │ (8002)   │ │ (8003)   │
             └──────────┘ └──────────┘ └──────────┘

API Endpoints

Control Plane (http://localhost:8000)

Method Endpoint Description
GET / Dashboard
GET /api/devices List all devices
POST /api/devices/register Register a device
POST /api/infer Submit inference request
GET /api/metrics System metrics
GET /api/routing/strategy Get routing strategy
POST /api/routing/strategy Set routing strategy

Device Agent (http://localhost:800X)

Method Endpoint Description
GET /health Health check
GET /info Device and model info
POST /execute Execute inference

Project Structure

edge-fabric-mvp/
├── control_plane/
│   ├── __init__.py
│   ├── server.py          # FastAPI app for routing
│   ├── router.py          # Routing logic
│   └── models.py          # Data models
├── device_agent/
│   ├── __init__.py
│   ├── agent.py           # Device agent
│   └── inference.py       # ONNX inference wrapper
├── client/
│   ├── __init__.py
│   └── api.py             # Client library
├── static/
│   ├── index.html         # Dashboard
│   └── style.css          # Styling
├── models/
│   ├── download_model.py  # Model downloader
│   └── mobilenetv2-7.onnx # Model file
├── requirements.txt
├── run_demo.py            # Demo runner
├── test_fabric.py         # Test suite
└── README.md

Routing Strategies

  • least_loaded (default): Routes to device with lowest CPU/memory usage
  • round_robin: Distributes requests evenly across devices
  • lowest_latency: Routes to device with best historical latency
  • random: Random device selection

Usage Examples

Python Client

from client.api import EdgeInferenceClient
import numpy as np

client = EdgeInferenceClient("http://localhost:8000")

# Run inference
input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)
result = client.infer("mobilenet", input_data)
print(f"Latency: {result['latency_ms']}ms")
print(f"Device: {result['device_id']}")

# Get devices
devices = client.get_devices()
print(f"Active devices: {len(devices)}")

# Run benchmark
benchmark = client.benchmark(num_requests=100)
print(f"P95 latency: {benchmark['p95_latency_ms']}ms")

cURL

# List devices
curl http://localhost:8000/api/devices

# Submit inference
curl -X POST http://localhost:8000/api/infer \
  -H "Content-Type: application/json" \
  -d '{"model_name": "mobilenet", "input_data": [0.5], "input_shape": [1, 3, 224, 224]}'

# Get metrics
curl http://localhost:8000/api/metrics

Configuration

Running with Custom Settings

# Start with 5 agents
python run_demo.py --agents 5

# Start only control plane
python run_demo.py --no-agents

# Custom ports
python run_demo.py --control-port 9000 --agent-start-port 9001

Starting Components Individually

# Control plane
python -m uvicorn control_plane.server:app --port 8000

# Device agent
python -m device_agent.agent --port 8001 --device-id my-device

Performance

Typical performance on modern hardware:

  • Single inference: 20-50ms (MobileNetV2)
  • Routing overhead: <1ms
  • Throughput: 20-50 req/s per device

Extending

Adding a New Model

  1. Download ONNX model to models/
  2. Update device_agent/inference.py to support new model
  3. Agents will automatically load models on startup

Custom Routing Strategy

Add to control_plane/router.py:

def _my_custom_select(self, devices: list[DeviceInfo]) -> DeviceInfo:
    # Your logic here
    return selected_device

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published