Sampling Proxy

A middleware server that intercepts and modifies sampling parameters for generation requests to OpenAI-compatible backends. It allows overriding specific parameters per model name when they are not set in the request, or enforcing parameter overrides when they are set in the request. The server supports both OpenAI-compatible and Anthropic request formats, enabling the use of Claude Code with OpenAI-compatible backends.

Features

Parameter Override: Automatically applies custom sampling parameters to generation requests
Model-Specific Settings: Configure different parameters for different models
Format Conversion: Converts between Anthropic and OpenAI request/response formats
Streaming Support: Handles both streaming and non-streaming responses
Enforced Parameters: Option to enforce specific parameters that override all others
Debug Logging: Comprehensive logging for troubleshooting

Installation

Prerequisites

Python 3.8 or higher
pip (Python package manager)

Setup with Virtual Environment

Clone or download the project:

git clone https://github.com/avtc/sampling-proxy.git
cd sampling-proxy

Create a virtual environment:
```
python -m venv sampling-proxy
```

Activate the virtual environment:

On Windows:

sampling-proxy\Scripts\activate

On macOS/Linux:

source sampling-proxy/bin/activate

Make the shell script executable:
```
chmod +x ./sampling_proxy.sh
```
Create configuration file:
```
cp config_sample.json config.json
```
Then edit config.json to match your specific configuration needs.
Install the dependencies:
```
pip install -r requirements.txt
```

Updating from Git

To update your existing installation to the latest version from the git repository:

Navigate to the project directory:
```
cd sampling-proxy
```

Activate the virtual environment:

On Windows:

sampling-proxy\Scripts\activate

On macOS/Linux:

source sampling-proxy/bin/activate

Pull the latest changes:
```
git pull origin main
```
Update dependencies (if requirements.txt has changed):
```
pip install -r requirements.txt --upgrade
```
Restart the proxy server if it's currently running.

Usage

Basic Usage

Run the proxy server with default settings:

python sampling_proxy.py

This will start the proxy server on http://0.0.0.0:8001 and forward requests to an OpenAI-compatible backend at http://127.0.0.1:8000/v1.

Command Line Options

python sampling_proxy.py --help

Available options:

--config, -c: Path to configuration JSON file (default: config.json)
--host: Host address for the proxy server (overrides config)
--port: Port for the proxy server (overrides config)
--base-path: Base path for the proxy server (overrides config)
--target-base-url: OpenAI compatible backend base url (overrides config)
--debug-logs, -d: Enable detailed debug logging (overrides config)
--override-logs, -o: Show when sampling parameters are overridden (overrides config)
--enforce-params, -e: Enforce specific parameters as JSON string (overrides config)

Examples

Run with custom target base url and debug logging:

python sampling_proxy.py --target-base-url http://127.0.0.1:8000/v1 --debug-logs

Run with a custom configuration file:

python sampling_proxy.py --config my-config.json

Run with enforced parameters:

python sampling_proxy.py --enforce-params '{"temperature": 0.7, "top_p": 0.9}'

Run with override logs to see parameter changes:
```
python sampling_proxy.py --override-logs
```

Configuration

The proxy uses an external config.json file for configuration. A sample configuration file config_sample.json is provided - copy it to config.json and modify as needed. You can specify a custom config file path with the --config command-line argument.

Sampling Parameter Priority

The proxy applies sampling parameters in the following priority order (from highest to lowest):

Enforced sampling parameters (always override everything)
Parameters specified in the request
Model-specific sampling parameters
Default sampling parameters (fallback values)

API Endpoints

The proxy handles the following endpoints:

Generation Endpoints (with parameter override)

/generate - SGLang generation endpoint
/completions - OpenAI completions
/chat/completions - OpenAI chat completions
/messages - Anthropic messages (converted to OpenAI format)

Other Endpoints (proxied without modification)

/models - List available models
All other endpoints are passed through to the backend

Health Check

/ - Returns proxy configuration and status

Example Usage with Clients

OpenAI Client

import openai

client = openai.OpenAI(
    base_url="http://localhost:8001",  # Point to the proxy
    api_key="not-required"
)

response = client.chat.completions.create(
    model="your-model",
    messages=[{"role": "user", "content": "Hello!"}]
)

Anthropic Client

from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:8001",  # Point to the proxy
    api_key="not-required"
)

response = client.messages.create(
    model="your-model",
    max_tokens=100,
    messages=[{"role": "user", "content": "Hello!"}]
)

Troubleshooting

Enable Debug Logging

python sampling_proxy.py --debug-logs --override-logs

Common Issues

Connection Refused: Ensure your backend server is running and accessible
404 Errors: Check if the backend supports the requested endpoints
Parameter Not Applied: Use --override-logs to see when parameters are being overridden

Logs

The proxy provides detailed logging including:

Incoming requests
Parameter overrides
Backend communication
Error details

License

This project is licensed under the MIT License. See the LICENSE file for details.

Quick Start Scripts

For convenience, use the provided scripts to start the proxy with the correct virtual environment:

Linux/macOS

./sampling_proxy.sh

Windows

.\sampling_proxy.ps1

Both scripts will automatically activate the sampling_proxy virtual environment and start the proxy server.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sampling Proxy

Features

Installation

Prerequisites

Setup with Virtual Environment

Updating from Git

Usage

Basic Usage

Command Line Options

Examples

Configuration

Sampling Parameter Priority

API Endpoints

Generation Endpoints (with parameter override)

Other Endpoints (proxied without modification)

Health Check

Example Usage with Clients

OpenAI Client

Anthropic Client

Troubleshooting

Enable Debug Logging

Common Issues

Logs

License

Quick Start Scripts

Linux/macOS

Windows

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config_sample.json		config_sample.json
requirements.txt		requirements.txt
sampling_proxy.ps1		sampling_proxy.ps1
sampling_proxy.py		sampling_proxy.py
sampling_proxy.sh		sampling_proxy.sh

License

avtc/sampling-proxy

Folders and files

Latest commit

History

Repository files navigation

Sampling Proxy

Features

Installation

Prerequisites

Setup with Virtual Environment

Updating from Git

Usage

Basic Usage

Command Line Options

Examples

Configuration

Sampling Parameter Priority

API Endpoints

Generation Endpoints (with parameter override)

Other Endpoints (proxied without modification)

Health Check

Example Usage with Clients

OpenAI Client

Anthropic Client

Troubleshooting

Enable Debug Logging

Common Issues

Logs

License

Quick Start Scripts

Linux/macOS

Windows

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages