Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 65 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,99 +1,92 @@
# LIRA: Local Inference tool for Realtime Audio
# LIRA: Local Inference for Realtime Audio
<p align="center">
<p align="center">
<img src="images/logo.png" alt="LIRA logo" width="1280" style="border-radius:24px; height:400px; object-fit:cover;">
</p>

**Local, efficient speech recognition.
Run ASR models on your machine—fast, simple, and developer-friendly.**
**Local, efficient automatic speech recognition (ASR). Run ASR models on your local machine—fast, simple, and developer-friendly.**

LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally with `lira run` and `lira serve` to integrate with your apps and tools.
LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally simply with the `lira run` and `lira serve` commands to integrate with your apps and tools.

---

## 🧩 Supported Model Architectures & Runtimes

LIRA supports multiple speech-model architectures. Runtime support depends on the exported model and chosen runtime.

| Model | Typical use case | Runs on | Supported datatypes |
|----------------------|-----------------------------------------|-----------------|------------------------------------|
| whisper-small | Low-latency, resource-constrained | CPU, NPU* | FP32, BFP16 |
| whisper-base | Balanced accuracy and performance | CPU, NPU* | FP32, BFP16 |
| whisper-medium | Higher accuracy for challenging audio | CPU, NPU* | FP32, BFP16 |
| whisper-large-v3 | Highest accuracy (more compute) | CPU | FP32, BFP16 |
| zipformer | Streaming / low-latency ASR encoder | CPU, NPU* | FP32, BFP16 |

<sub>*NPU support depends on available Vitis AI export artifacts and target hardware.</sub>

---

## 🚀 Getting Started

**Prerequisites:**

- **Python 3.10** is required.
- **Python 3.11** is required.
- We recommend using **conda** for environment management.
- For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to nebale NPU use cases.
- Current recommended Ryzen AI Version: RAI 1.5.1 with 32.0.203.280 driver.
- For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to enable NPU use cases.
- Current recommended Ryzen AI version 1.6.0 with the 32.0.203.280 NPU driver.

**Minimal install steps:**

1. **Clone the repo and change directory:**
```bash
git clone https://github.com/aigdat/LIRA.git
git clone https://github.com/amd/LIRA.git
cd LIRA
```

2. **Activate your conda environment:**
This conda environment should already be installed from the Ryzen AI SW installation mentioned earlier.
```bash
conda activate ryzen-ai-1.5.0
conda activate ryzen-ai-*.*.*
```
Replace the `ryzen-ai-*.*.*` with the version number you are using, such as `ryzen-ai-1.6.0`

3. **Install LIRA in editable mode:**
```bash
pip install -e .
```

Now you can run `lira --help` to see available commands.
You can run `lira --help` to see available commands.

---

## ⚡ CLI-first Design

LIRA is a CLI-first toolkit focused on simple developer workflows for exporting, running, and serving speech models.

**Primary commands:**

- **`lira run`**
Run, export, or benchmark models directly from the command line.
Use for local inference, ONNX export, or rapid prototyping.

- **`lira serve`**
Launch a FastAPI server with OpenAI-compatible endpoints.
Expose models as HTTP APIs for real-time transcription and seamless integration.
Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.
LIRA is a CLI-first toolkit focused on simple developer workflows for exporting, running, and serving speech models. You can run your own model locally, or host an OpenAI API-compatible server for any app you want to use on top of LIRA.

**Quick examples:**

```bash
# Run a model locally (inference)
lira run whisper --model-type whisper-base --export --device cpu --audio audio_files/test.wav
```
✅ To learn more about `lira run`, visit the detailed [Running Models with `lira run`](#-running-models-with-lira-run) section.

# Serve the model for local apps (OpenAI-compatible endpoints)
```bash
lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
```

---
✅ To learn more about `lira serve`, visit the detailed [LIRA Server](#️-lira-server) section.

## 🖥️ LIRA Server

LIRA includes a FastAPI-based HTTP server for rapid integration with your applications. The server offers **OpenAI API compatibility** for real-time speech recognition.
**NPU Acceleration:**

**Start the server:**

- **CPU acceleration:**
```bash
lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
```
- **NPU acceleration:**
```bash
lira serve --backend openai --model whisper-base --device npu --host 0.0.0.0 --port 5000
```

> Interested in more server features?
> Try the **LIRA server demo** with Open WebUI.
> See [docs/OpenWebUI_README.md](docs/OpenWebUI_README.md) for setup instructions.

- Configure models via `config/model_config.json`.
- Set API keys (dummy) as environment variables for protected backends.
🕙 For NPU acceleration, change `--device cpu` to `--device npu`.

---

## 🏃 Running Models with `lira run`

- Run, export, or benchmark models locally directly from the command line.
- Use for local inference, ONNX export, or rapid prototyping.

To run a model using the CLI:
```bash
lira run <model> [options]
Expand All @@ -108,18 +101,39 @@ Replace `<model>` with the model name or path.

_Tip: run `lira run <model> --help` for model-specific flags._

---
## 🖥️ LIRA Server
OpenAI API-compatible local model serving with `lira serve`.
- Launch and serve the model with a FastAPI server with OpenAI API-compatible endpoints.
- Expose models as HTTP APIs for real-time transcription and seamless integration.
- Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.

```bash
lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
```

> Interested in more server features?
> Try the **LIRA server demo** with Open WebUI.
> See [examples/openwebui](examples/openwebui) for setup instructions.

- Configure models via [config/model_config.json](config/model_config.json).
- Set API keys (dummy) as environment variables for protected backends.


---

### 🗣️ Running Whisper

Whisper supports export/optimization and model-specific flags.

**Example:**
Export Whisper base model to ONNX, optimize and run on NPU.
```bash
# Export Whisper base model to ONNX, optimize and run on NPU
lira run whisper --model-type whisper-base --export --device npu --audio <input/.wav file> --use-kv-cache
```

# Run inference on a sample audio file
Run inference on a sample audio file on CPU.
```bash
lira run whisper -m exported_models/whisper_base --device cpu --audio "audio_files/test.wav"
```

Expand Down Expand Up @@ -159,26 +173,16 @@ _Tip: Run `lira run zipformer --help` for all options._

Model and runtime configs live in `config/`:

- `config/model_config.json` — model routing and defaults
- `vitisai_config_*.json` — Vitis AI configs for NPU exports
- [config/model_config.json](config/model_config.json) — model routing and defaults
- [config/vitisai_config_whisper_base_encoder.json](config/vitisai_config_whisper_base_encoder.json) — Vitis AI Whisper encoder config for NPU exports
- [config/vitisai_config_whisper_base_decoder.json](config/vitisai_config_whisper_base_decoder.json) — Vitis AI Whisper decoder config for NPU exports
- [config/vitisai_config_zipformer_encoder.json](config/vitisai_config_zipformer_encoder.json) — Vitis AI Zipformer encoder config for NPU exports

You can point to custom config files or modify those in the repo.

---

## 🧩 Supported Model Architectures & Runtimes

LIRA supports multiple speech-model architectures. Runtime support depends on the exported model and chosen runtime.

| Model | Typical use case | Runs on | Supported datatypes |
|----------------------|-----------------------------------------|-----------------|------------------------------------|
| Whisper (small) | Low-latency, resource-constrained | CPU, GPU, NPU* | FP32, BFP16 |
| Whisper (base) | Balanced accuracy and performance | CPU, GPU, NPU* | FP32, BFP16 |
| Whisper (medium) | Higher accuracy for challenging audio | CPU, GPU, NPU* | FP32, BFP16 |
| Whisper (large) | Highest accuracy (more compute) | CPU, GPU | FP32, BFP16 |
| Zipformer | Streaming / low-latency ASR encoder | CPU, GPU, NPU* | FP32, BFP16 |

<sub>*NPU support depends on available Vitis AI export artifacts and target hardware.</sub>

## 🧪 Early Access & Open Source Intentions

Expand Down
19 changes: 9 additions & 10 deletions docs/OpenWebUI_README.md → examples/openwebui/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,20 @@ Ready to turn your browser into a voice-powered AI playground? With LIRA and Ope
## 1. Set up environments

**Recommended:** Use separate conda environments to avoid dependency conflicts.
- For LIRA, reuse `ryzen-ai-1.5.0` to leverage NPU support.
- For LIRA, reuse `ryzen-ai-*.*.*` to leverage NPU support, where `*.*.*` is the version number.
- For OpenWebUI, create a new environment.

### LIRA setup:
### LIRA and OpenWebUI setup:
Follow the instructions in the [Getting Started](../README.md#getting-started) section of the main README.md to install and set up the Ryzen AI environment.
```powershell
conda activate ryzen-ai-1.5.0
lira serve --help
```

### OpenWebUI setup:
In a new environment, let's set up OpenWebUI:
Let's set up OpenWebUI by first cloning the `ryzen-ai-*.*.*` environment, and then installing `open-webui`.
```powershell
conda create -n openwebui python=3.11 -y
conda create -n openwebui python=3.11 -y --clone ryzen-ai-*.*.*
```
```bash
conda activate openwebui
```
```bash
pip install open-webui
```

Expand Down Expand Up @@ -90,7 +89,7 @@ Record from your mic or upload audio files (`.wav`, `.mp3`)—OpenWebUI will sen
## 📝 Notes & Tips

- If you exported a Whisper ONNX model to a custom directory, set `LIRA_MODEL_DIR` before starting the server, or use `lira serve` flags to point at the export.
- For NPU runs, start `lira serve` from `ryzen-ai-1.5.0` so Vitis AI tooling and drivers are available.
- For NPU runs, start `lira serve` from the `ryzen-ai-*.*.*` conda environment so Vitis AI tooling and drivers are available.
- If running behind a reverse proxy, update OpenWebUI's API Base URL accordingly.

See the main [README.md](../README.md) for full LIRA setup and model export instructions.
Expand Down
File renamed without changes
Loading