diff --git a/README.md b/README.md
index b3b9516..2bf087e 100644
--- a/README.md
+++ b/README.md
@@ -1,13 +1,28 @@
-# LIRA: Local Inference tool for Realtime Audio
+# LIRA: Local Inference for Realtime Audio
-**Local, efficient speech recognition.
-Run ASR models on your machine—fast, simple, and developer-friendly.**
+**Local, efficient automatic speech recognition (ASR). Run ASR models on your local machine—fast, simple, and developer-friendly.**
-LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally with `lira run` and `lira serve` to integrate with your apps and tools.
+LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally simply with the `lira run` and `lira serve` commands to integrate with your apps and tools.
+
+---
+
+## 🧩 Supported Model Architectures & Runtimes
+
+LIRA supports multiple speech-model architectures. Runtime support depends on the exported model and chosen runtime.
+
+| Model | Typical use case | Runs on | Supported datatypes |
+|----------------------|-----------------------------------------|-----------------|------------------------------------|
+| whisper-small | Low-latency, resource-constrained | CPU, NPU* | FP32, BFP16 |
+| whisper-base | Balanced accuracy and performance | CPU, NPU* | FP32, BFP16 |
+| whisper-medium | Higher accuracy for challenging audio | CPU, NPU* | FP32, BFP16 |
+| whisper-large-v3 | Highest accuracy (more compute) | CPU | FP32, BFP16 |
+| zipformer | Streaming / low-latency ASR encoder | CPU, NPU* | FP32, BFP16 |
+
+*NPU support depends on available Vitis AI export artifacts and target hardware.
---
@@ -15,85 +30,63 @@ LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models local
**Prerequisites:**
-- **Python 3.10** is required.
+- **Python 3.11** is required.
- We recommend using **conda** for environment management.
-- For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to nebale NPU use cases.
-- Current recommended Ryzen AI Version: RAI 1.5.1 with 32.0.203.280 driver.
+- For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to enable NPU use cases.
+- Current recommended Ryzen AI version 1.6.0 with the 32.0.203.280 NPU driver.
**Minimal install steps:**
1. **Clone the repo and change directory:**
```bash
- git clone https://github.com/aigdat/LIRA.git
+ git clone https://github.com/amd/LIRA.git
cd LIRA
```
2. **Activate your conda environment:**
+This conda environment should already be installed from the Ryzen AI SW installation mentioned earlier.
```bash
- conda activate ryzen-ai-1.5.0
+ conda activate ryzen-ai-*.*.*
```
+Replace the `ryzen-ai-*.*.*` with the version number you are using, such as `ryzen-ai-1.6.0`
3. **Install LIRA in editable mode:**
```bash
pip install -e .
```
-Now you can run `lira --help` to see available commands.
+You can run `lira --help` to see available commands.
---
## ⚡ CLI-first Design
-LIRA is a CLI-first toolkit focused on simple developer workflows for exporting, running, and serving speech models.
-
-**Primary commands:**
-
-- **`lira run`**
- Run, export, or benchmark models directly from the command line.
- Use for local inference, ONNX export, or rapid prototyping.
-
-- **`lira serve`**
- Launch a FastAPI server with OpenAI-compatible endpoints.
- Expose models as HTTP APIs for real-time transcription and seamless integration.
- Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.
+LIRA is a CLI-first toolkit focused on simple developer workflows for exporting, running, and serving speech models. You can run your own model locally, or host an OpenAI API-compatible server for any app you want to use on top of LIRA.
**Quick examples:**
+
```bash
-# Run a model locally (inference)
lira run whisper --model-type whisper-base --export --device cpu --audio audio_files/test.wav
+```
+✅ To learn more about `lira run`, visit the detailed [Running Models with `lira run`](#-running-models-with-lira-run) section.
-# Serve the model for local apps (OpenAI-compatible endpoints)
+```bash
lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
```
----
+✅ To learn more about `lira serve`, visit the detailed [LIRA Server](#️-lira-server) section.
-## 🖥️ LIRA Server
-
-LIRA includes a FastAPI-based HTTP server for rapid integration with your applications. The server offers **OpenAI API compatibility** for real-time speech recognition.
+**NPU Acceleration:**
-**Start the server:**
-
-- **CPU acceleration:**
- ```bash
- lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
- ```
-- **NPU acceleration:**
- ```bash
- lira serve --backend openai --model whisper-base --device npu --host 0.0.0.0 --port 5000
- ```
-
-> Interested in more server features?
-> Try the **LIRA server demo** with Open WebUI.
-> See [docs/OpenWebUI_README.md](docs/OpenWebUI_README.md) for setup instructions.
-
-- Configure models via `config/model_config.json`.
-- Set API keys (dummy) as environment variables for protected backends.
+🕙 For NPU acceleration, change `--device cpu` to `--device npu`.
---
## 🏃 Running Models with `lira run`
+- Run, export, or benchmark models locally directly from the command line.
+- Use for local inference, ONNX export, or rapid prototyping.
+
To run a model using the CLI:
```bash
lira run [options]
@@ -108,6 +101,25 @@ Replace `` with the model name or path.
_Tip: run `lira run --help` for model-specific flags._
+---
+## 🖥️ LIRA Server
+OpenAI API-compatible local model serving with `lira serve`.
+- Launch and serve the model with a FastAPI server with OpenAI API-compatible endpoints.
+- Expose models as HTTP APIs for real-time transcription and seamless integration.
+- Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.
+
+```bash
+lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
+```
+
+> Interested in more server features?
+> Try the **LIRA server demo** with Open WebUI.
+> See [examples/openwebui](examples/openwebui) for setup instructions.
+
+- Configure models via [config/model_config.json](config/model_config.json).
+- Set API keys (dummy) as environment variables for protected backends.
+
+
---
### 🗣️ Running Whisper
@@ -115,11 +127,13 @@ _Tip: run `lira run --help` for model-specific flags._
Whisper supports export/optimization and model-specific flags.
**Example:**
+Export Whisper base model to ONNX, optimize and run on NPU.
```bash
-# Export Whisper base model to ONNX, optimize and run on NPU
lira run whisper --model-type whisper-base --export --device npu --audio --use-kv-cache
+```
-# Run inference on a sample audio file
+Run inference on a sample audio file on CPU.
+```bash
lira run whisper -m exported_models/whisper_base --device cpu --audio "audio_files/test.wav"
```
@@ -159,26 +173,16 @@ _Tip: Run `lira run zipformer --help` for all options._
Model and runtime configs live in `config/`:
-- `config/model_config.json` — model routing and defaults
-- `vitisai_config_*.json` — Vitis AI configs for NPU exports
+- [config/model_config.json](config/model_config.json) — model routing and defaults
+- [config/vitisai_config_whisper_base_encoder.json](config/vitisai_config_whisper_base_encoder.json) — Vitis AI Whisper encoder config for NPU exports
+- [config/vitisai_config_whisper_base_decoder.json](config/vitisai_config_whisper_base_decoder.json) — Vitis AI Whisper decoder config for NPU exports
+- [config/vitisai_config_zipformer_encoder.json](config/vitisai_config_zipformer_encoder.json) — Vitis AI Zipformer encoder config for NPU exports
You can point to custom config files or modify those in the repo.
---
-## 🧩 Supported Model Architectures & Runtimes
-LIRA supports multiple speech-model architectures. Runtime support depends on the exported model and chosen runtime.
-
-| Model | Typical use case | Runs on | Supported datatypes |
-|----------------------|-----------------------------------------|-----------------|------------------------------------|
-| Whisper (small) | Low-latency, resource-constrained | CPU, GPU, NPU* | FP32, BFP16 |
-| Whisper (base) | Balanced accuracy and performance | CPU, GPU, NPU* | FP32, BFP16 |
-| Whisper (medium) | Higher accuracy for challenging audio | CPU, GPU, NPU* | FP32, BFP16 |
-| Whisper (large) | Highest accuracy (more compute) | CPU, GPU | FP32, BFP16 |
-| Zipformer | Streaming / low-latency ASR encoder | CPU, GPU, NPU* | FP32, BFP16 |
-
-*NPU support depends on available Vitis AI export artifacts and target hardware.
## 🧪 Early Access & Open Source Intentions
diff --git a/docs/OpenWebUI_README.md b/examples/openwebui/README.md
similarity index 87%
rename from docs/OpenWebUI_README.md
rename to examples/openwebui/README.md
index 946791b..50d52c3 100644
--- a/docs/OpenWebUI_README.md
+++ b/examples/openwebui/README.md
@@ -9,21 +9,20 @@ Ready to turn your browser into a voice-powered AI playground? With LIRA and Ope
## 1. Set up environments
**Recommended:** Use separate conda environments to avoid dependency conflicts.
-- For LIRA, reuse `ryzen-ai-1.5.0` to leverage NPU support.
+- For LIRA, reuse `ryzen-ai-*.*.*` to leverage NPU support, where `*.*.*` is the version number.
- For OpenWebUI, create a new environment.
-### LIRA setup:
+### LIRA and OpenWebUI setup:
Follow the instructions in the [Getting Started](../README.md#getting-started) section of the main README.md to install and set up the Ryzen AI environment.
-```powershell
-conda activate ryzen-ai-1.5.0
-lira serve --help
-```
-### OpenWebUI setup:
-In a new environment, let's set up OpenWebUI:
+Let's set up OpenWebUI by first cloning the `ryzen-ai-*.*.*` environment, and then installing `open-webui`.
```powershell
-conda create -n openwebui python=3.11 -y
+conda create -n openwebui python=3.11 -y --clone ryzen-ai-*.*.*
+```
+```bash
conda activate openwebui
+```
+```bash
pip install open-webui
```
@@ -90,7 +89,7 @@ Record from your mic or upload audio files (`.wav`, `.mp3`)—OpenWebUI will sen
## 📝 Notes & Tips
- If you exported a Whisper ONNX model to a custom directory, set `LIRA_MODEL_DIR` before starting the server, or use `lira serve` flags to point at the export.
-- For NPU runs, start `lira serve` from `ryzen-ai-1.5.0` so Vitis AI tooling and drivers are available.
+- For NPU runs, start `lira serve` from the `ryzen-ai-*.*.*` conda environment so Vitis AI tooling and drivers are available.
- If running behind a reverse proxy, update OpenWebUI's API Base URL accordingly.
See the main [README.md](../README.md) for full LIRA setup and model export instructions.
diff --git a/docs/assets/image-add-connection.png b/examples/openwebui/assets/image-add-connection.png
similarity index 100%
rename from docs/assets/image-add-connection.png
rename to examples/openwebui/assets/image-add-connection.png
diff --git a/docs/assets/image.png b/examples/openwebui/assets/image.png
similarity index 100%
rename from docs/assets/image.png
rename to examples/openwebui/assets/image.png
diff --git a/docs/assets/openwebui audio.png b/examples/openwebui/assets/openwebui audio.png
similarity index 100%
rename from docs/assets/openwebui audio.png
rename to examples/openwebui/assets/openwebui audio.png