diff --git a/README.md b/README.md index b3b9516..2bf087e 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,28 @@ -# LIRA: Local Inference tool for Realtime Audio +# LIRA: Local Inference for Realtime Audio

LIRA logo

-**Local, efficient speech recognition. -Run ASR models on your machine—fast, simple, and developer-friendly.** +**Local, efficient automatic speech recognition (ASR). Run ASR models on your local machine—fast, simple, and developer-friendly.** -LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally with `lira run` and `lira serve` to integrate with your apps and tools. +LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally simply with the `lira run` and `lira serve` commands to integrate with your apps and tools. + +--- + +## 🧩 Supported Model Architectures & Runtimes + +LIRA supports multiple speech-model architectures. Runtime support depends on the exported model and chosen runtime. + +| Model | Typical use case | Runs on | Supported datatypes | +|----------------------|-----------------------------------------|-----------------|------------------------------------| +| whisper-small | Low-latency, resource-constrained | CPU, NPU* | FP32, BFP16 | +| whisper-base | Balanced accuracy and performance | CPU, NPU* | FP32, BFP16 | +| whisper-medium | Higher accuracy for challenging audio | CPU, NPU* | FP32, BFP16 | +| whisper-large-v3 | Highest accuracy (more compute) | CPU | FP32, BFP16 | +| zipformer | Streaming / low-latency ASR encoder | CPU, NPU* | FP32, BFP16 | + +*NPU support depends on available Vitis AI export artifacts and target hardware. --- @@ -15,85 +30,63 @@ LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models local **Prerequisites:** -- **Python 3.10** is required. +- **Python 3.11** is required. - We recommend using **conda** for environment management. -- For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to nebale NPU use cases. -- Current recommended Ryzen AI Version: RAI 1.5.1 with 32.0.203.280 driver. +- For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to enable NPU use cases. +- Current recommended Ryzen AI version 1.6.0 with the 32.0.203.280 NPU driver. **Minimal install steps:** 1. **Clone the repo and change directory:** ```bash - git clone https://github.com/aigdat/LIRA.git + git clone https://github.com/amd/LIRA.git cd LIRA ``` 2. **Activate your conda environment:** +This conda environment should already be installed from the Ryzen AI SW installation mentioned earlier. ```bash - conda activate ryzen-ai-1.5.0 + conda activate ryzen-ai-*.*.* ``` +Replace the `ryzen-ai-*.*.*` with the version number you are using, such as `ryzen-ai-1.6.0` 3. **Install LIRA in editable mode:** ```bash pip install -e . ``` -Now you can run `lira --help` to see available commands. +You can run `lira --help` to see available commands. --- ## ⚡ CLI-first Design -LIRA is a CLI-first toolkit focused on simple developer workflows for exporting, running, and serving speech models. - -**Primary commands:** - -- **`lira run`** - Run, export, or benchmark models directly from the command line. - Use for local inference, ONNX export, or rapid prototyping. - -- **`lira serve`** - Launch a FastAPI server with OpenAI-compatible endpoints. - Expose models as HTTP APIs for real-time transcription and seamless integration. - Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls. +LIRA is a CLI-first toolkit focused on simple developer workflows for exporting, running, and serving speech models. You can run your own model locally, or host an OpenAI API-compatible server for any app you want to use on top of LIRA. **Quick examples:** + ```bash -# Run a model locally (inference) lira run whisper --model-type whisper-base --export --device cpu --audio audio_files/test.wav +``` +✅ To learn more about `lira run`, visit the detailed [Running Models with `lira run`](#-running-models-with-lira-run) section. -# Serve the model for local apps (OpenAI-compatible endpoints) +```bash lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000 ``` ---- +✅ To learn more about `lira serve`, visit the detailed [LIRA Server](#️-lira-server) section. -## 🖥️ LIRA Server - -LIRA includes a FastAPI-based HTTP server for rapid integration with your applications. The server offers **OpenAI API compatibility** for real-time speech recognition. +**NPU Acceleration:** -**Start the server:** - -- **CPU acceleration:** - ```bash - lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000 - ``` -- **NPU acceleration:** - ```bash - lira serve --backend openai --model whisper-base --device npu --host 0.0.0.0 --port 5000 - ``` - -> Interested in more server features? -> Try the **LIRA server demo** with Open WebUI. -> See [docs/OpenWebUI_README.md](docs/OpenWebUI_README.md) for setup instructions. - -- Configure models via `config/model_config.json`. -- Set API keys (dummy) as environment variables for protected backends. +🕙 For NPU acceleration, change `--device cpu` to `--device npu`. --- ## 🏃 Running Models with `lira run` +- Run, export, or benchmark models locally directly from the command line. +- Use for local inference, ONNX export, or rapid prototyping. + To run a model using the CLI: ```bash lira run [options] @@ -108,6 +101,25 @@ Replace `` with the model name or path. _Tip: run `lira run --help` for model-specific flags._ +--- +## 🖥️ LIRA Server +OpenAI API-compatible local model serving with `lira serve`. +- Launch and serve the model with a FastAPI server with OpenAI API-compatible endpoints. +- Expose models as HTTP APIs for real-time transcription and seamless integration. +- Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls. + +```bash +lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000 +``` + +> Interested in more server features? +> Try the **LIRA server demo** with Open WebUI. +> See [examples/openwebui](examples/openwebui) for setup instructions. + +- Configure models via [config/model_config.json](config/model_config.json). +- Set API keys (dummy) as environment variables for protected backends. + + --- ### 🗣️ Running Whisper @@ -115,11 +127,13 @@ _Tip: run `lira run --help` for model-specific flags._ Whisper supports export/optimization and model-specific flags. **Example:** +Export Whisper base model to ONNX, optimize and run on NPU. ```bash -# Export Whisper base model to ONNX, optimize and run on NPU lira run whisper --model-type whisper-base --export --device npu --audio --use-kv-cache +``` -# Run inference on a sample audio file +Run inference on a sample audio file on CPU. +```bash lira run whisper -m exported_models/whisper_base --device cpu --audio "audio_files/test.wav" ``` @@ -159,26 +173,16 @@ _Tip: Run `lira run zipformer --help` for all options._ Model and runtime configs live in `config/`: -- `config/model_config.json` — model routing and defaults -- `vitisai_config_*.json` — Vitis AI configs for NPU exports +- [config/model_config.json](config/model_config.json) — model routing and defaults +- [config/vitisai_config_whisper_base_encoder.json](config/vitisai_config_whisper_base_encoder.json) — Vitis AI Whisper encoder config for NPU exports +- [config/vitisai_config_whisper_base_decoder.json](config/vitisai_config_whisper_base_decoder.json) — Vitis AI Whisper decoder config for NPU exports +- [config/vitisai_config_zipformer_encoder.json](config/vitisai_config_zipformer_encoder.json) — Vitis AI Zipformer encoder config for NPU exports You can point to custom config files or modify those in the repo. --- -## 🧩 Supported Model Architectures & Runtimes -LIRA supports multiple speech-model architectures. Runtime support depends on the exported model and chosen runtime. - -| Model | Typical use case | Runs on | Supported datatypes | -|----------------------|-----------------------------------------|-----------------|------------------------------------| -| Whisper (small) | Low-latency, resource-constrained | CPU, GPU, NPU* | FP32, BFP16 | -| Whisper (base) | Balanced accuracy and performance | CPU, GPU, NPU* | FP32, BFP16 | -| Whisper (medium) | Higher accuracy for challenging audio | CPU, GPU, NPU* | FP32, BFP16 | -| Whisper (large) | Highest accuracy (more compute) | CPU, GPU | FP32, BFP16 | -| Zipformer | Streaming / low-latency ASR encoder | CPU, GPU, NPU* | FP32, BFP16 | - -*NPU support depends on available Vitis AI export artifacts and target hardware. ## 🧪 Early Access & Open Source Intentions diff --git a/docs/OpenWebUI_README.md b/examples/openwebui/README.md similarity index 87% rename from docs/OpenWebUI_README.md rename to examples/openwebui/README.md index 946791b..50d52c3 100644 --- a/docs/OpenWebUI_README.md +++ b/examples/openwebui/README.md @@ -9,21 +9,20 @@ Ready to turn your browser into a voice-powered AI playground? With LIRA and Ope ## 1. Set up environments **Recommended:** Use separate conda environments to avoid dependency conflicts. -- For LIRA, reuse `ryzen-ai-1.5.0` to leverage NPU support. +- For LIRA, reuse `ryzen-ai-*.*.*` to leverage NPU support, where `*.*.*` is the version number. - For OpenWebUI, create a new environment. -### LIRA setup: +### LIRA and OpenWebUI setup: Follow the instructions in the [Getting Started](../README.md#getting-started) section of the main README.md to install and set up the Ryzen AI environment. -```powershell -conda activate ryzen-ai-1.5.0 -lira serve --help -``` -### OpenWebUI setup: -In a new environment, let's set up OpenWebUI: +Let's set up OpenWebUI by first cloning the `ryzen-ai-*.*.*` environment, and then installing `open-webui`. ```powershell -conda create -n openwebui python=3.11 -y +conda create -n openwebui python=3.11 -y --clone ryzen-ai-*.*.* +``` +```bash conda activate openwebui +``` +```bash pip install open-webui ``` @@ -90,7 +89,7 @@ Record from your mic or upload audio files (`.wav`, `.mp3`)—OpenWebUI will sen ## 📝 Notes & Tips - If you exported a Whisper ONNX model to a custom directory, set `LIRA_MODEL_DIR` before starting the server, or use `lira serve` flags to point at the export. -- For NPU runs, start `lira serve` from `ryzen-ai-1.5.0` so Vitis AI tooling and drivers are available. +- For NPU runs, start `lira serve` from the `ryzen-ai-*.*.*` conda environment so Vitis AI tooling and drivers are available. - If running behind a reverse proxy, update OpenWebUI's API Base URL accordingly. See the main [README.md](../README.md) for full LIRA setup and model export instructions. diff --git a/docs/assets/image-add-connection.png b/examples/openwebui/assets/image-add-connection.png similarity index 100% rename from docs/assets/image-add-connection.png rename to examples/openwebui/assets/image-add-connection.png diff --git a/docs/assets/image.png b/examples/openwebui/assets/image.png similarity index 100% rename from docs/assets/image.png rename to examples/openwebui/assets/image.png diff --git a/docs/assets/openwebui audio.png b/examples/openwebui/assets/openwebui audio.png similarity index 100% rename from docs/assets/openwebui audio.png rename to examples/openwebui/assets/openwebui audio.png