From fbac2793bf8d1f8f7e738807f079d826e3e0ce1c Mon Sep 17 00:00:00 2001
From: Consolvo <bconsolv@amd.com>
Date: Mon, 6 Oct 2025 09:16:38 -0500
Subject: [PATCH 1/8] README modifications: bash tweaks, supported models at
 top, links to models in HF, RAI 1.6.0

---
 README.md                | 93 ++++++++++++++++++----------------------
 docs/OpenWebUI_README.md | 15 +++----
 2 files changed, 47 insertions(+), 61 deletions(-)
diff --git a/README.md b/README.md
index b3b9516..6bde33b 100644
--- a/README.md
+++ b/README.md
@@ -1,36 +1,51 @@
-# LIRA: Local Inference tool for Realtime Audio
+# LIRA: Local Inference for Realtime Audio
 <p align="center">
     <p align="center">
         <img src="images/logo.png" alt="LIRA logo" width="1280" style="border-radius:24px; height:400px; object-fit:cover;">
     </p>
 
-**Local, efficient speech recognition.  
-Run ASR models on your machine—fast, simple, and developer-friendly.**
+**Local, efficient automatic speech recognition (ASR).  Run ASR models on your local machine—fast, simple, and developer-friendly.**
 
-LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally with `lira run` and `lira serve` to integrate with your apps and tools.
+LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally simply with the `lira run` and `lira serve` commands to integrate with your apps and tools.
+
+
+
+## 🧩 Supported Model Architectures & Runtimes
+
+LIRA supports multiple speech-model architectures. Runtime support depends on the exported model and chosen runtime.
+
+| Model                | Typical use case                        | Runs on         | Supported datatypes                |
+|----------------------|-----------------------------------------|-----------------|------------------------------------|
+| [whisper-small](https://huggingface.co/openai/whisper-small)      | Low-latency, resource-constrained       | CPU, GPU, NPU*  | FP32, BFP16                        |
+| [whisper-base](https://huggingface.co/openai/whisper-base)       | Balanced accuracy and performance       | CPU, GPU, NPU*  | FP32, BFP16                        |
+| [whisper-medium](https://huggingface.co/openai/whisper-medium)     | Higher accuracy for challenging audio   | CPU, GPU, NPU*  | FP32, BFP16                        |
+| [whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)      | Highest accuracy (more compute)         | CPU, GPU        | FP32, BFP16                        |
+| [zipformer](https://huggingface.co/papers/2310.11230)            | Streaming / low-latency ASR encoder     | CPU, GPU, NPU*  | FP32, BFP16                        |
+
+<sub>*NPU support depends on available Vitis AI export artifacts and target hardware.</sub>
 
----
 
 ## 🚀 Getting Started
 
 **Prerequisites:**
 
-- **Python 3.10** is required.
+- **Python 3.11** is required.
 - We recommend using **conda** for environment management.
-- For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to nebale NPU use cases.
-- Current recommended Ryzen AI Version: RAI 1.5.1 with 32.0.203.280 driver.
+- For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to enable NPU use cases.
+- Current recommended Ryzen AI version 1.6.0 with the 32.0.203.280 NPU driver.
 
 **Minimal install steps:**
 
 1. **Clone the repo and change directory:**
     ```bash
-    git clone https://github.com/aigdat/LIRA.git
+    git clone https://github.com/amd/LIRA.git
     cd LIRA
     ```
 
 2. **Activate your conda environment:**
+This conda environment should already be installed from the Ryzen AI SW installation mentioned earlier.
     ```bash
-    conda activate ryzen-ai-1.5.0
+    conda activate ryzen-ai-1.6.0
     ```
 
 3. **Install LIRA in editable mode:**
@@ -38,7 +53,7 @@ LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models local
     pip install -e .
     ```
 
-Now you can run `lira --help` to see available commands.
+You can run `lira --help` to see available commands.
 
 ---
 
@@ -48,40 +63,26 @@ LIRA is a CLI-first toolkit focused on simple developer workflows for exporting,
 
 **Primary commands:**
 
-- **`lira run`**  
-    Run, export, or benchmark models directly from the command line.  
-    Use for local inference, ONNX export, or rapid prototyping.
-
-- **`lira serve`**  
-    Launch a FastAPI server with OpenAI-compatible endpoints.  
-    Expose models as HTTP APIs for real-time transcription and seamless integration.  
-    Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.
+1. Run models locally with `lira run`:
 
-**Quick examples:**
+- Run, export, or benchmark models locally directly from the command line.  
+- Use for local inference, ONNX export, or rapid prototyping.
 ```bash
-# Run a model locally (inference)
 lira run whisper --model-type whisper-base --export --device cpu --audio audio_files/test.wav
-
-# Serve the model for local apps (OpenAI-compatible endpoints)
-lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
 ```
 
----
-
-## 🖥️ LIRA Server
+2. OpenAI API-compatible local model serving with `lira serve`:
 
-LIRA includes a FastAPI-based HTTP server for rapid integration with your applications. The server offers **OpenAI API compatibility** for real-time speech recognition.
+- Launch and serve the model with a FastAPI server with OpenAI API-compatible endpoints.  
+- Expose models as HTTP APIs for real-time transcription and seamless integration.  
+- Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.
+```bash
+lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
+```
 
-**Start the server:**
+**NPU Acceleration:**
 
-- **CPU acceleration:**
-    ```bash
-    lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
-    ```
-- **NPU acceleration:**
-    ```bash
-    lira serve --backend openai --model whisper-base --device npu --host 0.0.0.0 --port 5000
-    ```
+For NPU acceleration, change `--device cpu` to `--device npu`.
 
 > Interested in more server features?  
 > Try the **LIRA server demo** with Open WebUI.  
@@ -115,11 +116,13 @@ _Tip: run `lira run <model> --help` for model-specific flags._
 Whisper supports export/optimization and model-specific flags.
 
 **Example:**
+Export Whisper base model to ONNX, optimize and run on NPU.
 ```bash
-# Export Whisper base model to ONNX, optimize and run on NPU
 lira run whisper --model-type whisper-base --export --device npu --audio <input/.wav file> --use-kv-cache
+```
 
-# Run inference on a sample audio file
+Run inference on a sample audio file on CPU.
+```bash
 lira run whisper -m exported_models/whisper_base --device cpu --audio "audio_files/test.wav"
 ```
 
@@ -166,19 +169,7 @@ You can point to custom config files or modify those in the repo.
 
 ---
 
-## 🧩 Supported Model Architectures & Runtimes
-
-LIRA supports multiple speech-model architectures. Runtime support depends on the exported model and chosen runtime.
-
-| Model                | Typical use case                        | Runs on         | Supported datatypes                |
-|----------------------|-----------------------------------------|-----------------|------------------------------------|
-| Whisper (small)      | Low-latency, resource-constrained       | CPU, GPU, NPU*  | FP32, BFP16                        |
-| Whisper (base)       | Balanced accuracy and performance       | CPU, GPU, NPU*  | FP32, BFP16                        |
-| Whisper (medium)     | Higher accuracy for challenging audio   | CPU, GPU, NPU*  | FP32, BFP16                        |
-| Whisper (large)      | Highest accuracy (more compute)         | CPU, GPU        | FP32, BFP16                        |
-| Zipformer            | Streaming / low-latency ASR encoder     | CPU, GPU, NPU*  | FP32, BFP16                        |
 
-<sub>*NPU support depends on available Vitis AI export artifacts and target hardware.</sub>
 
 ## 🧪 Early Access & Open Source Intentions
 
diff --git a/docs/OpenWebUI_README.md b/docs/OpenWebUI_README.md
index 946791b..9c6319d 100644
--- a/docs/OpenWebUI_README.md
+++ b/docs/OpenWebUI_README.md
@@ -9,20 +9,15 @@ Ready to turn your browser into a voice-powered AI playground? With LIRA and Ope
 ## 1. Set up environments
 
 **Recommended:** Use separate conda environments to avoid dependency conflicts.  
-- For LIRA, reuse `ryzen-ai-1.5.0` to leverage NPU support.  
+- For LIRA, reuse `ryzen-ai-1.6.0` to leverage NPU support.  
 - For OpenWebUI, create a new environment.
 
-### LIRA setup:
+### LIRA and OpenWebUI setup:
 Follow the instructions in the [Getting Started](../README.md#getting-started) section of the main README.md to install and set up the Ryzen AI environment.
-```powershell
-conda activate ryzen-ai-1.5.0
-lira serve --help
-```
 
-### OpenWebUI setup:
-In a new environment, let's set up OpenWebUI:
+Let's set up OpenWebUI by first cloning the ryzen-ai-1.6.0 environment, and then installing `open-webui`.
 ```powershell
-conda create -n openwebui python=3.11 -y
+conda create -n openwebui python=3.11 -y --clone ryzen-ai-1.6.0
 conda activate openwebui
 pip install open-webui
 ```
@@ -90,7 +85,7 @@ Record from your mic or upload audio files (`.wav`, `.mp3`)—OpenWebUI will sen
 ## 📝 Notes & Tips
 
 - If you exported a Whisper ONNX model to a custom directory, set `LIRA_MODEL_DIR` before starting the server, or use `lira serve` flags to point at the export.
-- For NPU runs, start `lira serve` from `ryzen-ai-1.5.0` so Vitis AI tooling and drivers are available.
+- For NPU runs, start `lira serve` from `ryzen-ai-1.6.0` so Vitis AI tooling and drivers are available.
 - If running behind a reverse proxy, update OpenWebUI's API Base URL accordingly.
 
 See the main [README.md](../README.md) for full LIRA setup and model export instructions.

From 3d14ee28b8c94dbac91ff5806dea0ef5a8417d8a Mon Sep 17 00:00:00 2001
From: Consolvo <bconsolv@amd.com>
Date: Mon, 6 Oct 2025 09:29:32 -0500
Subject: [PATCH 2/8] divider lines

---
 README.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 6bde33b..084b8a0 100644
--- a/README.md
+++ b/README.md
@@ -8,7 +8,7 @@
 
 LIRA is a **CLI-first, developer-friendly tool**: run and serve ASR models locally simply with the `lira run` and `lira serve` commands to integrate with your apps and tools.
 
-
+---
 
 ## 🧩 Supported Model Architectures & Runtimes
 
@@ -24,6 +24,7 @@ LIRA supports multiple speech-model architectures. Runtime support depends on th
 
 <sub>*NPU support depends on available Vitis AI export artifacts and target hardware.</sub>
 
+---
 
 ## 🚀 Getting Started
 

From 73f2ea8a7a4356cfa296251461508f3dac220cf3 Mon Sep 17 00:00:00 2001
From: Consolvo <bconsolv@amd.com>
Date: Mon, 6 Oct 2025 09:46:34 -0500
Subject: [PATCH 3/8] config links

---
 README.md | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 084b8a0..8ee9859 100644
--- a/README.md
+++ b/README.md
@@ -163,9 +163,10 @@ _Tip: Run `lira run zipformer --help` for all options._
 
 Model and runtime configs live in `config/`:
 
-- `config/model_config.json` — model routing and defaults
-- `vitisai_config_*.json` — Vitis AI configs for NPU exports
-
+- [config/model_config.json](config/model_config.json) — model routing and defaults
+- [config/vitisai_config_whisper_base_encoder.json](config/vitisai_config_whisper_base_encoder.json) — Vitis AI Whisper encoder config for NPU exports
+- [config/vitisai_config_whisper_base_decoder.json](config/vitisai_config_whisper_base_decoder.json) — Vitis AI Whisper decoder config for NPU exports
+- [config/vitisai_config_zipformer_encoder.json](config/vitisai_config_zipformer_encoder.json) — Vitis AI Zipformer encoder config for NPU exports
 You can point to custom config files or modify those in the repo.
 
 ---

From d34e6ff574d2751c6f402d1d7e518469332d6623 Mon Sep 17 00:00:00 2001
From: Consolvo <bconsolv@amd.com>
Date: Mon, 6 Oct 2025 09:47:53 -0500
Subject: [PATCH 4/8] syntax

---
 README.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/README.md b/README.md
index 8ee9859..8b569c7 100644
--- a/README.md
+++ b/README.md
@@ -167,6 +167,7 @@ Model and runtime configs live in `config/`:
 - [config/vitisai_config_whisper_base_encoder.json](config/vitisai_config_whisper_base_encoder.json) — Vitis AI Whisper encoder config for NPU exports
 - [config/vitisai_config_whisper_base_decoder.json](config/vitisai_config_whisper_base_decoder.json) — Vitis AI Whisper decoder config for NPU exports
 - [config/vitisai_config_zipformer_encoder.json](config/vitisai_config_zipformer_encoder.json) — Vitis AI Zipformer encoder config for NPU exports
+
 You can point to custom config files or modify those in the repo.
 
 ---

From 6853c967f1ec9444a395827cc7a1a11ffad204d2 Mon Sep 17 00:00:00 2001
From: Consolvo <bconsolv@amd.com>
Date: Mon, 6 Oct 2025 09:53:01 -0500
Subject: [PATCH 5/8] examples folder instead of docs

---
 README.md                                           |   2 +-
 .../openwebui/README.md                             |   0
 .../openwebui}/assets/image-add-connection.png      | Bin
 {docs => examples/openwebui}/assets/image.png       | Bin
 .../openwebui}/assets/openwebui audio.png           | Bin
 5 files changed, 1 insertion(+), 1 deletion(-)
 rename docs/OpenWebUI_README.md => examples/openwebui/README.md (100%)
 rename {docs => examples/openwebui}/assets/image-add-connection.png (100%)
 rename {docs => examples/openwebui}/assets/image.png (100%)
 rename {docs => examples/openwebui}/assets/openwebui audio.png (100%)

diff --git a/README.md b/README.md
index 8b569c7..6108224 100644
--- a/README.md
+++ b/README.md
@@ -87,7 +87,7 @@ For NPU acceleration, change `--device cpu` to `--device npu`.
 
 > Interested in more server features?  
 > Try the **LIRA server demo** with Open WebUI.  
-> See [docs/OpenWebUI_README.md](docs/OpenWebUI_README.md) for setup instructions.
+> See [examples/openwebui](examples/openwebui) for setup instructions.
 
 - Configure models via `config/model_config.json`.
 - Set API keys (dummy) as environment variables for protected backends.
diff --git a/docs/OpenWebUI_README.md b/examples/openwebui/README.md
similarity index 100%
rename from docs/OpenWebUI_README.md
rename to examples/openwebui/README.md
diff --git a/docs/assets/image-add-connection.png b/examples/openwebui/assets/image-add-connection.png
similarity index 100%
rename from docs/assets/image-add-connection.png
rename to examples/openwebui/assets/image-add-connection.png
diff --git a/docs/assets/image.png b/examples/openwebui/assets/image.png
similarity index 100%
rename from docs/assets/image.png
rename to examples/openwebui/assets/image.png
diff --git a/docs/assets/openwebui audio.png b/examples/openwebui/assets/openwebui audio.png
similarity index 100%
rename from docs/assets/openwebui audio.png
rename to examples/openwebui/assets/openwebui audio.png

From 3404b4f4a3954fdf74b7eee14f5dd129fa5a85db Mon Sep 17 00:00:00 2001
From: Consolvo <bconsolv@amd.com>
Date: Mon, 6 Oct 2025 14:45:26 -0500
Subject: [PATCH 6/8] ryzen generic version number

---
 README.md                    |  7 ++++---
 examples/openwebui/README.md | 12 ++++++++----
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/README.md b/README.md
index 6108224..ecce421 100644
--- a/README.md
+++ b/README.md
@@ -33,7 +33,7 @@ LIRA supports multiple speech-model architectures. Runtime support depends on th
 - **Python 3.11** is required.
 - We recommend using **conda** for environment management.
 - For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to enable NPU use cases.
-- Current recommended Ryzen AI version 1.6.0 with the 32.0.203.280 NPU driver.
+- Current recommended Ryzen AI version 1.5.0 with the 32.0.203.280 NPU driver.
 
 **Minimal install steps:**
 
@@ -44,10 +44,11 @@ LIRA supports multiple speech-model architectures. Runtime support depends on th
     ```
 
 2. **Activate your conda environment:**
-This conda environment should already be installed from the Ryzen AI SW installation mentioned earlier.
+This conda environment should already be installed from the Ryzen AI SW installation mentioned earlier. 
     ```bash
-    conda activate ryzen-ai-1.6.0
+    conda activate ryzen-ai-*.*.*
     ```
+Replace the `ryzen-ai-*.*.*` with the version number you are using, such as `ryzen-ai-1.5.0`
 
 3. **Install LIRA in editable mode:**
     ```bash
diff --git a/examples/openwebui/README.md b/examples/openwebui/README.md
index 9c6319d..50d52c3 100644
--- a/examples/openwebui/README.md
+++ b/examples/openwebui/README.md
@@ -9,16 +9,20 @@ Ready to turn your browser into a voice-powered AI playground? With LIRA and Ope
 ## 1. Set up environments
 
 **Recommended:** Use separate conda environments to avoid dependency conflicts.  
-- For LIRA, reuse `ryzen-ai-1.6.0` to leverage NPU support.  
+- For LIRA, reuse `ryzen-ai-*.*.*` to leverage NPU support, where `*.*.*` is the version number.
 - For OpenWebUI, create a new environment.
 
 ### LIRA and OpenWebUI setup:
 Follow the instructions in the [Getting Started](../README.md#getting-started) section of the main README.md to install and set up the Ryzen AI environment.
 
-Let's set up OpenWebUI by first cloning the ryzen-ai-1.6.0 environment, and then installing `open-webui`.
+Let's set up OpenWebUI by first cloning the `ryzen-ai-*.*.*` environment, and then installing `open-webui`.
 ```powershell
-conda create -n openwebui python=3.11 -y --clone ryzen-ai-1.6.0
+conda create -n openwebui python=3.11 -y --clone ryzen-ai-*.*.*
+```
+```bash
 conda activate openwebui
+```
+```bash
 pip install open-webui
 ```
 
@@ -85,7 +89,7 @@ Record from your mic or upload audio files (`.wav`, `.mp3`)—OpenWebUI will sen
 ## 📝 Notes & Tips
 
 - If you exported a Whisper ONNX model to a custom directory, set `LIRA_MODEL_DIR` before starting the server, or use `lira serve` flags to point at the export.
-- For NPU runs, start `lira serve` from `ryzen-ai-1.6.0` so Vitis AI tooling and drivers are available.
+- For NPU runs, start `lira serve` from the `ryzen-ai-*.*.*` conda environment so Vitis AI tooling and drivers are available.
 - If running behind a reverse proxy, update OpenWebUI's API Base URL accordingly.
 
 See the main [README.md](../README.md) for full LIRA setup and model export instructions.

From a2a25ec14338b623d95af349b194c3365ae22a1f Mon Sep 17 00:00:00 2001
From: Consolvo <bconsolv@amd.com>
Date: Mon, 6 Oct 2025 17:39:24 -0500
Subject: [PATCH 7/8] changes for 1.6 and making separate lira run and lira
 server in README

---
 README.md | 61 +++++++++++++++++++++++++++++++------------------------
 1 file changed, 35 insertions(+), 26 deletions(-)

diff --git a/README.md b/README.md
index ecce421..6992339 100644
--- a/README.md
+++ b/README.md
@@ -16,11 +16,11 @@ LIRA supports multiple speech-model architectures. Runtime support depends on th
 
 | Model                | Typical use case                        | Runs on         | Supported datatypes                |
 |----------------------|-----------------------------------------|-----------------|------------------------------------|
-| [whisper-small](https://huggingface.co/openai/whisper-small)      | Low-latency, resource-constrained       | CPU, GPU, NPU*  | FP32, BFP16                        |
-| [whisper-base](https://huggingface.co/openai/whisper-base)       | Balanced accuracy and performance       | CPU, GPU, NPU*  | FP32, BFP16                        |
-| [whisper-medium](https://huggingface.co/openai/whisper-medium)     | Higher accuracy for challenging audio   | CPU, GPU, NPU*  | FP32, BFP16                        |
-| [whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)      | Highest accuracy (more compute)         | CPU, GPU        | FP32, BFP16                        |
-| [zipformer](https://huggingface.co/papers/2310.11230)            | Streaming / low-latency ASR encoder     | CPU, GPU, NPU*  | FP32, BFP16                        |
+| whisper-small     | Low-latency, resource-constrained       | CPU, NPU*  | FP32, BFP16                        |
+| whisper-base       | Balanced accuracy and performance       | CPU, NPU*  | FP32, BFP16                        |
+| whisper-medium     | Higher accuracy for challenging audio   | CPU, NPU*  | FP32, BFP16                        |
+| whisper-large-v3      | Highest accuracy (more compute)         | CPU        | FP32, BFP16                        |
+| zipformer            | Streaming / low-latency ASR encoder     | CPU, NPU*  | FP32, BFP16                        |
 
 <sub>*NPU support depends on available Vitis AI export artifacts and target hardware.</sub>
 
@@ -33,7 +33,7 @@ LIRA supports multiple speech-model architectures. Runtime support depends on th
 - **Python 3.11** is required.
 - We recommend using **conda** for environment management.
 - For Ryzen™ AI NPU flow, follow the [Ryzen AI installation instructions](https://ryzenai.docs.amd.com/en/latest/inst.html) and verify drivers/runtime for your device. Ensure that you have a Ryzen AI 300 Series machine to enable NPU use cases.
-- Current recommended Ryzen AI version 1.5.0 with the 32.0.203.280 NPU driver.
+- Current recommended Ryzen AI version 1.6.0 with the 32.0.203.280 NPU driver.
 
 **Minimal install steps:**
 
@@ -48,7 +48,7 @@ This conda environment should already be installed from the Ryzen AI SW installa
     ```bash
     conda activate ryzen-ai-*.*.*
     ```
-Replace the `ryzen-ai-*.*.*` with the version number you are using, such as `ryzen-ai-1.5.0`
+Replace the `ryzen-ai-*.*.*` with the version number you are using, such as `ryzen-ai-1.6.0`
 
 3. **Install LIRA in editable mode:**
     ```bash
@@ -61,42 +61,32 @@ You can run `lira --help` to see available commands.
 
 ## ⚡ CLI-first Design
 
-LIRA is a CLI-first toolkit focused on simple developer workflows for exporting, running, and serving speech models.
+LIRA is a CLI-first toolkit focused on simple developer workflows for exporting, running, and serving speech models. You can run your own model locally, or host an OpenAI API-compatible server for any app you want to use on top of LIRA.
 
-**Primary commands:**
+**Quick examples:**
 
-1. Run models locally with `lira run`:
-
-- Run, export, or benchmark models locally directly from the command line.  
-- Use for local inference, ONNX export, or rapid prototyping.
 ```bash
 lira run whisper --model-type whisper-base --export --device cpu --audio audio_files/test.wav
 ```
+✅ To learn more about `lira run`, visit the detailed [Running Models with `lira run`](#-running-models-with-lira-run) section.
 
-2. OpenAI API-compatible local model serving with `lira serve`:
-
-- Launch and serve the model with a FastAPI server with OpenAI API-compatible endpoints.  
-- Expose models as HTTP APIs for real-time transcription and seamless integration.  
-- Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.
 ```bash
 lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
 ```
 
-**NPU Acceleration:**
-
-For NPU acceleration, change `--device cpu` to `--device npu`.
+✅ To learn more about `lira serve`, visit the detailed [LIRA Server](#️-lira-server) section.
 
-> Interested in more server features?  
-> Try the **LIRA server demo** with Open WebUI.  
-> See [examples/openwebui](examples/openwebui) for setup instructions.
+**NPU Acceleration:**
 
-- Configure models via `config/model_config.json`.
-- Set API keys (dummy) as environment variables for protected backends.
+🕙 For NPU acceleration, change `--device cpu` to `--device npu`.
 
 ---
 
 ## 🏃 Running Models with `lira run`
 
+- Run, export, or benchmark models locally directly from the command line.  
+- Use for local inference, ONNX export, or rapid prototyping.
+
 To run a model using the CLI:
 ```bash
 lira run <model> [options]
@@ -111,6 +101,25 @@ Replace `<model>` with the model name or path.
 
 _Tip: run `lira run <model> --help` for model-specific flags._
 
+---
+## 🖥️ LIRA Server
+OpenAI API-compatible local model serving with `lira serve`.
+- Launch and serve the model with a FastAPI server with OpenAI API-compatible endpoints.  
+- Expose models as HTTP APIs for real-time transcription and seamless integration.  
+- Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.
+
+    ```bash
+    lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
+    ```
+
+> Interested in more server features?  
+> Try the **LIRA server demo** with Open WebUI.  
+> See [examples/openwebui](examples/openwebui) for setup instructions.
+
+- Configure models via [config/model_config.json](config/model_config.json).
+- Set API keys (dummy) as environment variables for protected backends.
+
+
 ---
 
 ### 🗣️ Running Whisper

From 6b85e6a3b4acab3c4c2199b69f0de9cb979a19dc Mon Sep 17 00:00:00 2001
From: Consolvo <bconsolv@amd.com>
Date: Mon, 6 Oct 2025 17:40:55 -0500
Subject: [PATCH 8/8] bullet point syntax

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 6992339..2bf087e 100644
--- a/README.md
+++ b/README.md
@@ -108,9 +108,9 @@ OpenAI API-compatible local model serving with `lira serve`.
 - Expose models as HTTP APIs for real-time transcription and seamless integration.  
 - Add speech recognition to your apps, automate workflows, or build custom endpoints using standard REST calls.
 
-    ```bash
-    lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
-    ```
+```bash
+lira serve --backend openai --model whisper-base --device cpu --host 0.0.0.0 --port 5000
+```
 
 > Interested in more server features?  
 > Try the **LIRA server demo** with Open WebUI.