feat(chat): add interactive terminal UI with live transcription #185

basnijholt · 2026-01-04T15:26:56Z

Summary

Add interactive terminal UI to the chat command with live transcription mode
Support pause/resume (Escape key) to mute mic for side conversations
Add slash commands: /tts, /mode, /tools, /clear, /help
Enable tool toggling at runtime via /tools disable|enable <name>
Support two input modes: "live" (default, VAD-based) and "direct" (Ctrl+C to end)

New Files

agent_cli/core/voice_input.py: Shared VAD recording loop extracted for reuse
agent_cli/core/chat_state.py: Session state management & slash command handling

Dependencies

Added prompt_toolkit>=3.0.0 for async editable input with key bindings

CLI Options

--vad-threshold (0.0-1.0): VAD speech detection threshold
--silence-threshold: Seconds of silence to end a speech segment

Test plan

All existing tests pass (476 passed, excluding VAD tests with SSL issues)
Pre-commit checks pass (ruff, mypy, formatting)
Manual testing: run agent-cli chat and verify live transcription works
Manual testing: verify Escape pauses/resumes recording
Manual testing: verify slash commands work (/help, /tts, /mode, /tools, /clear)
Manual testing: verify /mode direct switches to original Ctrl+C behavior

Enhance the chat command with an interactive terminal UI that supports: - Live transcription mode: text appears as you speak, editable before sending - Pause/resume: Escape key to mute mic for side conversations - Slash commands: /tts, /mode, /tools, /clear, /help - Tool toggling: enable/disable specific tools at runtime - Two input modes: "live" (default, VAD-based) and "direct" (Ctrl+C to end) New files: - agent_cli/core/voice_input.py: shared VAD recording loop - agent_cli/core/chat_state.py: session state & slash command handling Added prompt_toolkit dependency for async editable input with key bindings.

- Add explicit Ctrl+C key binding to properly exit live input mode - Track accumulated text length to append new transcriptions instead of replacing entire buffer, preserving cursor position when editing - Remove conflicting console.print status updates that caused flickering with prompt_toolkit's display management

- Return None from _get_live_input on Ctrl+C to signal exit, main loop now breaks on None instead of continuing with "No input received" - Add bottom_toolbar to PromptSession showing live status (🎤 Listening, 🔴 Recording, ⏳ Processing, ⏸️ Paused, ✓ Ready) - Insert transcribed text at cursor position instead of always appending at end, allowing users to position cursor before speaking

- Replace emoji status icons with ASCII text to avoid rendering issues - Use call_soon_threadsafe for all UI updates from background voice task - Use mutable list holder for status to avoid race conditions - Schedule buffer text updates on event loop for thread safety

…wrappers - Remove STATUS_ICONS and status toolbar (was causing display issues) - Remove unused _create_input_panel function - Remove call_soon_threadsafe wrappers (callbacks run on same event loop) - Simplify on_text_update to just set buffer text directly - Remove unused imports (Panel, Text, VoiceInputStatus)

basnijholt and others added 7 commits January 4, 2026 07:26

Merge 169b650 into d790558

4e51def

Update auto-generated docs

c07d71b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(chat): add interactive terminal UI with live transcription #185

feat(chat): add interactive terminal UI with live transcription #185

Uh oh!

basnijholt commented Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(chat): add interactive terminal UI with live transcription #185

Are you sure you want to change the base?

feat(chat): add interactive terminal UI with live transcription #185

Uh oh!

Conversation

basnijholt commented Jan 4, 2026

Summary

New Files

Dependencies

CLI Options

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants