Skip to content

A local AI model runner/server for different models i hugging face. Dev for MyDictionary, but free for all to use.

License

Notifications You must be signed in to change notification settings

jhfnetboy/Candle-local-AI-Server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎵 TTS Server - Local Text-to-Speech Service

Version 0.2.0 | High-performance local TTS server powered by Kokoro-82M ONNX model

License: MIT Rust

📖 Overview

A lightweight, blazing-fast text-to-speech server designed for the MyDictionary Chrome extension. Features 54 high-quality voices with automatic model downloading and intelligent caching. The macOS version now runs as a background menubar application.

✨ Features

  • 🎤 54 Premium Voices - British/American English, male/female options
  • Lightning Fast - Rust-powered, sub-second synthesis
  • 💾 Smart Caching - SHA256-based file caching with TTL, stored in ~/Library/Application Support/tts-server/
  • 🔄 Auto Download - Models download automatically on first run
  • 🌐 REST API - Simple HTTP endpoints for easy integration
  • 🎯 Browser Compatible - 16-bit PCM WAV output
  • 🖥️ macOS Menubar App - Runs silently in the background with a menubar icon for quick access and control.
  • 🔒 Single Instance - Prevents multiple instances from running concurrently.
  • 🪵 Detailed Logging - Logs are written to ~/Library/Application Support/tts-server/logs/

🚀 Quick Start

Option 1: Download Pre-built Binary (Recommended)

macOS (Apple Silicon & Intel)

The macOS version is now a self-contained .app bundle that runs as a background menubar application.

# 1. Download the latest TTS Server.app from the releases page:
#    (e.g., https://github.com/jhfnetboy/Candle-local-AI-Server/releases/download/v0.2.0/TTS_Server.app.zip)

# 2. Extract the downloaded archive (if it's a .zip or .tar.gz)
#    (Example for .zip):
#    unzip TTS_Server.app.zip

# 3. Move/Drag the "TTS Server.app" to your /Applications folder.
mv TTS_Server.app /Applications/

# 4. Install espeak-ng (required for phonemization)
brew install espeak-ng

# 5. Launch the application
#    You can double-click it from your /Applications folder, or run:
open /Applications/TTS\ Server.app

The application will:

  • Run silently in the background with an icon in your macOS menubar (top-right).
  • Start the server on http://localhost:9527.
  • Download models automatically on first run (~310MB ONNX model, ~50MB voice data). This will be stored in ~/Library/Application Support/tts-server/checkpoints/ and ~/Library/Application Support/tts-server/data/.
  • Create a cache directory for audio files in ~/Library/Application Support/tts-server/cache/audio/.
  • Generate detailed logs in ~/Library/Application Support/tts-server/logs/.

Menubar Icon Usage:

  • Left-click on the icon to show "Open UI" and "Quit" options.
  • "Open UI" will open http://localhost:9527 in your default browser.
  • "Quit" will gracefully shut down the server.

常见问题解决:

  • 如果遇到 "cannot be opened because it is from an unidentified developer"
    • 请在 /Applications 文件夹中右键点击 TTS Server.app,选择“打开”。系统可能会询问是否确定要打开,点击“打开”即可。此操作通常只需进行一次。
  • 如果遇到 "espeak-ng: command not found"
    • 安装: brew install espeak-ng

Windows (x64)

⚠️ Windows 版本将在未来版本发布 (预计 v0.2.0 后)

目前仅支持 macOS。Windows 用户可以选择从源码构建。


Option 2: Build from Source

Prerequisites:

# Clone the repository
git clone https://github.com/jhfnetboy/Candle-local-AI-Server.git
cd Candle-local-AI-Server

# Install espeak-ng
# macOS:
brew install espeak-ng
# Ubuntu:
sudo apt-get install espeak-ng
# Windows:
choco install espeak-ng

# Build release version (for macOS, this will generate a .app bundle)
cargo bundle --release

# For macOS, move the generated .app to Applications and launch:
mv target/release/bundle/osx/TTS\ Server.app /Applications/
open /Applications/TTS\ Server.app

# For Linux/Windows, run the raw binary (if you don't need a UI)
# ./target/release/tts-server

🔗 Integration with MyDictionary Extension

Step 1: Start TTS Server

# Make sure the server is running (e.g., double-click TTS Server.app or run from terminal)
# You should see the menubar icon if on macOS.

# You can check server health via:
curl http://localhost:9527/health

Step 2: Install MyDictionary Extension

  1. Download MyDictionary extension from Chrome Web Store or build from source
  2. The extension will automatically detect the local TTS server
  3. Open extension settings → TTS Voice Settings
  4. You'll see a green "✅ Connected" indicator if the server is running

Step 3: Select Your Voice

  1. Go to TTS Voice Settings (Extension popup → Settings → Voice Settings)
  2. Choose from 54 voices:
    • 🇬🇧 British English: George, Daniel, Alice, Emma... (Recommended for learning)
    • 🇺🇸 American English: Michael, Nova, Sarah...
  3. Click Save Settings

Step 4: Enjoy!

Select any text on a webpage and click the 🔊 TTS button in the sidebar.


📡 API Reference

Endpoints

GET / - Server Info

curl http://localhost:9527/

Response:

{
  "success": true,
  "data": {
    "name": "TTS Server",
    "version": "0.2.0",
    "status": "running",
    "framework": "Candle"
  }
}

GET /health - Health Check

curl http://localhost:9527/health

POST /synthesize - Text to Speech

Request:

curl -X POST http://localhost:9527/synthesize \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, world!",
    "voice": "bm_george",
    "format": "wav"
  }'

Parameters:

  • text (required): Text to synthesize
  • voice (optional): Voice ID (default: bm_george)
  • format (optional): Output format, currently only wav (reserved for future mp3/ogg support)

Response:

{
  "file_id": "51f91581302698db",
  "url": "http://localhost:9527/audio/51f91581302698db.wav",
  "cached": false
}

GET /audio/:filename - Get Audio File

curl http://localhost:9527/audio/51f91581302698db.wav --output output.wav

Voice List

See VOICE_API.md for complete list of 54 available voices.

Recommended voices for English learning:

  • bm_george - British male, clear and standard
  • bm_daniel - British male, accurate pronunciation
  • af_nova - American female, recommended
  • am_michael - American male, standard

🛠️ Configuration

Port Configuration

By default, the server runs on port 9527. To change:

Edit src/main.rs:

let addr = SocketAddr::from(([0, 0, 0, 0], 9527));  // Change port here

Then rebuild:

cargo build --release

Cache Configuration

  • Location: ~/Library/Application Support/tts-server/cache/audio/
  • TTL: 1 hour (3600 seconds)
  • Format: SHA256-based file IDs

To change cache settings, edit src/main.rs:

AudioCache::new("cache/audio", 3600)  // Change TTL (seconds)

🐛 Troubleshooting

Problem: Server won't start

Solution 1: Check if port 9527 is already in use

# macOS/Linux:
lsof -i :9527

# Windows:
netstat -ano | findstr :9527

Solution 2: Check espeak-ng installation

espeak-ng --version

If not installed, see Quick Start for installation instructions.

Problem: Extension shows "Disconnected"

  1. Make sure the TTS server is running: http://localhost:9527/health
  2. Check browser console for CORS errors
  3. Restart the server and reload the extension

Problem: "Model not found" error

The models should download automatically on first run. They will be stored in ~/Library/Application Support/tts-server/checkpoints/ and ~/Library/Application Support/tts-server/data/. If download fails:

# Manual download (you might need to provide the full path to download_models.sh inside the .app bundle)
# For example, if TTS Server.app is in /Applications:
/Applications/TTS\ Server.app/Contents/Resources/download_models.sh

Problem: Windows - "espeak-ng not found"

⚠️ Windows 版本将在未来版本发布 (预计 v0.2.0 后)

Windows 用户目前可以从源码构建,或者等待官方 Windows 版本发布。


🏗️ Project Structure

tts-server/
├── src/
│   ├── main.rs           # HTTP server & routes
│   ├── tts_engine.rs     # Kokoro ONNX inference
│   ├── cache.rs          # File caching system
│   ├── vocab.rs          # Tokenization
│   └── wav_encoder.rs    # WAV audio encoding
├── checkpoints/          # ONNX models (auto-downloaded to Application Support)
├── data/voices/          # 54 voice embeddings (auto-downloaded to Application Support)
├── Cargo.toml            # Rust dependencies
├── Info.plist.in         # macOS app bundle configuration
└── README.md             # This file

🤝 Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m 'Add amazing feature'
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


🙏 Acknowledgments


📞 Support


Made with ❤️ by Jason

About

A local AI model runner/server for different models i hugging face. Dev for MyDictionary, but free for all to use.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •