Text-to-speech running locally on a Mac Mini M4 Pro. No cloud APIs.

Live · Local · No Cloud
Try an example
Voice clone generation takes 15–20 seconds. Running Chatterbox on Apple Silicon.
0 / 500
Output Waveform
Audio Length
Generated In
Realtime Factor

Under 2 Seconds

Typical generation time for short phrases on Apple Silicon.

🔒

100% Local

Your text never leaves this server. No OpenAI, no ElevenLabs, no API keys.

🧠

Kokoro + Chatterbox

Stock voices via Kokoro (82M). Voice clone via Chatterbox Turbo (350M). Both running locally on Apple Silicon.