Pocket TTS is a lightweight local TTS from Kyutai that runs on CPU. Offers ~200ms latency, voice cloning, and 8 built-in voices without requiring a GPU or external API.Documentation Index
Fetch the complete documentation index at: https://visionagents.ai/llms.txt
Use this file to discover all available pages before exploring further.
Vision Agents requires a Stream account
for real-time transport. Most providers offer free tiers to get started.
Installation
Quick Start
Pocket TTS runs locally. No API key required.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
voice | str | "alba" | Built-in voice or path to wav file for cloning |
Built-in Voices
alba, marius, javert, jean, fantine, cosette, eponine, azelma
Voice Cloning
Next Steps
Build a Voice Agent
Get started with voice
Build a Video Agent
Add video processing

