> ## Documentation Index
> Fetch the complete documentation index at: https://visionagents.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Pocket TTS

[Pocket TTS](https://huggingface.co/kyutai/pocket-tts) is a lightweight local TTS from Kyutai that runs on CPU. Offers \~200ms latency, voice cloning, and 8 built-in voices without requiring a GPU or external API.

<Info>
  Vision Agents requires a [Stream](https://getstream.io/try-for-free/) account
  for real-time transport. Most providers offer free tiers to get started.
</Info>

## Installation

```sh theme={null}
uv add "vision-agents[pocket]"
```

## Quick Start

```python theme={null}
from vision_agents.core import Agent, User
from vision_agents.plugins import pocket, gemini, deepgram, getstream

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=gemini.LLM("gemini-3-flash-preview"),
    stt=deepgram.STT(),
    tts=pocket.TTS(),
)
```

<Note>Pocket TTS runs locally. No API key required.</Note>

## Parameters

| Name    | Type  | Default  | Description                                    |
| ------- | ----- | -------- | ---------------------------------------------- |
| `voice` | `str` | `"alba"` | Built-in voice or path to wav file for cloning |

## Built-in Voices

`alba`, `marius`, `javert`, `jean`, `fantine`, `cosette`, `eponine`, `azelma`

## Voice Cloning

```python theme={null}
# Use a local wav file
tts = pocket.TTS(voice="path/to/your/voice.wav")

# Or a HuggingFace-hosted voice
tts = pocket.TTS(voice="hf://kyutai/tts-voices/alba-mackenna/casual.wav")
```

## Next Steps

<CardGroup cols={2}>
  <Card title="Build a Voice Agent" icon="microphone" href="/introduction/voice-agents">
    Get started with voice
  </Card>

  <Card title="Build a Video Agent" icon="video" href="/introduction/video-agents">
    Add video processing
  </Card>
</CardGroup>
