> ## Documentation Index
> Fetch the complete documentation index at: https://visionagents.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Fish Audio TTS

[Fish Audio](https://fish.audio) provides high-quality text-to-speech with fine-grained prosody control, voice cloning support, and multiple backend models. Ideal for multilingual applications.

<Info>
  Vision Agents requires a [Stream](https://getstream.io/try-for-free/) account
  for real-time transport. Most providers offer free tiers to get started.
</Info>

<Tip>
  Fish Audio also provides [speech-to-text](/integrations/stt/fish) with automatic language detection. You can use both in the same agent.
</Tip>

## Installation

```sh theme={null}
uv add "vision-agents[fish]"
```

## Quick Start

```python theme={null}
from vision_agents.core import Agent, User
from vision_agents.plugins import fish, gemini, getstream

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=gemini.LLM("gemini-3-flash-preview"),
    stt=fish.STT(),
    tts=fish.TTS(),  # Uses S2-Pro model by default
)
```

<Warning>
  Set `FISH_API_KEY` in your environment or pass `api_key` directly.
</Warning>

## Basic Usage

```python theme={null}
tts = fish.TTS(reference_id="your_voice_id")  # Optional voice cloning
```

## Prosody Control

The S2-Pro model (default) supports inline control tags for natural prosody:

```python theme={null}
tts = fish.TTS()  # Uses s2-pro by default

# Include prosody tags in your text
text = "[whisper] This is a secret. [super happy] But this is great news!"
text = "Hello! [laugh] That's so funny."
```

## Selecting a Model

```python theme={null}
# Use the latest S2-Pro model with prosody control
tts = fish.TTS(model="s2-pro")

# Use legacy models if needed
tts = fish.TTS(model="speech-1.5")
tts = fish.TTS(model="speech-1.6")

# Use fast models for lower latency
tts = fish.TTS(model="s1")
tts = fish.TTS(model="s1-mini")
```

## Parameters

| Name           | Type  | Default    | Description                                                                    |
| -------------- | ----- | ---------- | ------------------------------------------------------------------------------ |
| `model`        | `str` | `"s2-pro"` | Backend model: `"s2-pro"`, `"speech-1.5"`, `"speech-1.6"`, `"s1"`, `"s1-mini"` |
| `reference_id` | `str` | `None`     | Voice ID for voice cloning                                                     |
| `api_key`      | `str` | `None`     | API key (defaults to `FISH_API_KEY` env var)                                   |
| `base_url`     | `str` | `None`     | Custom API endpoint                                                            |

## Next Steps

<CardGroup cols={2}>
  <Card title="Fish Audio STT" icon="microphone" href="/integrations/stt/fish">
    Speech-to-text with auto language detection
  </Card>

  <Card title="Build a Voice Agent" icon="microphone" href="/introduction/voice-agents">
    Get started with voice
  </Card>
</CardGroup>
