> ## Documentation Index
> Fetch the complete documentation index at: https://visionagents.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Fast-Whisper

[Fast-Whisper](https://github.com/guillaumekln/faster-whisper) is a high-performance local STT using CTranslate2. Provides 2-4x faster inference than standard Whisper with support for CPU and GPU acceleration.

<Info>
  Vision Agents requires a [Stream](https://getstream.io/try-for-free/) account
  for real-time transport. Most providers offer free tiers to get started.
</Info>

## Installation

```sh theme={null}
uv add "vision-agents[fast-whisper]"
```

## Quick Start

```python theme={null}
from vision_agents.core import Agent, User
from vision_agents.plugins import fast_whisper, gemini, elevenlabs, getstream

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=gemini.LLM("gemini-3-flash-preview"),
    stt=fast_whisper.STT(model_size="base"),
    tts=elevenlabs.TTS(),
)
```

<Note>
  Fast-Whisper runs locally. No API key required. Models download automatically
  on first use.
</Note>

## Parameters

| Name           | Type  | Default  | Description                                                          |
| -------------- | ----- | -------- | -------------------------------------------------------------------- |
| `model_size`   | `str` | `"base"` | Model size (`"tiny"`, `"base"`, `"small"`, `"medium"`, `"large-v3"`) |
| `language`     | `str` | `None`   | Language code or `None` for auto-detect                              |
| `device`       | `str` | `"cpu"`  | Device (`"cpu"`, `"cuda"`, `"auto"`)                                 |
| `compute_type` | `str` | `"int8"` | Precision (`"int8"`, `"float16"`, `"float32"`)                       |

## Model Sizes

| Model      | Speed     | Use Case                        |
| ---------- | --------- | ------------------------------- |
| `tiny`     | Fastest   | Real-time, resource-constrained |
| `base`     | Very Fast | General purpose                 |
| `small`    | Fast      | Balanced                        |
| `medium`   | Moderate  | Higher accuracy                 |
| `large-v3` | Slower    | Maximum accuracy                |

## Optimization

```python theme={null}
# CPU (default) - use int8 for best performance
stt = fast_whisper.STT(device="cpu", compute_type="int8")

# GPU - use float16 for speed and accuracy
stt = fast_whisper.STT(device="cuda", compute_type="float16")
```

## Next Steps

<CardGroup cols={2}>
  <Card title="Build a Voice Agent" icon="microphone" href="/introduction/voice-agents">
    Get started with voice
  </Card>

  <Card title="Build a Video Agent" icon="video" href="/introduction/video-agents">
    Add video processing
  </Card>
</CardGroup>