> ## Documentation Index
> Fetch the complete documentation index at: https://visionagents.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Qwen Realtime

[Qwen3 Realtime](https://www.alibabacloud.com/en/solutions/generative-ai/qwen) provides native audio I/O with built-in STT and TTS over WebSocket. No external speech services required.

<Info>
  Vision Agents requires a [Stream](https://getstream.io/try-for-free/) account
  for real-time transport. Most providers offer free tiers to get started.
</Info>

<Tip>
  Qwen models can also be used as a traditional [LLM](/integrations/llm/qwen) via their OpenAI-compatible endpoint.
</Tip>

## Installation

```sh theme={null}
uv add "vision-agents[qwen]"
```

## Quick Start

```python theme={null}
from vision_agents.core import Agent, User
from vision_agents.plugins import qwen, getstream

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=qwen.Realtime(fps=1),  # Enable video with fps > 0
)
```

<Warning>Set `DASHSCOPE_API_KEY` in your environment.</Warning>

## Parameters

| Name                      | Type   | Default                       | Description                                       |
| ------------------------- | ------ | ----------------------------- | ------------------------------------------------- |
| `model`                   | `str`  | `"qwen3-omni-flash-realtime"` | Qwen Realtime model                               |
| `voice`                   | `str`  | `"Cherry"`                    | Voice for audio output                            |
| `fps`                     | `int`  | `1`                           | Video frames per second                           |
| `include_video`           | `bool` | `False`                       | Include video frames                              |
| `vad_silence_duration_ms` | `int`  | `900`                         | Silence before turn end                           |
| `api_key`                 | `str`  | `None`                        | API key (defaults to `DASHSCOPE_API_KEY` env var) |

<Note>
  Qwen Realtime does not support text input. Start speaking once you join the
  call.
</Note>

## Next Steps

<CardGroup cols={2}>
  <Card title="Qwen LLM" icon="brain" href="/integrations/llm/qwen">
    Traditional LLM via OpenAI-compatible API
  </Card>

  <Card title="Build a Voice Agent" icon="microphone" href="/introduction/voice-agents">
    Get started with voice
  </Card>
</CardGroup>
