Skip to main content
Qwen provides powerful language models via the DashScope API. Use ChatCompletionsLLM from the OpenAI plugin with Qwen’s OpenAI-compatible endpoint.
Vision Agents requires a Stream account for real-time transport. Most providers offer free tiers to get started.
Qwen also provides Realtime speech-to-speech with native audio I/O and built-in STT/TTS.

Installation

uv add "vision-agents[openai]"

Quick Start

from vision_agents.core import Agent, User
from vision_agents.plugins import openai, deepgram, getstream

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=openai.ChatCompletionsLLM(
        model="qwen-plus",
        base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
        api_key="your_dashscope_api_key",
    ),
    stt=deepgram.STT(),
    tts=deepgram.TTS(),
)
Set DASHSCOPE_API_KEY in your environment.

Next Steps

Qwen Realtime

Native speech-to-speech with built-in STT/TTS

Build a Voice Agent

Get started with voice