xAI Realtime - Vision Agents

xAI provides realtime speech-to-speech over WebSocket with server-side voice activity detection, built-in web search, and X search. No separate STT/TTS needed.

Vision Agents uses Stream Video for real-time WebRTC transport by default. External WebRTC transports are supported as well. Most AI providers offer free tiers to get started.

xAI also provides a traditional LLM and standalone text-to-speech.

Installation

uv add "vision-agents[xai]"

Quick start

from vision_agents.core import Agent, User
from vision_agents.plugins import xai, getstream

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful voice assistant.",
    llm=xai.Realtime(),
)

Set XAI_API_KEY in your environment or pass api_key directly.

Parameters

Name	Type	Default	Description
`model`	`str`	`"grok-voice-think-fast-1.0"`	Grok realtime model
`voice`	`str`	`"ara"`	Voice (`"ara"`, `"rex"`, `"sal"`, `"eve"`, `"leo"`)
`api_key`	`str`	`None`	API key (defaults to `XAI_API_KEY` env var)
`turn_detection`	`str` or `None`	`"server_vad"`	Turn detection mode (`"server_vad"` or `None` for manual)
`vad_interrupt_response`	`bool`	`False`	Allow VAD to auto-cancel the assistant response on detected speech
`web_search`	`bool`	`True`	Enable web search tool
`x_search`	`bool`	`True`	Enable X (Twitter) search tool
`x_search_allowed_handles`	`list[str]`	`None`	Restrict X search to specific handles

vad_interrupt_response defaults to False because speaker-to-mic echo can cause the server to cancel the agent’s own response mid-sentence. Set to True only if your audio setup avoids echo feedback.

Function calling

@agent.llm.register_function(description="Get weather for a location")
async def get_weather(location: str) -> str:
    return f"The weather in {location} is sunny and 72°F"

See the Function Calling guide for details.

Next steps

xAI LLM

Advanced reasoning with Grok

xAI TTS

Text-to-speech with expressive voices

Qwen Realtime AssemblyAI

⌘I

​Installation

​Quick start

​Parameters

​Function calling

​Next steps

xAI LLM

xAI TTS

Installation

Quick start

Parameters

Function calling

Next steps