xAI TTS

xAI provides text-to-speech with five expressive voices, inline speech tags for delivery control, and multiple output codecs.

Vision Agents requires a Stream account for real-time transport. Most providers offer free tiers to get started.

xAI also provides an LLM and Realtime speech-to-speech. You can use all three in the same agent.

Installation

uv add "vision-agents[xai]"

Quick start

from vision_agents.core import Agent, User
from vision_agents.plugins import xai, getstream, deepgram

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=xai.LLM(model="grok-4.1"),
    stt=deepgram.STT(),
    tts=xai.TTS(),
)

Set XAI_API_KEY in your environment or pass api_key directly.

Parameters

tts = xai.TTS(voice="eve", language="en", codec="pcm", sample_rate=24000)

Name	Type	Default	Description
`api_key`	`str`	`None`	API key (defaults to `XAI_API_KEY` env var)
`voice`	`str`	`"eve"`	Voice (`"eve"`, `"ara"`, `"leo"`, `"rex"`, `"sal"`)
`language`	`str`	`"en"`	BCP-47 language code (e.g. `"en"`, `"zh"`, `"pt-BR"`) or `"auto"`
`codec`	`str`	`"pcm"`	Output codec (`"pcm"`, `"wav"`, `"mp3"`, `"mulaw"`, `"alaw"`)
`sample_rate`	`int`	`24000`	Output sample rate in Hz (8000, 16000, 22050, 24000, 44100, or 48000)
`bit_rate`	`int`	`None`	MP3 bit rate (only used when codec is `"mp3"`)

Voices

Voice	Description
`eve`	Energetic, upbeat — engaging and enthusiastic (default)
`ara`	Warm, friendly — balanced and conversational
`leo`	Authoritative, strong — commanding, great for instructional content
`rex`	Confident, clear — professional, ideal for business
`sal`	Smooth, balanced — versatile for a wide range of contexts

Speech tags

You can use inline speech tags in your text for fine-grained delivery control. Inline tags: [pause] [long-pause] [laugh] [chuckle] [giggle] [cry] [tsk] [tongue-click] [lip-smack] [breath] [inhale] [exhale] [sigh] [hum-tune] Wrapping tags: <whisper>, <shout>, <slow>, <fast>, <soft>, <loud>, <high-pitch>, <low-pitch>, <sing>

Installation

Quick start

Parameters

Voices

Speech tags

Next steps

xAI LLM

xAI Realtime

Documentation Index

​Installation

​Quick start

​Parameters

​Voices

​Speech tags

​Next steps

xAI LLM

xAI Realtime

Installation

Quick start

Parameters

Voices

Speech tags

Next steps