> ## Documentation Index
> Fetch the complete documentation index at: https://visionagents.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# xAI Realtime

> Speech-to-speech using xAI's Grok models over WebSocket with built-in VAD.

[xAI](https://x.ai/) provides realtime speech-to-speech over WebSocket with server-side voice activity detection, built-in web search, and X search. No separate STT/TTS needed.

<Info>
  Vision Agents requires a [Stream](https://getstream.io/try-for-free/) account
  for real-time transport. Most providers offer free tiers to get started.
</Info>

<Tip>
  xAI also provides a traditional [LLM](/integrations/llm/xai) and standalone [text-to-speech](/integrations/tts/xai).
</Tip>

## Installation

```sh theme={null}
uv add "vision-agents[xai]"
```

## Quick start

```python theme={null}
from vision_agents.core import Agent, User
from vision_agents.plugins import xai, getstream

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful voice assistant.",
    llm=xai.Realtime(),
)
```

<Warning>
  Set `XAI_API_KEY` in your environment or pass `api_key` directly.
</Warning>

## Parameters

| Name                       | Type            | Default                       | Description                                                        |
| -------------------------- | --------------- | ----------------------------- | ------------------------------------------------------------------ |
| `model`                    | `str`           | `"grok-voice-think-fast-1.0"` | Grok realtime model                                                |
| `voice`                    | `str`           | `"ara"`                       | Voice (`"ara"`, `"rex"`, `"sal"`, `"eve"`, `"leo"`)                |
| `api_key`                  | `str`           | `None`                        | API key (defaults to `XAI_API_KEY` env var)                        |
| `turn_detection`           | `str` or `None` | `"server_vad"`                | Turn detection mode (`"server_vad"` or `None` for manual)          |
| `vad_interrupt_response`   | `bool`          | `False`                       | Allow VAD to auto-cancel the assistant response on detected speech |
| `web_search`               | `bool`          | `True`                        | Enable web search tool                                             |
| `x_search`                 | `bool`          | `True`                        | Enable X (Twitter) search tool                                     |
| `x_search_allowed_handles` | `list[str]`     | `None`                        | Restrict X search to specific handles                              |

<Note>
  `vad_interrupt_response` defaults to `False` because speaker-to-mic echo can cause the server to cancel the agent's own response mid-sentence. Set to `True` only if your audio setup avoids echo feedback.
</Note>

## Function calling

```python theme={null}
@agent.llm.register_function(description="Get weather for a location")
async def get_weather(location: str) -> str:
    return f"The weather in {location} is sunny and 72°F"
```

See the [Function Calling guide](/guides/mcp-tool-calling) for details.

## Next steps

<CardGroup cols={2}>
  <Card title="xAI LLM" icon="brain" href="/integrations/llm/xai">
    Advanced reasoning with Grok
  </Card>

  <Card title="xAI TTS" icon="volume-high" href="/integrations/tts/xai">
    Text-to-speech with expressive voices
  </Card>
</CardGroup>
