xAI’s Grok provides advanced reasoning capabilities and real-time knowledge. The plugin supports conversation memory, streaming responses, and function calling (Grok 4.1+).
Vision Agents requires a Stream account
for real-time transport. Most providers offer free tiers to get started.
Installation
uv add "vision-agents[xai]"
Quick Start
from vision_agents.core import Agent, User
from vision_agents.plugins import xai, getstream, deepgram, elevenlabs
agent = Agent(
edge = getstream.Edge(),
agent_user = User( name = "Assistant" , id = "agent" ),
instructions = "You are a helpful assistant." ,
llm = xai.LLM( model = "grok-4.1" ),
stt = deepgram.STT(),
tts = elevenlabs.TTS(),
)
Set XAI_API_KEY in your environment or pass api_key directly.
Parameters
Name Type Default Description modelstr"grok-4"Model ("grok-4", "grok-4.1") api_keystrNoneAPI key (defaults to XAI_API_KEY env var)
Function Calling
Grok 4.1+ supports function calling:
@agent.llm.register_function ( description = "Get weather for a location" )
async def get_weather ( location : str ) -> str :
return f "The weather in { location } is sunny and 72°F"
See the Function Calling guide for details.
Events
The xAI plugin emits a low-level event for streaming chunks. Most developers should use the core LLMResponseCompletedEvent instead.
from vision_agents.plugins.xai.events import XAIChunkEvent
@agent.events.subscribe
async def on_xai_chunk ( event : XAIChunkEvent):
print ( f "Chunk: { event.chunk } " )
Next Steps
Build a Voice Agent Get started with voice
Build a Video Agent Add video processing