xAI’s Grok is a powerful language model that provides advanced reasoning capabilities and real-time knowledge. The xAI plugin for Vision Agents enables you to use Grok models in your conversational AI applications with full support for function calling and tool use.
The xAI plugin provides native integration with xAI’s chat completion API, including conversation memory management, streaming responses, and multimodal support.
Installation
Install the xAI plugin with
uv add vision-agents[xai]
Example
from vision_agents.plugins import xai
from vision_agents.core import Agent, User
from vision_agents.plugins import getstream, deepgram, elevenlabs
# Create agent with Grok
agent = Agent(
edge=getstream.Edge(),
agent_user=User(name="AI Assistant", id="agent"),
instructions="You are a helpful assistant with access to real-time information.",
llm=xai.LLM(model="grok-4.1"),
stt=deepgram.STT(),
tts=elevenlabs.TTS()
)
Initialization
The xAI plugin exists in the form of the LLM class:
from vision_agents.plugins import xai
llm = xai.LLM(
model="grok-4.1",
api_key="your_xai_api_key" # or set XAI_API_KEY environment variable
)
To initialize without passing in the API key, make sure the XAI_API_KEY is available as an environment variable.
You can do this either by defining it in a .env file or exporting it directly in your terminal.
Parameters
These are the parameters available in the xAI LLM plugin:
| Name | Type | Default | Description |
|---|
model | str | "grok-4" | The xAI model to use. Options include "grok-4", "grok-4.1", and other available Grok models. |
api_key | str or None | None | Your xAI API key. If not provided, the plugin will look for the XAI_API_KEY environment variable. |
client | AsyncClient or None | None | Optional pre-configured xAI AsyncClient instance. If provided, uses this instead of creating a new one. |
Features
Conversation Memory
The xAI plugin automatically manages conversation history, allowing for natural multi-turn conversations:
llm = xai.LLM(model="grok-4.1")
# First message
await llm.simple_response("My name is Alice and I have 2 cats")
# Second message - the LLM remembers the context
response = await llm.simple_response("How many pets do I have?")
print(response.text) # Will mention the 2 cats
Streaming Responses
The plugin supports streaming responses for real-time text generation:
response = await llm.create_response(
input="Tell me about quantum computing",
instructions="You are a helpful science educator.",
stream=True
)
Function Calling (Grok 4.1+)
Grok 4.1 and later models support function calling, allowing the model to use tools and take actions:
from vision_agents.plugins import xai
llm = xai.LLM(model="grok-4.1")
@llm.register_function(
description="Get the current weather for a location"
)
async def get_weather(location: str) -> str:
# Your weather API logic here
return f"The weather in {location} is sunny and 72°F"
# The model can now call this function when appropriate
response = await llm.simple_response("What's the weather in San Francisco?")
Function calling is supported in Grok 4.1 and later models. Earlier versions of Grok do not support this feature.
Methods
simple_response(text, processors, participant)
Generate a response to text input. This method is called automatically when new STT transcripts are received.
response = await llm.simple_response("Hello, how are you?")
print(response.text)
Parameters:
text (str): Input text to respond to
processors (list[Processor] | None): Optional list of processors for video/voice AI context
participant (Participant | None): Optional participant object
Returns: LLMResponseEvent with the generated text
Create a response with full control over parameters:
response = await llm.create_response(
input="Explain machine learning",
instructions="You are a technical educator. Be concise and clear.",
model="grok-4.1",
stream=True
)
Parameters:
input (str): Input text
instructions (str): System instructions for the model
model (str | None): Override the default model
stream (bool): Whether to stream the response (default: True)
Returns: LLMResponseEvent with the generated text
Events
The xAI plugin emits standard Vision Agents LLM events:
from vision_agents.core.llm.events import (
LLMResponseChunkEvent,
LLMResponseCompletedEvent
)
@llm.events.on(LLMResponseChunkEvent)
async def on_chunk(event: LLMResponseChunkEvent):
print(f"Chunk: {event.delta}")
@llm.events.on(LLMResponseCompletedEvent)
async def on_completed(event: LLMResponseCompletedEvent):
print(f"Complete response: {event.text}")
Usage with Agent
Use the xAI LLM as part of a complete voice or video agent:
from vision_agents.core import Agent, User
from vision_agents.plugins import xai, getstream, deepgram, elevenlabs
agent = Agent(
edge=getstream.Edge(),
agent_user=User(name="Grok Assistant", id="agent"),
instructions="You are a helpful AI assistant powered by Grok.",
llm=xai.LLM(model="grok-4.1"),
stt=deepgram.STT(),
tts=elevenlabs.TTS()
)
# Join a call
call = client.video.call("default", call_id)
await call.get_or_create(data={"created_by_id": agent.agent_user.id})
with await agent.join(call):
await agent.say("Hello! I'm powered by Grok. How can I help you today?")
await agent.finish()
Model Options
xAI offers several Grok models with different capabilities:
- grok-4: The base Grok 4 model with strong reasoning capabilities
- grok-4.1: Enhanced version with function calling support and improved performance
- Check xAI’s documentation for the latest available models
Getting Started
- Get your xAI API key from the xAI Console
- Set the
XAI_API_KEY environment variable:
export XAI_API_KEY="your_api_key_here"
- Use the plugin in your Vision Agents application
Links