OpenAI provides industry-leading language models. The plugin supports the Responses API (for GPT-5+) and ChatCompletionsLLM (any OpenAI-compatible API). Requires separate STT/TTS.Documentation Index
Fetch the complete documentation index at: https://visionagents.ai/llms.txt
Use this file to discover all available pages before exploring further.
Vision Agents requires a Stream account
for real-time transport. Most providers offer free tiers to get started.
Installation
LLM (Responses API)
Uses the Responses API (default for GPT-5+).| Name | Type | Default | Description |
|---|---|---|---|
model | str | "gpt-5.4" | Model (e.g., "gpt-5.4") |
api_key | str | None | API key (defaults to OPENAI_API_KEY env var) |
base_url | str | None | Custom API endpoint |
max_tool_rounds | int | 3 | Maximum tool-call rounds per response |
ChatCompletionsLLM
Works with any OpenAI-compatible API (Together AI, Fireworks, DeepSeek, etc.).| Name | Type | Default | Description |
|---|---|---|---|
model | str | — | Model identifier |
api_key | str | None | API key (defaults to OPENAI_API_KEY env var) |
base_url | str | None | Custom API endpoint |
tools_max_rounds | int | 3 | Maximum tool-call rounds per response |
Function Calling
Events
The OpenAI plugin emits a low-level event for raw stream data. Most developers should use the core events (LLMResponseCompletedEvent, RealtimeUserSpeechTranscriptionEvent, etc.) instead.Next Steps
OpenAI Realtime
Speech-to-speech over WebRTC
OpenAI TTS
Text-to-speech synthesis

