Vision Agents requires a Stream account
for real-time transport. Most providers offer free tiers to get started.
Installation
INWORLD_API_KEY in your environment (or pass api_key= explicitly).
Quick Start
Pair Inworld Realtime with a WebRTC-capable edge transport likegetstream.Edge().
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
model | str | "openai/gpt-4o-mini" | Provider-prefixed model ID (e.g. "openai/gpt-4o-mini", "google-ai-studio/gemini-2.5-flash", "inworld/<router-id>") |
voice | str | "Dennis" | Voice for audio responses ("Dennis", "Clive", "Olivia", or custom) |
api_key | str | None | API key (defaults to INWORLD_API_KEY env var) |
instructions | str | None | System prompt |
realtime_session | RealtimeSessionCreateRequestParam | None | Advanced — pass a full session param for custom turn detection, tool_choice, and other fields |
Registering Tools
Inworld’s Realtime API is protocol-compatible with OpenAI’s Realtime API, so registered functions follow the OpenAI function-calling schema.v1 is WebRTC only; a WebSocket transport may be added later. Video input is not currently supported by Inworld’s Realtime API.
Next Steps
Inworld TTS
Standalone text-to-speech
Build a Voice Agent
Get started with voice

