Skip to main content
Inworld AI provides a WebRTC-based Realtime speech-to-speech API. Uses native Opus over UDP for lower latency than WebSocket alternatives.
Vision Agents requires a Stream account for real-time transport. Most providers offer free tiers to get started.
Inworld also provides standalone text-to-speech.

Installation

uv add "vision-agents[inworld]"
Get your API key from the Inworld Portal and set INWORLD_API_KEY in your environment (or pass api_key= explicitly).

Quick Start

Pair Inworld Realtime with a WebRTC-capable edge transport like getstream.Edge().
from vision_agents.core import Agent, User
from vision_agents.plugins import getstream, inworld, smart_turn

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="My Agent", id="agent"),
    llm=inworld.Realtime(
        model="openai/gpt-4o-mini",
        voice="Dennis",
        instructions="You are a friendly voice assistant.",
    ),
    turn_detection=smart_turn.TurnDetection(),
)

Parameters

NameTypeDefaultDescription
modelstr"openai/gpt-4o-mini"Provider-prefixed model ID (e.g. "openai/gpt-4o-mini", "google-ai-studio/gemini-2.5-flash", "inworld/<router-id>")
voicestr"Dennis"Voice for audio responses ("Dennis", "Clive", "Olivia", or custom)
api_keystrNoneAPI key (defaults to INWORLD_API_KEY env var)
instructionsstrNoneSystem prompt
realtime_sessionRealtimeSessionCreateRequestParamNoneAdvanced — pass a full session param for custom turn detection, tool_choice, and other fields

Registering Tools

Inworld’s Realtime API is protocol-compatible with OpenAI’s Realtime API, so registered functions follow the OpenAI function-calling schema.
realtime = inworld.Realtime()

@realtime.register_function(description="Get the current weather for a city.")
async def get_weather(city: str) -> str:
    return f"It's sunny in {city}."
v1 is WebRTC only; a WebSocket transport may be added later. Video input is not currently supported by Inworld’s Realtime API.

Next Steps

Inworld TTS

Standalone text-to-speech

Build a Voice Agent

Get started with voice