> ## Documentation Index
> Fetch the complete documentation index at: https://visionagents.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Inworld Realtime

[Inworld AI](https://inworld.ai) provides a WebRTC-based Realtime speech-to-speech API. Uses native Opus over UDP for lower latency than WebSocket alternatives.

<Info>
  Vision Agents requires a [Stream](https://getstream.io/try-for-free/) account
  for real-time transport. Most providers offer free tiers to get started.
</Info>

<Tip>
  Inworld also provides standalone [text-to-speech](/integrations/tts/inworld).
</Tip>

## Installation

```sh theme={null}
uv add "vision-agents[inworld]"
```

Get your API key from the [Inworld Portal](https://studio.inworld.ai/) and set `INWORLD_API_KEY` in your environment (or pass `api_key=` explicitly).

## Quick Start

Pair Inworld Realtime with a WebRTC-capable edge transport like `getstream.Edge()`.

```python theme={null}
from vision_agents.core import Agent, User
from vision_agents.plugins import getstream, inworld, smart_turn

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="My Agent", id="agent"),
    llm=inworld.Realtime(
        model="openai/gpt-4o-mini",
        voice="Dennis",
        instructions="You are a friendly voice assistant.",
    ),
    turn_detection=smart_turn.TurnDetection(),
)
```

## Parameters

| Name               | Type                                | Default                | Description                                                                                                              |
| ------------------ | ----------------------------------- | ---------------------- | ------------------------------------------------------------------------------------------------------------------------ |
| `model`            | `str`                               | `"openai/gpt-4o-mini"` | Provider-prefixed model ID (e.g. `"openai/gpt-4o-mini"`, `"google-ai-studio/gemini-2.5-flash"`, `"inworld/<router-id>"`) |
| `voice`            | `str`                               | `"Dennis"`             | Voice for audio responses (`"Dennis"`, `"Clive"`, `"Olivia"`, or custom)                                                 |
| `api_key`          | `str`                               | `None`                 | API key (defaults to `INWORLD_API_KEY` env var)                                                                          |
| `instructions`     | `str`                               | `None`                 | System prompt                                                                                                            |
| `realtime_session` | `RealtimeSessionCreateRequestParam` | `None`                 | Advanced — pass a full session param for custom turn detection, `tool_choice`, and other fields                          |

## Registering Tools

Inworld's Realtime API is protocol-compatible with OpenAI's Realtime API, so registered functions follow the OpenAI function-calling schema.

```python theme={null}
realtime = inworld.Realtime()

@realtime.register_function(description="Get the current weather for a city.")
async def get_weather(city: str) -> str:
    return f"It's sunny in {city}."
```

<Note>
  v1 is WebRTC only; a WebSocket transport may be added later. Video input is not currently supported by Inworld's Realtime API.
</Note>

## Next Steps

<CardGroup cols={2}>
  <Card title="Inworld TTS" icon="waveform" href="/integrations/tts/inworld">
    Standalone text-to-speech
  </Card>

  <Card title="Build a Voice Agent" icon="microphone" href="/introduction/voice-agents">
    Get started with voice
  </Card>
</CardGroup>
