Skip to main content
AWS Bedrock is Amazon’s fully managed service for building generative AI applications with foundation models. The AWS Bedrock plugin for the Stream Python AI SDK provides realtime speech-to-speech capabilities using Amazon Nova models.

Installation

Install the Stream AWS plugin with
uv add vision-agents[aws]
Then add the following to your .env from your AWS Console:
STREAM_API_KEY=your_stream_api_key_here
STREAM_API_SECRET=your_stream_api_secret_here

AWS_BEDROCK_API_KEY=
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=

Example

Check out our AWS Bedrock Realtime example to see a working code sample using the plugin, or read on for extra details.

Initialisation

The AWS Bedrock plugin provides the Realtime class for speech-to-speech interactions:
from vision_agents.plugins import aws

llm = aws.Realtime()
AWS credentials are resolved via the standard AWS SDK chain (environment variables, AWS profiles, or IAM roles). Make sure your AWS credentials are properly configured with access to Amazon Bedrock.

Parameters

These are the parameters available in the AWS Bedrock Realtime plugin:
NameTypeDefaultDescription
modelstramazon.nova-2-sonic-v1:0The Nova model to use for realtime speech-to-speech.
region_namestrus-east-1AWS region name where Bedrock is available.
voice_idstrmatthewThe voice ID to use for audio output. See available voices.
reconnect_after_minutesfloat5.0Attempt to reconnect during silence after this many minutes. Reconnect is forced after 7 minutes.

Functionality

Connect to the session

Before sending audio or text, you must connect to the Bedrock session:
await llm.connect()

Send text for response

Use simple_response() to send text instructions to the model:
await llm.simple_response("Tell me a story about a dragon")

Function calling

AWS Bedrock Realtime supports function calling (tool use). Register functions using the @llm.register_function decorator:
from vision_agents.plugins import aws

llm = aws.Realtime()

@llm.register_function(
    name="get_weather",
    description="Get the current weather for a given city"
)
async def get_weather(location: str) -> dict:
    return {"city": location, "temperature": 72, "condition": "Sunny"}
When the model determines it needs to call a function, it will execute the registered function and use the result to continue the conversation.

Automatic reconnection

AWS Nova has an 8-minute connection window limit. The Realtime plugin handles this automatically:
  • After 5 minutes of silence (configurable via reconnect_after_minutes), the connection will reconnect during a moment of silence
  • After 7 minutes, the connection will force a reconnect regardless of audio activity
This ensures uninterrupted conversations without manual intervention.

Voice activity detection

The plugin uses Silero VAD (Voice Activity Detection) to track audio activity. This is used for:
  • Determining when there’s silence for optimal reconnection timing
  • Tracking the last audio activity timestamp

Complete example

Here’s a complete example using AWS Bedrock Realtime with function calling:
import asyncio
from dotenv import load_dotenv

from vision_agents.core import User, Agent, cli, AgentLauncher
from vision_agents.plugins import aws, getstream

load_dotenv()


async def create_agent(**kwargs) -> Agent:
    agent = Agent(
        edge=getstream.Edge(),
        agent_user=User(name="Weather Bot", id="agent"),
        instructions="""You are a helpful weather assistant. When asked about weather,
        use the get_weather function to fetch current conditions.""",
        llm=aws.Realtime(
            model="amazon.nova-2-sonic-v1:0",
            region_name="us-east-1",
        ),
    )

    @agent.llm.register_function(
        name="get_weather", description="Get the current weather for a given city"
    )
    async def get_weather(location: str) -> dict:
        return {"city": location, "temperature": 72, "condition": "Sunny"}

    return agent


async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
    call = await agent.create_call(call_type, call_id)

    with await agent.join(call):
        await asyncio.sleep(2)
        await agent.llm.simple_response(
            text="What's the weather like in Boulder?"
        )
        await agent.finish()


if __name__ == "__main__":
    cli(AgentLauncher(create_agent=create_agent, join_call=join_call))