AWS Bedrock provides realtime speech-to-speech using Amazon Nova models with automatic session management. The plugin handles Nova’s 8-minute connection limit transparently, reconnecting during silence to ensure uninterrupted conversations.
Vision Agents requires a Stream account for real-time transport. Most providers offer free tiers to get started.
Installation
uv add vision-agents[aws]
Quick Start
from vision_agents.core import Agent, User
from vision_agents.plugins import aws, getstream
agent = Agent(
edge=getstream.Edge(),
agent_user=User(name="Assistant", id="agent"),
instructions="You are a helpful assistant.",
llm=aws.Realtime(),
)
AWS credentials are resolved via the standard AWS SDK chain (environment variables, AWS profiles, or IAM roles).
Parameters
| Name | Type | Default | Description |
|---|
model | str | "amazon.nova-2-sonic-v1:0" | Nova model ID |
region_name | str | "us-east-1" | AWS region |
voice_id | str | "matthew" | Voice (available voices) |
reconnect_after_minutes | float | 5.0 | Reconnect during silence after N minutes |
Automatic Reconnection
AWS Bedrock has an 8-minute connection limit. The plugin handles this automatically:
- After 5 minutes of silence (configurable via
reconnect_after_minutes), reconnects during a moment of silence
- After 7 minutes, forces reconnect regardless of audio activity
This ensures uninterrupted conversations without manual intervention.
Voice Activity Detection
The plugin uses Silero VAD to track audio activity for optimal reconnection timing.
Function Calling
@agent.llm.register_function(description="Get weather for a location")
async def get_weather(location: str) -> dict:
return {"city": location, "temperature": 72, "condition": "Sunny"}
See the Function Calling guide for details.
Next Steps