TheDocumentation Index
Fetch the complete documentation index at: https://visionagents.ai/llms.txt
Use this file to discover all available pages before exploring further.
local plugin replaces the cloud edge with your machine’s microphone, speakers, and camera. Useful for local development, desktop apps, and demos where you don’t want to round-trip through a real-time transport.
No Stream account is required for the local edge — but you’ll still need API
keys for whichever LLM / STT / TTS plugins you use.
Installation
portaudio separately.
Quick Start
select_* helpers prompt interactively in the terminal. For headless use, instantiate AudioInputDevice, AudioOutputDevice, and CameraDevice directly with a known device index.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
audio_input | AudioInputDevice | — | Microphone for capturing user audio. |
audio_output | AudioOutputDevice | — | Speaker for playing agent audio. |
video_input | CameraDevice | None | Camera for capturing user video. None disables video. |
video_width | int | 640 | Output video width in pixels. |
video_height | int | 480 | Output video height in pixels. |
video_fps | int | 30 | Output video frame rate. |
video_input is set, agent video is rendered locally in a tkinter window. Subclass the device classes (AudioInputDevice, AudioOutputDevice, CameraDevice) to swap in alternative backends (e.g. GStreamer).
Next Steps
Build a Voice Agent
Get started with voice
Build a Video Agent
Add video processing

