Vision Agents requires a Stream account for real-time transport. Most providers offer free tiers to get started.
Installation
LLM
Text-only language model with streaming and function calling.| Name | Type | Default | Description |
|---|---|---|---|
model | str | — | HuggingFace model ID |
provider | str | None | Provider ("together", "groq", "fastest", "cheapest") |
api_key | str | None | API key (defaults to HF_TOKEN env var) |
VLM
Vision language model with automatic video frame buffering.| Name | Type | Default | Description |
|---|---|---|---|
model | str | — | HuggingFace VLM model ID |
fps | int | 1 | Video frames per second to buffer |
frame_buffer_seconds | int | 10 | Seconds of video to buffer |
provider | str | None | Inference provider |

