Vision Agents requires a Stream account
for real-time transport. Most providers offer free tiers to get started.
Installation
Cloud Detection
Uses Roboflow’s hosted API with pre-trained models.| Name | Type | Default | Description |
|---|---|---|---|
model_id | str | — | Roboflow Universe model ID |
classes | List[str] | None | Classes to detect (or all if None) |
conf_threshold | float | 0.5 | Confidence threshold |
fps | int | 5 | Frame processing rate |
annotate | bool | True | Draw bounding boxes |
Local Detection
Runs RF-DETR models locally without API calls.| Name | Type | Default | Description |
|---|---|---|---|
model_id | str | "rfdetr-seg-preview" | RF-DETR model ("rfdetr-nano", "rfdetr-base", "rfdetr-large") |
classes | List[str] | None | Classes to detect |
conf_threshold | float | 0.5 | Confidence threshold |
Cloud vs Local
| Cloud | Local | |
|---|---|---|
| Use when | Access to Roboflow Universe models | Higher throughput, avoid rate limits |
| Pros | Thousands of pre-trained models, no GPU required | No API costs, lower latency, works offline |
| Cons | Requires API key, potential rate limits | Requires local compute, RF-DETR models only |
Next Steps
Build a Voice Agent
Get started with voice
Build a Video Agent
Add video processing

