NVIDIA provides powerful vision language models through their NIM platform. The plugin enables real-time video understanding using models like Cosmos Reason2 with automatic frame buffering and NVCF asset management.Documentation Index
Fetch the complete documentation index at: https://visionagents.ai/llms.txt
Use this file to discover all available pages before exploring further.
Vision Agents requires a Stream account
for real-time transport. Most providers offer free tiers to get started.
Installation
Quick Start
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
model | str | "nvidia/cosmos-reason2-8b" | NVIDIA model ID |
fps | int | 1 | Video frames per second to buffer |
frame_buffer_seconds | int | 10 | Seconds of video to buffer |
frame_width | int | 800 | Frame width |
frame_height | int | 600 | Frame height |
api_key | str | None | API key (defaults to NVIDIA_API_KEY env var) |
Next Steps
Build a Voice Agent
Get started with voice
Build a Video Agent
Add video processing

