Vogent uses neural models to predict when a speaker has completed their conversational turn. Provides intelligent turn-taking for natural conversation flow.Documentation Index
Fetch the complete documentation index at: https://visionagents.ai/llms.txt
Use this file to discover all available pages before exploring further.
Vision Agents requires a Stream account
for real-time transport. Most providers offer free tiers to get started.
Installation
Quick Start
Models download automatically on first use.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
buffer_in_seconds | float | 2.0 | Audio buffer duration |
confidence_threshold | float | 0.5 | Turn completion threshold (0-1) |
sample_rate | int | 16000 | Audio sample rate |
Events
Next Steps
Build a Voice Agent
Get started with voice
Build a Video Agent
Add video processing

