In this example, we build a real-time sports commentator using Roboflow’sRF-DETR for player and ball detection combined with realtime models from Gemini and OpenAI, running on Stream’s low-latency edge network. The system annotates video with bounding boxes, detects game events (like when the ball reappears after fast action), and triggers AI commentary. This example explores the current limitations of realtime video understanding and showcases Vision Agents’ ability to hot-swap between different AI providers with minimal code changes.