Audio and Video Processors
Processors extend the agent’s capabilities by analysing and transforming audio/video streams in real-time. They have access to audio, video and can provide state to the LLM. Examples of what you can support with processors are:- API calls or state: Often you need some additional state. Like the score/stats of a video game/ sport match etc.
- Video Analysis: Pose detection, object recognition etc. Share the output of this with the realtime LLM
- Video/image capture: Easily support AI driven video capture or images.
- Video/audio transform: Video avatars, video effects etc