Anam provides real-time interactive avatar video with automatic lip-sync. Add a video avatar to your agent that speaks with natural movements synchronized to your agent’s voice output.
Vision Agents requires a Stream account
for real-time transport. Anam provides API keys and avatar IDs
through their dashboard.
Agent TTS audio is resampled to 24 kHz mono and streamed to Anam
Anam generates lip-synced avatar video and audio from the input
Avatar video and audio frames are streamed back to call participants via Stream Edge
When a user starts speaking, the avatar is automatically interrupted
With Realtime LLMsAnam also works with realtime speech-to-speech models. It subscribes to both TTS audio events and realtime audio output, so you can swap in a realtime LLM without any changes to the avatar setup.