HeyGen Avatars

HeyGen is a service that provides realistic AI avatars with automatic lip-sync capabilities. The HeyGen plugin allows you to add a video avatar to your AI agent that speaks with natural movements and expressions synchronized to your agent’s voice. The HeyGen plugin for the Stream Python AI SDK allows you to add avatar video functionality to your project, creating more engaging and human-like AI interactions.

Features

🎤 Automatic Lip-Sync: Avatar automatically syncs with audio
🚀 WebRTC Streaming: Low-latency real-time video streaming
🎨 Customizable: Change avatar, quality, and resolution

Installation

Install the Stream HeyGen plugin with

uv add vision-agents[heygen]

Example

Check out our HeyGen examples to see working code samples using the plugin, or read on for some key details.

Initialisation

The HeyGen plugin for Stream exists in the form of the AvatarPublisher class:

from vision_agents.plugins import heygen

avatar = heygen.AvatarPublisher(
    avatar_id="default",
    quality=heygen.VideoQuality.HIGH
)

To initialise without passing in the API key, make sure the HEYGEN_API_KEY is available as an environment variable. You can do this either by defining it in a .env file or exporting it directly in your terminal.

Parameters

These are the parameters available in the HeyGen AvatarPublisher plugin for you to customise:

Name	Type	Default	Description
`avatar_id`	`str`	`"default"`	HeyGen avatar ID to use for streaming. Get this from your HeyGen dashboard.
`quality`	`VideoQuality`	`VideoQuality.HIGH`	Video quality setting. Options: `VideoQuality.LOW`, `VideoQuality.MEDIUM`, or `VideoQuality.HIGH`.
`resolution`	`Tuple[int, int]`	`(1920, 1080)`	Output video resolution as (width, height).
`api_key`	`str` or `None`	`None`	Your HeyGen API key. If not provided, the plugin will look for the `HEYGEN_API_KEY` environment variable.

How It Works

The HeyGen avatar integration works differently depending on whether you’re using a standard streaming LLM or a Realtime LLM:

With Streaming LLMs (Recommended for Lower Latency)

When using a standard streaming LLM (like Gemini LLM), the flow is:

Text Generation: Your LLM generates text responses
Lip-Sync: Text is sent directly to HeyGen for avatar lip-sync generation
Audio Synthesis: HeyGen generates both the avatar video and audio with TTS
Streaming: Avatar video and audio are streamed to call participants

This approach has lower latency because text goes directly to HeyGen without transcription delays.

from vision_agents.core import Agent, User
from vision_agents.plugins import getstream, gemini, deepgram, heygen

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Avatar Assistant"),
    instructions="You're a friendly AI assistant.",
    
    llm=gemini.LLM("gemini-2.0-flash-exp"),
    stt=deepgram.STT(),
    
    processors=[
        heygen.AvatarPublisher(
            avatar_id="default",
            quality=heygen.VideoQuality.HIGH
        )
    ]
)

With Realtime LLMs

When using a Realtime LLM (like Gemini Realtime), the flow is:

Audio Generation: Realtime LLM generates audio directly
Transcription: Audio is transcribed to text
Lip-Sync: Text transcription is sent to HeyGen for avatar lip-sync
Video Only: HeyGen generates avatar video (audio comes from the Realtime LLM)
Streaming: Avatar video and LLM audio are streamed together

from vision_agents.core import Agent, User
from vision_agents.plugins import getstream, gemini, heygen

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Avatar Assistant"),
    instructions="You're a friendly AI assistant.",
    
    llm=gemini.Realtime(model="gemini-2.5-flash-native-audio-preview-09-2025"),
    
    processors=[
        heygen.AvatarPublisher(
            avatar_id="default",
            quality=heygen.VideoQuality.HIGH
        )
    ]
)

Usage in Agent

Add the AvatarPublisher to your agent’s processors list:

from uuid import uuid4
from vision_agents.core import Agent, User
from vision_agents.plugins import getstream, gemini, deepgram, heygen

async def start_avatar_agent():
    agent = Agent(
        edge=getstream.Edge(),
        agent_user=User(name="AI Assistant with Avatar", id="agent"),
        instructions="You're a friendly AI assistant.",
        
        llm=gemini.LLM("gemini-2.0-flash"),
        stt=deepgram.STT(),
        
        processors=[
            heygen.AvatarPublisher(
                avatar_id="default",
                quality=heygen.VideoQuality.HIGH,
                resolution=(1920, 1080)
            )
        ]
    )
    
    call = agent.edge.client.video.call("default", str(uuid4()))
    
    async with await agent.join(call):
        await agent.edge.open_demo(call)
        await agent.simple_response("Hello! I'm your AI assistant with an avatar.")
        await agent.finish()

Video Quality Options

Choose the appropriate quality based on your bandwidth and requirements:

VideoQuality.LOW: Lower bandwidth usage, suitable for slower connections
VideoQuality.MEDIUM: Balanced quality and bandwidth
VideoQuality.HIGH: Best quality, requires stable high-bandwidth connection

Getting Your Avatar ID

Sign up for a HeyGen account
Navigate to your HeyGen dashboard
Find your avatar ID in the avatar settings
Use this ID in the avatar_id parameter

Troubleshooting

Connection Issues

If you experience connection problems:

Verify your HeyGen API key is valid
Ensure network access to HeyGen’s servers
Check firewall settings for WebRTC traffic

Video Quality Issues

To optimize video quality:

Use quality=VideoQuality.HIGH for best results
Ensure stable internet connection
Consider lowering resolution if bandwidth is limited

No Avatar Appearing

Check browser console for errors
Verify Stream credentials are correct
Ensure HeyGen API key has proper permissions

Overview

AI Providers

Custom Integrations

Features

Installation

Example

Initialisation

Parameters

How It Works

With Streaming LLMs (Recommended for Lower Latency)

With Realtime LLMs

Usage in Agent

Video Quality Options

Getting Your Avatar ID

Troubleshooting

Connection Issues

Video Quality Issues

No Avatar Appearing

Overview

AI Providers

Custom Integrations

​Features

​Installation

​Example

​Initialisation

​Parameters

​How It Works

​With Streaming LLMs (Recommended for Lower Latency)

​With Realtime LLMs

​Usage in Agent

​Video Quality Options

​Getting Your Avatar ID

​Troubleshooting

​Connection Issues

​Video Quality Issues

​No Avatar Appearing

Features

Installation

Example

Initialisation

Parameters

How It Works

With Streaming LLMs (Recommended for Lower Latency)

With Realtime LLMs

Usage in Agent

Video Quality Options

Getting Your Avatar ID

Troubleshooting

Connection Issues

Video Quality Issues

No Avatar Appearing