View virtual try-on example on GitHub
Check out the complete virtual try-on example in our GitHub repository
lucy_2_rt), the agent listens for voice requests and restyles your video feed so you appear to be wearing different outfits — driven by both a text prompt and a reference image.
Lucy-2 is purpose-built for virtual try-on and costume-swap use cases. It accepts a reference image alongside a prompt, enabling accurate outfit transfer onto the user’s live video. Prompt and image updates are applied atomically via update_state, so the output video never shows a half-updated frame.
Vision Agents requires a Stream account
for real-time transport. Most providers offer free tiers to get started.
What you will build
- Listen to voice input and swap outfits in real time
- Use Decart Lucy-2 to restyle your video feed with both a prompt and a reference image
- Atomically swap costumes via
processor.update_state(prompt=..., image=...) - Fall back to prompt-only changes for freeform outfit requests
- Speak with an expressive voice using ElevenLabs
- Run on Stream’s low-latency edge network
Next steps
Decart Integration
Explore Decart’s video restyling and try-on capabilities
Expressive Voice Narrator
See another storytelling example with Cartesia’s expressive TTS

