Skip to main content

View virtual try-on example on GitHub

Check out the complete virtual try-on example in our GitHub repository
Build a real-time virtual try-on agent using Vision Agents and Decart. Powered by Decart’s Lucy-2 real-time model (lucy_2_rt), the agent listens for voice requests and restyles your video feed so you appear to be wearing different outfits — driven by both a text prompt and a reference image. Lucy-2 is purpose-built for virtual try-on and costume-swap use cases. It accepts a reference image alongside a prompt, enabling accurate outfit transfer onto the user’s live video. Prompt and image updates are applied atomically via update_state, so the output video never shows a half-updated frame.
Vision Agents requires a Stream account for real-time transport. Most providers offer free tiers to get started.

What you will build

  • Listen to voice input and swap outfits in real time
  • Use Decart Lucy-2 to restyle your video feed with both a prompt and a reference image
  • Atomically swap costumes via processor.update_state(prompt=..., image=...)
  • Fall back to prompt-only changes for freeform outfit requests
  • Speak with an expressive voice using ElevenLabs
  • Run on Stream’s low-latency edge network

Next steps

Decart Integration

Explore Decart’s video restyling and try-on capabilities

Expressive Voice Narrator

See another storytelling example with Cartesia’s expressive TTS