Skip to main content
Wizper is a real-time variant of OpenAI’s Whisper v3 that powers Speech-to-Text and highly-accurate, on-the-fly translation, hosted by Fal.ai. With the Vision Agents SDK you can use Wizper inside your video calls in just a few lines of code.

Installation

Install the Stream Wizper plugin with
uv add vision-agents[wizper]

Example

Check out our auto-translation example to see a practical implementation of the plugin and get inspiration for your own projects, or read on for some key details.
from vision_agents.plugins import wizper

# 1. Pure transcription (default)
stt = wizper.STT()

# 2. Translate everything participants say to Spanish
stt = wizper.STT(target_language="es")

@stt.on("transcript")
async def on_transcript(text: str, user: dict, metadata: dict):
    print(f"{user['name']} said → {text}")

# Send Stream PCM audio frames to the plugin
await stt.process_audio(pcm_data)

# Close when finished
await stt.close()

Initialisation

First, make sure you’ve created an API key for the FAL.ai service and set the FAL_KEY environment variable to your API key. The Wizper plugin is exposed via the STT class:
from vision_agents.plugins import wizper

# 1. Pure transcription (default)
stt = wizper.STT()

# 2. Translate everything participants say to Spanish
stt = wizper.STT(target_language="es")

Parameters

You can customise the behaviour of Wizper through the following parameters:
NameTypeDefaultDescription
taskstr|NonetranscribeTask to perform on the audio. Either transcribe or translate. Default value: “transcribe”.
target_languagestr|NoneNoneISO-639-1 code. If set, Wizper translates the recognised text to this language.
sample_rateint48000Incoming PCM sample rate in Hertz.

Functionality

Process Audio

from getstream.video import rtc

async with rtc.join(call, bot_user_id) as connection:

    @connection.on("audio")
    async def _on_audio(pcm: PcmData, user):
        await stt.process_audio(pcm, user)

Events

The plugin emits the standard Stream STT events:

Transcript Event

@stt.on("transcript")
async def _on_transcript(text: str, user: any, metadata: dict):
    ...

Error Event

@stt.on("error")
async def _on_error(error: Exception):
    # Handle errors returned by Wizper here

Close

When you’re done, close the Wizper connection with close():
await stt.close()