Skip to main content
This guide will take you through what you need to know to create your own plugin to connect the Stream Python AI SDK to a third-party AI provider. Currently, we support TTS, STT, STS and VAD plugins but we’ll be adding more functionality and features.
TL;DR - Copy the Quick-Start Template into a new directory, fill in the blanks, run the test suite with uv run pytest, and open a PR.

Plugin Categories

Each plugin implements one of the abstract base classes in vision_agents/core:
CategoryBase classTypical provider examples
STT (speech-to-text)STTDeepgram
TTS (text-to-speech)TTSElevenLabs
VAD (voice activity detection)VADSilero
LLM (large language model)LLMOpenAI
RealtimeRealtimeGemini Live
All base classes ship with:
  • WebRTC integration: Your plugin sends and receives audio frames to and from a Stream video call, with the webrtc code handled for you.
  • Events: Plugins use our event system to emit events which can be handled by event listeners in your application.
Implementing the abstract methods in the base class is all that is required for a minimal integration.

System Architecture & Lifecycle

An example workflow could look like this:
  1. You instantiate the plugin client and add it to your app.
  2. You listen for an event (e.g. audio received), which fires and triggers your plugin.
  3. Your plugin calls the third-party API.
  4. Results are dispatched via an event or directly into the call, e.g. for an STT plugin a transcript event is fired.
For an STT plugin, the workflow looks like:
Call → WebRTC Track → STT.process_audio()

                        calls provider

                Provider transcript  →  self.events.send(events.STTTranscriptEvent(
                                            session_id=self.session_id,
                                            plugin_name=self.provider_name,
                                            text=text,
                                            user_metadata=user_metadata,
                                            confidence=metadata.get("confidence"),
                                            language=metadata.get("language"),
                                            processing_time_ms=metadata.get("processing_time_ms"),
                                            audio_duration_ms=metadata.get("audio_duration_ms"),
                                            model_name=metadata.get("model_name"),
                                            words=metadata.get("words"),
                                        ))

                                            Your application consumes the event

Required Overrides

The base class for each type of plugin has different abstract methods which must be implemented in your plugin code, e.g. for an STT plugin, you must override the STT.process_audio and STT.close methods.
from vision_agents.core import stt
from getstream.video.rtc.track_util import PcmData

class MySTT(stt.STT):

    async def process_audio(self, pcm: PcmData, user):
        """Send PCM data to provider and await transcripts."""

    async def close(self):
        """Gracefully shut down connections."""

Quickstart Template

Below is the skeleton for a new STT plugin named AcmeSTT. Replace placeholders with real provider logic. For other plugin types, replace with their equivalent objects. Create these files in the agents/plugins/acme folder:
acme/vision_agents/
        ├── pyproject.toml
        ├── README.md
        ├── vision_agents/
        │   ├── plugins
        │        └── acme
        |           └── __init__.py 
        |           └── acme_stt.py
        |           └── events.py # Data class containing events specific to your plugin 
        └── tests/
            └── test_stt.py

pyproject.toml

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "vision-agents-plugins-acme"
version = "0.1.0"
description = "Acme STT integration for GetStream"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
    "vision-agents",
    "acmestt-sdk~=1.2",    # provider SDK
]

[tool.hatch.build.targets.wheel]
packages = ["."]

[tool.hatch.metadata]
allow-direct-references = true

[tool.uv.sources]
vision-agents = { workspace = true } # Use the local version of Vision Agents on your file system 

[dependency-groups]
dev = [
    "pytest>=8.4.1",
    "pytest-asyncio>=1.0.0",
]

/plugins/acme/vision_agents/plugins/acme/__init__.py

from .stt import STT

# Re-export under the new namespace for convenience
__path__ = __import__("pkgutil").extend_path(__path__, __name__)

__all__ = ["STT"]

/plugins/acme/vision_agents/plugins/acme/stt.py

from __future__ import annotations
import os
from vision_agents.core import stt
from getstream.video.rtc.track_util import PcmData

class AcmeSTT(stt.STT):
    """Real-time STT via fictional provider Acme."""

    # This will be called by the base class `STT.process_audio` method
    async def _process_audio_impl(self, pcm: PcmData, user):
        """Send PCM data to provider and await transcripts."""

    async def close(self):
        """Gracefully shut down connections."""

/plugins/acme/vision_agents/tests/test_stt.py

Fill out your test suite to cover all the plugin’s functions and features. Run the suite with:
# Install dependencies and create virtual environment
uv sync

# Run tests
uv run pytest -v

README.md

Model your readme on other plugins - explain how to use the plugin, what’s possible with it, what needs to be set, the events it emits, etc.

Emitting Custom Events

The base class has an events system attached with uses our builtin EventManager. This means you can emit events by calling:
    # Inside the AcmeSTT class
    self.events.send(YourEventModel)
Don’t forget to document them!

Packaging & Distribution

First, install dev dependencies and make sure your tests all pass. Then, run quality checks:
uv sync
uv run pre-commit run --all-files
Once everything looks good, open a PR to the Stream repo. Once merged, we will publish the plugin under the vision-agents-plugins-<provider> namespace on PyPI.

Contribution Checklist

  1. Fork agents, create a feature branch
  2. Add your plugin by implementing one of the base classes
  3. Ensure tests are all passing
  4. Ensure new code passes the pre-commit hooks
  5. List runtime and test dependencies in pyproject.toml
  6. Add documentation (README.md in the plugin folder)

Support

Need help? Open an issue or discussion on the Agents GitHub repo or email support@getstream.io. We look forward to seeing what you build with us!
I