Skip to main content

Documentation Index

Fetch the complete documentation index at: https://visionagents.ai/llms.txt

Use this file to discover all available pages before exploring further.

Sarvam AI provides streaming text-to-speech using the Bulbul model, with configurable speaker, pace, and language support for Indian languages.
Vision Agents requires a Stream account for real-time transport. Get your Sarvam API key from the Sarvam dashboard.
Sarvam also provides speech-to-text and an LLM. You can use all three in the same agent.

Installation

uv add "vision-agents[sarvam]"

Quick start

from vision_agents.core import Agent, User
from vision_agents.plugins import sarvam, getstream, smart_turn

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Sarvam Agent", id="agent"),
    instructions="Reply in the same language the user speaks.",
    llm=sarvam.LLM(model="sarvam-m"),
    stt=sarvam.STT(language="hi-IN"),
    tts=sarvam.TTS(speaker="shubh"),
    turn_detection=smart_turn.TurnDetection(),
)
Set SARVAM_API_KEY in your environment or pass api_key directly.

Parameters

tts = sarvam.TTS(
    model="bulbul:v3",
    language="hi-IN",
    speaker="shubh",
    pace=1.0,
)
NameTypeDefaultDescription
modelstr"bulbul:v3"TTS model (bulbul:v2, bulbul:v3-beta, or bulbul:v3)
languagestr"hi-IN"Target language code (e.g. hi-IN, en-IN)
speakerstr"shubh"Speaker voice id — must be compatible with the chosen model (see below)
sample_rateint24000Output sample rate in Hz
pacefloatNoneSpeech pace (bulbul:v3 supports 0.5–2.0)
pitchfloatNoneSpeech pitch (bulbul:v2 only)
loudnessfloatNoneSpeech loudness (bulbul:v2 only)
temperaturefloatNoneSampling temperature (bulbul:v3 and bulbul:v3-beta only)
enable_preprocessingboolTrueNormalize mixed-language and numeric text
api_keystrNoneAPI key (defaults to SARVAM_API_KEY env var)

Speaker compatibility

Each model supports a specific set of speakers. Passing an incompatible speaker raises a ValueError.
ModelSpeakers
bulbul:v2abhilash, anushka, arya, hitesh, karun, manisha, vidya
bulbul:v3-betaaayan, aditya, advait, amelia, amit, ashutosh, dev, ishita, kabir, kavya, manan, neha, pooja, priya, rahul, ratan, ritu, rohan, roopa, shubh, shreya, simran, sophia, sumit, varun
bulbul:v3aayan, aditya, advait, amelia, amit, ashutosh, dev, ishita, kabir, kavya, manan, neha, pooja, priya, rahul, ratan, ritu, rohan, roopa, shubh, shreya, simran, sophia, sumit, varun

Next steps

Sarvam STT

Streaming speech-to-text for Indian languages

Sarvam LLM

Chat completions with Sarvam models