Sarvam AI provides streaming text-to-speech using the Bulbul model, with configurable speaker, pace, and language support for Indian languages.Documentation Index
Fetch the complete documentation index at: https://visionagents.ai/llms.txt
Use this file to discover all available pages before exploring further.
Vision Agents requires a Stream account
for real-time transport. Get your Sarvam API key from the Sarvam
dashboard.
Installation
Quick start
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
model | str | "bulbul:v3" | TTS model (bulbul:v2, bulbul:v3-beta, or bulbul:v3) |
language | str | "hi-IN" | Target language code (e.g. hi-IN, en-IN) |
speaker | str | "shubh" | Speaker voice id — must be compatible with the chosen model (see below) |
sample_rate | int | 24000 | Output sample rate in Hz |
pace | float | None | Speech pace (bulbul:v3 supports 0.5–2.0) |
pitch | float | None | Speech pitch (bulbul:v2 only) |
loudness | float | None | Speech loudness (bulbul:v2 only) |
temperature | float | None | Sampling temperature (bulbul:v3 and bulbul:v3-beta only) |
enable_preprocessing | bool | True | Normalize mixed-language and numeric text |
api_key | str | None | API key (defaults to SARVAM_API_KEY env var) |
Speaker compatibility
Each model supports a specific set of speakers. Passing an incompatible speaker raises aValueError.
| Model | Speakers |
|---|---|
bulbul:v2 | abhilash, anushka, arya, hitesh, karun, manisha, vidya |
bulbul:v3-beta | aayan, aditya, advait, amelia, amit, ashutosh, dev, ishita, kabir, kavya, manan, neha, pooja, priya, rahul, ratan, ritu, rohan, roopa, shubh, shreya, simran, sophia, sumit, varun |
bulbul:v3 | aayan, aditya, advait, amelia, amit, ashutosh, dev, ishita, kabir, kavya, manan, neha, pooja, priya, rahul, ratan, ritu, rohan, roopa, shubh, shreya, simran, sophia, sumit, varun |
Next steps
Sarvam STT
Streaming speech-to-text for Indian languages
Sarvam LLM
Chat completions with Sarvam models

