Skip to main content
Sarvam AI provides language models built for Indian languages. The plugin uses Sarvam’s OpenAI-compatible Chat Completions endpoint and automatically strips <think> reasoning blocks from streamed output.
Vision Agents requires a Stream account for real-time transport. Get your Sarvam API key from the Sarvam dashboard.
Sarvam also provides speech-to-text and text-to-speech. You can use all three in the same agent.

Installation

uv add "vision-agents[sarvam]"

Quick start

from vision_agents.core import Agent, User
from vision_agents.plugins import sarvam, getstream, smart_turn

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Sarvam Agent", id="agent"),
    instructions="Reply in Hindi or English, whichever the user speaks.",
    llm=sarvam.LLM(model="sarvam-30b"),
    stt=sarvam.STT(language="hi-IN"),
    tts=sarvam.TTS(speaker="shubh"),
    turn_detection=smart_turn.TurnDetection(),
)
Set SARVAM_API_KEY in your environment or pass api_key directly.

Parameters

NameTypeDefaultDescription
modelstr"sarvam-m"Model id (sarvam-m, sarvam-30b, or sarvam-105b)
api_keystrNoneAPI key (defaults to SARVAM_API_KEY env var)
base_urlstr"https://api.sarvam.ai/v1"Sarvam API endpoint

Available models

ModelDescription
sarvam-mDefault model with hybrid thinking support
sarvam-30b30B parameter model for balanced performance
sarvam-105b105B parameter model for maximum capability
Sarvam-m supports “hybrid thinking” — it emits <think> reasoning blocks before the answer. The plugin automatically strips these so they don’t reach TTS.

Function calling

@agent.llm.register_function(description="Get weather for a location")
async def get_weather(location: str) -> str:
    return f"The weather in {location} is sunny and 30°C"
See the Function Calling guide for details.

Next steps

Sarvam STT

Streaming speech-to-text for Indian languages

Sarvam TTS

Streaming text-to-speech for Indian languages