Skip to main content
Deploy your agents as HTTP services using the built-in FastAPI server. The server provides REST endpoints for session management, health checks, and metrics.

Quick Start

from vision_agents.core import Agent, AgentLauncher, Runner, User
from vision_agents.plugins import gemini, deepgram, elevenlabs, getstream

async def create_agent(**kwargs) -> Agent:
    return Agent(
        edge=getstream.Edge(),
        agent_user=User(name="Assistant", id="agent"),
        instructions="You're a helpful voice assistant.",
        llm=gemini.LLM("gemini-2.5-flash"),
        tts=elevenlabs.TTS(),
        stt=deepgram.STT(),
    )

async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
    call = await agent.create_call(call_type, call_id)
    async with agent.join(call):
        await agent.finish()

if __name__ == "__main__":
    Runner(AgentLauncher(create_agent=create_agent, join_call=join_call)).cli()
Run the server:
python agent.py serve --host 0.0.0.0 --port 8000

API Endpoints

MethodEndpointDescription
POST/sessionsStart a new agent session
GET/sessions/{session_id}Get session info
DELETE/sessions/{session_id}Close a session
POST/sessions/{session_id}/closeClose via sendBeacon (POST alternative)
GET/sessions/{session_id}/metricsGet session metrics
GET/healthHealth check
GET/readyReadiness check

Start a Session

curl -X POST http://localhost:8000/sessions \
  -H "Content-Type: application/json" \
  -d '{"call_id": "my-call-123", "call_type": "default"}'
Response:
{
  "session_id": "abc-123",
  "call_id": "my-call-123",
  "session_started_at": "2025-01-15T10:30:00Z"
}

Get Session Metrics

curl http://localhost:8000/sessions/abc-123/metrics
Response:
{
  "session_id": "abc-123",
  "call_id": "my-call-123",
  "session_started_at": "2025-01-15T10:30:00Z",
  "metrics_generated_at": "2025-01-15T10:35:00Z",
  "metrics": {
    "llm_latency_ms__avg": 245.5,
    "llm_time_to_first_token_ms__avg": 120.3,
    "llm_input_tokens__total": 1500,
    "llm_output_tokens__total": 800,
    "stt_latency_ms__avg": 85.2,
    "tts_latency_ms__avg": 95.1
  }
}

Permissions

Control access to endpoints with ServeOptions:
from fastapi import Depends, Header, HTTPException
from vision_agents.core import Runner, AgentLauncher, ServeOptions

async def verify_api_key(x_api_key: str = Header(...)):
    if x_api_key != "secret-key":
        raise HTTPException(status_code=401, detail="Invalid API key")

options = ServeOptions(
    can_start_session=verify_api_key,
    can_close_session=verify_api_key,
    can_view_session=verify_api_key,
    can_view_metrics=verify_api_key,
)

Runner(
    AgentLauncher(create_agent=create_agent, join_call=join_call),
    serve_options=options,
).cli()

CORS Configuration

options = ServeOptions(
    cors_allow_origins=["https://myapp.com"],
    cors_allow_methods=["GET", "POST", "DELETE"],
    cors_allow_headers=["Authorization", "Content-Type"],
    cors_allow_credentials=True,
)

CLI Options

python agent.py serve --help
OptionDefaultDescription
--host127.0.0.1Server host
--port8000Server port
--agents-log-levelINFOLog level for agents
--http-log-levelINFOLog level for HTTP server

Custom FastAPI App

For full control, pass your own FastAPI instance:
from fastapi import FastAPI

app = FastAPI(title="My Agent API")

@app.get("/custom")
async def custom_endpoint():
    return {"status": "ok"}

options = ServeOptions(fast_api=app)

Next Steps