> ## Documentation Index
> Fetch the complete documentation index at: https://visionagents.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Quickstart

> Build and run your first AI voice agent in under 5 minutes

You'll build a real-time voice agent you can talk to in your browser, using [Gemini Realtime](https://ai.google.dev/gemini-api/docs/live) on [Stream's](https://getstream.io/video/) edge network. About 18 lines of Python.

<Prompt description="Copy this prompt into Claude Code, Cursor, Windsurf, or any coding agent to scaffold your project." actions={["copy", "cursor"]}>
  {`Create a Python project for a Vision Agents voice assistant using uv and Python 3.12.

    Steps:
    1. Initialize: uv init --python 3.12 my-agent && cd my-agent && uv add "vision-agents[getstream,gemini]" python-dotenv
    2. Create .env with: STREAM_API_KEY, STREAM_API_SECRET (from getstream.io), GOOGLE_API_KEY (from aistudio.google.com)
    3. Create main.py:

    from dotenv import load_dotenv
    from vision_agents.core import Agent, AgentLauncher, User, Runner
    from vision_agents.plugins import getstream, gemini

    load_dotenv()

    async def create_agent(**kwargs) -> Agent:
      return Agent(
          edge=getstream.Edge(),
          agent_user=User(name="Assistant", id="agent"),
          instructions="You're a helpful voice assistant. Be concise.",
          llm=gemini.Realtime(),
      )

    async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
      call = await agent.create_call(call_type, call_id)
      async with agent.join(call):
          await agent.simple_response("Greet the user")
          await agent.finish()

    if __name__ == "__main__":
      Runner(AgentLauncher(create_agent=create_agent, join_call=join_call)).cli()

    4. Run with: uv run main.py run

    Reference docs: https://visionagents.ai
    MCP server: https://visionagents.ai/mcp
    Skill.md: https://visionagents.ai/skill.md`}
</Prompt>

## Build your agent

<Steps>
  <Step title="Set up your project" icon="folder-plus">
    Create a project directory and install Vision Agents with the `getstream` and `gemini` plugins. If you don't have **[uv](https://docs.astral.sh/uv/getting-started/installation/)** yet, install it first.

    ```bash theme={null}
    uv init --python 3.12 my-agent && cd my-agent
    uv add "vision-agents[getstream,gemini]" python-dotenv
    ```
  </Step>

  <Step title="Add your API keys" icon="key">
    Get the keys you'll need:

    * Create a free **[Stream account](https://getstream.io/try-for-free/)** for `STREAM_API_KEY` and `STREAM_API_SECRET`.
    * Get an API key from **[Google AI Studio](https://aistudio.google.com/)** for `GOOGLE_API_KEY`.

    Then create a `.env` file in the project root. Vision Agents auto-loads these for each plugin.

    ```bash .env theme={null}
    STREAM_API_KEY=your_stream_api_key
    STREAM_API_SECRET=your_stream_api_secret
    GOOGLE_API_KEY=your_google_api_key
    ```
  </Step>

  <Step title="Write the agent" icon="file-code">
    Create `main.py`. The agent joins a Stream call and responds via Gemini Realtime.

    ```python main.py theme={null}
    from dotenv import load_dotenv

    from vision_agents.core import Agent, AgentLauncher, User, Runner
    from vision_agents.plugins import getstream, gemini

    load_dotenv()


    async def create_agent(**kwargs) -> Agent:
        return Agent(
            edge=getstream.Edge(),
            agent_user=User(name="Assistant", id="agent"),
            instructions="You're a helpful voice assistant. Be concise.",
            llm=gemini.Realtime(),
        )


    async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
        call = await agent.create_call(call_type, call_id)
        async with agent.join(call):
            await agent.simple_response("Greet the user")
            await agent.finish()


    if __name__ == "__main__":
        Runner(AgentLauncher(create_agent=create_agent, join_call=join_call)).cli()
    ```
  </Step>

  <Step title="Run it" icon="play">
    Start the agent. The CLI prints a join link, open it to talk to your agent in the browser.

    ```bash theme={null}
    uv run main.py run
    ```

    The agent greets you as soon as you join the call.
  </Step>
</Steps>

## Next steps

<CardGroup cols={2}>
  <Card title="Voice Agents" icon="microphone" href="/introduction/voice-agents">
    Custom STT/LLM/TTS pipelines, function calling, provider options
  </Card>

  <Card title="Video Agents" icon="video" href="/introduction/video-agents">
    VLMs, YOLO processors, real-time video analysis
  </Card>

  <Card title="Deploy to Production" icon="server" href="/guides/deploying-overview">
    Docker, Kubernetes, and monitoring
  </Card>

  <Card title="Browse Integrations" icon="plug" href="/integrations/introduction-to-integrations">
    25+ AI providers to mix and match
  </Card>
</CardGroup>
