> ## Documentation Index
> Fetch the complete documentation index at: https://visionagents.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Gemini LLM

<iframe className="w-full aspect-video rounded-xl" src="https://www.youtube.com/embed/8lA6bF2EnvA" title="Gemini integration" frameBorder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowFullScreen />

[Google's Gemini](https://ai.google.dev/gemini-api/docs/live) provides powerful language models with built-in tools for search, code execution, RAG, and URL context. The LLM mode requires separate STT/TTS.

<Info>
  Vision Agents requires a [Stream](https://getstream.io/try-for-free/) account
  for real-time transport. Most providers offer free tiers to get started.
</Info>

<Tip>
  Gemini also provides [Realtime speech-to-speech](/integrations/realtime/gemini) with optional video over WebSocket.
</Tip>

## Installation

```sh theme={null}
uv add "vision-agents[gemini]"
```

## Quick Start

```python theme={null}
from vision_agents.core import Agent, User
from vision_agents.plugins import gemini, getstream, deepgram, elevenlabs

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=gemini.LLM(model="gemini-3.1-pro-preview"),
    stt=deepgram.STT(),
    tts=elevenlabs.TTS(),
)
```

## Built-in Tools

Gemini provides built-in tools you can enable:

```python theme={null}
llm = gemini.LLM(
    model="gemini-3-flash-preview",
    tools=[
        gemini.tools.GoogleSearch(),
        gemini.tools.CodeExecution(),
        gemini.tools.FileSearch(store),  # RAG
        gemini.tools.URLContext(),
    ]
)
```

| Tool            | Description                    |
| --------------- | ------------------------------ |
| `GoogleSearch`  | Ground responses with web data |
| `CodeExecution` | Run Python code                |
| `FileSearch`    | RAG over your documents        |
| `URLContext`    | Read specific web pages        |

## File Search (RAG)

Managed RAG with automatic chunking and retrieval:

```python theme={null}
from vision_agents.plugins import gemini

store = gemini.GeminiFilesearchRAG(name="my-knowledge-base")
await store.create()
await store.add_directory("./knowledge")

llm = gemini.LLM(
    model="gemini-3-flash-preview",
    tools=[gemini.tools.FileSearch(store)]
)
```

See the [RAG guide](/guides/rag) for more details.

## Function Calling

```python theme={null}
@agent.llm.register_function(description="Get weather for a location")
async def get_weather(location: str) -> dict:
    return {"temperature": "22°C", "condition": "Sunny"}
```

See the [Function Calling guide](/guides/mcp-tool-calling) for details.

## Events

The Gemini plugin no longer documents provider-specific event classes for normal app code. Prefer core events and provider-agnostic response handling.

## Next Steps

<CardGroup cols={2}>
  <Card title="Gemini Realtime" icon="bolt" href="/integrations/realtime/gemini">
    Speech-to-speech with optional video
  </Card>

  <Card title="Build a Voice Agent" icon="microphone" href="/introduction/voice-agents">
    Get started with voice
  </Card>
</CardGroup>
