> ## Documentation Index
> Fetch the complete documentation index at: https://visionagents.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Gemini LLM

<iframe className="w-full aspect-video rounded-xl" src="https://www.youtube.com/embed/8lA6bF2EnvA" title="Gemini integration" frameBorder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowFullScreen />

[Google's Gemini](https://ai.google.dev/gemini-api/docs/live) provides powerful language models with built-in tools for search, code execution, RAG, and URL context. The LLM mode requires separate STT/TTS.

<Info>
  Vision Agents requires a [Stream](https://getstream.io/try-for-free/) account
  for real-time transport. Most providers offer free tiers to get started.
</Info>

<Tip>
  Gemini also provides [Realtime speech-to-speech](/integrations/realtime/gemini) with optional video over WebSocket.
</Tip>

## Installation

```sh theme={null}
uv add "vision-agents[gemini]"
```

## Quick Start

```python theme={null}
from vision_agents.core import Agent, User
from vision_agents.plugins import gemini, getstream, deepgram, elevenlabs

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=gemini.LLM("gemini-3-flash-preview"),
    stt=deepgram.STT(),
    tts=elevenlabs.TTS(),
)
```

## Built-in Tools

Gemini provides built-in tools you can enable:

```python theme={null}
llm = gemini.LLM(
    model="gemini-3-flash-preview",
    tools=[
        gemini.tools.GoogleSearch(),
        gemini.tools.CodeExecution(),
        gemini.tools.FileSearch(store),  # RAG
        gemini.tools.URLContext(),
    ]
)
```

| Tool            | Description                    |
| --------------- | ------------------------------ |
| `GoogleSearch`  | Ground responses with web data |
| `CodeExecution` | Run Python code                |
| `FileSearch`    | RAG over your documents        |
| `URLContext`    | Read specific web pages        |

## File Search (RAG)

Managed RAG with automatic chunking and retrieval:

```python theme={null}
from vision_agents.plugins import gemini

store = gemini.GeminiFilesearchRAG(name="my-knowledge-base")
await store.create()
await store.add_directory("./knowledge")

llm = gemini.LLM(
    model="gemini-3-flash-preview",
    tools=[gemini.tools.FileSearch(store)]
)
```

See the [RAG guide](/guides/rag) for more details.

## Function Calling

```python theme={null}
@agent.llm.register_function(description="Get weather for a location")
async def get_weather(location: str) -> dict:
    return {"temperature": "22°C", "condition": "Sunny"}
```

See the [Function Calling guide](/guides/mcp-tool-calling) for details.

## Events

The Gemini plugin emits events for connection state and responses. Most developers should use the core events ([LLMResponseCompletedEvent](/reference/events-reference#llmresponsecompletedevent), etc.) for provider-agnostic code.

```python theme={null}
from vision_agents.plugins.gemini.events import (
    GeminiConnectedEvent,
    GeminiErrorEvent,
)

@agent.events.subscribe
async def on_gemini_connected(event: GeminiConnectedEvent):
    print(f"Connected to Gemini model: {event.model}")

@agent.events.subscribe
async def on_gemini_error(event: GeminiErrorEvent):
    print(f"Gemini error: {event.error}")
```

| Event                  | Description                     |
| ---------------------- | ------------------------------- |
| `GeminiConnectedEvent` | Realtime connection established |
| `GeminiErrorEvent`     | Error occurred                  |
| `GeminiAudioEvent`     | Audio output received           |
| `GeminiTextEvent`      | Text output received            |
| `GeminiResponseEvent`  | Response chunk received         |

## Next Steps

<CardGroup cols={2}>
  <Card title="Gemini Realtime" icon="bolt" href="/integrations/realtime/gemini">
    Speech-to-speech with optional video
  </Card>

  <Card title="Build a Voice Agent" icon="microphone" href="/introduction/voice-agents">
    Get started with voice
  </Card>
</CardGroup>
