Skip to main content
Give your agents access to documents, URLs, and knowledge bases using Retrieval-Augmented Generation (RAG).
Vision Agents requires a Stream account for real-time transport.

Options

OptionBest ForComplexity
Gemini File SearchQuick setup, automatic chunking/embeddingSimple
TurboPufferFull control, hybrid search, productionMore setup
Gemini’s File Search handles chunking, embedding, and retrieval automatically.
from vision_agents.plugins import gemini

# Create and populate a file search store
store = gemini.GeminiFilesearchRAG(name="my-knowledge-base")
await store.create()  # Reuses existing store if found
await store.add_directory("./knowledge")  # Skips duplicates via content hash

# Use with Gemini LLM
llm = gemini.LLM(
    model="gemini-2.5-flash",
    tools=[gemini.tools.FileSearch(store)]
)
Features:
  • Store reuse (finds existing stores by name)
  • Content deduplication via SHA-256 hash
  • Concurrent batch uploads

TurboPuffer

TurboPuffer provides hybrid search combining vector (semantic) and BM25 (keyword) search with Reciprocal Rank Fusion.
from vision_agents.plugins import turbopuffer, gemini

# Initialize with hybrid search
rag = turbopuffer.TurboPufferRAG(
    namespace="my-knowledge",
    chunk_size=10000,
    chunk_overlap=200,
)
await rag.add_directory("./knowledge")

# Register as function for LLM
llm = gemini.LLM("gemini-2.5-flash")

@llm.register_function(description="Search the knowledge base")
async def search_knowledge(query: str) -> str:
    return await rag.search(query, top_k=5, mode="hybrid")

RAG Pipeline Overview

For custom implementations, a typical RAG pipeline involves:
  1. Document gathering — URLs, folders, PDFs, external APIs
  2. Parsing — Convert to text (markdownify, BeautifulSoup, OCR)
  3. Chunking — Split into retrievable pieces (fixed size, semantic, recursive)
  4. Embedding — Convert text to vectors (MTEB leaderboard)
  5. Vector storage — Store embeddings for similarity search
  6. Hybrid search — Combine vector + full-text search (TurboPuffer guide)
  7. Reranking — Score and filter results before passing to LLM

Comparison

FeatureGemini File SearchTurboPuffer
SetupSimpleMore setup
ChunkingAutomaticConfigurable
SearchManagedHybrid (vector + BM25)
ControlLessFull control
CostIncluded with GeminiSeparate service
Best forPrototypesProduction with custom needs

Next Steps