MiniMax

MiniMax provides powerful large language models, including the latest MiniMax-M3 agentic reasoning model and the M-series lineup. Use them with Vision Agents via the dedicated minimax plugin, which wraps MiniMax’s OpenAI-compatible Chat Completions API.

Vision Agents uses Stream Video for real-time WebRTC transport by default. External WebRTC transports are supported as well. Most AI providers offer free tiers to get started.

Get your MiniMax API key from the MiniMax Platform.

Installation

uv add vision-agents["minimax"]

Quick Start

from vision_agents.core import Agent, User
from vision_agents.plugins import minimax, getstream, deepgram, elevenlabs

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=minimax.LLM(),  # defaults to MiniMax-M3
    stt=deepgram.STT(),
    tts=elevenlabs.TTS(),
)

Set MINIMAX_API_KEY in your environment or pass api_key directly. The plugin also honors MINIMAX_BASE_URL if you proxy the API.When deploying to Asia, pair MiniMax with the Tencent RTC edge transport for the lowest end-to-end latency, and point at the in-region MiniMax endpoint https://api.minimaxi.com/v1 using a key from platform.minimaxi.com.

from vision_agents.core import Agent, User
from vision_agents.plugins import minimax, tencent, deepgram, elevenlabs

agent = Agent(
    edge=tencent.Edge(),  # low-latency edge in mainland China and Asia
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=minimax.LLM(
        model="MiniMax-M3",
        base_url="https://api.minimaxi.com/v1",
    ),
    stt=deepgram.STT(),
    tts=elevenlabs.TTS(),
)

Parameters

Name	Type	Default	Description
`model`	`str`	`"MiniMax-M3"`	Model identifier (see available models below)
`api_key`	`str`	`None`	API key (defaults to `MINIMAX_API_KEY` env var)
`base_url`	`str`	`"https://api.minimax.io/v1"`	MiniMax API endpoint (overridable via `MINIMAX_BASE_URL`)
`client`	`AsyncOpenAI`	`None`	Optional pre-configured `AsyncOpenAI` client for dependency injection
`max_tokens`	`int`	`None`	Upper limit for response tokens
`tools_max_rounds`	`int`	`3`	Max calling rounds for multi-hop tool calls (must be `>= 1`)

Available Models

These models are supported through the OpenAI-compatible API:

Model	Context	Description
`MiniMax-M3` (default)	512K	Latest flagship for agentic reasoning, tool use, coding, and long context; supports image input
`MiniMax-M2.7`	205K	Previous-generation flagship; ~60 tps output
`MiniMax-M2.7-highspeed`	205K	Same as M2.7 with faster output (~100 tps)

For MiniMax-M3, MiniMax recommends temperature=1.0 (the API rejects 0.0, so the plugin defaults to 1.0) and top_p=0.95. M3 also supports multimodal input (images and videos) and optional deep thinking via the thinking parameter (adaptive by default). The response_format field is not supported by MiniMax and is intentionally not exposed. See the MiniMax OpenAI API reference for reasoning_split, streaming usage, and other supported parameters.

Function Calling

MiniMax models support function calling with automatic tool invocation:

@agent.llm.register_function(description="Get weather for a location")
async def get_weather(location: str) -> str:
    return f"The weather in {location} is sunny and 72°F"

In multi-turn tool conversations, preserve the full assistant message (including tool_calls and any reasoning content) in the conversation history so the reasoning chain stays intact. Vision Agents handles this when using registered functions on the agent. See the Function Calling guide for details.

Installation

Quick Start

Parameters

Available Models

Function Calling

Next Steps

Build a Voice Agent

Build a Video Agent

​Installation

​Quick Start

​Parameters

​Available Models

​Function Calling

​Next Steps

Build a Voice Agent

Build a Video Agent

Installation

Quick Start

Parameters

Available Models

Function Calling

Next Steps