Building with Persistent Conversations
For production applications, the Agent usesStreamConversation
automatically. This uses our Chat API under the hood, which calls an ephemeral endpoint that collects the events from the LLM responses before persisting them. This ensures that the responses from the LLM are still streamed in real time to the user, allowing for smooth UI updates while not affecting performance/rate limits by writing to the database.
As the default strategy, no additional setup is required for conversations and Stream. The following example will automatically create a chat channel under the hood linked to the call ID and persist the conversation as both the user and the bot speak. No additional accounts or API keys required.
Building with In-Memory Conversations
For development and testing, you can useInMemoryConversation
to store messages locally. This approach is perfect for prototyping and doesn’t require any external services.
Let’s build a simple example using in-memory conversations. First, we’ll need to install the required dependencies:
main.py
file, we can start by importing the packages required for our project:
.env
variables required for our sample. Since we are running the Gemini model in this example, you will need to have the following in your .env
:
Both Stream and Google offer free API keys. For Gemini, developers can get a free API key on Google’s AI Studio while Stream developers can get their API key on the Stream Dashboard
start_agent
function where most of our code will live. In this method, we can setup the Agent
with conversation support:
uv run main.py
which kicks off the agent with conversation memory and automatically opens the Stream Video demo app as the UI.
Advanced Conversation Features
Both conversation types support advanced features like streaming messages and real-time updates. The system automatically handles:- Message Structure: Each message includes content, role, user_id, timestamp, and unique ID
- Streaming Support: Real-time message updates as the LLM generates responses
- Event Integration: Integration with the framework’s event system
- Thread Safety: Background workers handle API calls asynchronously
Agent
class. Messages are stored and retrieved transparently, providing context to your LLM for more natural conversations.
Custom Conversation Implementations
If you need custom conversation behavior, you can implement theConversation
abstract base class: