Installation
Install the Moondream plugin withChoosing the Right Processor
CloudDetectionProcessor (Recommended for Most Users)
- Use when: You want a simple setup with no infrastructure management
 - Pros: No model download, no GPU required, automatic updates
 - Cons: Requires API key, 2 RPS rate limit by default (can be increased)
 - Best for: Development, testing, low-to-medium volume applications
 
LocalDetectionProcessor (For Advanced Users)
- Use when: You need higher throughput, have your own GPU infrastructure, or want to avoid rate limits
 - Pros: No rate limits, no API costs, full control over hardware
 - Cons: Requires GPU for best performance, model download on first use, infrastructure management
 - Best for: Production deployments, high-volume applications, custom infrastructure
 
Quick Start
Using CloudDetectionProcessor (Hosted)
TheCloudDetectionProcessor uses Moondream’s hosted API. By default it has a 2 RPS (requests per second) rate limit and requires an API key. The rate limit can be adjusted by contacting the Moondream team.
To initialize without passing in the API key, make sure the 
MOONDREAM_API_KEY is available as an environment variable.
You can do this either by defining it in a .env file or exporting it directly in your terminal.Using LocalDetectionProcessor (On-Device)
If you are running on your own infrastructure or using a service like Digital Ocean’s Gradient AI GPUs, you can use theLocalDetectionProcessor which downloads the model from HuggingFace and runs on device.
The moondream3-preview model is gated and requires HuggingFace authentication:
- Request access at https://huggingface.co/moondream/moondream3-preview
 - Set 
HF_TOKENenvironment variable:export HF_TOKEN=your_token_here - Or run: 
huggingface-cli login 
Detect Multiple Objects
Both processors support zero-shot detection of multiple object types simultaneously:Configuration
CloudDetectionProcessor Parameters
| Name | Type | Default | Description | 
|---|---|---|---|
api_key | str or None | None | API key for Moondream Cloud API. If not provided, will attempt to read from MOONDREAM_API_KEY environment variable. | 
detect_objects | str or List[str] | "person" | Object(s) to detect using zero-shot detection. Can be any object name like “person”, “car”, “basketball”. | 
conf_threshold | float | 0.3 | Confidence threshold for detections. | 
fps | int | 30 | Frame processing rate. | 
interval | int | 0 | Processing interval in seconds. | 
max_workers | int | 10 | Thread pool size for CPU-intensive operations. | 
By default, the Moondream Cloud API has a 2 RPS (requests per second) rate limit. Contact the Moondream team to request a higher limit.
LocalDetectionProcessor Parameters
| Name | Type | Default | Description | 
|---|---|---|---|
detect_objects | str or List[str] | "person" | Object(s) to detect using zero-shot detection. Can be any object name like “person”, “car”, “basketball”. | 
conf_threshold | float | 0.3 | Confidence threshold for detections. | 
fps | int | 30 | Frame processing rate. | 
interval | int | 0 | Processing interval in seconds. | 
max_workers | int | 10 | Thread pool size for CPU-intensive operations. | 
device | str or None | None | Device to run inference on (‘cuda’, ‘mps’, or ‘cpu’). Auto-detects CUDA, then MPS (Apple Silicon), then defaults to CPU. | 
model_name | str | "moondream/moondream3-preview" | Hugging Face model identifier. | 
options | AgentOptions or None | None | Model directory configuration. If not provided, uses default which defaults to tempfile.gettempdir(). | 
Performance will vary depending on your hardware configuration. CUDA is recommended for best performance on NVIDIA GPUs. The model will be downloaded from HuggingFace on first use.
Video Publishing
Both processors publish annotated video frames with bounding boxes drawn on detected objects:Use Cases
The Moondream plugin enables a wide range of computer vision applications:- Retail Analytics: Track customer movement and product interactions
 - Security & Surveillance: Detect specific objects or people in real-time
 - Sports Analysis: Track players, balls, and equipment
 - Warehouse Management: Monitor inventory and equipment
 - Accessibility: Describe surroundings for visually impaired users
 - Smart Home: Detect pets, packages, or specific objects
 

