Introduction
ModelCub is a local-first MLOps platform for computer vision. It provides everything you need to manage datasets, annotate images, train models, and deploy them - all on your own infrastructure.
What is ModelCub?
ModelCub is the open-source alternative to cloud platforms like Roboflow. It gives you:
- Dataset Management: Import from YOLO, Roboflow, COCO, or raw images
- Annotation Tools: Built-in canvas-based labeling interface
- Version Control: Git-like workflows for datasets
- Training Integration: YOLO v8/v11 with auto-configuration
- Model Deployment: Export to ONNX, TensorRT, CoreML
- Three Interfaces: CLI, Python SDK, and Web UI
All while keeping your data 100% local and 100% private.
Why ModelCub?
Problem: Cloud Lock-In
Traditional platforms like Roboflow:
- Charge $500-$8k/month
- Store your data on their servers
- Make it difficult to switch providers
- Can't handle sensitive data (medical, defense)
Problem: DIY Fragmentation
Building your own stack:
- Label Studio + Ultralytics + custom scripts
- No integration between tools
- No version control for datasets
- Hard to reproduce experiments
Solution: ModelCub
A complete, integrated platform that:
- Runs entirely on your infrastructure
- Costs $0 (free and open source)
- Provides professional tooling
- Enables reproducible workflows
Core Features
1. Privacy-First
Your data never leaves your machine:
- ✅ Works 100% offline
- ✅ No telemetry or tracking
- ✅ No account required
- ✅ HIPAA/GDPR friendly
- ✅ Perfect for air-gapped environments
2. Complete Workflow
Everything in one tool:
# Import dataset
modelcub dataset add --source ./data --name v1
# Annotate
modelcub ui # Opens annotation interface
# Train
modelcub train v1 --model yolov11n
# Deploy
modelcub export --format onnx3. Version Control
Git-like semantics for datasets:
# Commit changes
modelcub commit "Added 100 new samples"
# View history
modelcub history
# Compare versions
modelcub diff v1 v2
# Rollback
modelcub checkout v14. Three Interfaces
CLI for automation:
modelcub dataset add --source ./data --name v1
modelcub dataset info v1Python SDK for notebooks:
from modelcub import Project, Dataset
project = Project.init("my-project")
dataset = Dataset.from_yolo("./data", name="v1")
stats = dataset.stats()Web UI for visual work:
modelcub ui # Opens at localhost:8000All three interfaces use the same underlying API.
Architecture
ModelCub is built on clean, layered architecture:
User Interfaces
├── CLI (Click)
├── Python SDK (Public API)
└── Web UI (React + TypeScript)
│
▼
FastAPI Backend
(REST + WebSocket)
│
▼
Core Services
(Business Logic)
│
▼
File System State
(.modelcub directory)Design Principles
1. API-First
Everything is composable. Use any interface interchangeably.
2. Stateless Backend
No hidden database. All state lives in human-readable YAML/JSON files.
3. Format-Agnostic
YOLO internally for simplicity, but import/export any format.
4. Git-Friendly
Version control everything like code. Diffs, commits, rollbacks.
Use Cases
Medical Imaging
Hospital deploying tumor detection:
- Patient data stays on-premise (HIPAA compliant)
- Air-gapped training environment
- Full audit trail for regulatory compliance
Startup
E-commerce company building product recognition:
- Save $96k/year vs Roboflow
- Use savings to hire engineers
- Own your data and tools
Research Lab
University running CV experiments:
- Reproducible workflows
- Version datasets alongside code
- Easy collaboration within team
Defense/Government
Classified project requirements:
- Zero external dependencies
- Air-gapped deployment
- Complete data sovereignty
What's Included
Phase 1 (Current - Complete)
- ✅ Project initialization and configuration
- ✅ Dataset import (YOLO, Roboflow, COCO, images)
- ✅ Class management (add, rename, remove)
- ✅ CLI with 20+ commands
- ✅ Python SDK
- ✅ FastAPI backend
- ✅ React frontend
- ✅ Basic annotation system
Phase 2 (In Progress)
- 🚧 Dataset validation with health scoring
- 🚧 Auto-fix system with backups
- 🚧 Version control (commit, diff, history)
- 🚧 Visual diff UI
- 🚧 Multi-format export
Phase 3 (Planned)
- 📅 Advanced annotation (polygon, segmentation)
- 📅 Keyboard shortcuts
- 📅 Auto-save and undo/redo
- 📅 Review and consensus mode
Phase 4 (Planned)
- 📅 YOLO training integration
- 📅 Real-time progress updates
- 📅 Model evaluation and comparison
- 📅 Multi-GPU support
Getting Started
Ready to try ModelCub?
- Install ModelCub - Get up and running in 2 minutes
- Quick Start - Create your first project
- API Reference - Learn the Python SDK
- CLI Reference - Master the command line
Philosophy
ModelCub is built on these principles:
Local-First
Your data is yours. No cloud dependencies, no vendor lock-in.
Developer-Friendly
Clean APIs, good documentation, sensible defaults. Built by engineers who felt the pain.
Reproducible
Version everything. Generate reports. Make experiments repeatable.
Transparent
No black boxes. All state in human-readable files. Open source code.
Composable
Use what you need. Ignore what you don't. Everything works standalone.
Community
ModelCub is open source and community-driven:
- 🐛 Report Bugs: GitHub Issues
- 💡 Feature Requests: GitHub Discussions
- 🤝 Contributing: Contributing Guide
- 📚 Documentation: You're reading it!