Introduction

ModelCub is a local-first MLOps platform for computer vision. It provides everything you need to manage datasets, annotate images, train models, and deploy them - all on your own infrastructure.

What is ModelCub?

ModelCub is the open-source alternative to cloud platforms like Roboflow. It gives you:

Dataset Management: Import from YOLO, Roboflow, COCO, or raw images
Annotation Tools: Built-in canvas-based labeling interface
Version Control: Git-like workflows for datasets
Training Integration: YOLO v8/v11 with auto-configuration
Model Deployment: Export to ONNX, TensorRT, CoreML
Three Interfaces: CLI, Python SDK, and Web UI

All while keeping your data 100% local and 100% private.

Why ModelCub?

Problem: Cloud Lock-In

Traditional platforms like Roboflow:

Charge $500-$8k/month
Store your data on their servers
Make it difficult to switch providers
Can't handle sensitive data (medical, defense)

Problem: DIY Fragmentation

Building your own stack:

Label Studio + Ultralytics + custom scripts
No integration between tools
No version control for datasets
Hard to reproduce experiments

Solution: ModelCub

A complete, integrated platform that:

Runs entirely on your infrastructure
Costs $0 (free and open source)
Provides professional tooling
Enables reproducible workflows

Core Features

1. Privacy-First

Your data never leaves your machine:

✅ Works 100% offline
✅ No telemetry or tracking
✅ No account required
✅ HIPAA/GDPR friendly
✅ Perfect for air-gapped environments

2. Complete Workflow

Everything in one tool:

bash

# Import dataset
modelcub dataset add --source ./data --name v1

# Annotate
modelcub ui  # Opens annotation interface

# Train
modelcub train v1 --model yolov11n

# Deploy
modelcub export --format onnx

3. Version Control

Git-like semantics for datasets:

bash

# Commit changes
modelcub commit "Added 100 new samples"

# View history
modelcub history

# Compare versions
modelcub diff v1 v2

# Rollback
modelcub checkout v1

4. Three Interfaces

CLI for automation:

bash

modelcub dataset add --source ./data --name v1
modelcub dataset info v1

Python SDK for notebooks:

python

from modelcub import Project, Dataset

project = Project.init("my-project")
dataset = Dataset.from_yolo("./data", name="v1")
stats = dataset.stats()

Web UI for visual work:

bash

modelcub ui  # Opens at localhost:8000

All three interfaces use the same underlying API.

Architecture

ModelCub is built on clean, layered architecture:

User Interfaces
├── CLI (Click)
├── Python SDK (Public API)
└── Web UI (React + TypeScript)
        │
        ▼
   FastAPI Backend
   (REST + WebSocket)
        │
        ▼
    Core Services
   (Business Logic)
        │
        ▼
   File System State
  (.modelcub directory)

Design Principles

1. API-First

Everything is composable. Use any interface interchangeably.

2. Stateless Backend

No hidden database. All state lives in human-readable YAML/JSON files.

3. Format-Agnostic

YOLO internally for simplicity, but import/export any format.

4. Git-Friendly

Version control everything like code. Diffs, commits, rollbacks.

Use Cases

Medical Imaging

Hospital deploying tumor detection:

Patient data stays on-premise (HIPAA compliant)
Air-gapped training environment
Full audit trail for regulatory compliance

Startup

E-commerce company building product recognition:

Save $96k/year vs Roboflow
Use savings to hire engineers
Own your data and tools

Research Lab

University running CV experiments:

Reproducible workflows
Version datasets alongside code
Easy collaboration within team

Defense/Government

Classified project requirements:

Zero external dependencies
Air-gapped deployment
Complete data sovereignty

What's Included

Phase 1 (Current - Complete)

✅ Project initialization and configuration
✅ Dataset import (YOLO, Roboflow, COCO, images)
✅ Class management (add, rename, remove)
✅ CLI with 20+ commands
✅ Python SDK
✅ FastAPI backend
✅ React frontend
✅ Basic annotation system

Phase 2 (In Progress)

🚧 Dataset validation with health scoring
🚧 Auto-fix system with backups
🚧 Version control (commit, diff, history)
🚧 Visual diff UI
🚧 Multi-format export

Phase 3 (Planned)

📅 Advanced annotation (polygon, segmentation)
📅 Keyboard shortcuts
📅 Auto-save and undo/redo
📅 Review and consensus mode

Phase 4 (Planned)

📅 YOLO training integration
📅 Real-time progress updates
📅 Model evaluation and comparison
📅 Multi-GPU support

Getting Started

Ready to try ModelCub?

Install ModelCub - Get up and running in 2 minutes
Quick Start - Create your first project
API Reference - Learn the Python SDK
CLI Reference - Master the command line

Philosophy

ModelCub is built on these principles:

Local-First

Your data is yours. No cloud dependencies, no vendor lock-in.

Developer-Friendly

Clean APIs, good documentation, sensible defaults. Built by engineers who felt the pain.

Reproducible

Version everything. Generate reports. Make experiments repeatable.

Transparent

No black boxes. All state in human-readable files. Open source code.

Composable

Use what you need. Ignore what you don't. Everything works standalone.

Community

ModelCub is open source and community-driven:

🐛 Report Bugs: GitHub Issues
💡 Feature Requests: GitHub Discussions
🤝 Contributing: Contributing Guide
📚 Documentation: You're reading it!

Introduction ​

What is ModelCub? ​

Why ModelCub? ​

Problem: Cloud Lock-In ​

Problem: DIY Fragmentation ​

Solution: ModelCub ​

Core Features ​

1. Privacy-First ​

2. Complete Workflow ​

3. Version Control ​

4. Three Interfaces ​

Architecture ​

Design Principles ​

Use Cases ​

Medical Imaging ​

Startup ​

Research Lab ​

Defense/Government ​

What's Included ​

Phase 1 (Current - Complete) ​

Phase 2 (In Progress) ​

Phase 3 (Planned) ​

Phase 4 (Planned) ​

Getting Started ​

Philosophy ​

Local-First ​

Developer-Friendly ​

Reproducible ​

Transparent ​

Composable ​

Community ​

Next Steps ​

Introduction

What is ModelCub?

Why ModelCub?

Problem: Cloud Lock-In

Problem: DIY Fragmentation

Solution: ModelCub

Core Features

1. Privacy-First

2. Complete Workflow

3. Version Control

4. Three Interfaces

Architecture

Design Principles

Use Cases

Medical Imaging

Startup

Research Lab

Defense/Government

What's Included

Phase 1 (Current - Complete)

Phase 2 (In Progress)

Phase 3 (Planned)

Phase 4 (Planned)

Getting Started

Philosophy

Local-First

Developer-Friendly

Reproducible

Transparent

Composable

Community

Next Steps