Skip to main content
← projects
shipped mlopsplatform

model-registry — lineage, artifacts, and the deploy contract

A self-hosted model registry that's the source of truth for what trains, what ships, and what's running. Each model carries its full lineage — code commit, feature graph hash, data window, owner, evaluation report — and the production platform refuses to load anything it didn't register.

Role

Research Engineer

Stack

Python · FastAPI · PostgreSQL · MinIO/S3 · OpenTelemetry

models tracked
320+
audit traceability
100%
MTTR rollback
<2 min

Why not MLflow

MLflow is great for ML at scale; trading is a different shape. We needed a few things MLflow doesn’t enforce by default:

  • The feature graph is part of the artefact. A model’s hash is derived from (code, feature_graph, data_window, hyperparams) — not just from training output. Two models with the same weights but different feature graphs are different models.
  • No unsigned promotion. Moving a model from candidate to production requires a signed evaluation report from alpha-bench and a human approver from a small ACL. The API enforces it.
  • Strong owners. Every model has exactly one owning team. If the owner team is empty (because a quant left), the platform surfaces it as an orphan and won’t let you promote new versions until it’s reassigned. Compliance loves this.

What the API looks like

POST /models/xsmom/versions
Content-Type: application/json

{
  "code_commit": "a13fce…",
  "feature_graph_hash": "8b2e…",
  "data_window": ["2018-01-01", "2026-04-30"],
  "evaluation": { "report_id": "rep_4f2…", "objective": "rank_ic" },
  "owner": "team-mid-frequency-equities"
}

201 Created
   { "model_id": "xsmom@2026.05.15-r3",
     "status":   "candidate",
     "ready_at": null }

A model becomes production only via:

POST /models/xsmom/versions/xsmom@2026.05.15-r3:promote
{ "approver": "<acl-name>", "canary_pct": 5 }

Lineage in practice

Every signal emitted by signal-stream includes its model_id. Every fill from execution includes the signal_id that drove it. The registry stitches these together so the answer to “what model produced this trade, and what did its training distribution look like?” is one query, not a forensic exercise.

What I’d ship next

  • Native support for ensemble lineage — when a meta-model consumes three children, the registry should model that as a DAG, not a flat list
  • Automatic deprecation pressure — a model whose owner team hasn’t pushed a new version in N months gets a warning, then a hard expiry, before becoming an orphan