signal-stream — online feature serving for live alpha models

What it does

Models registered in alpha-bench come with a deployment manifest: the feature graph they consume, their refit cadence, their schema, and the signal they emit.

signal-stream takes that manifest and runs the same graph live. No rewrites — the feature definitions are imported directly from feature-forge. The only difference is the executor: a streaming runtime that materialises each node as new bars arrive, instead of a batch runtime that reads parquet.

Why this matters

The fastest way to lose money is to deploy a research model whose production features have drifted from training. So every emit is guarded:

schema hash check — the live feature schema must match the model’s training schema, byte-exact
distribution canary — rolling window of feature values is compared against the training distribution; a 4-sigma shift trips a soft kill
PnL canary — paper-traded shadow PnL is logged alongside live PnL and divergence over N days routes the model to triage

Architecture (sketch)

        ┌────────────┐    ┌──────────────┐    ┌────────────┐
md ──► │  ingestor   │──► │ feature-forge│──► │  models    │──► signal
        │  (rust)    │    │  (online)    │    │  (python)  │      bus
        └────────────┘    └──────┬───────┘    └────────────┘
                                 │
                                 ▼
                          point-in-time
                          feature store
                          (Redis + Arrow)

The Rust ingest tier handles fan-in and timestamp normalisation; the Python online runtime owns feature compute and model inference. The boundary is a typed Arrow channel — same schema as the research lake, zero copy on the hot path.

Operability

one Grafana board per model, owned by the team that registered it
pager only routes on freshness or schema failures — alpha drift goes to a dashboard, not a phone, because that’s research triage, not infra triage
every signal carries a model_id, feature_graph_hash, data_window_end — so a downstream trade can be replayed against the exact feature snapshot that produced it

What I’d ship next

A shadow-to-canary harness that automates the “5% capital under monitoring” rollout pattern and surfaces “promote / hold / rollback” as a single decision
WASM-compiled feature kernels so the same code runs in the pre-trade strategy host as well as the streaming tier