signal-stream — online feature serving for live alpha models
The production half of the loop: the same feature graph that ran in research, now consuming a market-data stream and emitting signals into the trading stack — with point-in-time guarantees, lineage, and a kill switch.
Research Engineer
Python · Rust · Kafka · Redis · gRPC · Prometheus
What it does
Models registered in alpha-bench come with a deployment manifest: the feature graph they consume, their refit cadence, their schema, and the signal they emit.
signal-stream takes that manifest and runs the same graph live. No
rewrites — the feature definitions are imported directly from
feature-forge. The only difference is the
executor: a streaming runtime that materialises each node as new bars
arrive, instead of a batch runtime that reads parquet.
Why this matters
The fastest way to lose money is to deploy a research model whose production features have drifted from training. So every emit is guarded:
- schema hash check — the live feature schema must match the model’s training schema, byte-exact
- distribution canary — rolling window of feature values is compared against the training distribution; a 4-sigma shift trips a soft kill
- PnL canary — paper-traded shadow PnL is logged alongside live
PnL and divergence over
Ndays routes the model to triage
Architecture (sketch)
┌────────────┐ ┌──────────────┐ ┌────────────┐
md ──► │ ingestor │──► │ feature-forge│──► │ models │──► signal
│ (rust) │ │ (online) │ │ (python) │ bus
└────────────┘ └──────┬───────┘ └────────────┘
│
▼
point-in-time
feature store
(Redis + Arrow)
The Rust ingest tier handles fan-in and timestamp normalisation; the Python online runtime owns feature compute and model inference. The boundary is a typed Arrow channel — same schema as the research lake, zero copy on the hot path.
Operability
- one Grafana board per model, owned by the team that registered it
- pager only routes on freshness or schema failures — alpha drift goes to a dashboard, not a phone, because that’s research triage, not infra triage
- every signal carries a
model_id,feature_graph_hash,data_window_end— so a downstream trade can be replayed against the exact feature snapshot that produced it
What I’d ship next
- A shadow-to-canary harness that automates the “5% capital under monitoring” rollout pattern and surfaces “promote / hold / rollback” as a single decision
- WASM-compiled feature kernels so the same code runs in the pre-trade strategy host as well as the streaming tier