DLFT is a closed-loop distributed control system. It happens to run a consumer brand. The shape is the same as a robotic milker, a self-driving stack, or a satellite ground station — sensors, estimators, actuators, and a safety envelope around the parts that move atoms.
If you build robotics, perception, or any other system where the world bites back, you'll recognise everything in here. Replace the herd of cattle with a herd of campaigns. Replace the LIDAR with a Facebook Ad Library scrape. Replace the actuator arm with a Meta Graph API call. The control problem is structurally the same.
The hardness is structural too: noisy sensors, partially observable state, drifting objectives, irreversible actions, hard latency budgets, and a long tail of failure modes that have to be handled deterministically. The model is the easy part. Everything around it is the engineering.
Sensors → estimator → policy → actuator → world. Feedback closes the loop. Telemetry observes everything. A hard envelope wraps the parts that move money.
Same shapes, different vocabulary.
Five blocks. Standard shape.
The state of the world is partially observable, noisy, and adversarial. Competitor ad libraries are rate-limited and bot-fingerprinted. Shopify storefronts mutate without notice. Customer messages arrive in seven languages. We treat each ingest path as a sensor — with its own sampling rate, calibration, fault model, and dropout handling.
Raw sensor data is not the world. The estimator turns observations into a current belief — what's working, what's decaying, what's worth more attention. Closed-loop, online, prior-driven. A keyword's reward distribution is a Beta-Bernoulli arm. A competitor's relevance is a decaying activity score. Both update continuously.
The estimator hands the policy block a current state and a decision to make. The policy synthesizes language: a refund offer, ad copy, an image prompt, a brand DNA spec. It is a probabilistic actuator — high-quality output, non-zero failure rate. Its job ends at proposing. It never decides what hits the world.
This is where the world bites back. The actuator is the only block that moves money or makes irreversible changes. It is wrapped in a deterministic safety envelope: budget caps, fraud-aware refund ceilings, three-phase WAL commits, and idempotent retries. The policy's output is sanitized, capped, and logged before any byte leaves our network.
Every block writes structured events to a shared bus. State transitions, model calls, costs, retries, recoveries — each one row, each one queryable. We can rebuild any incident from the log alone: which sensor fired, which estimate updated, which policy proposed, which actuator commit landed, what it cost. Replayable, not rememberable.
Containers die. APIs time out. Webhooks lie. Networks partition. The scheduler is the supervisor: it claims work atomically, checkpoints multi-step state machines, and amnesties orphaned rows on every boot. A deterministic crash auto-disables its own schedule before it can burn through a budget. Bounded autonomy is not optional.
A herd of campaigns is harder than a herd of cattle in exactly one way: the cattle don't change shape every night.