D Field study 05

Thesis Loop Resilience Trust Control Work Team

DLFT.AI / CONTROL 2026 — Q2

A control loop, in disguise.

DLFT is a closed-loop distributed control system. It happens to run a consumer brand. The shape is the same as a robotic milker, a self-driving stack, or a satellite ground station — sensors, estimators, actuators, and a safety envelope around the parts that move atoms.

If you build robotics, perception, or any other system where the world bites back, you'll recognise everything in here. Replace the herd of cattle with a herd of campaigns. Replace the LIDAR with a Facebook Ad Library scrape. Replace the actuator arm with a Meta Graph API call. The control problem is structurally the same.

The hardness is structural too: noisy sensors, partially observable state, drifting objectives, irreversible actions, hard latency budgets, and a long tail of failure modes that have to be handled deterministically. The model is the easy part. Everything around it is the engineering.

The control loop

Sensors → estimator → policy → actuator → world. Feedback closes the loop. Telemetry observes everything. A hard envelope wraps the parts that move money.

Sample rate

Per-second job claim. Per-minute schedule tick. Per-hour bandit re-roll. Per-store cadence (1d → 1w).

Latency budget

Sub-second decisions for CS replies. Multi-second for ad evaluations. Bounded retry windows.

State estimation

Beta-Bernoulli arms · denormalized signal columns · feedback-driven priors.

Fault model

Container loss · API timeout · partial pause · stale read. All handled, none routed to humans.

If we spoke your language

Same shapes, different vocabulary.

Postgres job queue with SELECT … FOR UPDATE SKIP LOCKED

→

Fault-tolerant message bus with atomic claim semantics

4.2-second crash-to-recovery

→

Sub-five-second MTTR; automatic state reconciliation

Phase-checkpointed pipeline resume

→

Idempotent state machine; recover-from-arbitrary-fault

3-phase atomic ad kill (intent · paused · commit)

→

Two-phase commit with WAL; safety interlock

Thompson Sampling bandit over keywords

→

Online optimisation under uncertainty; bounded exploration

Adaptive scan cadence (activity → daily/3d/weekly)

→

Closed-loop control with activity-driven sampling

Multi-vendor image routing (six providers)

→

Heterogeneous model ensemble; vendor failover

Anti-manipulation gate above the LLM

→

Hard safety envelope around a probabilistic actuator

Per-call cost ledger (model, tokens, cents, brand)

→

First-class telemetry; replayable, queryable observability

MAX_RESUME_ATTEMPTS = 3

→

Circuit breaker; bounded autonomy with auto-disable

Subsystems

Five blocks. Standard shape.

i.Sensors · ingest

Reading the world.

The state of the world is partially observable, noisy, and adversarial. Competitor ad libraries are rate-limited and bot-fingerprinted. Shopify storefronts mutate without notice. Customer messages arrive in seven languages. We treat each ingest path as a sensor — with its own sampling rate, calibration, fault model, and dropout handling.

What runs: stealth-aware browser scraping (proxy-class adaptive) · Shopify GraphQL mirror · Meta Marketing API · 17track shipment polling · agentmail CS inbox sync.

ii.Estimator · online learning

Belief over state.

Raw sensor data is not the world. The estimator turns observations into a current belief — what's working, what's decaying, what's worth more attention. Closed-loop, online, prior-driven. A keyword's reward distribution is a Beta-Bernoulli arm. A competitor's relevance is a decaying activity score. Both update continuously.

What runs: Thompson Sampling bandit (Jöhnk's algorithm, Beta-Bernoulli) · adaptive scan cadence scorer · cross-store fraud profile aggregator · denormalized signal columns for fast read.

iii.Policy · the model layer

Where the LLM lives.

The estimator hands the policy block a current state and a decision to make. The policy synthesizes language: a refund offer, ad copy, an image prompt, a brand DNA spec. It is a probabilistic actuator — high-quality output, non-zero failure rate. Its job ends at proposing. It never decides what hits the world.

What runs: Claude Haiku/Sonnet/Opus (text) · Gemini 2.5 Flash (creative) · Leonardo, FASHN, Kling, OpenAI image (heterogeneous ensemble) · single-shot calls with structured output, no agentic loops.

iv.Actuator · safety envelope

The hard limit.

This is where the world bites back. The actuator is the only block that moves money or makes irreversible changes. It is wrapped in a deterministic safety envelope: budget caps, fraud-aware refund ceilings, three-phase WAL commits, and idempotent retries. The policy's output is sanitized, capped, and logged before any byte leaves our network.

What runs: 3-phase atomic Meta auto-kill · Shopify mutation with optimistic concurrency · CS reply through anti-manipulation cap · Revolut payout with daily/monthly/lifetime budgets.

v.Observability · telemetry bus

Replay or it didn't happen.

Every block writes structured events to a shared bus. State transitions, model calls, costs, retries, recoveries — each one row, each one queryable. We can rebuild any incident from the log alone: which sensor fired, which estimate updated, which policy proposed, which actuator commit landed, what it cost. Replayable, not rememberable.

What runs: per-call cost ledger (ApiUsage) · BackgroundJobLog · PipelineStepLog · MetaAutoKillLog · ActivityLog · CsConversationLog · structured incident replay.

vi.Fault tolerance · scheduler

When the world bites back.

Containers die. APIs time out. Webhooks lie. Networks partition. The scheduler is the supervisor: it claims work atomically, checkpoints multi-step state machines, and amnesties orphaned rows on every boot. A deterministic crash auto-disables its own schedule before it can burn through a budget. Bounded autonomy is not optional.

What runs: Postgres-claimed jobs · phase-checkpointed pipelines · boot-time ghost amnesty (>4h stale) · MAX_RESUME_ATTEMPTS=3 circuit breaker · per-kind worker isolation.

In your units

MTTR

4.2s

Mean time to recovery from container loss. Idempotent retries; no human in the loop.

Decision cadence

12k/h

Per active brand. Pipeline runs, ad evaluations, scrape rounds, model calls.

Ensemble cardinality

Heterogeneous models in production. Three text tiers. Six image / video providers. Routed per task.

Safety interlocks

Deterministic envelopes wrapping probabilistic actuators. Refund cap, kill threshold, dispute history, daily budget, monthly budget, attempt cap, idempotency key.

State reconciliation

≤60s

Cron tick interval. Boot-time sweep. Webhook reconcile. Drift bounded by design.

Autonomy boundary

Self-disable threshold for any deterministic crash loop. Circuit breaker over open-ended retry.

Observability rows

1:1

Every state transition is exactly one row in a structured log. Replayable, not rememberable.

Money leaked to faults

€0

No silent ad-spend gaps, no orphaned receipts, no missed pauses. Verifiable in the audit log.

A herd of campaigns is harder than a herd of cattle in exactly one way: the cattle don't change shape every night.