Back to home
Architecture Portfolio

Systems I Design

Three architectural patterns that recur across every large-scale infrastructure programme I have led. Abstracted from specific implementations, these diagrams represent the structural thinking behind lifecycle automation, Digital Twins, and observability at scale.

Lifecycle Automation Digital Twin Layers Observability Pipeline
01 / Architecture

Closed-Loop Lifecycle Automation

The engine behind Zero Touch Networks and Digital Twins. Infrastructure converges toward its intended state through a continuous cycle: declare intent, strategize the path, execute, observe the outcome, and assess the drift. Each phase is a distinct software system; the architecture is in how they compose.

Closed Loop Continuous convergence 01 Intent What should be 02 Strategize How to get there 03 Execute Execute safely 04 Observe What actually is 05 Assess Measure the gap DRIFT DETECTED → RE-CONVERGE Domain Model Source of truth Simulation Engine Monte Carlo · Solvers Workflow Engine Guided · Zero Touch feeds Intent + Strategy feeds Strategy + Assess executes strategy

Closed-Loop Lifecycle Automation

Click any phase to explore its role. The forward path drives convergence; the feedback loop from Assess back to Intent ensures drift is detected and corrected. Three foundational sub-systems — the Domain Model, Simulation Engine, and Workflow Engine — underpin the entire cycle.

This pattern is explored in depth in The Automation Ladder and powers the Fragility Index feedback loop.

02 / Architecture

Digital Twin Architecture Layers

A Digital Twin is not a dashboard — it is a high-fidelity, computable model of physical infrastructure that supports simulation, prediction, and autonomous decision-making. The architecture is a layered stack, each layer building on the one below.

L5
Autonomous Action
Closed-loop automation driven by AI agents. Intent is expressed, plans are generated, actions are executed, outcomes are observed. The twin drives the infrastructure, not just mirrors it.
Intent engines RL agents Strategy refinement Zero Touch workflows
L4
Prediction & Simulation
Forward-looking capabilities: Monte Carlo failure simulations, capacity forecasting, what-if scenario planning. The twin models not just present state, but possible futures.
Monte Carlo MTBF / MTTR Capacity solvers What-if scenarios
L3
Analytics & Insight
Fragility assessment, SLO tracking, anomaly detection, and trend analysis. This layer transforms raw state into actionable signals: where is risk accumulating, and what demands attention?
Fragility Index P(SLO breach) Anomaly detection Trend analysis
L2
Domain Model
The computable representation of infrastructure: topology, capacity, redundancy (N+k), dependencies, and demand. Three temporal views — observed state, intended state, and planned state — form the foundation for every layer above.
Space Power Cooling Network N+k model 3 temporal views
L1
Data Ingestion
Telemetry collection, configuration parsing, inventory reconciliation, and demand feeds. Sensors, SNMP, streaming APIs, and batch imports converge into a normalized data lake that feeds the domain model.
Telemetry Config parsing Inventory sync Demand feeds
L0
Physical Infrastructure
The real world. Data centers, network fabrics, power distribution, cooling systems, and the interconnection ecosystem. Everything above is a representation of what exists here.
Data centers Network fabrics Power distribution Cooling plants

Layer 3 implements the Fragility Index methodology. Layer 5 implements the lifecycle automation pattern above. The Domain Model (L2) is the foundational concept discussed in the Automation Ladder.

03 / Architecture

Observability Pipeline

You cannot control what you cannot observe. This pipeline transforms raw infrastructure telemetry into the signals that feed the lifecycle loop and the Digital Twin. Built for Google-scale volumes, the pattern generalises to any infrastructure at any scale.

ROUTERS SWITCHES POWER COOLING SENSORS Collection Ingest · Normalize · Buffer Enrichment Topology · Ownership · Context Correlation Group · Deduplicate · Root cause Assessment Health · Fragility · SLO impact Signal Alert · Prioritize · Dispatch Twin Update State sync · Model refresh DOMAIN MODEL LIFECYCLE ENGINE DASHBOARDS & APIs Observe state Trigger actions Human insight

Observability Pipeline

Click any stage to explore. Raw telemetry from heterogeneous infrastructure sources flows through collection, enrichment with topology context, correlation for root-cause analysis, health assessment, and finally actionable signal generation. The output feeds the Domain Model, the Lifecycle Engine, and human operators.

This pipeline populates the Observe phase in the lifecycle loop and feeds the Fragility Index with the empirical MTBF/MTTR data needed for probabilistic simulation.