Back
Essay

The Automation Ladder

Every infrastructure organization sits on a rung. Most think they are higher than they are. The rungs are not about technology. They are about how much you trust your systems to act without you.

I have spent nearly three decades climbing this ladder, at each rung discovering that the hard part is never the code. It is convincing an organization that the next rung is not only possible but necessary. That the engineers currently doing the work manually are not being replaced, they are being freed to solve harder problems.

What follows are the five rungs as I climbed them, shaped by 16 years at Google, transforming how the world's largest network is managed and operated. Each rung is not a technology upgrade. It is a shift in how humans relate to the infrastructure they build.

The starting point

Manual Operations

The engineer SSHs into a router, reads a Method of Procedure (MOP) document, and types commands. Knowledge lives in people's heads and in wiki pages that are perpetually out of date. Every change is a unique event. Every outage is a surprise. Rollback means "the engineer who made the change remembers what they typed."
This is where most of the Internet was built in the 1990s and early 2000s. I lived it firsthand building early Internet infrastructure at Liberty Global, Optus, and in Google's early networking days. It works until it does not, and what breaks it is always scale.
Humans do everything. The system has no memory.
The first shift

Scripted Assistance

Someone writes a Python script. Then another. Then a library. Then a framework. The scripts encode what the best engineers know, but the human is still in the loop for every decision. Config templates replace freeform typing. Version control replaces "I think I backed it up somewhere."
This is the rung where most organizations get stuck, sometimes permanently. The scripts multiply, the templates diverge, and the tooling becomes its own maintenance burden. The question that unlocks the next rung: what if the system knew what the network should look like, not just what to type?
Humans decide. Scripts execute. Knowledge is in code, but scattered.
The domain shift

Guided Workflows

The text-based MOP disappears. In its place: a structured workflow that guides the operator step by step, with pre-checks, validation, and safety gates built in. The operator no longer needs to be a subject-matter expert. They follow the workflow, and the system ensures correctness.
This is where domain modeling becomes critical. The system must understand entities, relationships, and constraints, not just config syntax. At Google, this was the shift that let us scale: deployment engineers could bring new capacity into production through a web UI, without deep networking expertise. The 40-step MOP became a guided workflow executable by anyone.
The system guides. Humans confirm. Non-SMEs can operate.
The autonomous shift

Zero Touch

The human is removed from the operational path entirely. The system receives intent (what the network should look like), computes the plan (what needs to change), validates safety (is this change safe right now), executes the workflow, and observes the outcome. A closed-loop control system.
This is the hardest rung to reach because it requires trust. Trust in the model, trust in the safety checks, trust in the rollback mechanisms. At Google, the Zero Touch Network initiative proved it was possible at hyperscale: daily network operations with no human in the loop. The key insight is that removing humans does not mean removing judgment. It means encoding judgment into the system itself.
The system acts. Humans set intent. Operations are autonomous.
The intelligence shift

AI-Driven Autonomous Systems

The system does not just execute intent, it generates intent. The foundation is a data model that captures not only the infrastructure (past, present, and projected future) but also the business: demand forecasts, supply chain constraints, customer commitments, and market dynamics. The Digital Twin is this complete model, a living representation where the physical and the commercial converge.
On top of the model sit the oracles: simulations (Monte Carlo, what-if scenarios), solvers (optimization, constraint satisfaction), reinforcement learning (adaptive decision-making), and generative AI (proposing novel configurations). These oracles derive insights no human could compute, measuring fragility across every layer, predicting capacity exhaustion before it happens, and generating plans that balance cost, risk, and performance.
The plans themselves follow a refinement lifecycle. They start coarse: high-level capacity intentions spanning quarters and regions. As they mature and become more committed, they sharpen in resolution, becoming concrete in space (which facility, which rack, which port), in time (which maintenance window, which quarter), and in action (which workflows, which dependencies, which rollback strategy). A plan is never a single artifact. It is a living thing that gains precision as commitment grows.
This is where infrastructure meets intelligence. We are still climbing this rung. The technology exists. The challenge now is organizational: teaching enterprises to trust systems that see further than any individual operator.
The system reasons. Humans govern. Infrastructure thinks.

The temptation at every rung is to believe you have arrived. Organizations that scripted their operations in 2010 often feel automated in 2026. They are not. The gap between scripted assistance and true Zero Touch is the same gap between cruise control and a self-driving car: one executes instructions, the other understands context.

The ladder is not just a technology progression. It is a trust progression. Each rung requires an organization to let go of something: manual control, then expert gatekeeping, then human-in-the-loop approval, then deterministic-only reasoning. What you gain at each step is the decoupling of growth from cost. Scale capacity, services, and output while the teams, tooling, and operational overhead grow sub-linearly. That is the economic argument for every rung on this ladder.

The question is not whether your infrastructure will climb this ladder. It is whether you will climb it deliberately, learning from those who have already made the ascent, or be forced up it by the competition.

← Back to home Infrastructure as Code, Infrastructure as Data →