Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.aidonow.com/llms.txt

Use this file to discover all available pages before exploring further.

This is Episode 9 of the Autonomous Dev Org series — an honest account of building a development organization where AI handles implementation and humans handle direction.

The Coordination Tax

Episode 8 gave us the Global Architect — a system-wide view that catches boundary blindness before implementation begins. Episode 7 gave us Hook Shields — sub-2-second enforcement at the point of write. Both worked. Neither was sustainable at scale. The problem was coordination. Every task required a judgment call: is this local enough for Hook Shields only, or complex enough to warrant a full Global Architect pass? We were making that call manually, task by task. In a loop processing dozens of tasks per day, the coordination overhead had become its own full-time job — exactly the kind of work we’d built the loop to eliminate. We had two powerful verification systems and were manually routing between them. That’s not an autonomous organization. That’s a human-in-the-loop wrapper around automation. We needed the loop itself to understand what kind of task it was handling and route accordingly. The Intelligence Router was the missing piece.

What We Built

We built an Intelligence Router — a lightweight classification layer that sits in front of both verification systems and decides, for each task, which tier of verification is appropriate. Local tasks stay in the inner loop. Cross-repo tasks escalate to the Global Architect. Persistent violations trigger a new process: the Correction Cycle. The Correction Cycle is what makes the loop genuinely self-healing. When a violation recurs despite Hook Shields — which shouldn’t happen but occasionally does — the cycle doesn’t just fix the code. It analyzes the failure pattern, codifies a new or updated shield, and deploys it across all repositories simultaneously. The organization learns from every bug, at the speed of a git push.

The War Room

Architect: Okay, we’ve got local Hook Shields for the inner loop and a Global Architect for the boundaries. But the coordination is starting to feel like a full-time job. How do we scale this without manually routing every task? Director: That’s the Master Agent Fallacy. We tried building one agent that knew everything about every repo. It got slow and overconfident. I’m moving to a Hierarchy of Verification, managed by an Intelligence Router. Architect: Small fixes stay local, big changes go global? Director: Precisely. But the router decides — not us. Trust becomes a routing decision based on context weight and risk, made automatically from the task specification. Builder: What does that look like in practice? Director: The router reads the incoming task. Single repository, no shared types touched? Stays local. Hook Shields run. Touches the Core SDK, or modifies a type used across repo boundaries? The Global Architect runs first and produces a Sync Plan. Domain agents work from the plan. Architect: And when something still fails after all that? Director: That’s the interesting case. A recurring violation — something that slips through existing shields — isn’t just a bug. It’s a gap in the governance model. The Correction Cycle kicks in: analyze the pattern, write a new shield, deploy it everywhere. Builder: So the organization gets smarter every time I mess up? Director: That’s exactly right. In a traditional team, a lesson learned by one person stays with that person. In this system, a violation caught anywhere becomes enforcement applied everywhere. The IQ of the loop increases with every error it encounters.

The Routing Decision

The Intelligence Router classifies tasks along two dimensions: scope and risk. Scope answers: how many repositories does this task touch?
  • Single-repo: the task is bounded. Local verification is sufficient.
  • Cross-repo: the task has boundary effects. Global verification is required.
Risk answers: what is the blast radius if this task produces an error?
  • Low: isolated behavior, easily reversed, no downstream consumers affected.
  • High: shared types, public contracts, or infrastructure changes with wide impact.
The routing matrix:
ScopeRiskVerification Tier
Single-repoLowInner loop only (Hook Shields + compile + tests)
Single-repoHighInner loop + targeted boundary check
Cross-repoLowGlobal Architect scan + local execution
Cross-repoHighFull Global Architect pass + SDK-First gate + coordinated execution
The router reads the task specification — which repositories it mentions, which types it modifies, which services it touches — and assigns a tier. This classification adds seconds to task setup and saves hours of incident response.
Diagram of the Intelligence Router architecture: incoming tasks are classified by scope and risk, routed to Hook Shields for local tasks or the Global Architect for cross-repo tasks, with the Correction Cycle closing the loop on new violation patterns

The Correction Cycle

Hook Shields have a coverage problem: they catch what you’ve already thought of. A new violation class — one that wasn’t anticipated when the shields were written — can slip through the inner loop undetected. The Correction Cycle is how the loop responds when this happens.
1

Violation Detected

The Verifier agent catches a violation in a merged PR that existing shields didn’t prevent. This is the signal that governance has a gap.
2

Pattern Analysis

The Global Architect analyzes the failure: what rule was violated, why the existing shield didn’t catch it, and whether this represents a new pattern or a gap in existing coverage. The analysis produces a structured description of the violation class.
3

Shield Codification

A new Hook Shield is written — or an existing one updated — to cover the gap. The shield is specific: it targets the exact pattern that slipped through, with an error message that names the rule and explains the fix.
4

Organization-Wide Deployment

The new shield is pushed to all repositories simultaneously. The next task in any repository runs with the updated enforcement. The violation class is now blocked everywhere, not just in the repository where it was caught.

The Self-Healing Insight

Traditional teams fix bugs without systematically converting them into prevention. A developer catches a floating-point currency bug, fixes it, leaves a comment, maybe adds a test. Six months later, a new developer makes the same mistake — unfamiliar with the history, unseen by any enforcement. The knowledge stays with the individual who encountered it. The team’s aggregate experience doesn’t automatically become the team’s aggregate enforcement. The Correction Cycle changes this. Every violation is:
  1. Fixed (immediate)
  2. Analyzed (understanding the pattern)
  3. Encoded (shield written)
  4. Deployed (enforcement applied everywhere)
The loop’s governance capability strictly increases over time. Each bug makes the environment harder to violate incorrectly. The organization’s IQ, measured as the ratio of violations caught before PR to violations caught after, trends toward 1.0 as the shield library grows. We’re not there yet — and we’re probably never fully there, because requirements change and new violation classes emerge. But the direction is correct and the mechanism is real.

What Governance Looks Like in Practice

Six months into running the full system — Router + inner loop + Global Architect
  • Correction Cycle — here’s what the daily operation looks like:
Morning task queue: 12–20 tasks, classified by the router. Roughly 70% stay local (single-repo, low-risk). 25% go through the Global Architect pass (cross-repo changes). 5% are flagged as requiring human judgment before proceeding (ambiguous scope, high architectural risk). During execution: Hook Shields catch ~40 violations per day across all repositories. All are caught pre-write, self-corrected by the agent in under 30 seconds. None reach PR review. Weekly Correction Cycle runs: typically 1–3 new shield additions per week, driven by verifier catches or human review notes. Each shield addition is deployed in under 5 minutes. Human review involvement: reading the Sync Plans for cross-repo changes (~15 minutes per day), approving architectural decisions flagged by the router, and reviewing the weekly shield additions for correctness. The coordination overhead that was threatening to consume the human operating capacity is now under 30 minutes per day. The loop runs at a pace no human team could sustain, with governance quality that improves instead of degrading over time.

What Didn’t Work

A monolithic router. Our first router design tried to classify tasks in a single pass — read the task, output a tier. For ambiguous tasks (the 5% that genuinely require judgment), the router was wrong too often. We added a confidence score: if confidence falls below threshold, the task is flagged for human routing before proceeding. Correction Cycle without human shield review. Early on, we let the Correction Cycle deploy new shields automatically after analysis. Two of the first five shields had false-positive patterns — they blocked valid code. We added a mandatory human review step before shield deployment. The review takes two minutes and prevents the kind of feedback loop failure where the cure is worse than the disease. Treating all violations equally. Some violations are Class A — critical, block-immediately — and some are Class B — advisory, worth flagging but not blocking. We initially ran everything as Class A. It caused friction on edge cases where the violation class had legitimate exceptions. Tiering the shields resolved this: Class A exits with code 1, Class B exits with 0 but writes to stderr. The agent is informed but not halted.

AI Collaboration in This Episode

The Intelligence Router is itself an agent — a lightweight classifier that reads task specifications and outputs routing decisions. A smaller, faster model works well for this role. The classification task doesn’t require deep reasoning; it requires consistent application of a routing matrix against a structured input. The irony of the self-healing loop is that agents participate in their own governance improvement. When the Correction Cycle produces a new shield, we describe the failure pattern to Claude and ask it to write the detection logic. It produces more comprehensive coverage than our manual sketches — catching edge cases in the pattern we hadn’t anticipated. The agent writes the enforcement that governs itself. We review what it wrote. This division of labor is stable and efficient: the model is better at exhaustive pattern coverage, humans are better at judging where to draw the boundary.

The Principle Behind the System

We set out to build a development organization where AI handles implementation and humans handle direction. What we built is closer to: AI handles execution, AI handles enforcement, and humans handle judgment at boundary conditions. The boundary conditions are the interesting part. They’re where the system’s rules are ambiguous, where context is genuinely hard to encode mechanically, where a wrong call cascades into something expensive. Those are the moments that require a human who understands the intent behind the rules, not just the rules themselves. Everything else — the execution, the verification, the enforcement, the correction — runs autonomously at a pace and consistency no human team could match. Governance isn’t the bottleneck anymore. It’s the engine.

What’s Next

The loop executes, verifies, enforces, and corrects. It doesn’t get tired, and it doesn’t forget. But it still only does what it’s told. The remaining question — and it’s a harder one than any of the engineering problems we’ve solved — is whether the direction itself can be shaped by what the loop learns. Whether the human-authored strategy at the top can start to feel pressure from the autonomous execution at the bottom. Whether leadership changes when the execution layer stops being a constraint. That’s Episode 10: how the loop changed what it means to lead a development organization.

Episode 10: Leadership Evolution

How the loop changed what it means to lead a development organization.

Series Overview

The full arc from loop to organization.

Earlier in the Series


All content represents personal learning from personal and side projects. Code examples are sanitized and generalized. No proprietary information is shared. Opinions are my own and do not reflect my employer’s views.