Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.aidonow.com/llms.txt

Use this file to discover all available pages before exploring further.

Executive Summary

Autonomous AI agents accelerate software delivery by executing implementation work continuously and without fatigue. This acceleration is a liability when the work being executed references incomplete, ambiguous, or internally contradictory requirements. The speed advantage compounds the directional error: an agent that builds rapidly against a flawed specification produces a larger volume of work that must be discarded or rearchitected. This paper examines a formal requirement lifecycle designed for autonomous development organizations, comprising two components: an eight-section working-backwards template that enforces completeness before review, and a mandatory four-stage review chain that serves as a hard gate before any builder agent touches the work. Analysis of this lifecycle reveals that the Press Release section of the template and the Non-Goals section carry disproportionate preventive value — the former by forcing customer-value framing before technical specification, the latter by eliminating the scope expansion that autonomous agents will otherwise pursue systematically.

Key Findings

  • Requirement debt in autonomous organizations compounds differently than in human-staffed teams. A human engineer encountering a vague requirement typically pauses to seek clarification. An AI agent encountering the same ambiguity resolves it unilaterally and proceeds. The resulting implementation is structurally complete and directionally wrong, and the error surfaces only after substantial work has been committed.
  • The Press Release section is the highest-value component of the working-backwards template. By requiring the author to write a customer-facing announcement as if the feature is already shipped, it surfaces “technically correct but wrong product” failures before any architecture work begins. No other section of the template performs this function.
  • Non-Goals are as structurally necessary as goals for autonomous agent contexts. Human engineers use professional judgment to avoid gold-plating features they were not asked to build. Autonomous agents do not apply this judgment; they implement everything not explicitly excluded. Explicit Non-Goals are the mechanism by which scope boundaries become machine-interpretable.
  • A single-reviewer approval model creates a single-failure-of-coverage problem. Completeness, feasibility, and safety are distinct analytical concerns. A reviewer capable of assessing all three simultaneously is rare; a process that requires all three to be assessed by separate reviewers with distinct mandates is structurally more reliable.
  • The first architectural correction after build begins is a requirement audit signal. A correction that touches architecture — not merely a bug in implementation — indicates that a gap in the requirement survived all review stages. Treating this correction as an isolated fix rather than a systemic signal causes the same gap to surface again in adjacent work.
  • Draft-status hard gates prevent the most expensive class of autonomous development error. Requiring agents to verify requirement status before beginning execution is a zero-cost constraint that eliminates the entire category of “agent built against an unapproved specification” failures.

1. The Requirement Debt Problem: Fast Execution Against Incomplete Specifications

1.1 Why Speed Amplifies Directional Error

In a human-staffed engineering organization, a developer encountering an ambiguous requirement has several natural corrective mechanisms: conversation with the product owner, hesitation before writing code, and the tacit knowledge that informs how gaps are filled. None of these mechanisms are reliable properties of autonomous AI agents. An autonomous agent operating against an incomplete requirement applies heuristics to resolve ambiguity and proceeds. The heuristics may be sophisticated, but they are not informed by organizational context, product strategy, or the unstated constraints that the product owner would have communicated in a thirty-second conversation. The agent builds what the specification implies, not what the product owner intended. When this divergence is discovered, the work product is not a partially correct feature that requires adjustment — it is an architecturally complete implementation of the wrong thing. The critical distinction from human-staffed teams is not that agents make more errors. It is that agents execute faster. A human developer who misreads a requirement and builds incorrectly for two hours produces two hours of rework. An agent operating continuously against the same flawed specification for two days produces two days of rework, including integrated tests, documentation, and downstream dependencies that reference the incorrect behavior as a contract. Requirement debt in autonomous organizations does not accumulate gradually; it compounds at the agent’s execution rate.

1.2 The Incompleteness Patterns That Produce Rework

Analysis of autonomous agent rework incidents reveals four recurring incompleteness patterns in requirements that reach builder agents without formal lifecycle gating:
Incompleteness PatternAgent BehaviorRework Type
Ambiguous success criteriaAgent defines its own acceptance thresholdImplementation passes agent’s test, fails product owner’s intent
Missing Non-GoalsAgent implements all plausible adjacent featuresOver-built surface requiring architectural reduction
Undocumented dependenciesAgent assumes dependency availability or substitutesIntegration failures discovered at wiring stage
No customer-facing articulationAgent optimizes for technical elegance over user valueCorrect implementation of an undesirable feature
Each pattern produces a distinct class of rework. The missing Non-Goals pattern is particularly expensive in autonomous contexts because agent gold-plating adds surface area that must be removed — which requires the same care and testing as the original implementation.

2. Architecture: Working-Backwards Template Plus Review Chain as a Hard Gate

2.1 Two-Component Design

The requirement lifecycle described in this paper comprises two components that operate in sequence: Component 1 — The Working-Backwards Template. An eight-section document structure that forces completeness before review. The template cannot be submitted for review in a partially complete state; all eight sections are required. The template originated in Amazon’s product development process, where it was designed to prevent technical teams from specifying implementation before customer value had been clearly articulated. The same property that makes it valuable for human product teams makes it valuable for autonomous agent contexts: it forces the right ordering of questions. Component 2 — The Four-Stage Review Chain. A mandatory sequence of review roles — Skeptic, Architect, CISO, and Approval — that must be traversed in order before a requirement transitions to approved status. Each stage has a distinct mandate and can return the requirement to draft status with documented rationale. Approval from one stage is not a prerequisite for beginning the next stage in parallel; each stage assesses only its designated concern, in sequence. Together, these components define a requirement state machine with five states:
draft → in_review → approved → implemented → archived
No builder agent may begin execution against a requirement that is not in approved state. This is not a convention or a guideline — it is a hard gate enforced before task execution begins.

2.2 Why Template-First, Then Review

The template and the review chain address different failure modes. The template prevents incompleteness from reaching reviewers. The review chain prevents flawed-but-complete requirements from reaching builders. Omitting the template and relying solely on reviewers to identify incompleteness degrades review quality: reviewers spend attention identifying missing sections rather than critically assessing the sections that exist. The template is a precondition for productive review, not a substitute for it.

3. The Eight-Section Template: What Each Section Prevents

3.1 Template Overview

The eight sections of the working-backwards template are not arbitrary organizational conventions. Each section exists because a specific class of requirement failure cannot be identified without it. The table below maps each section to the failure mode it prevents.
SectionCore Question ForcedFailure Mode Prevented
Press ReleaseWhat customer value does this deliver, stated as if already shipped?Building technically correct implementations of features customers do not want
FAQWhat assumptions are embedded in this requirement?Assumptions surviving review unexamined and surfacing as bugs
User Stories / Use CasesWhat observable behavior does this require?Abstract requirements that cannot be tested or verified
Technical ContextWhat is the architecture, dependency surface, and constraint set?Integration failures discovered at implementation time
Success MetricsHow will DONE be determined?Implementations that never complete because completion is undefined
Non-GoalsWhat will this explicitly NOT do?Autonomous agent scope expansion beyond the intended surface
Dependencies & RisksWhat must exist first, and what could go wrong?Blocked work and unmitigated risks discovered at execution time
Timeline & MilestonesWhat are the key dates and delivery phases?Scheduling conflicts and undefined delivery expectations

3.2 The Press Release Section

The Press Release section requires the author to write a customer-facing announcement as if the feature has already been shipped. This is the most frequently underestimated section and the most valuable. The framing constraint — past tense, customer audience, announcement format — produces a specific cognitive forcing function. It is not possible to write a credible press release for a feature whose customer value has not been clearly articulated. Authors who attempt the section and find they cannot write more than two sentences without retreating to technical language have identified a requirement that is not ready for review. The section fails audibly, at the authorship stage, before any reviewer has been engaged. The failure mode this section prevents — building a technically correct implementation of a feature that does not deliver customer value — is the most expensive class of requirement error because it is invisible to technical reviewers. An Architect reviewing technical context cannot identify this failure. A CISO reviewing security implications cannot identify it. Only the Press Release section, with its forced customer-value framing, creates a point in the process where this class of error must be confronted.

3.3 The Non-Goals Section

The Non-Goals section explicitly enumerates what the requirement will not deliver. In human-staffed teams, experienced engineers apply professional judgment to avoid building features they were not asked to build. This judgment is informal, inconsistent across individuals, and entirely absent in autonomous agents. An autonomous agent operating against a requirement that specifies what to build, but does not specify what not to build, will implement everything it determines to be plausibly related to the stated goal. This is not a reasoning error — it is correct behavior given the information provided. The agent cannot distinguish between “adjacent feature the product owner wants eventually” and “adjacent feature the product owner explicitly does not want in this release.” Non-Goals make that distinction machine-interpretable.
The absence of Non-Goals in a requirement that reaches a builder agent is not a minor completeness gap. It is an open scope boundary. Autonomous agents will fill open scope boundaries with implementations. Those implementations require the same removal effort as any other code: review, testing, and coordination. A requirement without Non-Goals is structurally incomplete for autonomous agent contexts, regardless of how thoroughly the other seven sections have been completed.

3.4 User Stories and Success Metrics as Verification Anchors

User Stories and Success Metrics serve a function beyond requirement completeness: they define the termination condition for builder agent execution. An agent without a clear termination condition — defined by observable user behavior (User Stories) and measurable outcomes (Success Metrics) — has no reliable signal for when the task is complete. Agents in this state either over-build (continuing to refine and extend because no metric has been satisfied) or under-build (stopping at a technically functional state that does not satisfy the product owner’s unstated expectations). Success Metrics should be specific enough to be evaluated against a running system: not “the feature should be fast” but “the p99 latency for this operation should be below 200ms under the expected load profile.” User Stories should describe observable behavior, not internal mechanism: not “the system processes events asynchronously” but “as a user, I see updated results within five seconds of submitting a request.”

4. The Review Chain: Separating Completeness, Feasibility, and Safety Reviews

4.1 Why Role Separation Matters

The four-stage review chain assigns a distinct mandate to each stage. The separation is not administrative ceremony — it is a structural defense against the single-reviewer failure mode, in which one reviewer is expected to simultaneously assess completeness, technical feasibility, and security implications. No reviewer is equally equipped for all three concerns, and the cognitive load of assessing all three simultaneously degrades performance on each. The review chain resolves this by making each stage responsible for exactly one concern:
StageMandateReturn Condition
SkepticChallenge completeness, feasibility, and surface-level assumptionsInsufficient evidence that the requirement is ready for technical review
ArchitectValidate technical context, integration points, and architectural fitTechnical context is incomplete, integration assumptions are incorrect, or architectural approach is incompatible with existing system
CISOAssess security implications, data handling requirements, and compliance constraintsSecurity risk is unmitigated, data handling is non-compliant, or compliance requirements are unaddressed
ApprovalFinal gate — confirm all prior stages completed and documentedAny prior stage finding is unresolved

4.2 The Skeptic Stage: Preventing Weak Requirements from Entering Technical Review

The Skeptic stage is the first gate and the one most frequently treated as administrative formality. This is a mistake. The Skeptic’s mandate is to challenge the requirement before technical reviewers invest time assessing something that is fundamentally incomplete or implausible. Effective Skeptic review asks:
  • Can the Press Release section be read to a customer without explanation? If it requires context to be understood, the requirement has not been sufficiently distilled.
  • Are the User Stories written in terms of observable user behavior, or do they describe implementation details?
  • Are the Non-Goals specific enough to give a builder agent an unambiguous boundary?
  • Is there at least one measurable Success Metric that could be evaluated against a running system?
A requirement that cannot satisfy these challenges is returned to draft with documented rationale. The Skeptic does not fix the requirement; the Skeptic identifies that it requires additional authorship work.

4.3 The Architect Stage: Validating Technical Context Before Commitment

The Architect stage is the only stage that evaluates the Technical Context section in depth. Its function is to verify that the architectural approach described in the requirement is consistent with the existing system, that dependencies identified are accurate and available, and that integration points are correctly characterized. Architect review failures typically take one of three forms: the requirement assumes a dependency that does not exist or is not available at the expected stage; the integration approach conflicts with existing architectural decisions; or the requirement’s scope, as implied by the Technical Context section, is substantially larger than what the other seven sections suggest. The last form is the most important. A requirement whose Technical Context section reveals a larger implementation scope than the Timeline and Success Metrics sections anticipate is a scheduling failure waiting to materialize. The Architect stage is the correct point to surface this discrepancy, before the requirement is approved and a builder agent begins execution.

4.4 The CISO Stage: Assessing Safety Before Approval

The CISO stage reviews the requirement for security implications, data handling requirements, and compliance constraints. This stage is positioned after Architect review because security analysis is more productive when the architectural approach has been validated. Reviewing security implications of a technical approach that the Architect has already rejected is unproductive. CISO review covers: data classification for any new data types introduced; access control implications of new surfaces; audit logging requirements; compliance constraints relevant to the feature (regulatory, contractual, or organizational); and residual risk that must be documented and accepted by an accountable owner. A requirement approved through CISO review carries an implicit security risk posture. Builder agents operating against approved requirements can proceed without performing their own security analysis of the approach — that analysis has already been completed and documented.

5. The Hard Gate: Draft Status Stops Execution Before It Starts

5.1 Implementation of the Draft-Status Check

The hard gate is the mechanism by which the review lifecycle becomes enforceable rather than advisory. Before executing any task that references a requirement, the executing agent performs the following check:
1. Read the requirement status from the requirement record.
2. If status is `draft` or `in_review`:
     Post comment: "REQ-NNN is in [status] status.
     Requirements must complete the lifecycle review chain
     before build execution begins."
     Halt task execution.
     Route to the requirement-lifecycle review process.
3. If status is `approved`:
     Proceed with execution.
This check adds negligible latency to task execution and eliminates the entire category of “agent built against an unapproved specification” failures. The cost of the check is constant and minimal. The cost of its absence is variable and potentially substantial.
An agent that skips the draft-status check because the requirement “looks complete” or “has been discussed in previous context” is not applying good judgment — it is bypassing a governance control. The status check must be implemented as a pre-execution step that runs unconditionally, regardless of the agent’s assessment of the requirement’s apparent quality. The purpose of the hard gate is precisely to prevent agent reasoning from substituting for formal review.

5.2 The State Machine as Authoritative Record

The requirement state machine (draft → in_review → approved → implemented → archived) is the authoritative source of truth for execution authorization. All transitions are recorded with timestamp, responsible reviewer, and rationale for the transition. A transition from in_review back to draft carries the reviewer’s documented rationale, creating an auditable record of why the requirement required additional work. This audit trail serves two functions. First, it provides accountability for the review chain: each stage produces a documented decision, not an implicit approval by inaction. Second, it creates a historical record of requirement quality issues that can be analyzed over time to identify systemic weaknesses in the authorship process.

6. The First-Architectural-Correction Rule: Corrections as Requirement Audits

6.1 The Signal That a Reviewed Requirement Has Gaps

Even a well-authored requirement that passes the full four-stage review chain may contain gaps that only become visible during implementation. The First-Architectural-Correction Rule governs how the builder agent responds when such a gap surfaces. The rule distinguishes between two classes of correction: Implementation bug: A correction that addresses incorrect behavior within the scope of the approved requirement. The fix is local. Execution continues. Architectural correction: A correction that touches the structural approach, the scope of the implementation, or the integration model. This correction indicates that a gap in the requirement survived all review stages. When any correction is architectural in nature, the agent must stop, read the full requirement document, identify all similar gaps — not just the one surfaced by the correction — and document the findings before writing more code.

6.2 Why Stopping Matters

The instinct to treat an architectural correction as a local fix and continue is understandable: the correction is clear, the fix is obvious, and the path forward is visible. This instinct is incorrect. An architectural gap that survived four-stage review is structural, not incidental. It reflects either an ambiguity in the requirement that reviewers did not surface, or a complexity in the implementation space that the requirement did not anticipate. In either case, the same type of gap is likely to exist in adjacent sections of the requirement. An agent that fixes the visible gap and continues will encounter the next gap after producing more work that may require further rearchitecting.
The First-Architectural-Correction Rule is a requirement audit trigger, not an implementation halt. The agent reads the requirement, identifies all similar gaps, and documents them. If additional gaps are found, they are surfaced before more code is written. If no additional gaps are found, the agent has confirmed that the requirement is sufficient for the remaining work and may proceed with higher confidence.

6.3 Feeding Corrections Back to the Lifecycle

Architectural corrections identified during implementation should be documented in the requirement record, not just in the code change. This documentation serves the review chain: future reviewers of similar requirements can identify the pattern that was missed and add it to their review mandate. A review chain that does not incorporate feedback from implementation experience does not improve over time.

7. Three Capture Modes for Different Input Contexts

7.1 Not All Requirements Begin as Structured Documents

The eight-section template assumes that the author begins with a blank document and sufficient context to populate all sections from scratch. This assumption holds for planned features in an organized product backlog. It does not hold for requirements that originate as verbal descriptions, existing specification documents, or user feedback that has been informally discussed. Three capture modes address these contexts:
ModeInput FormMechanismOutput
Template modeNew requirement with sufficient contextAuthor populates all eight sections directlyCompleted eight-section document
Guided Q&A modeInformal description or verbal summaryAgent asks structured questions and populates template from answersCompleted eight-section document with gaps marked for human review
Import modeExisting specification documentAgent maps existing document sections to the eight-section structure, identifies unmapped sectionsPartially completed document with identified gaps

7.2 Template Mode

Template mode is the baseline: the author has the context to complete all eight sections and does so. The template structure enforces completeness; the review chain enforces quality. No additional mechanism is needed. The primary failure mode in template mode is authors treating the Press Release and Non-Goals sections as lower-priority than the Technical Context section. Authors with engineering backgrounds frequently front-load technical specification and treat the customer-facing sections as administrative. This inversion defeats the purpose of the working-backwards structure. Template completion must be assessed by the Skeptic with specific attention to the quality of the customer-facing sections, not just their presence.

7.3 Guided Q&A Mode

Guided Q&A mode is appropriate when a requirement has been discussed informally — in a planning meeting, in a chat thread, or in a user interview — but has not been structured. The agent conducts a structured interview, asking questions mapped to the eight sections, and populates the template from the answers. The questions for each section follow the structure of the section itself:
  • For the Press Release: “Describe this feature as if you were announcing it to customers. What would the headline say? What customer problem does it solve?”
  • For Non-Goals: “What is explicitly out of scope for this requirement? What adjacent features are you deliberately not building?”
  • For Success Metrics: “How will you know this feature is complete? What does success look like as a measurable outcome?”
Guided Q&A mode produces a template with gaps explicitly marked. The author reviews the populated template, fills any gaps, and submits for the review chain. The gaps marked by the agent during Q&A are noted in the review record, giving the Skeptic specific sections to examine first.

7.4 Import Mode

Import mode addresses the common case in which a specification document already exists — a product brief, a design document, a functional specification — and the requirement lifecycle is being adopted after the document has been authored. In import mode, the agent maps the content of the existing document to the eight sections of the working-backwards template. Sections of the existing document that have clear template analogues are mapped directly. Sections of the template that have no analogue in the existing document are marked as gaps requiring authorship.
Import mode frequently reveals that existing specification documents are strong on Technical Context and weak on Press Release, Non-Goals, and Success Metrics. This is a consistent pattern, not an exception. Technical authors write technical specifications. The import process surfaces the customer-value and scope-boundary gaps that technical specifications routinely omit.

8. Implementation Constraints

8.1 Review Chain Latency

The four-stage review chain introduces latency between requirement authorship and build execution. For organizations accustomed to informal requirement approval processes, this latency may initially appear as a bottleneck. It is not a bottleneck; it is a deliberate constraint whose cost is lower than the cost of its absence. The latency is also partially controllable. The review chain does not require each stage to complete before the next stage begins its own assessment — it requires that each stage’s approval is recorded before the requirement transitions to approved. Reviewer availability, not the process structure, is the primary determinant of review chain duration. Organizations should resist the temptation to compress the review chain by combining stages or designating a single reviewer as both Skeptic and Architect. The value of the chain derives from its separation of concerns. Collapsing stages recreates the single-reviewer failure mode.

8.2 Template Adoption Resistance

Working-backwards templates are more demanding than informal requirement formats. Authors accustomed to writing one-paragraph feature descriptions will find the eight-section structure burdensome initially. This resistance is most acute for the Press Release section, which requires a mode of thinking — customer-value framing, announcement language, past-tense articulation of shipped benefit — that is unfamiliar to most engineers. The appropriate response to this resistance is not to make the template easier. The difficulty of the Press Release section is diagnostic: if an author cannot write a credible customer-facing announcement, the feature has not been sufficiently thought through. The friction is the signal.

8.3 State Machine Integrity

The requirement state machine is only as reliable as the processes that enforce transitions. A draft requirement that is marked approved without traversing the review chain provides no governance value — it provides false assurance, which is actively harmful. State transitions must be recorded with the identity of the reviewer who authorized the transition and the timestamp of the transition. Any system that permits transition without reviewer attribution is not a state machine — it is a label attached to a document.

9. Recommendations

  1. Adopt the working-backwards template as the mandatory format for all new requirements before review begins. Do not accept requirements for review until all eight sections have been completed. Incomplete submissions are returned to the author, not reviewed with gaps noted by the reviewer. The template’s purpose is to move completeness responsibility to the author, not to the review chain.
  2. Implement the draft-status hard gate as a pre-execution check in every builder agent. The check must run unconditionally before any task that references a requirement is executed. Do not allow agent assessment of apparent requirement quality to substitute for the status check. Verify the check is in place before deploying any new builder agent configuration.
  3. Treat the Press Release and Non-Goals sections as primary review targets. When training reviewers or evaluating the effectiveness of the Skeptic stage, direct specific attention to these two sections. They represent the highest-value and most frequently underwritten parts of the template.
  4. Enforce strict stage sequencing in the review chain. Do not combine the Skeptic, Architect, and CISO stages, and do not permit one reviewer to hold multiple stage responsibilities. The value of the review chain is proportional to the independence of each stage’s mandate.
  5. Record all review chain decisions with attributed rationale. A transition from in_review back to draft that does not carry documented rationale provides no learning value and no accountability. Require reviewers to produce a written finding for every rejection, however brief.
  6. Activate the First-Architectural-Correction Rule by default in all builder agents. When a correction touches architecture rather than implementation detail, require the agent to read the full requirement, identify similar gaps, and document findings before continuing. Make this behavior explicit in agent configuration, not a judgment call the agent applies situationally.
  7. Select the appropriate capture mode based on the origin of the requirement. For planned backlog items, use template mode. For informally described requirements, use guided Q&A mode. For organizations migrating existing specification documents into the lifecycle, use import mode. Do not require authors to use template mode for requirements that already exist in a different format — convert first, then review.
  8. Analyze requirement rejection patterns at regular intervals. The rationale records from rejected review stages contain diagnostic information about systemic weaknesses in the authorship process. Review these records every ten requirements to identify whether specific sections are consistently failing review, and adjust either the authorship guidance or the review criteria accordingly.

Forward-Looking Statement

The requirement lifecycle described in this paper addresses the problem as it exists today: autonomous agents that begin execution rapidly against specifications that have not been formally reviewed. As autonomous development organizations mature, the requirement lifecycle will need to evolve in two directions. First, the review chain will need to incorporate feedback loops that allow reviewer criteria to update based on implementation outcomes — the current model captures rationale but does not systematically apply it to future reviews. Second, as agent capabilities expand to include multi-requirement planning, the state machine will need to model requirement dependencies: a requirement that cannot be approved because a prerequisite requirement has not yet been implemented. Neither evolution is feasible without the foundational lifecycle described here. The working-backwards template and the four-stage review chain are not the final architecture for autonomous requirement management — they are the minimum viable governance structure that makes future evolution possible.
All content represents personal learning from personal projects. Code examples are sanitized and generalized. No proprietary information is shared. Opinions are my own and do not reflect my employer’s views.