Documentation Index
Fetch the complete documentation index at: https://www.aidonow.com/llms.txt
Use this file to discover all available pages before exploring further.
Executive Summary
The Plan → Implement → Verify pattern is a structured three-phase AI-assisted development methodology that enforces cognitive separation between architectural planning, code generation, and quality assurance. Applied to the construction of a complete CRM domain layer—comprising seven domain models, a custom DynamoDB derive macro, a full event-sourcing integration, and a 35-route API layer—the pattern intercepted 18 defects before production deployment, including five that would have caused critical production failures. Fresh Verifier sessions operating without knowledge of Builder implementation decisions consistently identified defect classes that Builder-authored tests cannot detect: requirement gaps, cross-entity consistency violations, and architectural flaws such as non-atomic event and state writes. At an AI cost of $12.68, the pattern delivered an estimated 4.4x velocity improvement against manual development benchmarks.Key Findings
- Requirement-level gaps are a distinct defect class from code-level bugs. Builder sessions optimize for passing tests; Verifier sessions operating from a fresh context optimize for requirement adherence. These are structurally different evaluations requiring separate execution contexts.
- AI models hallucinate features, not only code. Builder sessions will infer and implement functionality that seems logically consistent with specified requirements but is not actually required. Independent verification is the primary mechanism for detecting and preventing scope creep of this kind.
- Fresh Verifier sessions identify 100 percent more defects than reused sessions. In controlled comparison, reusing a Builder session for subsequent verification produced zero defect detections for a codebase that contained four confirmed defects.
- Cross-entity consistency verification requires explicit prompting. Individual entity verification does not automatically detect foreign key type mismatches, naming convention inconsistencies, or protocol divergence across related entities. A dedicated cross-entity verification pass is required.
- Atomic event-and-state write patterns must be enforced architecturally, not assumed. Builder sessions without explicit guidance on event-sourcing atomicity will produce non-atomic implementations that appear correct under test but create state divergence under partial failure conditions.
1. A Seven-Model CRM Domain Layer Provided Sufficient Complexity to Surface Structural Limitations of Single-Session AI Development
The scope of the implementation under analysis was the complete initialization of a CRM module for a multi-tenant platform:- 7 core domain models: Account, Contact, Lead, Opportunity, Activity, Product, Address
- Event-sourcing integration via
EventStoretrait - DynamoDB single-table design with custom derive macro
- Financial configuration system with tenant-scoped settings
- ISO reference data (ISO 3166 countries, ISO 4217 currencies)
- Full API layer with 35 routes and permission-based access control
- Integration tests against a local DynamoDB emulator
2. Cognitive Separation Across Three Distinct Sessions — Evaluator, Builder, and Verifier — Is the Structural Mechanism That Makes Independent Review Possible
2.1 The Evaluator Session Produces a Human-Approved Architecture Document That Governs All Subsequent Implementation Decisions
The planning phase is conducted in an Evaluator session using a reasoning-optimized model. The Evaluator explores the existing codebase to understand established patterns, then produces a written plan document that is approved by a human engineer before any implementation begins. Reference prompt:- One table for all CRM entities
- Partition key pattern:
TENANT#{tenant_id}#ACCOUNT#{account_id} - GSI patterns for cross-entity queries
- Each domain model emits events via the
EventStoretrait - Event naming convention:
{Entity}{Action}(e.g.,AccountCreated) - Repository pattern wraps both state storage and event publishing
- Custom
#[derive(DynamoDbEntity)]macro - Auto-generates partition key, sort key, and GSI attributes
- Eliminates boilerplate across all seven domain models
- L1: Domain model validation (unit tests)
- L2: Repository CRUD operations (integration tests against local emulator)
- L3: Event publishing flow (EventBridge to SQS verification)
- L4: End-to-end CRM workflows
2.2 The Builder Session Executes Against the Approved Plan in Isolation, With No Knowledge of Prior or Subsequent Verification Activity
The Builder session uses a throughput-optimized model to implement against the approved plan. Each of the seven sub-tasks was issued as a separate Builder session with a specific plan reference. Reference prompt for DynamoDB macro sub-task:- Created the
dynamodb-derivecrate - Implemented the proc macro using
synandquote - Generated 15 unit tests covering attribute pattern combinations
- All tests passing on local execution
2.3 The Verifier Session, Initialized Without Builder Context, Evaluates Implementation Against Requirements Rather Than Against the Builder’s Own Tests
The Verifier session is initialized without any prior context from the Builder. This session independently reads the original requirements and the plan, then assesses the implementation against both. Reference prompt:3. The Most Consequential Defect Class — Event-Sourcing Atomicity Violation — Is Undetectable by Builder-Authored Tests and Requires Independent Architectural Review
The most consequential defect detected during this engagement occurred during the event-sourcing integration sub-task and illustrates the value of independent verification most clearly.3.1 A State-First Write Order Creates Irrecoverable Divergence Between the Database and the Event Log Under Partial Failure
The Builder produced the following pattern for account creation:3.2 The Verifier, Reading Requirements Independently, Identified the Partial Failure Scenario That Builder Tests Cannot Reach
The Verifier, reading requirements independently, raised the following finding:“What happens if DynamoDB save succeeds but EventStore append fails? You now have state divergence between the database and the event log.”This is a correct architectural observation. DynamoDB and the EventStore are separate systems. A failure between the two write operations leaves the account in the database without an audit trail in the event log, violating the event-sourcing guarantee that the event log is the source of truth.
3.3 The Two-Phase Commit Pattern — Event Append First, State Persist Second With Rollback — Became the Standard Across All Seven Domain Models
A planning session with the Evaluator produced the two-phase commit pattern that became the standard across all seven domain models:Builder-authored tests cannot detect this class of defect. Unit and integration tests exercise the happy path; they do not test the behavioral contract between two separate storage systems under partial failure. Independent architectural review through a fresh Verifier session is the only mechanism that consistently identifies this pattern.
4. Builder Tests and Verifier Findings Address Non-Overlapping Defect Classes — They Are Structurally Different Evaluations, Not Redundant Ones
A consistent pattern emerged across all seven sub-tasks in this implementation.| Defect Class | Detected By Builder Tests | Detected By Fresh Verifier |
|---|---|---|
| Compilation errors and type mismatches | Yes | Not applicable |
| Happy-path functional correctness | Yes | Redundant |
| Missing negative test cases | No | Yes — 8 instances |
| Requirement gaps (plan said X, code did Y) | No | Yes — 6 instances |
| Hallucinated features not in requirements | No | Yes — noted below |
| Cross-entity consistency violations | No | Yes — 4 instances |
| Architectural protocol violations | No | Yes — 2 instances (including atomicity) |
5. Three Defect Case Studies Illustrate the Requirement-Gap, Hallucination, and Consistency-Violation Classes That Fresh Verifier Sessions Consistently Detect
5.1 AI Requirement Hallucination: The preferred_name Field
During Contact domain model implementation, the Builder added a preferred_name: Option<String> field not present in the requirements. The Builder’s commit message cited a specific section of the product requirements document; that section contained no such reference. The Builder inferred that a system with first_name and last_name fields would benefit from a preferred name. The inference is reasonable; the implementation is out of scope.
The Verifier identified this as a hallucinated feature and it was removed. A GitHub issue was created to evaluate it formally in a future iteration.
Implication: AI models hallucinate features, not only code. Independent verification controls scope creep, not only technical defects.
5.2 Cross-Entity Consistency: Foreign Key Type Mismatch
Following individual verification of each entity, integration testing revealed a failure in the relationship between Opportunity and Account:5.3 Permission Naming Inconsistency Across API Routes
During API integration, the Builder applied inconsistent permission naming patterns across entity routes:- Account routes:
"crm.account.read"(period delimiter) - Contact routes:
"crm:contact:read"(colon delimiter, different from Account) - Lead routes:
"crm-lead-read"(hyphen delimiter, entirely different pattern)
6. Four Principles Codified From This Engagement Govern Event Ordering, Macro Review, Type Safety, and Requirement Traceability in All Subsequent Implementations
The following principles were codified from the defects identified and resolved during this implementation: Principle 1: Events Are the Source of Truth When integrating event sourcing with a persistent store, the event must be appended before state is updated. Event append failure causes the operation to fail. State persistence failure causes event rollback. The inverse order—state first, event second—creates irrecoverable divergence under partial failure. Principle 2: Macro-Generated Code Requires Independent Review Code generated by derive macros compiles and executes but may violate conventions not encoded in the macro itself, such as DynamoDB projection type requirements. All macro-generated code must be inspected via macro expansion tooling as part of the verification process. Principle 3: Type Safety Supersedes Runtime Validation for Cross-Cutting Concerns String-based permissions, event names, and table names are defect vectors. Compile-time enforcement through typed constants eliminates the class of runtime errors caused by typos and naming inconsistencies:7. Single-Session Approaches Produce No Planning Artifacts, No Cross-Entity Verification, and No Architectural Flaw Detection — Deficiencies Absent From the Three-Phase Pattern
| Dimension | Single-Session Approach | Plan → Implement → Verify |
|---|---|---|
| Defects reaching production | Present | Zero in observed deployment |
| Requirement coverage | Variable | Systematically verified |
| Feature scope control | None — AI may add unrequested features | Controlled — Verifier flags hallucinated requirements |
| Cross-entity consistency | Not verified | Explicit verification pass |
| Architectural flaw detection | Absent | Present via independent review |
| Planning artifacts | None | Plan document serves as governance record |
| Verification cost | Negligible | 180k tokens per engagement |
8. Observed Metrics: 18 Defects Intercepted, Zero Reaching Production, 92 Percent Test Coverage, and 4.4x Velocity Improvement at $12.68 AI Cost
| Metric | Value |
|---|---|
| Sub-tasks completed | 7 |
| Defects identified in verification | 18 (5 critical, 8 moderate, 5 minor) |
| Defects reaching production | 0 |
| Test coverage | 92% — 85 unit, 42 integration, 12 event flow, 3 E2E |
| Production code | 6,800 lines; test code 2,400 lines |
| Velocity vs. manual estimate | 4.4x |
| Total AI cost | ~$12.68 (145k Opus + 700k Sonnet tokens) |
| Average rework cycles per sub-task | 2.3 |
9. The Complete CRM Domain Layer — 6,800 Lines of Production Code, 142 Tests Across Four Levels — Was Delivered Through This Methodology With Zero Production Defects
The complete CRM domain layer delivered through this methodology comprised seven domain models (Account, Contact, Lead, Opportunity, Activity, Product, Address), a custom#[derive(DynamoDbEntity)] macro with PK/SK/GSI auto-generation and PII field encryption, trait-based repositories with in-memory and DynamoDB implementations, 21 domain events integrated with DynamoDB Streams and EventBridge, a 35-route API layer with typed permission constants, tenant-scoped financial configuration, and ISO 3166/4217 reference data. Total scope: 6,800 lines of production code, 2,400 lines of tests, 142 tests across four levels.
10. Recommendations
- Treat Verifier session isolation as a non-negotiable engineering standard. The evidence is unambiguous: reused sessions detect zero defects where fresh sessions detect multiple. This is not a cost optimization opportunity—it is the mechanism that makes independent verification meaningful.
- Require cross-entity verification as a mandatory step in multi-model domain implementations. Individual entity verification is necessary but insufficient. The prompt template in Section 5.2 provides a repeatable starting point.
- Establish typed constants for all cross-cutting concerns before beginning API implementation. Permission names, event type identifiers, and table names must be compile-time constants before any routes are written. Retrofitting from string-based patterns is expensive.
- Encode event-sourcing atomicity requirements explicitly in Builder prompts. Do not assume Builder sessions will infer the correct write ordering. The two-phase commit pattern in Section 3.3 should be provided as a reference.
- Require requirement traceability in test naming conventions. Tests not traceable to a requirement provide coverage without verification. Enforce this through code review policy.
- Document AI cost and return on investment per major feature. Measurable ROI data provides the organizational evidence base for workflow adoption and continuous refinement.
11. Conclusion and Forward Outlook
The Plan → Implement → Verify pattern is a production-validated methodology for AI-assisted software development that addresses structural limitations of single-session approaches. Its core insight—that independent Verifier sessions detect defect classes that Builder sessions cannot by design—is not a limitation of current AI capabilities but a property of any cognitive system that evaluates its own output. As AI model capabilities advance, the return on structured workflows will increase. More capable models will produce higher-quality implementations and more thorough verifications, but they will not eliminate the value of cognitive separation between planning, implementation, and review. Organizations that establish workflow discipline now will be positioned to systematically capture the value of future model improvements rather than experiencing them as unpredictable quality variations. The five critical defects prevented by this methodology in a single engagement—including an event-sourcing atomicity violation that would have produced irrecoverable state divergence in production—represent the category of failure that structured AI-assisted development is specifically designed to address.All content represents personal learning from personal projects. Code examples are sanitized and generalized. No proprietary information is shared. Opinions are my own and do not reflect my employer’s views.