Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.aidonow.com/llms.txt

Use this file to discover all available pages before exploring further.

Executive Summary

Prompt engineering for production software development remains predominantly an ad hoc practice, with most published guidance confined to generic principles rather than domain-specific templates validated against real outcomes. This reference compiles prompt structures derived from seven weeks of AI-assisted production infrastructure development, covering infrastructure design, debugging, code review, and refactoring workflows. Each template is accompanied by observed success rates, failure mode analysis, and anti-patterns derived from approximately 200 prompt executions. The aggregate success rate across all categories is 87%, with the highest rates in documentation (100%) and API contract design (95%), and the lowest in performance debugging (71%) and breaking change migration (67%). The primary finding is that prompt effectiveness is determined less by linguistic style and more by the structural completeness of four components: current state context, measurable target outcome, explicit solution space constraints, and specified output format.

Key Findings

  • Prompt structure predicts outcome more reliably than prompt length or specificity of language. Templates that include all four structural components (context, task, constraints, output format) consistently outperform unconstrained equivalents across all task categories.
  • Explicit constraints on what the AI should not do reduce hallucination rates more effectively than positive instructions alone. Anti-pattern constraints (e.g., “Do NOT recommend approaches requiring new infrastructure”) prevent the AI from optimizing for idealized solutions that are inapplicable to the actual system.
  • Session isolation between implementation and verification is critical for defect detection. Verification sessions that inherit Builder context miss defects that fresh sessions catch. The token cost of session isolation is negligible relative to the cost of production defects.
  • Performance optimization prompts require instrumentation data as input. Prompts issued without profiling results produce optimization suggestions based on heuristic assumptions, which are incorrect approximately 50% of the time.
  • Breaking change migrations fail at higher rates when executed reactively rather than planned proactively. The 67% success rate for breaking change migration reflects the one case where changes were executed without a migration plan; all planned migrations succeeded.
  • Documentation prompts achieve 100% success when the output format enforces machine-parseable structure. ADRs written with imperative language, code examples, and no narrative prose are reliably extracted and enforced by downstream AI agents without disambiguation.

1. Prompt Structure Fundamentals

Each template in this library conforms to a four-component structure. Deviating from this structure is the primary source of prompt failure across all categories.
ComponentPurposeFailure Mode When Absent
ContextProvides current system stateAI generates idealized solutions inapplicable to actual architecture
TaskSpecifies measurable target outcomeAI produces output that addresses a different problem
ConstraintsDefines solution space boundariesAI hallucinates infrastructure, invents requirements, over-engineers
OutputSpecifies required format and deliverablesAI produces narrative where structured data was needed, or vice versa
Reference Template:
Context:
[Current state of the system]

Task:
[Specific action with measurable outcome]

Constraints:
[What to avoid, limits, boundaries]

Output:
[Expected format and deliverables]
The Constraints component is the most frequently omitted and the most consequential. Without explicit constraints, AI systems optimize for correctness in the abstract rather than applicability to the specific system. The result is technically valid but operationally inapplicable output.

2. Infrastructure Design Templates

2.1 ADR-Driven Architectural Design

Use case: Architectural decisions affecting multiple components, with requirement to compare approaches and document trade-offs. Observed success rate: 92% (23 of 25 ADRs led to successful implementations) Why this structure works: Requiring the AI to enumerate consequences before proposing solutions prevents the common failure mode of receiving a recommendation without understanding its systemic implications.
Plan an architectural approach for [FEATURE].

Context:
- Current architecture: [DESCRIBE EXISTING SYSTEM]
- Problem: [SPECIFIC ISSUE YOU'RE SOLVING]
- Constraints: [TECHNICAL LIMITATIONS]
- Related decisions: [EXISTING ADRs IF ANY]

Task:
Design 2-3 approaches with trade-offs. For each approach:
1. How it works (technical approach)
2. What changes (affected components)
3. Pros/cons (specific to our system)
4. Migration effort (hours estimate)

Constraints:
- Do NOT recommend approaches requiring new infrastructure
- Do NOT suggest "best practices" without context
- Do NOT hallucinate features we don't have

Output:
Structured comparison table + recommendation with justification.
Representative result: Used to design multi-tenant capsule isolation (ADR-0010). AI compared three approaches: table-per-capsule, partition key isolation, and separate databases. Partition key approach was selected and implemented across 48 files with zero cross-environment data leaks. Primary failure modes: Vague constraint specification results in AI recommending ideal-state solutions requiring infrastructure not present in the system. Vague problem statements result in AI solving a related but distinct problem.

2.2 DynamoDB Single-Table Schema Design

Use case: Designing DynamoDB single-table schemas with explicit access pattern requirements. Observed success rate: 88% (7 of 8 schemas required no major refactoring post-deployment) Why this structure works: Forcing enumeration of access patterns upfront prevents the common failure mode of designing for generic CRUD operations rather than actual query requirements.
Design a DynamoDB single-table schema for [DOMAIN].

Context:
- Entities: [LIST ENTITIES WITH KEY ATTRIBUTES]
- Relationships: [HOW ENTITIES RELATE]
- Multi-tenancy: Tenant + Capsule isolation required
- Existing table: [TABLE NAME AND CURRENT SCHEMA IF ANY]

Access patterns (priority order):
1. [SPECIFIC QUERY WITH EXPECTED VOLUME]
2. [SPECIFIC QUERY WITH EXPECTED VOLUME]
3. [etc.]

Task:
Design partition key, sort key, and GSI patterns that support all access patterns.
Include:
- PK/SK patterns for each entity
- GSI definitions with projection types
- Example items for each entity
- Migration strategy if modifying existing table

Constraints:
- Maximum 5 GSIs (cost control)
- All queries must be efficient (no scans)
- Maintain tenant+capsule isolation in all keys

Output:
Table structure + example items + query patterns + cost estimate.
Representative result: CRM schema supporting seven entities (Account, Contact, Lead, and others) with 12 access patterns. Schema handled all patterns with three GSIs. 92% test coverage; no refactoring required after deployment. Primary failure modes: Omitting access patterns causes AI to design for generic CRUD operations, which do not match real query requirements. Including “support future queries” as a requirement causes over-engineering with 10+ GSIs.

2.3 API Contract Design

Use case: Defining REST API routes with permission requirements and validation rules. Observed success rate: 95% (38 of 40 API designs implemented without breaking changes) Why this structure works: Specifying authorization model and validation rules upfront prevents the most expensive form of rework: discovering permission inconsistencies or validation gaps after implementation.
Design API contract for [FEATURE].

Context:
- Domain model: [DESCRIBE ENTITIES]
- Existing routes: [RELATED API ENDPOINTS]
- Permission model: [PERMISSION NAMING PATTERN]
- Validation rules: [CONSTRAINTS FROM DOMAIN]

Task:
Define REST API endpoints with:
1. Route paths and HTTP methods
2. Request/response schemas (JSON)
3. Permission requirements per endpoint
4. Validation rules (request validation)
5. Error responses (what can fail and why)

Constraints:
- Follow RESTful conventions (no RPC-style endpoints)
- Use typed permission constants (no string literals)
- All IDs are UUIDs (no sequential integers)
- Pagination for list endpoints (max 100 items)

Output:
OpenAPI 3.0 spec + permission constants + validation schemas.
Representative result: Design of 35 CRM API routes. AI identified permission naming inconsistencies across entities (mixed delimiter conventions: dots versus colons). Resolution via typed constants produced zero permission defects in production. Primary failure modes: Omitting permission pattern specification causes AI to invent inconsistent naming conventions. Omitting error response design causes AI to generate generic 500 responses for all failure conditions.

3. Debugging Templates

3.1 Root Cause Analysis

Use case: Systematic investigation of defects to identify root cause rather than symptoms. Observed success rate: 78% (14 of 18 bugs traced to actual root cause, not symptom) Why this structure works: Explicitly prohibiting fix suggestions before root cause identification prevents the most common AI debugging failure: proposing a symptom-addressing fix that leaves the underlying cause unresolved.
Debug [BUG DESCRIPTION].

Context:
- Observed behavior: [WHAT'S ACTUALLY HAPPENING]
- Expected behavior: [WHAT SHOULD HAPPEN]
- Environment: [PRODUCTION/STAGING/LOCAL]
- Recent changes: [COMMITS IN LAST 24H]

Investigation steps:
1. Read error logs: [LOG FILE PATHS]
2. Check related code: [FILES LIKELY INVOLVED]
3. Review recent changes: git diff [COMMIT RANGE]
4. Identify divergence point: where expected != observed

Task:
Find the root cause by:
1. Reproducing the issue (minimal reproduction case)
2. Tracing execution flow (where does it break?)
3. Identifying the change that introduced it (git bisect if needed)
4. Explaining WHY it fails (not just WHERE)

Constraints:
- Do NOT suggest fixes yet (root cause first)
- Do NOT assume infrastructure issues without evidence
- Do NOT blame "race conditions" without proof

Output:
Root cause analysis with:
- Exact line of code causing the issue
- Why it fails (logic error, wrong assumption, etc.)
- When it was introduced (commit hash)
- Suggested fix approach (not implementation yet)
Representative result: Authentication failure investigation. Initial AI analysis attributed the failure to the OAuth library. Root cause analysis constraint forced re-examination, revealing middleware was evaluating permissions before JWT validation. Reordering middleware resolved the defect. Primary failure modes: Permitting fix suggestions before root cause identification results in symptom-addressing fixes. Cryptic error messages cause AI to attribute cause based on pattern-matched “similar issues” rather than evidence.

3.2 Performance Bottleneck Identification

Use case: Identifying performance bottlenecks through measurement rather than assumption. Observed success rate: 71% (5 of 7 performance issues correctly identified) Why this structure works: The measurement-before-optimization constraint prevents the most expensive performance debugging failure: optimizing a component that is not the actual bottleneck.
Investigate performance issue: [DESCRIPTION].

Context:
- Slow operation: [WHAT'S SLOW]
- Current performance: [METRICS - LATENCY, THROUGHPUT]
- Expected performance: [TARGET METRICS]
- Environment: [PRODUCTION/STAGING]

Measurement strategy:
1. Identify measurement points (where to add instrumentation)
2. Collect baseline metrics (before optimization)
3. Profile execution (CPU, memory, I/O, network)
4. Identify bottleneck (slowest component)

Task:
Find the bottleneck by:
1. Adding instrumentation code (tracing, metrics)
2. Running profiler on representative workload
3. Analyzing results to find hot paths
4. Quantifying impact (% of total time per component)

Constraints:
- Measure first, optimize later (no premature optimization)
- Use profiler data, not guesses
- Focus on p95 latency, not averages
- Ignore micro-optimizations (< 5% impact)

Output:
Performance profile showing:
- Time breakdown by component (%)
- Bottleneck identification (specific function/query)
- Optimization opportunity ranking
- Expected improvement from fixing top bottleneck
Representative result: API responses averaging 800ms. AI suggested caching as the optimization. Profiling constraint forced measurement first, revealing 750ms attributable to a DynamoDB query missing a GSI. Adding the GSI reduced latency to 45ms. Caching would have masked the root cause while providing a fraction of the available improvement.
Performance optimization prompts issued without profiling data as input will generate heuristic recommendations with approximately 50% accuracy. The measurement infrastructure investment required to use this template correctly is non-negotiable.

4. Code Review Templates

4.1 Verification Checklist

Use case: Systematic review of AI-generated implementations before merge. Observed success rate: 94% (18 of 19 critical defects caught before production) Why this structure works: A structured checklist prevents the “looks correct” assessment that misses systematic issues. The checklist enforces review of categories that ad hoc inspection reliably omits.
Verify implementation of [FEATURE].

Context:
- Requirements: [LINK TO PLAN/PRD]
- Implementation: git diff main...[BRANCH]
- Tests: [TEST FILE PATHS]

Verification checklist:

1. Requirement Coverage
   - Does code implement ALL requirements from plan?
   - Are there hallucinated features (not in requirements)?
   - Are edge cases from requirements tested?

2. Test Quality
   - Do tests map to specific requirements?
   - Are negative tests included (what should fail)?
   - Is test coverage >= 85%?
   - Do tests run in CI?

3. Cross-Cutting Concerns
   - Authorization: Are permission checks present?
   - Multi-tenancy: Are tenant boundaries enforced?
   - Error handling: Are errors logged with context?
   - Observability: Are metrics/traces added?

4. Integration Assumptions
   - Are external service calls mocked in tests?
   - Are database transactions atomic?
   - Are event sourcing patterns followed?

5. Code Quality
   - Are naming conventions consistent?
   - Is the code readable (no clever tricks)?
   - Are magic numbers extracted to constants?
   - Is documentation present where needed?

Task:
Review code against checklist. For each item:
- PASS: Requirement met
- CONDITIONAL: Met with minor issues (list them)
- FAIL: Not met (blocking issue)

Constraints:
- Do NOT pass if ANY cross-cutting concern is untested
- Do NOT approve hallucinated features
- Do NOT accept < 85% test coverage without justification

Output:
Verification report with:
- Overall decision: PASS / CONDITIONAL / FAIL
- Issues found (category, severity, location)
- Required fixes for CONDITIONAL/FAIL
- Suggestions for improvement (optional)
Representative result: CRM implementation verification identified five critical defects the Builder had not flagged: event and database atomicity violation (HIGH), missing PII field encryption (HIGH), permission naming inconsistency (MEDIUM), hallucinated preferred_name field (MEDIUM), and foreign key type mismatch (HIGH). All caught before production deployment. Primary failure modes: Verification sessions that reuse the Builder session context miss defects because the session inherits the Builder’s assumptions. Fresh sessions are required for effective verification.

4.2 Cross-Entity Consistency Review

Use case: Reviewing changes that affect multiple related entities for consistency violations. Observed success rate: 100% (6 of 6 consistency violations detected) Why this structure works: Individual entity review does not surface cross-entity inconsistencies. This template explicitly directs attention to the relationship layer where integration defects originate.
Verify cross-entity consistency for [DOMAIN].

Context:
- Entities involved: [LIST ENTITIES]
- Relationships: [HOW THEY RELATE]
- Recent changes: [WHAT WAS MODIFIED]

Consistency checks:

1. Foreign Keys
   - Do foreign key types match primary key types?
   - Are ID formats consistent (all UUID vs mixed)?
   - Are cascade rules defined (what happens on delete)?

2. Event Patterns
   - Do all {Entity}Created events have same structure?
   - Are event names consistent ({Entity}{Action})?
   - Is event versioning applied uniformly?

3. Repository Patterns
   - Do all repositories implement same trait?
   - Are CRUD method signatures consistent?
   - Are error types uniform across repositories?

4. API Patterns
   - Are route paths consistent (/entities/:id pattern)?
   - Are HTTP methods consistent across entities?
   - Are permission names following same pattern?

5. Validation Rules
   - Are similar fields validated the same way?
   - Are error messages consistent in format?

Task:
Check each entity against the patterns from other entities.
Flag any inconsistencies as BLOCKING.

Output:
Consistency report with:
- Inconsistencies found (entity, pattern, deviation)
- Impact assessment (breaking change?)
- Remediation steps
Representative result: Detected that the Opportunity entity defined account_id: String while the Account entity used id: AccountId(Uuid). Integration tests would have failed at runtime. The defect was corrected before merge.

5. Refactoring Templates

5.1 Pattern Extraction

Use case: Extracting a reusable abstraction from repeated code patterns. Observed success rate: 83% (10 of 12 refactorings improved maintainability metrics) Why this structure works: Requiring definition of the abstraction before extraction prevents the failure mode of extracting a pattern that does not accommodate all usage sites.
Extract reusable pattern from [CODE LOCATION].

Context:
- Repeated code: [DESCRIBE DUPLICATION]
- Locations: [FILE PATHS WHERE PATTERN APPEARS]
- Current pain: [WHY IT'S PROBLEMATIC]

Pattern analysis:
1. Identify common structure (what's the same?)
2. Identify variation points (what differs?)
3. Define abstraction (trait, macro, function?)
4. Estimate usage sites (how many places use this?)

Task:
Extract pattern by:
1. Defining the abstraction (trait signature, macro syntax)
2. Implementing the abstraction (generic logic)
3. Migrating 1-2 usage sites (prove it works)
4. Creating migration plan for remaining sites

Constraints:
- Do NOT extract if < 3 usage sites (not worth it)
- Do NOT make abstraction more complex than original code
- Do NOT break existing tests during migration
- Migrate incrementally (not all at once)

Output:
- Abstraction implementation
- Migration guide
- Before/after comparison (lines of code saved)
- Risk assessment (what could break?)
Representative result: DynamoDB entity conversion code duplicated across seven entities (600 lines total). Extracted #[derive(DynamoDbEntity)] macro. Code reduction to 80 lines with zero defects introduced. Primary failure modes: Extracting from two usage sites produces an abstraction that does not accommodate the third site. Abstractions more complex than the original duplication decrease rather than improve maintainability.

5.2 Breaking Change Migration Planning

Use case: Planning and executing API or schema changes that break existing callers. Observed success rate: 67% (2 of 3 migrations completed without incident; 1 failure attributable to reactive rather than planned execution) Why this structure works: Requiring a migration plan before execution prevents the reactive error-fixing cascade that characterizes unplanned breaking change execution.
Plan migration for breaking change: [CHANGE DESCRIPTION].

Context:
- Current implementation: [WHAT EXISTS NOW]
- Desired implementation: [WHAT YOU WANT]
- Reason for change: [WHY IT'S NECESSARY]
- Affected components: [WHAT USES CURRENT IMPLEMENTATION]

Impact analysis:
1. Find all usage sites (grep, LSP references)
2. Categorize by impact (breaking vs compatible)
3. Estimate migration effort per site
4. Identify blockers (can't migrate until X is done)

Migration plan:
1. Preparation (new implementation alongside old)
2. Migration sequence (order matters - dependencies first)
3. Cutover strategy (atomic vs gradual)
4. Rollback plan (if migration fails)

Task:
Create migration plan with:
- Pre-change checklist (what to prepare)
- Migration steps (ordered, specific)
- Validation steps (how to verify each step)
- Rollback procedure (if things go wrong)

Constraints:
- Do NOT break main branch during migration
- Do NOT migrate more than 10 files per commit
- Do NOT skip validation between steps
- Maximum 5 commits for entire migration

Output:
Migration plan + estimated time + risk rating.
Representative result (failure case): CRUD method addition to a macro (breaking change) was executed without a migration plan, producing 30 incremental error-fixing commits over an extended period. Subsequent human intervention batched fixes into three commits, completing the migration in 90 minutes. The lesson operationalized: breaking changes must be planned before execution, not fixed reactively.

6. Documentation Templates

6.1 Architecture Decision Record

Use case: Documenting architectural decisions in a format that AI agents can parse and enforce. Observed success rate: 100% (all ADRs produced with this template were successfully parsed by downstream AI agents) Why this structure works: ADRs written in narrative prose are not reliably parsed by AI agents. Machine-parseable structure — imperative language, code examples, no ambiguity — enables AI agents to extract and enforce constraints without human disambiguation.
Write ADR for [DECISION].

Context:
- Problem: [WHAT WE'RE SOLVING]
- Current state: [HOW IT WORKS NOW]
- Pain points: [WHY CURRENT STATE IS BAD]

Decision process:
1. Options considered (2-3 approaches)
2. Trade-offs per option (specific pros/cons)
3. Decision rationale (why we chose option X)

ADR structure:

## Context
[Problem statement with metrics if available]

## Decision

### 1. [Constraint Name]
[Exact pattern with code examples]

### 2. [Constraint Name]
[Exact pattern with code examples]

## Consequences

### Positive
- [Specific benefit 1]
- [Specific benefit 2]

### Negative
- [Specific drawback 1]
- [Migration effort estimate]

### Migration Strategy
1. [Concrete step 1]
2. [Concrete step 2]

Task:
Write ADR following this structure.
Use code blocks, tables, bullet lists (not prose paragraphs).
Include examples for each constraint.

Constraints:
- Do NOT write philosophical discussions
- Do NOT skip code examples
- Do NOT use vague language ("should", "might", "consider")
- Use imperative language ("must", "required", "pattern is")

Output:
ADR document in Markdown, ready for AI to parse and enforce.
Representative result: ADR-0010 (Capsule Isolation). Defined table naming conventions, partition key patterns, and required fields. AI agents reading the ADR generated compliant code across 48 files with zero constraint violations. Primary failure modes: Narrative prose ADRs are interpreted inconsistently by AI agents. Omitting code examples causes AI agents to misinterpret patterns. Vague language (“should”, “might”) is treated as optional by AI agents.

7. Success Rate Summary

The following table consolidates observed success rates across seven weeks of production use against approximately 200 prompt executions.
CategoryTemplateSuccess RatePrimary Failure Mode
Infrastructure Design90%
ADR-Driven Design92%Vague constraint specification
Database Schema Design88%Missing access pattern enumeration
API Contract Design95%Missing permission model specification
Debugging75%
Root Cause Analysis78%Premature fix suggestion
Performance Debugging71%Missing profiling data
Code Review94%
Verification Checklist94%Builder session reuse for verification
Cross-Entity Consistency100%N/A — no failures observed
Refactoring83%
Pattern Extraction83%Insufficient usage site coverage
Breaking Change Migration67%Reactive execution without plan
Documentation100%
ADR Writing100%N/A — structured format always parseable
Overall87%
“Success” is defined as task completion without major rework or human intervention beyond review and approval.

8. Anti-Patterns That Consistently Degrade Performance

8.1 The Vague Requirement

Non-performant:
Make the API better.
Performant:
Add rate limiting to API routes.

Current: No rate limiting, users can make unlimited requests.
Target: 100 requests/minute per tenant.
Implementation: Use AWS API Gateway throttling.
Vague outcome specifications cause AI systems to optimize for generic “best practices” that may be inapplicable to the specific system.

8.2 The Unconstrained Error Fix

Non-performant:
Fix all compilation errors.
Performant:
Fix compilation errors in [CRATE].

Constraints:
- Maximum 5 commits total
- Group related fixes in single commit
- Do NOT add new methods/types
- Only update call sites to match new signatures
- If stuck after 3 commits, report and stop
Unconstrained error fixing produces sequential single-error resolution that generates new errors at each step. The documented failure case produced 30 commits over 24 hours before human intervention.

8.3 The Missing Context

Non-performant:
Implement feature X.
Performant:
Implement feature X.

Context:
- Architecture: Multi-tenant SaaS with capsule isolation
- Conventions: Follow ADR-0010 for table naming
- Constraints: All queries must include tenant+capsule in partition key
- Related code: See [FILE] for similar pattern

Follow existing patterns, do not invent new approaches.
Without system context, AI generates technically correct code that does not conform to existing architectural patterns, requiring rework to integrate.

8.4 Session Reuse for Verification

Non-performant: Using the Builder’s session for verification to reduce token consumption. Performant: Always initialize a fresh session for verification. The Builder’s session contains assumptions, intermediate reasoning, and context that bias defect detection. Fresh verification sessions identify defects that Builder sessions suppress. The token cost differential is negligible relative to the cost of production defects.

8.5 Intuition-Based Performance Optimization

Non-performant:
Optimize this function.
Performant:
Profile [OPERATION] to find bottleneck.

Use profiler to measure time breakdown.
Only optimize if component is >10% of total time.
Report findings before optimizing.
Performance intuition is incorrect approximately 50% of the time in practice. Measurement-before-optimization is not optional.

9. Copy-Paste Reference Templates

9.1 Planning Template

Plan [FEATURE].

Context:
- Current system: [DESCRIBE]
- Problem: [WHAT'S BROKEN/MISSING]
- Constraints: [TECHNICAL LIMITS]
- Related work: [ADRs, EXISTING FEATURES]

Design requirements:
1. [REQUIREMENT 1]
2. [REQUIREMENT 2]
3. [REQUIREMENT 3]

Task:
Design approach with:
- Architecture (how components interact)
- Data model (schema, entities)
- API contract (if applicable)
- Testing strategy (4-level pyramid)
- Migration plan (if changing existing system)

Constraints:
- Follow ADR-[XXX] for [PATTERN]
- No new infrastructure dependencies
- Must maintain backward compatibility

Output:
Design document with diagrams, examples, and migration steps.

9.2 Implementation Template

Implement [FEATURE] per plan: [PLAN FILE PATH]

Context:
- Plan section: [SPECIFIC SECTION]
- Affected files: [FILES TO MODIFY]
- Testing requirements: [FROM PLAN]

Task:
Implement feature with:
1. Core logic (domain layer)
2. Integration code (repository/API layer)
3. Tests (4 levels: unit, integration, E2E, contract)
4. Documentation (inline comments where needed)

Constraints:
- Follow plan exactly (no creative deviations)
- All tests must pass before requesting review
- Test coverage >= 85%
- No compiler warnings

Output:
Implementation + tests + verification readiness checklist.

9.3 Verification Template

Verify [FEATURE] implementation.

Context:
- Requirements: [PLAN/PRD PATH]
- Implementation: git diff main...[BRANCH]
- Tests: [TEST FILE PATHS]

Checklist:
1. Requirement coverage (all requirements implemented?)
2. Test quality (requirements mapped to tests?)
3. Cross-cutting concerns (auth, multi-tenancy, errors, observability)
4. Integration assumptions (are they tested?)
5. Code quality (naming, readability, documentation)

Task:
Review against checklist.
For each item: PASS / CONDITIONAL / FAIL
List specific issues with file paths and line numbers.

Constraints:
- FAIL if any cross-cutting concern is untested
- FAIL if hallucinated features exist
- CONDITIONAL if minor issues can be fixed quickly

Output:
Verification report with overall decision and required fixes.

9.4 Debugging Template

Debug [ISSUE].

Context:
- Observed: [WHAT'S HAPPENING]
- Expected: [WHAT SHOULD HAPPEN]
- Environment: [PROD/STAGING/LOCAL]
- Recent changes: [COMMITS IN LAST 24H]

Investigation:
1. Reproduce issue (minimal reproduction)
2. Check logs: [LOG PATHS]
3. Review code: [LIKELY FILES]
4. Trace execution: where does it break?
5. Identify root cause: WHY does it fail?

Constraints:
- Find root cause before suggesting fixes
- Use evidence (logs, traces), not assumptions
- Do NOT blame infrastructure without proof

Output:
Root cause analysis with exact failure point and suggested fix approach.

10. Template Selection Guide

What are you doing?

├─ Designing something new?
│  └─ Use Planning Template + ADR-Driven Design

├─ Implementing a design?
│  └─ Use Implementation Template

├─ Reviewing code?
│  └─ Use Verification Template

├─ Fixing a bug?
│  └─ Use Debugging Template

├─ Making a breaking change?
│  └─ Use Breaking Change Migration

├─ Optimizing performance?
│  └─ Use Performance Debugging

└─ Extracting common patterns?
   └─ Use Pattern Extraction

11. Quantitative Outcomes from Seven Weeks of Production Use

MetricValue
Total prompts executed~200
Overall success rate87%
Planning token consumption~150k tokens/week
Implementation token consumption~500k tokens/week
Verification token consumption~180k tokens/week
Debugging token consumption~100k tokens/week (variable)
Planning time reduction75% (2–4 hours to 30 minutes)
Implementation time reduction80% (3–5 days to 1 day)
Verification time reduction75% (2 hours to 30 minutes)
Critical defects caught in verification18
Defects reaching production0
ADR compliance rate100%
Average cost per feature$12–15
All token and cost figures reflect specific model versions (Claude Opus 4 for evaluation, Claude Sonnet 3.5 for building and verification) over a specific time period. Model pricing changes frequently. Use these figures for relative comparison purposes, not absolute cost projection.

12. Recommendations

Recommendation 1: Adopt the four-component structure as an organizational standard for all AI task prompts. Context, task, constraints, and output format should be required components of any prompt issued to AI development systems. This structure reduces hallucination rates and improves output applicability more reliably than any other single intervention. Recommendation 2: Establish session isolation as a policy requirement for verification workflows. Verification sessions must be initialized fresh, independent of Builder sessions. Encode this requirement in workflow documentation and tooling configurations. Recommendation 3: Require profiling data as a prerequisite for performance optimization prompts. Performance optimization prompts without measurement data should be rejected at the workflow level. The measurement investment is justified by the cost of optimizing non-bottleneck components. Recommendation 4: Treat breaking change execution without a migration plan as a policy violation. The failure case documented at 67% success rate is entirely attributable to reactive execution. Planned migrations succeeded at 100%. Migration planning is the intervention, not post-hoc error correction. Recommendation 5: Write ADRs in machine-parseable format from the outset. ADRs written in narrative prose cannot be reliably enforced by downstream AI agents. Imperative language, code examples, and structured sections should be organizational standards for ADR authorship. Recommendation 6: Treat this library as a baseline, not a ceiling. The templates presented here reflect specific technology choices (Rust, DynamoDB, AWS Lambda) and a specific system architecture. Adaptation to different technology stacks and organizational contexts is expected. The structural patterns — four-component prompts, constraint specification, session isolation — are transferable. The specific content is a starting point.
When adapting templates from this library, begin by modifying only the Constraints component for your specific system context. Constraints are the highest-leverage adaptation point: the same structural prompt with system-specific constraints outperforms a generalized prompt adapted to your domain through the Task or Context components alone.

License: Use freely. No attribution required. If it helps, great. If not, ignore it.