Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.aidonow.com/llms.txt

Use this file to discover all available pages before exploring further.

Executive Summary

This paper presents quantitative findings from seven weeks of AI-assisted production SaaS platform development, analyzing 8,000 or more lines of production code across 524 commits. The central finding challenges the dominant “10x productivity” narrative: AI does not reduce total engineering time by 90 percent. It redistributes engineering effort from implementation toward design, context preparation, and output verification, while making previously uneconomical work — comprehensive documentation, exhaustive test coverage, complete architectural decision records — financially viable. Aggregate time savings across all task types measured at approximately 28 percent, not 90 percent. The genuine productivity gains were concentrated in systematic, well-specified implementation tasks (8–10x speedup) and entirely offset in large-scale refactoring tasks involving breaking changes (16x slowdown). The practical implication: organizations optimizing for lines of code per day will not achieve the promised returns; organizations optimizing for architectural clarity and design quality will.

Key Findings

  • AI-assisted development produces an aggregate 28 percent time reduction, not the 90 percent implied by “10x productivity” marketing claims, measured across all task types over seven weeks.
  • The speedup is highly non-uniform: systematic implementation of well-defined patterns yields 8–10x acceleration; large breaking changes yield a 16x slowdown relative to manual remediation.
  • The primary bottleneck shifts from implementation to design clarity: with AI, time allocation moves from 80 percent implementation / 20 percent design to 30 percent implementation / 70 percent design plus verification.
  • AI makes previously uneconomical work viable: comprehensive E2E test suites, complete architectural decision records, and detailed documentation all became economically feasible only under AI-assisted development economics.
  • AI accelerates pattern matching, not judgment: engineers who relied on implementation complexity to conceal weak architectural thinking will find AI amplifies the rate at which that debt compounds.
  • Quality improvements substantially exceed the time savings metric: pre-merge bug detection, documentation completeness, and architectural clarity improved materially, even where raw speed gains were modest.

1. Introduction

The productivity claims attached to AI-assisted software development tools are pervasive and largely unvalidated by controlled empirical data. “10x productivity,” “80 percent time savings,” and “ship in days what used to take weeks” are marketing propositions, not measured outcomes. This paper reports measured outcomes. Over seven weeks of production platform development, every hour was tracked by activity type, all commits were logged, and task time estimates were compared against AI-assisted actuals. The methodology is not rigorous by research standards — it is a single practitioner’s data from a single project — but it is specific, and it produces findings that diverge substantially from the marketing narrative. The findings are presented not to argue that AI is overvalued, but to provide a more accurate model for where AI creates value, where it does not, and how engineering organizations should measure and manage the transition.

2. The Productivity Myth: Examining the Premise

The standard productivity narrative rests on a flawed premise: that typing code is the primary bottleneck in software engineering. The stated model:
  • AI writes code instantly
  • Engineers save 80–90 percent of development time
  • Lines of code per hour is the primary productivity metric
The structural problem with this model: If typing speed were the primary bottleneck, productivity tools that address typing speed — keyboard layouts, faster editors, autocomplete — would have already produced the claimed gains. They have not, because the primary bottlenecks in software engineering are not at the implementation layer. The actual bottlenecks in software engineering:
  • Understanding requirements with sufficient precision to implement them correctly
  • Making the right architectural and design decisions
  • Catching bugs before production deployment
  • Maintaining system coherence as the codebase grows
AI eliminates typing as a bottleneck and exposes the actual bottlenecks. This is genuinely valuable. It is not 10x faster.

3. Time Distribution Analysis

3.1 Pre-AI Time Allocation

ActivityTime SharePrimary Bottleneck
Implementation (writing code, debugging syntax)80%Typing speed, context switching
Design and architecture15%Frequently abbreviated due to implementation pressure
Code review and verification5%Frequently abbreviated due to time pressure

3.2 AI-Assisted Time Allocation

ActivityTime SharePrimary Bottleneck
Design and architecture30%Clarity of requirements and constraints
Output verification and review40%Rigor of AI output checking
Prompt crafting and context preparation30%Precision of specification
Net aggregate time reduction: Approximately 20–28 percent, not 90 percent. The redistribution is significant independent of the aggregate time savings. Work that previously occupied 80 percent of engineering time now occupies 30 percent. Work that previously occupied 15 percent now occupies 30 percent. The character of engineering work changes fundamentally; the volume of work does not decrease proportionally.

4. Task-Level Performance Data

4.1 Week 2: Systematic Implementation (High-Value Case)

Task: CRM domain implementation — Account, Contact, Lead, and Opportunity entities.
MetricValue
Production code generated6,800+ lines
Test code generated2,400+ lines
Commits216
Pre-merge bugs caught by independent review18
Manual estimate120 hours (3 weeks)
Actual time with AI28 hours (3.5 days)
Speedup4.3x
Conditions that enabled this result: The task was well-scoped and systematic. Each entity followed a consistent pattern. Requirements were explicit. The planning phase (Evaluator) consumed 20 percent of total time; implementation 60 percent; verification 20 percent.

4.2 Week 4: Feature Development (Representative Case)

The surface metric: 4,000 lines of code in 3 days. This appears consistent with 10x productivity claims. The complete timeline:
PhaseTime
ADR-0020 architecture documentation16 hours (2 days)
Implementation24 hours (3 days)
Verification and bug correction8 hours (1 day)
Total48 hours (6 days)
Manual estimate for equivalent feature: 10–12 days. Actual speedup: Approximately 2x, not 10x. The line count metric is misleading because it excludes the planning time that made AI implementation possible.

4.3 Week 5: Breaking Change Remediation (Failure Case)

Task: Update macro signature for InMemoryRepository (breaking change affecting 95 files).
MetricAI-AssistedManual
Time to completion24 hours (failed)90 minutes
Commits313
Final state63 errors remainingClean build
Relative performance16x slowerBaseline
Structural explanation: The macro signature change produced a consistent pattern that needed to be applied globally across all 95 affected files. AI’s local error correction approach could not identify or apply the global pattern. Each fix introduced new errors in dependent files. Manual batch remediation resolved the pattern in a single pass.

4.4 Aggregate Performance (Weeks 1–7)

Task CategorySpeedupConditions
Systematic implementation8–10xWell-defined patterns, explicit requirements
Feature development2–4xNormal features with adequate design
Breaking changes0.1x (slower)System-wide refactoring, cascading errors
Novel problems~1x (no gain)Requires creative architectural decisions
DocumentationEffectively infiniteWork that would not occur manually
Verification and testing2–3xComprehensive coverage, edge case discovery
Overall average~3xAcross all task types
Summary metrics (weeks 1–7):
  • Total production code: 8,000+ lines
  • Total commits: 524
  • System scope: 15 entities, 37 routes, 8 workers
  • Test coverage: 92 percent (8,464 of 9,200 lines tested)
  • Estimated time without AI: approximately 280 hours
  • Actual time with AI: approximately 200 hours
  • Net time savings: 28 percent

5. The Structural Reframe: What AI Actually Changes

5.1 AI Amplifies Design Decisions — In Both Directions

Positive amplification: Good architectural decisions, applied consistently at scale.
#[derive(DomainAggregate, DomainEvent)]
#[capsule_isolated]
pub struct Lead { ... }
Once the isolation pattern was established, AI applied it consistently across 15 entities, generating 4,702 lines of boilerplate with zero isolation violations. The architectural decision was made once; AI executed it systematically. Negative amplification: Poorly scoped breaking changes, compounded by local error correction.
// Breaking change to macro signature
fn pk_for_id(self, id) -> String
A macro signature change with undefined global impact produced 24 hours of failed remediation attempts. The architectural decision to make a breaking change without a migration plan was amplified by AI’s inability to resolve the resulting cascade. Implication: The quality of outcomes under AI-assisted development is more tightly coupled to the quality of architectural decisions than under manual development, because AI executes design decisions at higher velocity and scale.

5.2 AI Eliminates Typing as a Bottleneck; It Exposes Thinking as the Binding Constraint

Before AI: Vague requirements could be resolved through progressive refinement during implementation. The act of writing code revealed misunderstandings and forced clarification incrementally. With AI: Vague requirements produce hallucinated implementations. The specification must be complete before AI begins; incompleteness is exposed immediately as incorrect output. To obtain high-quality AI output, the following preconditions must be satisfied:
  1. Requirements must be unambiguous and complete
  2. Architectural decisions must be documented (ADR format)
  3. Success criteria must be explicit
  4. Constraints must be stated, not assumed
This discipline improves design quality independent of AI. Engineers who internalize it will produce better architectures than they would through incremental refinement during implementation. However, engineers who do not invest in this discipline will find AI amplifies the rate at which vague requirements compound into technical debt. For a detailed treatment of deliberate pre-implementation thinking, see The Lost Art of Mental Compilation.

5.3 AI Changes the Economics of Engineering Work

Work with poor effort-to-value ratio under manual economics becomes viable under AI-assisted economics:
Work CategoryManual EffortAI-Assisted EffortOutcome
Comprehensive E2E test suite (21 scenarios)4–6 hours per scenario (~100 hours)45–60 minutes per scenario (~20 hours)Existed; would not have otherwise
35-page organization modelWould not have been written8 hoursWritten
127 API routes tagged with visibilityMind-numbing and error-prone3 hoursComplete and consistent
Complete ADR documentationWritten inconsistently, skipped under pressureTemplate-based, generated during planningFull decision history
The aggregate effect: work that provides genuine engineering value but exceeded the cost threshold under manual economics is now economically viable. This is where the genuine transformation resides — not in raw speed, but in the categories of work that can now be executed at all.

6. Implications for Engineering Practice

6.1 The Measurement Problem

Teams measuring AI productivity by lines of code per hour or features shipped per sprint will capture a narrow and misleading subset of the actual impact. The correct measurement framework:
MetricBefore AIWith AIDirection
Time spent on design before any code~15%~30%↑ (valuable)
Pre-merge bug detectionBaseline18+ bugs/sprint caught pre-merge
Documentation completenessFrequently incompleteConsistently complete
Architectural decision documentationInconsistentSystematic
Raw time savedBaseline~28%Modest improvement
Organizations that measure only the last row will consistently undervalue AI adoption. Organizations that measure all rows will see a more complete picture.

6.2 The Differentiation Effect

AI differentiates engineering capability by surface area, not absolute skill level:
  • Engineers who relied on implementation speed as a primary differentiator will find that advantage neutralized. Implementation speed is now supplied by tooling.
  • Engineers with strong architectural instincts will find their leverage multiplied. The quality of their design decisions now scales to the execution capacity of AI.
  • Engineers with weak architectural foundations will find AI accelerates their rate of technical debt production. Code that looks correct but is architecturally unsound will be generated faster than it can be reviewed.
The uncomfortable implication: AI does not raise the floor of engineering output quality uniformly. It raises the ceiling for engineers with strong design foundations and accelerates the degradation curve for those without.

6.3 Documentation as a First-Class Output

Under manual development economics, documentation is a secondary output produced inconsistently when time permits. Under AI-assisted economics, the marginal cost of documentation approaches zero. This changes the professional standard. Complete documentation — architectural decision records for every significant design decision, API documentation with examples, workflow guides for every process — is now achievable at reasonable cost. Organizations that do not capture this value are choosing to maintain documentation debt rather than accepting a structural constraint.

7. Recommendations

  1. Replace lines-of-code and features-shipped metrics with a composite measurement framework that includes time invested in pre-implementation design, pre-merge bug detection rate, documentation completeness, and architectural decision record coverage. Speed metrics alone will consistently misrepresent AI adoption value.
  2. Invest in prompt specification quality as a core engineering skill. The ability to write unambiguous, constraint-complete specifications is the binding constraint on AI output quality. Organizations should treat this skill with the same investment priority as code review, testing practice, and architectural design.
  3. Establish documentation as a required deliverable, not an optional enhancement. AI-assisted economics make complete documentation viable. Teams that do not require it are accepting preventable future onboarding and maintenance costs.
  4. Identify and protect human expertise in the categories where AI does not accelerate: novel architectural design, security threat modeling, breaking change remediation, and domain-specific business logic validation. These categories require human expertise; AI adoption does not reduce that requirement.
  5. Set expectations with stakeholders based on task-type-specific speedup data, not aggregate marketing claims. Systematic implementation tasks will show 8–10x improvement; architectural work will show 1–2x improvement; breaking changes will show degraded performance. Aggregate expectations should reflect the actual task mix.

8. Conclusion

The “10x productivity” framing is both inaccurate and misleading as a guide to AI adoption strategy. The accurate framing, supported by the data presented here, is: AI redistributes engineering effort from implementation toward design and verification, makes previously uneconomical work viable, and amplifies the consequences of architectural decisions in both directions. The net aggregate time savings — 28 percent in this engagement — are material and real. They are not transformative on their own. The transformation lies in what becomes possible at the margins: exhaustive test coverage, complete architectural documentation, consistent application of design patterns at scale, and the shift of engineering effort toward the work that most determines long-term system quality. As AI tooling continues to mature, the categories where AI performs reliably will expand. The systematic implementation speedup of 8–10x that characterizes the current generation of tools may extend to more complex tasks as context window capacity grows and instruction-following precision improves. The categories where human judgment is structurally required — architectural design, security reasoning, domain validation — will remain human responsibilities regardless of model capability. Engineering organizations that internalize this distinction now will be better positioned to capture future capability improvements as they become available.

Discussion

What is your experience with AI productivity claims? Are you actually 10x faster, or are you doing different work? Share your data: