The Macro That Wrote 80% of Our Repositories

Executive Summary

Procedural macro code generation offers a high-leverage response to the structural uniformity problem inherent in event-sourced domain repositories. In a SaaS platform with 20 domain entities, manual repository implementations totaled 13,600 lines; after macro refactoring, the equivalent functionality required 2,400 lines of domain-specific code plus 2,917 lines of reusable macro infrastructure—a 61% total reduction and an 80% reduction in per-entity boilerplate. Time-to-add-new-entity decreased from 4–6 hours to approximately 30 minutes. AI pattern analysis accurately identified the five macro abstractions from examination of existing implementations; human judgment was required for API boundary design and compile-time validation specification. A GSI query defect in the initial implementation was identified and resolved before production deployment through compile-time attribute validation. This analysis documents the design decisions, implementation outcomes, and actionable principles derived from this work.

Key Findings

Five derive macros eliminated 80% of per-entity boilerplate across 20 domain entities. The macro infrastructure investment of 2,917 lines reached break-even at the fourth entity migration and yields compounding savings for every subsequent entity.
AI accurately extracts repeating patterns from a corpus of existing implementations. Given five representative entity implementations, the evaluator agent produced a macro design recommendation that identified the correct abstraction boundaries and required minimal revision.
Compile-time attribute validation is the primary mechanism for preventing macro misconfiguration. The GSI defect in the initial implementation was caused by missing compile-time validation; adding attribute dependency checks eliminated the class of misconfiguration error.
Test helpers generated in test-only compilation contexts provide measurably improved test authoring ergonomics. Macro-generated fixtures, event constructors, and assertion helpers reduce per-test setup overhead and enforce consistent test patterns across all entities.
AI does not validate macro configurations against architectural intent without explicit specification. The initial DynamoDB macro generated query operations without specifying the GSI name because the functional specification did not enumerate GSI configuration as a required attribute; the macro API must be designed by a human who understands the architectural intent.
The macro API design phase is the highest-leverage human contribution. Implementation of macro logic is a systematic task where AI performs well; the decisions about which invariants to enforce, which attributes to require, and what error messages to produce require architectural judgment that AI agents do not apply by default.

1. Introduction: The Scale of the Problem

By January 2025, a SaaS platform had accumulated 20 or more domain entities (Leads, Accounts, Opportunities, Products, and others). Each entity required five categories of surrounding infrastructure:

Event sourcing boilerplate: approximately 100 lines (version tracking, event replay, uncommitted event accumulation)
DynamoDB repository: approximately 300 lines (CRUD operations, entity-domain conversion, GSI queries)
In-memory test repository: approximately 80 lines (HashMap-backed CRUD for unit test isolation)
Caching decorator: approximately 150 lines (Moka-backed read-through cache with write-invalidate)
Event helper methods: approximately 50 lines (aggregate ID access, event type string, metadata)

At 680 lines per entity across 20 entities, this infrastructure accounted for 13,600 lines of near-identical code. The practical consequences included: multi-location manual propagation of every bug fix, copy-paste drift that introduced subtle behavioral differences across entity implementations, and 4–6 hours of ramp time for each new entity.

2. Pattern Analysis

An AI evaluator agent was provided access to five representative entity implementations and directed to identify common patterns and recommend a macro abstraction strategy. The agent’s analysis:

Analyzing Lead, Account, Opportunity, Product, Contact...

Patterns found:
1. Event sourcing: All have version, uncommitted_events, replay()
2. Repositories: All implement save(), find_by_id(), delete()
3. Test helpers: All have test_fixture(), assert_events()
4. Event enums: All need aggregate_id(), event_type()
5. Caching: All use Moka with same TTL and write-invalidate

Recommendation: Create 5 derive macros to eliminate this.

The agent correctly identified that differences between entities (Account versus Contact versus Lead) were parameter values applied to a uniform structure, not fundamentally distinct implementations. This insight—that the correct abstraction is a parameterized macro rather than entity-specific code generation—is the central design decision, and it was produced accurately by pattern analysis alone. Human contribution at this phase consisted of specifying which naming conventions to enforce (snake_case event type strings, METADATA/LIST_ITEM as DynamoDB sort key discriminators) and which configuration attributes to make explicit versus inferred.

3. The Five Macros

3.1 DomainAggregate (Event Sourcing)

Before (100 lines of boilerplate):

pub struct Lead {
    id: LeadId,
    tenant_id: String,
    name: String,
    version: u64,
    uncommitted_events: Vec<LeadEvent>,
}

impl Lead {
    pub fn replay(events: Vec<LeadEvent>) -> Option<Self> {
        if events.is_empty() {
            return None;
        }

        let first = events.first()?;
        let created = match first {
            LeadEvent::Created(e) => e,
            _ => return None,
        };

        let mut aggregate = Self::from_created_event(created);

        for event in events.into_iter().skip(1) {
            aggregate.apply_event(&event);
            aggregate.version += 1;
        }

        Some(aggregate)
    }

    pub fn uncommitted_events(&self) -> &[LeadEvent] {
        &self.uncommitted_events
    }

    pub fn take_uncommitted_events(&mut self) -> Vec<LeadEvent> {
        std::mem::take(&mut self.uncommitted_events)
    }

    pub fn version(&self) -> u64 {
        self.version
    }

    fn record_event(&mut self, event: LeadEvent) {
        self.apply_event(&event);
        self.version += 1;
        self.uncommitted_events.push(event);
    }
}

After (5 lines + macro):

#[derive(DomainAggregate)]
#[aggregate(
    event = "LeadEvent",
    id_field = "id",
    id_type = "LeadId",
)]
pub struct Lead {
    id: LeadId,
    tenant_id: String,
    name: String,
    version: u64,
    #[serde(skip)]
    uncommitted_events: Vec<LeadEvent>,
}

// Only need to implement domain-specific logic:
impl Lead {
    fn apply_event(&mut self, event: &LeadEvent) {
        match event {
            LeadEvent::Created(e) => self.name = e.name.clone(),
            LeadEvent::Updated(e) => self.name = e.name.clone(),
        }
    }

    fn from_created_event(event: &LeadCreated) -> Self {
        Self {
            id: event.id.clone(),
            tenant_id: event.tenant_id.clone(),
            capsule_id: event.capsule_id.clone(),
            name: event.name.clone(),
            version: 0,
            uncommitted_events: vec![],
        }
    }
}

Generated methods: replay(events), uncommitted_events(), take_uncommitted_events(), version(), record_event(event). Test helpers (generated only in #[cfg(test)] builds): test_fixture(), assert_uncommitted_events(count), assert_last_event_type(type).

3.2 DynamoDbRepository (Infrastructure)

Before (300 lines per repository):

pub struct DynamoDbLeadRepository {
    client: DynamoDbClient,
    table_name: String,
}

impl DynamoDbLeadRepository {
    pub fn new(client: DynamoDbClient) -> Self { /* ... */ }

    async fn save(&self, aggregate: &Lead) -> Result<()> {
        let entity = LeadEntity::from_domain(aggregate);

        // METADATA item
        let metadata_item = entity.to_item_with_sk("METADATA")?;

        // LIST_ITEM for queries
        let list_item = entity.to_list_item()?;

        // TransactWriteItems with both
        self.client
            .transact_write_items()
            .transact_items(/* METADATA */)
            .transact_items(/* LIST_ITEM */)
            .send()
            .await?;

        Ok(())
    }

    async fn find_by_id(&self, id: &LeadId) -> Result<Option<Lead>> {
        let pk = format!("LEAD#{}", id.0);

        let result = self.client
            .get_item()
            .table_name(&self.table_name)
            .key("PK", AttributeValue::S(pk))
            .key("SK", AttributeValue::S("METADATA".to_string()))
            .send()
            .await?;

        match result.item {
            Some(item) => {
                let entity: LeadEntity = serde_dynamo::from_item(item)?;
                Ok(Some(entity.to_domain()?))
            }
            None => Ok(None),
        }
    }

    // ... 250 more lines for list(), delete(), update(), etc.
}

After (5 lines + macro):

#[derive(DynamoDbRepository)]
#[repository(
    entity = "LeadEntity",
    domain = "Lead",
    event = "LeadEvent",
    table = "crm_leads",
    gsi_name_lookup = "gsi1_by_name",  // Optional
    multi_item = true,  // METADATA + LIST_ITEM pattern
)]
pub struct DynamoDbLeadRepository;

Generated methods: new(client), save(aggregate), find_by_id(id), list(tenant_id, capsule_id), delete(id), and find_by_name(name) when gsi_name_lookup is specified.

3.3 CachedRepository (Performance)

Before (150 lines per cached repository):

pub struct CachedLeadRepository<R: LeadRepository> {
    inner: Arc<R>,
    cache: Arc<Cache<LeadId, Arc<Lead>>>,
}

impl<R: LeadRepository> CachedLeadRepository<R> {
    pub fn new(inner: R) -> Self {
        let cache = Cache::builder()
            .max_capacity(10_000)
            .time_to_live(Duration::from_secs(300))
            .build();

        Self {
            inner: Arc::new(inner),
            cache: Arc::new(cache),
        }
    }
}

#[async_trait]
impl<R: LeadRepository + Send + Sync> LeadRepository for CachedLeadRepository<R> {
    async fn save(&self, aggregate: &Lead) -> Result<()> {
        self.inner.save(aggregate).await?;
        self.cache.invalidate(&aggregate.id);  // Write-invalidate
        Ok(())
    }

    async fn find_by_id(&self, id: &LeadId) -> Result<Option<Lead>> {
        if let Some(cached) = self.cache.get(id) {
            return Ok(Some((*cached).clone()));
        }

        let result = self.inner.find_by_id(id).await?;

        if let Some(ref aggregate) = result {
            self.cache.insert(id.clone(), Arc::new(aggregate.clone()));
        }

        Ok(result)
    }

    // ... 100 more lines for list(), delete(), invalidate(), etc.
}

After (3 lines + macro):

#[derive(CachedRepository)]
#[cache(
    repository = "DynamoDbLeadRepository",
    domain = "Lead",
    id_type = "LeadId",
    max_capacity = 10_000,
    ttl_secs = 300,
)]
pub struct CachedLeadRepository;

Generated features: read-through caching on find_by_id(), write-invalidate on save() and delete(), TTL-based expiration, and LRU eviction via Moka.

4. Defect: GSI Query Generation

Defect: The initial DynamoDbRepository macro generated code that did not handle GSI queries correctly.Root cause: The macro generated Query operations without specifying the GSI name via .index_name(). Queries were directed at the primary table instead of the GSI.How it was identified: Attribute validation added to the macro during review:

#[proc_macro_derive(DynamoDbRepository, attributes(repository))]
pub fn derive_dynamodb_repository(input: TokenStream) -> TokenStream {
    let attrs = parse_attrs(&input)?;

    // Validate: if gsi_name_lookup is specified, ensure GSI exists
    if let Some(gsi_field) = &attrs.gsi_name_lookup {
        if attrs.gsi_index_name.is_none() {
            return compile_error!(
                "gsi_name_lookup requires gsi_index_name attribute"
            );
        }
    }

    // Generate code...
}

Principle: AI generates working code for the common path but does not validate that configuration attributes are mutually consistent. Proc macros require explicit compile-time checks for every configuration dependency.

5. Test Helper Generation

The DomainAggregate macro generates test helper methods exclusively in test compilation contexts (#[cfg(test)]), ensuring zero runtime overhead in production binaries:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_lead_creation() {
        // Macro-generated test fixture
        let mut lead = Lead::test_fixture("tenant-123", "PRODUS");

        lead.update_name("New Name".to_string());

        // Macro-generated assertions
        lead.assert_uncommitted_events(1);
        lead.assert_last_event_type("lead_updated");
    }

    #[test]
    fn test_event_replay() {
        let events = vec![
            LeadEvent::test_created(/* ... */),
            LeadEvent::test_updated(/* ... */),
        ];

        let lead = Lead::replay(events).unwrap();

        assert_eq!(lead.name, "Updated Name");
        assert_eq!(lead.version, 1);
    }
}

Generated test helpers: test_fixture(), test_created() / test_updated() event constructors, assert_uncommitted_events(), assert_last_event_type(), and assert_roundtrip() for serialization verification.

Use cargo expand to inspect macro-generated code during development. It shows the exact expansion of each macro invocation, making it possible to verify that security invariants (tenant isolation checks, encryption calls, audit event emissions) are present in the generated output. This inspection step should be mandatory for macros that touch security-critical paths.

6. Impact Quantification

Before macros:

20 entities × 680 lines boilerplate = 13,600 lines

Repository.rs files:
- lead_repository.rs: 380 lines
- account_repository.rs: 410 lines
- opportunity_repository.rs: 450 lines
(... 17 more)

After macros:

20 entities × 120 lines (domain logic only) = 2,400 lines
+ macro infrastructure: 2,917 lines
= 5,317 lines total

Savings: 13,600 - 5,317 = 8,283 lines (61% reduction)

Repository.rs files:
- lead_repository.rs: 45 lines (90% reduction)
- account_repository.rs: 52 lines (87% reduction)
- opportunity_repository.rs: 68 lines (85% reduction)

Per-macro reduction:

DomainAggregate: 100 lines → 5 lines (95% reduction)
DynamoDbRepository: 300 lines → 5 lines (98% reduction)
CachedRepository: 150 lines → 3 lines (98% reduction)
InMemoryRepository: 80 lines → 5 lines (94% reduction)
DomainEvent helpers: 50 lines → 4 lines (92% reduction)

7. Comparative Analysis: Development Economics

Metric	Before Macros	After Macros	Change
Lines per entity (total)	680	120	−82%
Time to add new entity	4–6 hours	30 minutes	−88%
Bug fix propagation time	3.5 hours (7 entities manual)	20 minutes (macro update)	−90%
Risk of propagation error	Present (manual multi-site edit)	Eliminated (single source)	Eliminated
Architectural consistency	Enforced by convention	Enforced by compiler	Structurally guaranteed
New entity test setup	15–20 tests, manual fixture creation	5–8 tests, macro-generated fixtures	−60% test authoring

Break-even analysis: macro infrastructure required approximately 26.5 hours of total investment (design, implementation, verification, defect resolution). At 4 hours saved per entity, break-even occurs at the fourth entity migration. All subsequent entities yield pure savings.

8. AI-Human Division of Labor

This engagement provides evidence for a specific division of labor between AI agents and human practitioners in macro development: AI performed well at:

Pattern extraction from a corpus of five existing implementations
Implementation of 2,917 lines of macro logic from a specified API
Generation of 387 lines of reference documentation from the implementation
Diagnosis of compilation errors and root cause identification during migration

Human judgment was required for:

Abstraction boundary design (which patterns to expose as macros, which to leave manual)
Macro attribute API design (which configuration to require versus infer)
Security invariant specification (tenant isolation check as a generated, unconditional constraint)
Error message quality (actionable versus technically correct)
Architecture decisions triggered by defects (DynamoDB versus search service for flexible query workloads)

The pattern that emerges is: AI is well-suited for implementing a specified abstraction; humans are required for designing the abstraction and specifying its invariants. Conflating these roles—asking an AI agent to both design and implement a macro—produces implementations that are technically functional but architecturally under-specified.

9. Macro Infrastructure Structure

platform-macros/
├── Cargo.toml
├── README.md (387 lines of documentation)
├── src/
│   ├── lib.rs (405 lines - macro exports)
│   ├── domain_aggregate.rs (272 lines)
│   ├── domain_event.rs (311 lines)
│   ├── inmemory_repository.rs (284 lines)
│   ├── dynamodb_repository.rs (539 lines)
│   └── cached_repository.rs (499 lines)
└── tests/
    └── domain_aggregate_test.rs (212 lines)

16 integration tests, all passing. Zero entity test regressions across 20+ refactored entities.

10. Recommendations

Require five or more pattern instances before initiating macro development. Pattern extraction requires sufficient examples to distinguish structural uniformity from incidental similarity. Premature abstraction based on two or three examples produces macros that are poorly generalized and require frequent revision.
Design the macro attribute API before engaging an implementation agent. Specify which attributes are required, which are optional, what their permitted values are, what compile-time dependencies exist between them, and what error messages should be produced when constraints are violated. This specification is the input to macro implementation; its quality determines the quality of the output.
Add compile-time attribute dependency validation for all configuration attributes. Every attribute that depends on another attribute for correctness (for example, gsi_name_lookup depending on gsi_index_name) must be enforced at compile time. Runtime failures from misconfigured macros impose debugging costs that negate the productivity benefit.
Inspect generated code with cargo expand before deploying any macro to production. Verify that security invariants are present in the generated output. This step is mandatory and cannot be replaced by functional testing alone.
Establish a deprecation and backward-compatibility policy for macro APIs before the first consumer is added. Macro changes propagate simultaneously to all consumers. A breaking attribute rename is a simultaneous breaking change across the entire codebase. Treating macro APIs with the same versioning discipline as public library APIs prevents cascading rework.

11. Conclusion

Procedural macro code generation is a high-leverage technique for eliminating structurally uniform boilerplate in event-sourced domain architectures. The evidence from this analysis establishes that five macros can eliminate 80% of per-entity boilerplate across 20 or more entities, reduce time-to-new-entity from hours to minutes, and convert bug fix propagation from a multi-entity manual task into a single macro change. The technique is not without risk. Generated code can omit security invariants that were not explicitly specified. Macro API changes have amplified blast radius. AI agents will not apply security or architectural awareness to generated code without explicit specification. Each of these risks has a known mitigation: explicit invariant specification, API versioning discipline, and mandatory cargo expand inspection. The AI-human division of labor that this work demonstrates—AI for pattern extraction and systematic implementation, humans for abstraction design and invariant specification—is likely to generalize across other code generation domains. As AI-assisted development tools mature, practitioners who develop discipline around abstraction design and invariant specification will be positioned to extract maximum leverage from AI implementation capability without incurring the associated security and architectural risks.

Resources and Further Reading

The Rust Programming Language - Macros
syn crate documentation
quote crate documentation
Related article: Saga Workflow Patterns (Week 5)

Disclaimer: This content represents my personal learning journey using AI for a personal project. It does not represent my employer’s views, technologies, or approaches.All code examples are generic patterns or pseudocode for educational purposes.

Overview

Data & State

Code & Tooling

Debugging & Design

Infrastructure

The Macro That Wrote 80% of Our Repositories

Executive Summary

Key Findings

1. Introduction: The Scale of the Problem

2. Pattern Analysis

3. The Five Macros

3.1 DomainAggregate (Event Sourcing)

3.2 DynamoDbRepository (Infrastructure)

3.3 CachedRepository (Performance)

4. Defect: GSI Query Generation

5. Test Helper Generation

6. Impact Quantification

7. Comparative Analysis: Development Economics

8. AI-Human Division of Labor

9. Macro Infrastructure Structure

10. Recommendations

11. Conclusion

Resources and Further Reading

Overview

Data & State

Code & Tooling

Debugging & Design

Infrastructure

Documentation Index

​Executive Summary

​Key Findings

​1. Introduction: The Scale of the Problem

​2. Pattern Analysis

​3. The Five Macros

​3.1 DomainAggregate (Event Sourcing)

​3.2 DynamoDbRepository (Infrastructure)

​3.3 CachedRepository (Performance)

​4. Defect: GSI Query Generation

​5. Test Helper Generation

​6. Impact Quantification

​7. Comparative Analysis: Development Economics

​8. AI-Human Division of Labor

​9. Macro Infrastructure Structure

​10. Recommendations

​11. Conclusion

​Resources and Further Reading

Executive Summary

Key Findings

1. Introduction: The Scale of the Problem

2. Pattern Analysis

3. The Five Macros

3.1 DomainAggregate (Event Sourcing)

3.2 DynamoDbRepository (Infrastructure)

3.3 CachedRepository (Performance)

4. Defect: GSI Query Generation

5. Test Helper Generation

6. Impact Quantification

7. Comparative Analysis: Development Economics

8. AI-Human Division of Labor

9. Macro Infrastructure Structure

10. Recommendations

11. Conclusion

Resources and Further Reading