Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.aidonow.com/llms.txt

Use this file to discover all available pages before exploring further.

Mocked E2E suite vs real-site suite: five real bugs exposed on first run

Executive Summary

A frontend application’s E2E test suite underwent a complete architectural rewrite: all page.route() API interception and sessionStorage-based auth injection were eliminated, and the 52-spec Playwright suite was redirected to run against a permanently deployed dev environment. The motivation was not test philosophy — it was a specific observation that mocked tests were passing while real integration failures existed in production-bound code. The first full run after the rewrite produced five failures, each corresponding to a genuine application defect that the mocked suite had been concealing. This paper documents the rewrite architecture, the specific failures exposed, and the assertion design changes required to test correctly against a live environment.

Key Findings

  • API mocking in E2E tests creates a category of invisible bugs: failures in the real data layer, permission enforcement, navigation state, and CMS slug registration are all suppressed when test responses are fabricated, producing passing test suites that do not reflect application correctness.
  • The first run of a real-site E2E suite is a bug disclosure event: five application defects — concealed across months of passing mocked tests — became immediately visible when the suite ran against the actual dev environment for the first time.
  • Auth injection via sessionStorage is not equivalent to real authentication: the mocked suite bypassed the login flow entirely, preventing tests from catching UI discrepancies in the auth system itself, including button label mismatches and persona landing route divergence.
  • Assertion design must change when mocking is removed: body-text assertions that worked against fabricated responses produce false positives against real pages. URL-pattern assertions are more robust indicators of navigation success than page content checks.
  • Live-data tests require accepting valid outcome sets, not specific values: tests against a real environment must assert that a valid state rendered, not that a specific mock-injected value appeared — a structural change to test semantics that simplifies maintenance under real data churn.
  • A permanently deployed dev environment is the prerequisite: the no-mock architecture depends on a stable, continuously updated environment that the CI pipeline can target. Without it, the only alternative is mocking.

1. The Mocking Pattern and Its Failure Mode

The original test suite used two complementary mechanisms to isolate tests from the real application backend. page.route() API interception — Playwright’s route API intercepts outgoing network requests matching a pattern and fulfills them with static JSON supplied by the test. A representative fixture:
test.beforeEach(async ({ page }) => {
  await page.route('**/api/operator/**', (route) =>
    route.fulfill({ status: 200, contentType: 'application/json', body: JSON.stringify(mockData) })
  );
});
Every module’s tests carried equivalent patterns. The effect was that no test ever exercised a real backend endpoint. sessionStorage injection — Rather than navigating through the application’s login flow, the suite injected auth state directly into browser storage:
async function seedAuth(page: Page, token: string): Promise<void> {
  await page.evaluate((t) => sessionStorage.setItem('auth_token', t), token);
}
The consequence of both mechanisms combined: tests verified that UI components rendered correctly when handed fabricated data through fabricated auth — a useful but incomplete property. What they could not verify was whether the application worked correctly against its real backend, real permission model, or real authentication system. The specific observation that triggered the rewrite: a data service was returning HTTP 404 on dashboard load in the dev environment. The mocked E2E suite showed all tests passing. The mocking had fabricated the 200 response that the real service was failing to produce, making the test suite actively misleading.

2. Real-Site Architecture

2.1 The Dev Environment

The rewritten suite targets a permanently deployed environment that receives automatic updates on every push to the development branch via a deploy pipeline: code push → Docker image build → registry push → Kubernetes rollout. The environment is always running and always reflects the current state of the development branch. The playwright.config.ts encodes this as policy:
use: {
  baseURL: process.env.PLAYWRIGHT_BASE_URL ?? 'https://<dev-environment-url>',
  trace: 'retain-on-failure',
  screenshot: 'only-on-failure',
  video: 'retain-on-failure',
},
// NO API MOCKING — tests run against real endpoints.
// A failing test means something is genuinely broken on the dev site.
The webServer block — which previously started a local server before running tests — was removed entirely. The retries count is set to zero. There is no automatic retry on failure; a failure is a genuine signal.

2.2 Authentication Without Injection

The dev environment runs with a persona-picker login page enabled, presenting named persona buttons corresponding to each role in the application’s permission model. The loginAs() fixture replaces all sessionStorage injection:
export async function loginAs(page: Page, persona: Persona): Promise<void> {
  await page.goto('/dev-login');
  await page.getByRole('button', { name: PERSONA_BUTTON[persona] }).click();
  await page.waitForURL(`**${PERSONA_LANDING[persona]}**`, { timeout: 15_000 });
}
The waitForURL call functions as an implicit assertion. If login fails — network error, missing persona, auth system failure — the test fails at the auth step with a clear timeout, not at a downstream assertion with a cryptic element-not-found error. Each persona maps to a distinct landing route, so the wait also verifies that the application’s routing responded correctly to a successful login.

2.3 Assertion Design for Live Data

Tests against a real environment cannot assert specific data values — the data changes. The suite adopted two complementary assertion patterns: URL-pattern assertions over body-text assertions. Rather than checking whether the page body contains specific text, tests check whether the URL matches or does not match an error route pattern:
// Before (fragile against real pages):
await expect(page.locator('body')).not.toContainText('404');

// After (robust):
expect(page.url()).not.toMatch(/\/(404|error|not-found)/);
This distinction matters: a page can legitimately display “HTTP 404” in a data-fetch error within an otherwise correctly rendered component, without the navigation itself failing. The URL is the correct indicator of navigation success. Valid outcome sets over specific values. When a page’s content depends on the current user’s permissions, tests assert that one of the valid outcomes rendered:
const hasDashboard = await page.locator('[data-testid="finance-dashboard"]').isVisible();
const hasAccessDenied = await page.locator('[data-testid="access-denied"]').isVisible();
expect(hasDashboard || hasAccessDenied).toBe(true);
This pattern acknowledges that both outcomes are correct depending on the user’s role configuration, without requiring the test to know which outcome applies to the current dev environment data state.

3. The Five Bugs Exposed on First Run

3.1 Data Service HTTP 404

The trigger for the entire rewrite. A data service was returning HTTP 404 on the primary dashboard load. The mocked suite had been fulfilling this request with a synthesized 200 response. On first real run, AUTH-03 failed: the test navigated successfully to the target route, but the page body contained the application’s data-fetch error message. The fix applied was not to the test — it was to the backend service. The test correctly identified a real defect. The assertion adjustment (URL-pattern check replacing body-text check) was a separate correctness improvement, not a workaround for the backend failure.

3.2 Navigation isActive Routing Bug

SHELL-05 and SHELL-06 revealed that two navigation items were simultaneously showing aria-current="page" when only one should have been active. The root cause: the isActive function used startsWith on the first two URL path segments. Two routes sharing a common prefix both matched, activating both navigation items simultaneously.
// Before (incorrect for multi-segment paths):
isActive: (path) => location.pathname.startsWith(path.split('/').slice(0, 2).join('/'))

// After:
isActive: (path) => location.pathname.startsWith(path)
The mocked tests had never exercised live React Router navigation state at this granularity. The test fixture injected auth and loaded pages statically; it did not navigate between routes through the application’s actual routing layer.

3.3 Three Missing CMS Slug Registrations

Three application pages — in the finance, partner, and operator modules — returned HTTP 404 from the CMS slug API endpoint (GET /api/shell/cms/:slug). The mocked suite had been fabricating 200 responses for these endpoints. The real API call revealed that these pages had never been registered in the CMS page store. Fix: three slug registrations added to the backend CMS configuration. The tests exposed a gap between what the UI expected to exist and what the backend had actually been configured to serve.

3.4 Permission Gate Behavior

The mocked finance test expected a specific dashboard component to be visible. Against the real dev environment, the test user lacked the finance:read permission, causing the application router to render an AccessDenied component instead. The mocked test had fabricated a valid finance API response, bypassing the permission gate entirely and making the gate invisible to the test. This is a particularly significant finding: the mocked suite was not testing the application as users actually experienced it. Users without the required permission would see AccessDenied. The test with fabricated data showed them a finance dashboard.

3.5 Auth UI Label Mismatch

The login fixture used a persona identifier (crm_user) that resolved to a display label ('CRM User') in the test’s assertion, but rendered as 'Product User' in the actual dev login UI. The mocked suite bypassed the login page entirely and never navigated to it, making this label discrepancy invisible. The fix was a one-line change in the test; the significance is that the discrepancy existed and went undetected until real navigation was required.

4. Scope and Coverage

The rewrite covered the complete application surface across 52 spec files:
Module GroupSpec FilesApproximate Tests
Auth and shell navigation420
Application forge (feed, inbox, tasks, requirements, agents, analytics, admin)834
Admin (users, roles, capsules, tenant info, audit log)728
CRM (dashboard, accounts, contacts, leads, activities, invoices)11~30
CMS (editor, drag-drop, publish, preview, version history, assets)7~18
Finance, operator, partner, capsule admin, ITSM7~25
Personas (legacy mocked specs, not yet rewritten)4
Total: approximately 160–170 individual test cases. All rewritten modules achieved full pass rates after the five bugs identified above were resolved.
E2E tests in this architecture do not run in the standard CI pipeline, which executes type-checking and unit tests only. The E2E suite requires network access to the live dev environment. Running it as a gate on the development branch is a natural next step, but requires CI-to-dev-environment connectivity and a stable environment uptime SLA.

5. Trade-offs and Implementation Constraints

PropertyMocked SuiteReal-Site Suite
IsolationComplete — no network dependencyNone — depends on live environment
Bug detection (integration)BlindComprehensive
Bug detection (UI rendering)GoodGood
Test data determinismFull — fabricated valuesPartial — seed data + real state
Auth coverageNone — injection bypasses loginFull — exercises real auth flow
Permission model coverageNone — mock bypasses gateFull — real permission enforcement
CI dependencyNoneRequires dev environment access
Failure signal qualityLow — failures rarely indicate real defectsHigh — failures indicate real defects
The real-site architecture has one significant constraint: test data is not controlled. Tests assert render validity rather than data correctness, which is appropriate for integration-level testing but does not replace unit tests that verify data handling logic. The two test layers are complementary, not substitutable.
A permanently deployed dev environment introduces a dependency: if the dev environment is down or behind, E2E tests produce false failures. The no-mock architecture requires infrastructure investment in environment stability. Running the suite against a staging environment with the PLAYWRIGHT_BASE_URL environment variable is the recommended approach for pre-release validation.

6. Recommendations

  1. Treat the first real-site E2E run as a bug disclosure exercise, not a test validation. The failures are findings. Each failure should be triaged as a potential real defect before assuming the test is incorrect.
  2. Replace body-text assertions with URL-pattern assertions for navigation verification. Page content is ambiguous — a correctly rendered page can display error text from a data fetch failure. The URL is the authoritative indicator of routing success.
  3. Adopt valid-outcome-set assertions for permission-gated routes. Tests that assert one specific outcome will produce false failures when the current user’s permissions differ from the test’s assumption. Assert that one of the valid outcomes rendered.
  4. Remove the webServer block and retries configuration simultaneously. Automatic retries mask intermittent real failures. If the environment is stable enough to test against, it is stable enough to require a clean pass.
  5. Invest in the dev environment before investing in mock fidelity. The effort required to keep complex mock fixtures synchronized with real API behavior exceeds the infrastructure cost of maintaining a permanently deployed dev environment. The mocks degrade silently; the environment fails loudly.
  6. Run loginAs() through the real UI, not through storage injection, even for tests that do not test auth. Real navigation through the login flow catches discrepancies between the auth system’s behavior and the test’s assumptions about it. The cost is a few hundred milliseconds per test; the coverage gain is the entire auth layer.

Conclusion

A mocked E2E test suite does not test the application — it tests the application’s behavior when given fabricated inputs through a fabricated auth layer. For detecting regressions in rendering logic against stable, controlled data, this may be sufficient. For detecting integration failures, permission-model correctness, and routing state bugs, it is structurally inadequate. The five bugs disclosed on first real-site run had all existed in the codebase during the period when the mocked suite was passing. The passing tests were not evidence of application correctness; they were evidence of mock fidelity. The distinction matters for any team relying on E2E pass rates as a quality gate. As dev environments become easier to maintain through containerization and GitOps pipelines, the cost argument for mocking weakens. Teams that can deploy a real environment can afford to test against it.
Code examples are sanitized and generalized. No proprietary information is shared. Opinions are my own and do not reflect my employer’s views.