Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.aidonow.com/llms.txt

Use this file to discover all available pages before exploring further.

Executive Summary

Migrating a project management system while preserving operational continuity is a well-understood problem with a frequently over-engineered solution: the bidirectional synchronization daemon. This analysis examines an alternative approach — dual-write at the agent skill layer — applied to migrate an active engineering environment from a legacy issue tracking system to a new project management platform without interruption to ongoing work. The central finding is that organizations operating AI agent workflows have a structural advantage in ITSM migration: agent skills already constitute the single write point for all project management operations, and routing those writes to both systems costs one additional HTTP call per operation. No new infrastructure is required, no synchronization service is introduced, and cutover reduces to a configuration change rather than a migration event.

Key Findings

  • The synchronization daemon pattern introduces more operational complexity than it resolves. Bidirectional sync between two ITSM systems requires conflict resolution logic, eventual consistency tolerances, and a new service to provision, monitor, and maintain — all of which add risk during the window when the migration is most likely to encounter problems.
  • Agent skill layers constitute a natural single write plane for ITSM operations. In environments where AI agents execute project management tasks — creating tasks, updating statuses, recording observations — the skill layer already serializes all writes. That serialization point can absorb dual-write behavior without architectural change.
  • Best-effort secondary writes, with logging and no blocking, provide sufficient durability guarantees for a migration validation period. The secondary system is not a production system during dual-write; it is a validation environment. Occasional secondary write failures are informative, not catastrophic, and do not justify blocking primary operations.
  • Schema alignment before data movement prevents silent field mapping errors at import time. Custom fields on the new platform must mirror the canonical field vocabulary used by the agent layer before any records are moved. Discovering field mismatches after import produces inconsistently migrated data that is difficult to audit.
  • A one-time import of active records against a quiesced or sequenced dataset produces a clean migration baseline. The import is a bounded, repeatable operation — not a live synchronization problem — and should be treated as such.
  • Cutover in the dual-write pattern is a configuration swap, not a migration event. Promoting the secondary system to primary requires changing two values in the skill configuration: the primary endpoint and the secondary endpoint. No data movement, no downtime, no synchronization cutover risk.

1. Introduction: The Sync Daemon Trap in ITSM Migration

ITSM migrations are reliably underestimated. The appeal of a phased migration — running both systems in parallel, synchronizing state between them until the new system is validated — is real. The cost of implementing that synchronization correctly is almost always higher than anticipated. The synchronization daemon pattern emerges from a reasonable premise: both systems should reflect the same state during the transition period, so changes to either system are propagated to the other. In practice, this premise produces a set of hard problems that are not inherent to the migration but are introduced by the synchronization approach itself. Bidirectional synchronization requires a deterministic answer to conflict resolution: when the same record is modified in both systems within the same time window, which write wins? This question is straightforward to ask and difficult to answer without detailed knowledge of both systems’ data models, API semantics, and consistency guarantees. In practice, most migration synchronization implementations resolve conflicts by accepting one system as authoritative — which implicitly eliminates the need for bidirectional synchronization. Unidirectional synchronization is simpler, but it forecloses the “use either system during migration” operational model that motivated the architecture. The synchronization daemon is also a new service: it must be provisioned, configured, monitored, and kept running for the duration of the migration. It introduces a new failure mode — sync lag, sync failure, or partial sync — that did not exist before the migration began. A migration intended to reduce operational complexity introduces operational complexity as a precondition for completion. The dual-write pattern described in this analysis avoids these costs by eliminating the synchronization service entirely. It does so by exploiting a structural property of AI agent workflows that is not present in traditional software environments: the agent skill layer is already the exclusive write path for all ITSM operations. There is no human operator creating tickets directly in the legacy system in parallel with agent writes. There is no legacy integration producing writes outside agent control. The agent skill layer is the single write plane, and that property makes dual-write at the skill layer both sufficient and safe.

2. Migration Architecture: Three Phases Without Downtime

The migration was structured as three sequential phases, each of which could be executed and validated independently. No phase required system downtime, and no phase introduced irreversible state that would complicate rollback. Phase 1: Schema Alignment. Custom field definitions in the new project management platform were established to mirror the canonical field vocabulary used by the agent layer. This phase produced no data movement; its output was a verified field mapping between the two systems’ schemas. Phase 2: One-Time Import. Active records from the legacy system were imported into the new platform using a bounded, repeatable import script. The import was executed against a snapshot of active records, transforming legacy identifiers and field names into the new platform’s schema. The legacy system remained the operational primary throughout this phase. Phase 3: Dual-Write Validation. Agent skills were updated to write to both systems on every operation. The legacy system remained primary (write failures block the operation); the new platform received secondary writes on a best-effort basis (write failures are logged but do not block). This phase validated the new platform under production write volume before any cutover commitment. Each phase is a gate. Phase 2 does not begin until Phase 1 is validated. Phase 3 does not begin until Phase 2 is validated. Cutover — designating the new platform as primary — does not occur until Phase 3 has produced sufficient confidence.

3. The Dual-Write Protocol: Agent Skills Eliminate the Need for a Dedicated Sync Service

The dual-write protocol is implemented directly in the agent skill functions responsible for ITSM operations. Every skill that creates, updates, or closes a task in the legacy system is extended to perform the same operation against the new platform as a secondary write. The following implementation demonstrates the pattern for a task creation operation:
async fn create_task(payload: &CreateTaskPayload) -> Result<TaskId, SkillError> {
    // Primary write — must succeed; failure propagates to caller
    let task_id = primary_itsm.create_task(payload).await?;

    // Secondary write — best-effort; failure is logged, not propagated
    if let Err(e) = secondary_itsm.create_task(payload).await {
        tracing::warn!("[dual-write] secondary ITSM failed: {}", e);
    }

    Ok(task_id)
}
The protocol has three properties worth examining: Asymmetric failure handling. The primary write is required. If it fails, the error propagates to the caller and the operation fails as it would without dual-write. The secondary write is advisory. If it fails, the failure is recorded in structured logs and the operation succeeds. This asymmetry is deliberate: the secondary system is under validation, not under production commitment. A failed secondary write is a data point about the new platform’s reliability, not a production incident. No compensating transactions. The protocol does not attempt to roll back the primary write if the secondary write fails, nor does it retry the secondary write. Either of these behaviors would introduce the coordination complexity that the synchronization daemon pattern already carries. Best-effort secondary writes with persistent logging provide sufficient information to identify systematic secondary failures and address them before cutover. Identical payload to both systems. The same CreateTaskPayload is submitted to both systems. Field translation between the two schemas occurs at payload construction time (see Section 4), not at write time. This keeps the dual-write protocol itself simple and makes it easy to verify that both systems are receiving the same data.
The dual-write protocol assumes that the agent skill layer is the exclusive write path for ITSM operations. If human operators or external integrations also write to the legacy system, those writes will not propagate to the new platform during the dual-write phase. Before implementing this pattern, audit all write sources and confirm that agent skills account for all production writes.

4. Phase 1: Schema Alignment Before Data Movement

Schema alignment is the least visible and most consequential phase of the migration. Field mapping errors discovered after data import produce inconsistently migrated records that are difficult to identify and expensive to correct. The agent skill layer operates against a canonical vocabulary of task fields: identifiers, status values, relationship fields, and domain-specific metadata. The new project management platform may use different field names, different value types, or may not have equivalents for certain fields at all. Phase 1 resolves this before any records move. The following example illustrates a custom field alignment mapping for domain-specific metadata fields:
# Field alignment mapping: canonical vocabulary → new platform custom fields
field_mappings:
  # Requirement traceability
  req_id:
    legacy_field: "cf_requirement_id"
    new_field: "custom_fields.req_id"
    type: string
    required: false

  # Pull request linkage
  pr_url:
    legacy_field: "cf_pr_url"
    new_field: "custom_fields.pr_url"
    type: url
    required: false

  pr_number:
    legacy_field: "cf_pr_number"
    new_field: "custom_fields.pr_number"
    type: integer
    required: false

  # Agent telemetry fields
  agent_session_id:
    legacy_field: "cf_agent_session"
    new_field: "custom_fields.agent_session_id"
    type: string
    required: false

  agent_skill_name:
    legacy_field: "cf_skill_name"
    new_field: "custom_fields.agent_skill_name"
    type: string
    required: false
Validation of the schema alignment phase requires more than verifying that fields exist in the new platform. Each custom field must be verified to accept the same value range as the corresponding legacy field. An integer field that accepts values up to 32,767 will silently truncate pull request numbers above that range in some platforms. A URL field that enforces a maximum length may reject long pull request URLs. These behaviors are not surfaced by field existence checks; they require test record submissions with representative values.
Do not proceed to Phase 2 until test record submissions against the new platform’s schema have validated all custom field mappings with representative values from the production dataset. A silent truncation or type coercion in a custom field will propagate through the import and produce a data quality problem that is difficult to detect post-cutover.

5. Phase 2: One-Time Import Without Drift

The one-time import moves active records from the legacy system to the new platform. It is explicitly not a synchronization operation — it is a bounded snapshot import. The distinction matters. A synchronization-framed import attempts to maintain consistency between two live systems, which requires change detection, conflict resolution, and incremental update logic. A snapshot-framed import takes a defined set of records from the legacy system at a defined point in time, transforms them, and inserts them into the new platform. It is repeatable: if the import fails or produces incorrect results, it can be dropped and re-run against the same or an updated snapshot. The import targets active records only — issues that are open and under active management. Closed records, historical records, and archive items are excluded. This scoping decision is deliberate: closed records carry no ongoing operational dependency, and importing them would expand the import scope, increase import complexity, and provide no benefit to the migration’s primary goal of validating the new platform under operational conditions. The import script performs the following transformation steps for each record:
  1. Extract the legacy record’s identifier, subject, description, status, priority, and custom field values.
  2. Map status and priority values to their equivalents in the new platform’s vocabulary.
  3. Translate custom field names using the alignment mapping established in Phase 1.
  4. Construct a creation payload in the new platform’s API format.
  5. Submit the creation request and record the new platform’s assigned identifier alongside the legacy identifier in a migration manifest.
The migration manifest — a persistent mapping from legacy identifiers to new platform identifiers — is an output of Phase 2. It is required for Phase 3: when the dual-write layer creates a new record on the new platform, it uses the new platform’s identifier for that record. For records that existed before dual-write was enabled, cross-reference operations require the manifest.

6. Phase 3: Dual-Write and Validation Before Cutover

Phase 3 begins when the import is verified and the agent skill layer is updated to implement the dual-write protocol. The legacy system remains primary. Every new write from the agent layer — task creation, status updates, comment additions, closure — goes to both systems. Validation during Phase 3 proceeds on two dimensions: Write success rate. Structured logs from the [dual-write] log lines provide a continuous measure of secondary write success rate. A healthy secondary write rate — above 99% across a representative volume period — is a prerequisite for cutover. Secondary write failures that are systematic (affecting a particular operation type or field value) indicate schema or API issues on the new platform that must be resolved before cutover. Record fidelity. Periodic spot-checks compare records between the two systems. A record created via the dual-write path should be present in both systems with identical field values. Discrepancies in field values indicate translation errors in the payload construction logic. Phase 3 should run for a period sufficient to cover the full range of operation types in normal use. If the engineering workflow produces task creation, status updates, and closures within a typical week, a one-week dual-write validation window is a reasonable minimum.
Query the structured log output for [dual-write] entries at the end of each day during the validation period. A zero-failure day is not sufficient evidence on its own, but a pattern of zero failures across a full operational week provides strong confidence for cutover. Surface the failure rate as a metric — do not rely on absence of alerts.

7. Cutover: Configuration Change, Not Migration Event

Cutover in the dual-write pattern is a configuration change applied to the agent skill layer. The two systems — previously designated primary and secondary — exchange roles:
  • The new project management platform becomes primary: write failures propagate to the caller.
  • The legacy system becomes secondary: write failures are logged but do not block.
The new platform does not need to be “promoted” in any structural sense. All active records have been present on the new platform since the Phase 2 import. All writes since Phase 3 began have been applied to both systems. The new platform is already current. The following table compares the operational properties of the sync daemon approach and the agent dual-write approach across key migration dimensions:
DimensionSync Daemon ApproachAgent Dual-Write Approach
New infrastructure requiredYes — sync service must be provisioned and maintainedNo — dual-write runs inside existing skill functions
Conflict resolution requiredYes — bidirectional sync produces write conflictsNo — writes are unidirectional from the agent layer
Cutover mechanismSync cutover with potential for in-flight conflictConfiguration swap: promote secondary to primary
Failure surface during migrationPrimary system + sync daemon + secondary systemPrimary system + best-effort secondary writes
Secondary write failure handlingRequires retry logic and conflict resolutionLog and continue; no blocking
Rollback mechanismStop sync daemon; revert client routingRevert configuration: demote secondary back to primary
ApplicabilityAny system with shared write accessSystems where agent skills are the exclusive write path
Human writes to legacy system during migrationSupported (sync propagates to new system)Not supported without extending dual-write to human write paths
The constraint in the final row is the meaningful limitation of the agent dual-write pattern. It applies cleanly when the agent skill layer is the exclusive write path. Environments where human operators write directly to the legacy system in parallel with agent writes require either extending dual-write to cover those write paths or accepting that human writes during the migration window will not propagate to the new platform automatically.
After cutover, maintain the legacy system as a best-effort secondary write target for a post-cutover validation window — typically one to two weeks. This provides a rollback path at low cost: if the new platform exhibits reliability issues post-cutover, the legacy system has a near-current copy of all records and can be restored to primary with another configuration swap.

8. Implementation Constraints

8.1 Exclusive Write Path Requirement

The dual-write pattern’s primary constraint is architectural: it requires that the agent skill layer be the exclusive write path for ITSM operations during the migration period. If parallel write paths exist — human operators creating issues directly in the legacy system, external integrations triggering writes outside the agent layer, or webhook-driven updates from third-party tools — those writes will not be mirrored to the new platform. Auditing write sources before Phase 3 begins is not optional. A write path discovered after cutover that was not covered by dual-write produces a divergence between the two systems that must be reconciled manually.

8.2 Secondary Write Latency

Each agent operation that writes to ITSM now incurs an additional network round-trip for the secondary write. In the best-effort implementation described here, the secondary write is not on the critical path — the primary write completes before the secondary write begins, and the agent does not block on the secondary result. However, if the secondary system exhibits high latency or timeout behavior, the secondary write may extend the total operation duration. Implementing the secondary write as a non-awaited background task eliminates this latency concern at the cost of removing the synchronous log signal. The appropriate choice depends on whether operation latency or immediate failure visibility is the higher priority during the validation period.

8.3 Import Record Fidelity

The Phase 2 import produces a static snapshot. Any records created or modified in the legacy system between the import snapshot time and Phase 3 activation will not be present on the new platform with current state. The gap window — the time between import and dual-write activation — must be minimized and accounted for in the migration manifest. For low-volume environments, the gap window may be small enough to address with a manual re-import of records modified during the gap. For high-volume environments, the import script should be designed to support incremental re-runs against a filtered set of recently-modified records.

9. Recommendations

  1. Audit all ITSM write paths before implementing dual-write. Run a write source audit that identifies every mechanism — agent skills, human operator interfaces, external integrations, webhook consumers — that creates or modifies records in the legacy system. Dual-write at the agent skill layer is sufficient only when agent skills account for all production writes. Document the audit result before Phase 3 begins.
  2. Establish the schema alignment mapping as a versioned artifact, not a runtime assumption. Write the field mapping explicitly (as shown in Section 4) and commit it to the repository before any import or dual-write execution. A runtime field mapping that lives only in script logic is opaque to review and difficult to audit after migration.
  3. Structure secondary write failures as queryable log events, not plain-text messages. Emit secondary write failures as structured log entries with consistent field names — dual_write_target, operation_type, http_status, error_message — so they can be aggregated and queried during the validation period. A count of secondary write failures by operation type is a more useful signal than a list of error strings.
  4. Treat the migration manifest as a first-class artifact. The mapping from legacy identifiers to new platform identifiers is required for cross-reference operations, audit trails, and rollback planning. Store it durably — not only as script output — and confirm its completeness before Phase 3 activation.
  5. Define explicit cutover criteria before Phase 3 begins. Establish the secondary write success rate threshold, the minimum dual-write validation window, and the record fidelity spot-check requirements that constitute readiness for cutover. Cutover criteria defined before validation begins are more reliable than criteria formed during the validation period under operational pressure to complete the migration.
  6. Retain the legacy system as a best-effort secondary for at least two weeks post-cutover. The cost is one additional HTTP call per ITSM write during the post-cutover window. The benefit is a near-current rollback target. Remove the legacy secondary writes only after the new platform has demonstrated reliability across a representative operational period.

Conclusion

The dual-write migration pattern, applied at the agent skill layer, demonstrates that a class of problems typically addressed with dedicated synchronization infrastructure can be resolved more simply when the write plane is already centralized. The pattern is not universally applicable — it requires that agent skills constitute the exclusive write path for the systems being migrated — but in environments where that precondition holds, it reduces a multi-service migration architecture to a configuration change bookended by two well-defined data operations. As AI agent adoption in engineering workflows matures, the patterns that emerge from agent-centric architectures will increasingly diverge from those inherited from human-operator-centric systems. The dual-write migration pattern is one instance of a broader principle: operational transitions that are complex when write access is distributed become tractable when write access is centralized in a programmable layer. Organizations building agent workflows should recognize that centralization as a first-class architectural property, not an implementation detail, and design for the operational leverage it provides.
All content represents personal learning from personal projects. Code examples are sanitized and generalized. No proprietary information is shared. Opinions are my own and do not reflect my employer’s views.