Modernization

MVP Approach Comparison — NexGen Backend

Last updated: 2026-02-01 | Modernization

MVP Approach Comparison — NexGen Backend

Three architectural approaches for building the NexGen MVP backend. Each diagram shows what gets built, how it relates to the legacy platform, and the data flow.

Option A: Clean Build in NexGen

Build consolidated services fresh in the nexgen repo. Port business logic from legacy repos as needed. Legacy services continue running until NexGen replacements are validated and traffic is switched.

graph TB
    subgraph "Clients"
        FAN[Fan App]
        ADMIN[Admin/Expert App]
    end

    subgraph "Edge Layer"
        LB[GCP Load Balancer]
        ISTIO[Istio IngressGateway<br/>Path-Based Routing]
    end

    subgraph "NexGen Services — Clean Build"
        direction TB
        style NexGen fill:#d4edda,stroke:#28a745

        subgraph "Identity Domain"
            ID_SVC[identity-service<br/>Java 21 / Spring Boot 3.x<br/>celebrity + fan + users]
        end

        subgraph "Content Domain"
            CONT_SVC[content-service<br/>content + media]
            WEB_SVC[webinar-service<br/>Zoom integration]
        end

        subgraph "Commerce Domain"
            PAY_SVC[payment-service<br/>stripe + subscriptions + wallet + txn]
            PW_SVC[purchase-workflow<br/>Spring State Machine]
            INV_SVC[inventory-service<br/>product catalog hub]
            SHO_SVC[shoutout-service<br/>offers + fulfillment]
        end

        subgraph "Communication Domain"
            NOT_SVC[notification-service<br/>email + SMS + push]
            CHAT_SVC[chat-service]
            SSE_SVC[sse-service]
        end

        subgraph "Platform Domain"
            PLT_SVC[platform-services<br/>tags + tracking + org]
            SRCH_SVC[search-service]
        end

        subgraph "Learning Domain"
            CLS_SVC[class-catalog-service<br/>courses + journeys]
            EVT_SVC[event-service]
        end
    end

    subgraph "Legacy Services — Running Until Cutover"
        direction TB
        style Legacy fill:#fff3cd,stroke:#ffc107

        LEG_CEL[celebrity]
        LEG_FAN[fan]
        LEG_CONT[content]
        LEG_MEDIA[media]
        LEG_STRIPE[stripe]
        LEG_SUB[subscriptions]
        LEG_INV[inventory]
        LEG_OTHER[... 20+ more services]
    end

    subgraph "NexGen Consolidated Database — 6 Domains"
        direction LR
        style NexDB fill:#d4edda,stroke:#28a745

        DB_ID[(identity_db<br/>celebrity.* / fan.* / users.*)]
        DB_COM[(commerce_db<br/>payments.* / shoutout.* / inventory.*)]
        DB_CONT[(content_db<br/>content.* / media.* / learning.*)]
        DB_COMM[(communication_db<br/>notifications.* / chat.* / sse.*)]
        DB_PLT[(platform_db<br/>tags.* / tracking.* / org.*)]
        DB_KC[(keycloak_db)]
    end

    subgraph "Legacy Databases — 35 per tenant"
        style LegDB fill:#fff3cd,stroke:#ffc107
        LEG_DB[(35 separate PostgreSQL databases)]
    end

    subgraph "External APIs"
        STRIPE_API[Stripe]
        MUX_API[Mux]
        ZOOM_API[Zoom]
        STREAM_API[Stream Chat]
    end

    FAN --> LB --> ISTIO
    ADMIN --> LB

    ISTIO -->|new paths| ID_SVC
    ISTIO -->|new paths| CONT_SVC
    ISTIO -->|new paths| PAY_SVC
    ISTIO -->|legacy paths| LEG_CEL
    ISTIO -->|legacy paths| LEG_CONT
    ISTIO -->|legacy paths| LEG_STRIPE

    ID_SVC --> DB_ID
    CONT_SVC --> DB_CONT
    PAY_SVC --> DB_COM
    NOT_SVC --> DB_COMM
    PLT_SVC --> DB_PLT

    LEG_CEL --> LEG_DB
    LEG_FAN --> LEG_DB
    LEG_CONT --> LEG_DB

    PAY_SVC --> STRIPE_API
    CONT_SVC --> MUX_API
    WEB_SVC --> ZOOM_API
    CHAT_SVC --> STREAM_API

Pros

Clean codebase from day one — no legacy baggage
Consolidated schema designed correctly upfront
Can adopt latest patterns (OpenTelemetry, Spring State Machine) without refactoring
Clear separation: legacy runs until NexGen is ready

Cons

Must re-implement ~24 GraphQL endpoints, ~75 RabbitMQ contracts, 11 API integrations
Data migration from 35 legacy DBs into 6 consolidated DBs is a big lift
Dual running costs (legacy + NexGen) during transition
Risk of divergence — legacy gets hotfixes that NexGen must track

Effort Profile

Upfront: High (build everything from scratch)
Ongoing: Low (clean codebase, no tech debt)
Risk: Medium-high (data migration, feature parity gap)

Option B: Consolidate In-Place

Follow H14’s recommendation — merge existing Gen 2 services into consolidated modules within their existing repos, then migrate into nexgen. Refactor working code rather than rewriting it.

graph TB
    subgraph "Clients"
        FAN[Fan App]
        ADMIN[Admin/Expert App]
    end

    subgraph "Edge Layer"
        LB[GCP Load Balancer]
        ISTIO[Istio IngressGateway]
    end

    subgraph "Phase 1 — Merge Services In Existing Repos"
        direction TB
        style Phase1 fill:#cce5ff,stroke:#004085

        subgraph "identity-service repo"
            ID_MOD_CEL[celebrity module<br/>from: celebrity repo]
            ID_MOD_FAN[fan module<br/>from: fan repo]
            ID_MOD_USR[users module<br/>from: users repo]
        end

        subgraph "payment-service repo"
            PAY_MOD_STR[stripe module<br/>from: stripe repo]
            PAY_MOD_SUB[subscriptions module<br/>from: subscriptions repo]
            PAY_MOD_WAL[wallet module<br/>from: wallet repo]
            PAY_MOD_TXN[transaction module<br/>from: transaction repo]
        end

        subgraph "content-service repo"
            CON_MOD_CON[content module<br/>from: content repo]
            CON_MOD_MED[media module<br/>from: media repo]
        end

        subgraph "Unchanged Services"
            INV[inventory-service<br/>stays separate]
            WEB[webinar-service<br/>stays separate]
            SSE[sse-service<br/>stays separate]
            SRCH[search-service<br/>stays separate]
            OTHER[... 8 more unchanged]
        end
    end

    subgraph "Phase 2 — Database Consolidation"
        direction LR
        style Phase2 fill:#cce5ff,stroke:#004085

        DB_ID[(identity_db<br/>ALTER TABLE SET SCHEMA<br/>celebrity.* / fan.* / users.*)]
        DB_COM[(commerce_db<br/>payments.* / shoutout.* / inventory.*)]
        DB_CONT[(content_db<br/>content.* / media.* / learning.*)]
    end

    subgraph "Legacy Databases — Migrating"
        style LegDB fill:#fff3cd,stroke:#ffc107
        CEL_DB[(celebrity-db)] -->|schema move| DB_ID
        FAN_DB[(fan-db)] -->|schema move| DB_ID
        STR_DB[(stripe-db)] -->|schema move| DB_COM
        WAL_DB[(wallet-db)] -->|schema move| DB_COM
        CONT_DB[(content-db)] -->|schema move| DB_CONT
        MED_DB[(media-db)] -->|schema move| DB_CONT
    end

    subgraph "Phase 3 — Move to NexGen Monorepo"
        direction TB
        style Phase3 fill:#d4edda,stroke:#28a745

        NX[nexgen/ monorepo<br/>All consolidated services<br/>moved here as modules]
    end

    FAN --> LB --> ISTIO
    ADMIN --> LB

    ISTIO --> ID_MOD_CEL
    ISTIO --> PAY_MOD_STR
    ISTIO --> CON_MOD_CON
    ISTIO --> INV

    Phase1 -->|migrate code| Phase3
    Phase2 -->|migrate data| Phase3

Pros

Preserves all existing business logic, integrations, and contracts
Database migration is mechanical (ALTER TABLE SET SCHEMA)
No feature parity gap — production code moves, not rewrites
Lower risk — each service merges independently

Cons

Inherited tech debt (zero test coverage, Gen 1 patterns in some services)
Messy intermediate state — partially merged repos
Three-phase process adds coordination overhead
Core-lib coupling carries forward unchanged

Effort Profile

Upfront: Medium (merge existing code, refactor structure)
Ongoing: Medium (tech debt cleanup continues post-merge)
Risk: Low-medium (preserving working code)

Option C: Hybrid — Schema First, Strangler Fig API

Build the consolidated database schema fresh in NexGen. Stand up thin API services that coexist with legacy via strangler fig routing. Migrate traffic endpoint-by-endpoint.

graph TB
    subgraph "Clients"
        FAN[Fan App]
        ADMIN[Admin/Expert App]
    end

    subgraph "Edge Layer"
        LB[GCP Load Balancer]
        ISTIO[Istio IngressGateway<br/>Strangler Fig Router]
    end

    subgraph "NexGen API Layer — Thin Services"
        direction TB
        style NexAPI fill:#d4edda,stroke:#28a745

        GW[GraphQL Gateway<br/>Unified schema]

        subgraph "Migrated Endpoints"
            ID_API[identity-service<br/>GET /profiles ✅<br/>GET /follows ✅<br/>POST /profiles ✅]
            CONT_API[content-service<br/>GET /content ✅<br/>GET /media ✅]
        end

        subgraph "Not Yet Migrated"
            PAY_STUB[payment endpoints<br/>→ routes to legacy]
            INV_STUB[inventory endpoints<br/>→ routes to legacy]
            SHO_STUB[shoutout endpoints<br/>→ routes to legacy]
        end
    end

    subgraph "Legacy Services — Still Serving Some Endpoints"
        direction TB
        style Legacy fill:#fff3cd,stroke:#ffc107

        LEG_STRIPE[stripe service]
        LEG_INV[inventory service]
        LEG_SHOUT[shoutout service]
        LEG_SUB[subscriptions service]
        LEG_OTHER[... remaining legacy]
    end

    subgraph "Data Sync Layer"
        direction TB
        style Sync fill:#e2d5f1,stroke:#6f42c1

        CDC[Change Data Capture<br/>Airbyte / Debezium]
        SYNC_ID[Identity sync<br/>legacy → nexgen]
        SYNC_CONT[Content sync<br/>legacy → nexgen]
    end

    subgraph "NexGen Consolidated Database"
        direction LR
        style NexDB fill:#d4edda,stroke:#28a745

        DB_ID[(identity_db<br/>celebrity.* / fan.*)]
        DB_CONT[(content_db<br/>content.* / media.*)]
        DB_COM[(commerce_db<br/>payments.* / inventory.*)]
        DB_COMM[(communication_db)]
        DB_PLT[(platform_db)]
        DB_KC[(keycloak_db)]
    end

    subgraph "Legacy Databases"
        style LegDB fill:#fff3cd,stroke:#ffc107
        LEG_DB[(35 separate databases<br/>source of truth until<br/>endpoint migrated)]
    end

    FAN --> LB --> ISTIO
    ADMIN --> LB

    ISTIO -->|migrated endpoints| GW
    ISTIO -->|not-yet-migrated| LEG_STRIPE
    ISTIO -->|not-yet-migrated| LEG_INV

    GW --> ID_API
    GW --> CONT_API
    GW --> PAY_STUB -->|proxy| LEG_STRIPE
    GW --> INV_STUB -->|proxy| LEG_INV

    ID_API --> DB_ID
    CONT_API --> DB_CONT

    LEG_STRIPE --> LEG_DB
    LEG_INV --> LEG_DB

    CDC --> SYNC_ID --> DB_ID
    CDC --> SYNC_CONT --> DB_CONT
    LEG_DB --> CDC

Migration Sequence Detail

graph LR
    subgraph "Wave 1 — Foundation"
        direction TB
        W1A[Deploy consolidated schema<br/>6 empty domain databases]
        W1B[Set up CDC pipeline<br/>legacy → nexgen sync]
        W1C[Deploy GraphQL gateway<br/>proxy all to legacy]
        W1A --> W1B --> W1C
    end

    subgraph "Wave 2 — Identity + Content"
        direction TB
        W2A[Build identity-service<br/>port profile CRUD]
        W2B[Build content-service<br/>port content/media CRUD]
        W2C[Switch Istio routes<br/>identity + content → nexgen]
        W2D[Verify + remove CDC<br/>for migrated domains]
        W2A --> W2C
        W2B --> W2C
        W2C --> W2D
    end

    subgraph "Wave 3 — Commerce"
        direction TB
        W3A[Build payment-service<br/>port Stripe + subscriptions]
        W3B[Build inventory-service<br/>port product catalog]
        W3C[Switch Istio routes<br/>commerce → nexgen]
        W3A --> W3C
        W3B --> W3C
    end

    subgraph "Wave 4 — Remaining"
        direction TB
        W4A[Communication services]
        W4B[Platform services]
        W4C[Decommission legacy]
        W4A --> W4C
        W4B --> W4C
    end

    W1C --> W2A
    W1C --> W2B
    W2D --> W3A
    W2D --> W3B
    W3C --> W4A
    W3C --> W4B

Pros

Schema designed correctly from day one (no inherited debt)
Zero-downtime migration — endpoint by endpoint via Istio routing
Legacy stays live as fallback for every endpoint
Can validate NexGen against legacy (shadow traffic, comparison testing)
Each wave is independently deployable and reversible

Cons

CDC pipeline adds complexity (Airbyte/Debezium setup and maintenance)
Dual-write / sync consistency challenges during transition
Longer total timeline — migration is incremental
Must maintain two codebases until last endpoint migrates

Effort Profile

Upfront: Medium (schema + gateway + CDC infrastructure)
Ongoing: Medium (per-endpoint migration cycles)
Risk: Low (every step is reversible, legacy is always fallback)

Side-by-Side Comparison

graph LR
    subgraph "Option A: Clean Build"
        direction TB
        A1[Build all 18 services fresh]
        A2[Build consolidated schema]
        A3[Migrate all data at once]
        A4[Big-bang cutover]
        A1 --> A2 --> A3 --> A4
    end

    subgraph "Option B: In-Place Consolidation"
        direction TB
        B1[Merge services in existing repos]
        B2[ALTER TABLE SET SCHEMA]
        B3[Move merged code to nexgen]
        B4[Already running in prod]
        B1 --> B2 --> B3 --> B4
    end

    subgraph "Option C: Hybrid Strangler Fig"
        direction TB
        C1[Build schema + gateway + CDC]
        C2[Port endpoints one at a time]
        C3[Switch traffic per endpoint]
        C4[Decommission legacy per domain]
        C1 --> C2 --> C3 --> C4
    end

Dimension	A: Clean Build	B: In-Place	C: Hybrid
Schema quality	Best — designed fresh	Good — mechanical move	Best — designed fresh
Code quality	Best — no legacy debt	Worst — carries tech debt	Good — port selectively
Migration risk	High — big-bang data migration	Low — incremental moves	Low — CDC + rollback
Feature parity risk	High — must re-implement all	None — same code	Low — legacy fallback
Upfront effort	Highest	Lowest	Medium
Time to first value	Longest	Shortest	Medium
Reversibility	Hard (once cutover)	Easy (undo merges)	Easy (revert routes)
Dual-run cost	High	None	Medium
Test coverage	Can enforce TDD from start	Must retrofit	Can enforce TDD for new
Recommended for	Greenfield teams	Risk-averse orgs	Most balanced approach

Agentic Coding Re-Evaluation

The scoring above assumes a human engineering team. This section re-evaluates all three options assuming Claude Code (agentic AI) does the implementation work while a human architect reviews and approves.

What Changes with Agentic Coding

graph TB
    subgraph "Hard for Humans, Easy for Agents"
        style HE fill:#d4edda,stroke:#28a745
        H1[Read 191 repos and synthesize<br/>business logic into 18 services]
        H2[Re-implement 24 GraphQL endpoints<br/>by reading legacy source]
        H3[Reconstruct 75 RabbitMQ contracts<br/>from code analysis across repos]
        H4[Write consolidated schema<br/>from 280+ Flyway migrations]
        H5[Port 11 API integrations<br/>by reading existing integration code]
        H6[Enforce TDD from day one<br/>writing tests before implementation]
        H7[Maintain perfect consistency<br/>same patterns across all 18 services]
        H8[Generate complete Flyway migrations<br/>for consolidated schema]
    end

    subgraph "Hard Regardless — Needs Human + Cloud Access"
        style HR fill:#f8d7da,stroke:#dc3545
        R1[GKE cluster provisioning<br/>Terraform apply, IAM, networking]
        R2[Cloud SQL setup<br/>regional HA, private IP, backups]
        R3[CDC pipeline<br/>Airbyte/Debezium runtime config]
        R4[Istio routing rules<br/>live traffic switching]
        R5[External API sandbox testing<br/>Stripe/Mux/Zoom credentials]
        R6[Production data migration<br/>validation against real data]
        R7[DNS + TLS + Load Balancer<br/>live infrastructure cutover]
    end

    subgraph "Actually Easier for Agents Than Humans"
        style AE fill:#cce5ff,stroke:#004085
        E1[Reading ALL legacy code<br/>not just the parts you remember]
        E2[Finding every edge case<br/>in 280+ migrations]
        E3[Cross-referencing contracts<br/>between 35 services simultaneously]
        E4[Writing exhaustive tests<br/>from BDD scenarios]
        E5[Maintaining naming conventions<br/>across entire codebase perfectly]
    end

Re-Scored Comparison — Agentic Coding

Dimension	A: Clean Build	B: In-Place	C: Hybrid
Schema quality	Best — designed fresh	Good — mechanical move	Best — designed fresh
Code quality	Best — no legacy debt	Worst — carries tech debt	Good — port selectively
Migration risk	Medium — agent validates thoroughly	Low — same code	Low — CDC + rollback
Feature parity risk	Low — agent reads ALL legacy code	None — same code	Low — legacy fallback
Agentic effort	Medium — agent excels at this	Medium — refactoring messy code is hard even for agents	High — CDC/infra work is agent-unfriendly
Time to first value	Medium — agent works fast	Shortest	Longest — infrastructure bottleneck
Reversibility	Hard (once cutover)	Easy (undo merges)	Easy (revert routes)
Dual-run cost	High	None	Medium
Test coverage	Best — TDD from scratch, agent writes all tests	Worst — must retrofit onto legacy	Good — TDD for new code
Agent-friendliness	Highest — pure code generation	Medium — needs to understand messy legacy	Lowest — infrastructure-heavy
Human review burden	Medium — review new code	High — review messy merges	High — review infra + sync logic

Why Option A Wins with Agentic Coding

graph TB
    subgraph "Option A: Clean Build — Agent Perspective"
        direction TB
        style OptA fill:#d4edda,stroke:#28a745

        A1["Step 1: Read all 280+ Flyway migrations<br/>Synthesize into 6 consolidated schemas<br/>🤖 Agent strength: full codebase context"]
        A2["Step 2: Generate Spring Boot project scaffold<br/>18 services, shared core-lib, unified patterns<br/>🤖 Agent strength: perfect consistency"]
        A3["Step 3: Write BDD scenarios from legacy behavior<br/>Then write tests FIRST (TDD)<br/>🤖 Agent strength: exhaustive coverage"]
        A4["Step 4: Port business logic service-by-service<br/>Read legacy → write clean equivalent<br/>🤖 Agent strength: reads ALL source code"]
        A5["Step 5: Port GraphQL endpoints + RabbitMQ contracts<br/>Read existing contracts → implement on new schema<br/>🤖 Agent strength: mechanical precision"]
        A6["Step 6: Port external API integrations<br/>Stripe, Mux, Zoom, Stream Chat, Twilio, Mandrill<br/>🤖 Agent strength: reads existing integration code"]

        A1 --> A2 --> A3 --> A4 --> A5 --> A6
    end

    subgraph "Human Responsibilities"
        direction TB
        style Human fill:#fff3cd,stroke:#ffc107

        H1[Review + approve generated code]
        H2[Provision cloud infrastructure<br/>GKE, Cloud SQL, networking]
        H3[Configure external API credentials<br/>Stripe sandbox, Mux, Zoom]
        H4[Run data migration against prod<br/>validate real data]
        H5[Perform traffic cutover<br/>Istio route switching]
    end

    A6 --> H1
    H1 --> H2 --> H3 --> H4 --> H5

Why Option B Gets Worse with Agentic Coding

Option B’s main advantage was “preserving working code” — but that advantage assumes code is expensive to write. With agentic coding, writing code is cheap. What’s expensive is dealing with messy code:

Merging services means understanding implicit coupling, undocumented behavior, and Gen 1 patterns mixed with Gen 2
Retrofitting tests onto zero-coverage legacy code is harder than writing tests alongside new code
Refactoring inconsistent patterns across 191 repos is tedious even for agents — more context switching, more edge cases
The “messy intermediate state” of partially-merged repos creates confusing context for the agent

Why Option C Gets Worse with Agentic Coding

Option C’s main advantage was risk mitigation through incremental migration. But:

The CDC pipeline (Airbyte/Debezium) is infrastructure work — the hardest category for agentic coding
Dual-write consistency is a runtime problem that requires live debugging with real data
The strangler fig router adds a permanent coordination layer that must be maintained
Each migration wave requires infrastructure changes (Istio routes, CDC config) — the agent-unfriendly parts
The incremental approach means the agent is context-switching between “build new” and “maintain sync” constantly

Revised Recommendation

graph LR
    subgraph "Human Team"
        HT[Option C: Hybrid<br/>Most balanced for humans<br/>Lower risk per step]
    end

    subgraph "Agentic Coding"
        AC[Option A: Clean Build<br/>Plays to agent strengths<br/>Best long-term outcome]
    end

    subgraph "Key Insight"
        KI["The 'hardest' option for humans<br/>is the 'most natural' for agents<br/><br/>Reading 191 repos and writing<br/>18 clean services is exactly<br/>what agentic coding does best"]
    end

    HT ~~~ KI ~~~ AC

Option A is the recommended approach for agentic coding because:

Pure code generation — the agent’s core strength. No infrastructure provisioning, no CDC pipelines, no live traffic management.
Full codebase context — the agent can read ALL 191 repos simultaneously, something no human team can do. This eliminates the “feature parity risk” that makes Option A scary for humans.
TDD from scratch — writing tests alongside new code is natural. Retrofitting tests onto legacy (Option B) or writing tests for code that also needs sync logic (Option C) is harder.
Clean architecture — no compromises for backward compatibility within the codebase. Every service follows identical patterns because one agent writes them all.
Human effort concentrated on high-value work — architects review generated code, provision infrastructure, manage credentials, and make cutover decisions. The mechanical coding work is delegated entirely.

Remaining Risks (Option A + Agentic Coding)

Risk	Mitigation
Agent misses subtle legacy behavior	Human reviews + comparison testing against legacy endpoints
Data migration correctness	Generate migration scripts, run against staging copy of prod DB
External API contract drift	Port integration tests that run against sandbox APIs
Schema consolidation misses edge cases	Agent reads ALL 280+ migrations; human validates with production queries
Big-bang cutover risk	Stage: deploy NexGen alongside legacy, shadow traffic, then switch

Generated: 2026-02-01