MVP Approach Comparison — NexGen Backend
MVP Approach Comparison — NexGen Backend
Three architectural approaches for building the NexGen MVP backend. Each diagram shows what gets built, how it relates to the legacy platform, and the data flow.
Option A: Clean Build in NexGen
Build consolidated services fresh in the nexgen repo. Port business logic from legacy repos as needed. Legacy services continue running until NexGen replacements are validated and traffic is switched.
graph TB
subgraph "Clients"
FAN[Fan App]
ADMIN[Admin/Expert App]
end
subgraph "Edge Layer"
LB[GCP Load Balancer]
ISTIO[Istio IngressGateway<br/>Path-Based Routing]
end
subgraph "NexGen Services — Clean Build"
direction TB
style NexGen fill:#d4edda,stroke:#28a745
subgraph "Identity Domain"
ID_SVC[identity-service<br/>Java 21 / Spring Boot 3.x<br/>celebrity + fan + users]
end
subgraph "Content Domain"
CONT_SVC[content-service<br/>content + media]
WEB_SVC[webinar-service<br/>Zoom integration]
end
subgraph "Commerce Domain"
PAY_SVC[payment-service<br/>stripe + subscriptions + wallet + txn]
PW_SVC[purchase-workflow<br/>Spring State Machine]
INV_SVC[inventory-service<br/>product catalog hub]
SHO_SVC[shoutout-service<br/>offers + fulfillment]
end
subgraph "Communication Domain"
NOT_SVC[notification-service<br/>email + SMS + push]
CHAT_SVC[chat-service]
SSE_SVC[sse-service]
end
subgraph "Platform Domain"
PLT_SVC[platform-services<br/>tags + tracking + org]
SRCH_SVC[search-service]
end
subgraph "Learning Domain"
CLS_SVC[class-catalog-service<br/>courses + journeys]
EVT_SVC[event-service]
end
end
subgraph "Legacy Services — Running Until Cutover"
direction TB
style Legacy fill:#fff3cd,stroke:#ffc107
LEG_CEL[celebrity]
LEG_FAN[fan]
LEG_CONT[content]
LEG_MEDIA[media]
LEG_STRIPE[stripe]
LEG_SUB[subscriptions]
LEG_INV[inventory]
LEG_OTHER[... 20+ more services]
end
subgraph "NexGen Consolidated Database — 6 Domains"
direction LR
style NexDB fill:#d4edda,stroke:#28a745
DB_ID[(identity_db<br/>celebrity.* / fan.* / users.*)]
DB_COM[(commerce_db<br/>payments.* / shoutout.* / inventory.*)]
DB_CONT[(content_db<br/>content.* / media.* / learning.*)]
DB_COMM[(communication_db<br/>notifications.* / chat.* / sse.*)]
DB_PLT[(platform_db<br/>tags.* / tracking.* / org.*)]
DB_KC[(keycloak_db)]
end
subgraph "Legacy Databases — 35 per tenant"
style LegDB fill:#fff3cd,stroke:#ffc107
LEG_DB[(35 separate PostgreSQL databases)]
end
subgraph "External APIs"
STRIPE_API[Stripe]
MUX_API[Mux]
ZOOM_API[Zoom]
STREAM_API[Stream Chat]
end
FAN --> LB --> ISTIO
ADMIN --> LB
ISTIO -->|new paths| ID_SVC
ISTIO -->|new paths| CONT_SVC
ISTIO -->|new paths| PAY_SVC
ISTIO -->|legacy paths| LEG_CEL
ISTIO -->|legacy paths| LEG_CONT
ISTIO -->|legacy paths| LEG_STRIPE
ID_SVC --> DB_ID
CONT_SVC --> DB_CONT
PAY_SVC --> DB_COM
NOT_SVC --> DB_COMM
PLT_SVC --> DB_PLT
LEG_CEL --> LEG_DB
LEG_FAN --> LEG_DB
LEG_CONT --> LEG_DB
PAY_SVC --> STRIPE_API
CONT_SVC --> MUX_API
WEB_SVC --> ZOOM_API
CHAT_SVC --> STREAM_API
Pros
- Clean codebase from day one — no legacy baggage
- Consolidated schema designed correctly upfront
- Can adopt latest patterns (OpenTelemetry, Spring State Machine) without refactoring
- Clear separation: legacy runs until NexGen is ready
Cons
- Must re-implement ~24 GraphQL endpoints, ~75 RabbitMQ contracts, 11 API integrations
- Data migration from 35 legacy DBs into 6 consolidated DBs is a big lift
- Dual running costs (legacy + NexGen) during transition
- Risk of divergence — legacy gets hotfixes that NexGen must track
Effort Profile
- Upfront: High (build everything from scratch)
- Ongoing: Low (clean codebase, no tech debt)
- Risk: Medium-high (data migration, feature parity gap)
Option B: Consolidate In-Place
Follow H14’s recommendation — merge existing Gen 2 services into consolidated modules within their existing repos, then migrate into nexgen. Refactor working code rather than rewriting it.
graph TB
subgraph "Clients"
FAN[Fan App]
ADMIN[Admin/Expert App]
end
subgraph "Edge Layer"
LB[GCP Load Balancer]
ISTIO[Istio IngressGateway]
end
subgraph "Phase 1 — Merge Services In Existing Repos"
direction TB
style Phase1 fill:#cce5ff,stroke:#004085
subgraph "identity-service repo"
ID_MOD_CEL[celebrity module<br/>from: celebrity repo]
ID_MOD_FAN[fan module<br/>from: fan repo]
ID_MOD_USR[users module<br/>from: users repo]
end
subgraph "payment-service repo"
PAY_MOD_STR[stripe module<br/>from: stripe repo]
PAY_MOD_SUB[subscriptions module<br/>from: subscriptions repo]
PAY_MOD_WAL[wallet module<br/>from: wallet repo]
PAY_MOD_TXN[transaction module<br/>from: transaction repo]
end
subgraph "content-service repo"
CON_MOD_CON[content module<br/>from: content repo]
CON_MOD_MED[media module<br/>from: media repo]
end
subgraph "Unchanged Services"
INV[inventory-service<br/>stays separate]
WEB[webinar-service<br/>stays separate]
SSE[sse-service<br/>stays separate]
SRCH[search-service<br/>stays separate]
OTHER[... 8 more unchanged]
end
end
subgraph "Phase 2 — Database Consolidation"
direction LR
style Phase2 fill:#cce5ff,stroke:#004085
DB_ID[(identity_db<br/>ALTER TABLE SET SCHEMA<br/>celebrity.* / fan.* / users.*)]
DB_COM[(commerce_db<br/>payments.* / shoutout.* / inventory.*)]
DB_CONT[(content_db<br/>content.* / media.* / learning.*)]
end
subgraph "Legacy Databases — Migrating"
style LegDB fill:#fff3cd,stroke:#ffc107
CEL_DB[(celebrity-db)] -->|schema move| DB_ID
FAN_DB[(fan-db)] -->|schema move| DB_ID
STR_DB[(stripe-db)] -->|schema move| DB_COM
WAL_DB[(wallet-db)] -->|schema move| DB_COM
CONT_DB[(content-db)] -->|schema move| DB_CONT
MED_DB[(media-db)] -->|schema move| DB_CONT
end
subgraph "Phase 3 — Move to NexGen Monorepo"
direction TB
style Phase3 fill:#d4edda,stroke:#28a745
NX[nexgen/ monorepo<br/>All consolidated services<br/>moved here as modules]
end
FAN --> LB --> ISTIO
ADMIN --> LB
ISTIO --> ID_MOD_CEL
ISTIO --> PAY_MOD_STR
ISTIO --> CON_MOD_CON
ISTIO --> INV
Phase1 -->|migrate code| Phase3
Phase2 -->|migrate data| Phase3
Pros
- Preserves all existing business logic, integrations, and contracts
- Database migration is mechanical
(
ALTER TABLE SET SCHEMA) - No feature parity gap — production code moves, not rewrites
- Lower risk — each service merges independently
Cons
- Inherited tech debt (zero test coverage, Gen 1 patterns in some services)
- Messy intermediate state — partially merged repos
- Three-phase process adds coordination overhead
- Core-lib coupling carries forward unchanged
Effort Profile
- Upfront: Medium (merge existing code, refactor structure)
- Ongoing: Medium (tech debt cleanup continues post-merge)
- Risk: Low-medium (preserving working code)
Option C: Hybrid — Schema First, Strangler Fig API
Build the consolidated database schema fresh in NexGen. Stand up thin API services that coexist with legacy via strangler fig routing. Migrate traffic endpoint-by-endpoint.
graph TB
subgraph "Clients"
FAN[Fan App]
ADMIN[Admin/Expert App]
end
subgraph "Edge Layer"
LB[GCP Load Balancer]
ISTIO[Istio IngressGateway<br/>Strangler Fig Router]
end
subgraph "NexGen API Layer — Thin Services"
direction TB
style NexAPI fill:#d4edda,stroke:#28a745
GW[GraphQL Gateway<br/>Unified schema]
subgraph "Migrated Endpoints"
ID_API[identity-service<br/>GET /profiles ✅<br/>GET /follows ✅<br/>POST /profiles ✅]
CONT_API[content-service<br/>GET /content ✅<br/>GET /media ✅]
end
subgraph "Not Yet Migrated"
PAY_STUB[payment endpoints<br/>→ routes to legacy]
INV_STUB[inventory endpoints<br/>→ routes to legacy]
SHO_STUB[shoutout endpoints<br/>→ routes to legacy]
end
end
subgraph "Legacy Services — Still Serving Some Endpoints"
direction TB
style Legacy fill:#fff3cd,stroke:#ffc107
LEG_STRIPE[stripe service]
LEG_INV[inventory service]
LEG_SHOUT[shoutout service]
LEG_SUB[subscriptions service]
LEG_OTHER[... remaining legacy]
end
subgraph "Data Sync Layer"
direction TB
style Sync fill:#e2d5f1,stroke:#6f42c1
CDC[Change Data Capture<br/>Airbyte / Debezium]
SYNC_ID[Identity sync<br/>legacy → nexgen]
SYNC_CONT[Content sync<br/>legacy → nexgen]
end
subgraph "NexGen Consolidated Database"
direction LR
style NexDB fill:#d4edda,stroke:#28a745
DB_ID[(identity_db<br/>celebrity.* / fan.*)]
DB_CONT[(content_db<br/>content.* / media.*)]
DB_COM[(commerce_db<br/>payments.* / inventory.*)]
DB_COMM[(communication_db)]
DB_PLT[(platform_db)]
DB_KC[(keycloak_db)]
end
subgraph "Legacy Databases"
style LegDB fill:#fff3cd,stroke:#ffc107
LEG_DB[(35 separate databases<br/>source of truth until<br/>endpoint migrated)]
end
FAN --> LB --> ISTIO
ADMIN --> LB
ISTIO -->|migrated endpoints| GW
ISTIO -->|not-yet-migrated| LEG_STRIPE
ISTIO -->|not-yet-migrated| LEG_INV
GW --> ID_API
GW --> CONT_API
GW --> PAY_STUB -->|proxy| LEG_STRIPE
GW --> INV_STUB -->|proxy| LEG_INV
ID_API --> DB_ID
CONT_API --> DB_CONT
LEG_STRIPE --> LEG_DB
LEG_INV --> LEG_DB
CDC --> SYNC_ID --> DB_ID
CDC --> SYNC_CONT --> DB_CONT
LEG_DB --> CDC
Migration Sequence Detail
graph LR
subgraph "Wave 1 — Foundation"
direction TB
W1A[Deploy consolidated schema<br/>6 empty domain databases]
W1B[Set up CDC pipeline<br/>legacy → nexgen sync]
W1C[Deploy GraphQL gateway<br/>proxy all to legacy]
W1A --> W1B --> W1C
end
subgraph "Wave 2 — Identity + Content"
direction TB
W2A[Build identity-service<br/>port profile CRUD]
W2B[Build content-service<br/>port content/media CRUD]
W2C[Switch Istio routes<br/>identity + content → nexgen]
W2D[Verify + remove CDC<br/>for migrated domains]
W2A --> W2C
W2B --> W2C
W2C --> W2D
end
subgraph "Wave 3 — Commerce"
direction TB
W3A[Build payment-service<br/>port Stripe + subscriptions]
W3B[Build inventory-service<br/>port product catalog]
W3C[Switch Istio routes<br/>commerce → nexgen]
W3A --> W3C
W3B --> W3C
end
subgraph "Wave 4 — Remaining"
direction TB
W4A[Communication services]
W4B[Platform services]
W4C[Decommission legacy]
W4A --> W4C
W4B --> W4C
end
W1C --> W2A
W1C --> W2B
W2D --> W3A
W2D --> W3B
W3C --> W4A
W3C --> W4B
Pros
- Schema designed correctly from day one (no inherited debt)
- Zero-downtime migration — endpoint by endpoint via Istio routing
- Legacy stays live as fallback for every endpoint
- Can validate NexGen against legacy (shadow traffic, comparison testing)
- Each wave is independently deployable and reversible
Cons
- CDC pipeline adds complexity (Airbyte/Debezium setup and maintenance)
- Dual-write / sync consistency challenges during transition
- Longer total timeline — migration is incremental
- Must maintain two codebases until last endpoint migrates
Effort Profile
- Upfront: Medium (schema + gateway + CDC infrastructure)
- Ongoing: Medium (per-endpoint migration cycles)
- Risk: Low (every step is reversible, legacy is always fallback)
Side-by-Side Comparison
graph LR
subgraph "Option A: Clean Build"
direction TB
A1[Build all 18 services fresh]
A2[Build consolidated schema]
A3[Migrate all data at once]
A4[Big-bang cutover]
A1 --> A2 --> A3 --> A4
end
subgraph "Option B: In-Place Consolidation"
direction TB
B1[Merge services in existing repos]
B2[ALTER TABLE SET SCHEMA]
B3[Move merged code to nexgen]
B4[Already running in prod]
B1 --> B2 --> B3 --> B4
end
subgraph "Option C: Hybrid Strangler Fig"
direction TB
C1[Build schema + gateway + CDC]
C2[Port endpoints one at a time]
C3[Switch traffic per endpoint]
C4[Decommission legacy per domain]
C1 --> C2 --> C3 --> C4
end
| Dimension | A: Clean Build | B: In-Place | C: Hybrid |
|---|---|---|---|
| Schema quality | Best — designed fresh | Good — mechanical move | Best — designed fresh |
| Code quality | Best — no legacy debt | Worst — carries tech debt | Good — port selectively |
| Migration risk | High — big-bang data migration | Low — incremental moves | Low — CDC + rollback |
| Feature parity risk | High — must re-implement all | None — same code | Low — legacy fallback |
| Upfront effort | Highest | Lowest | Medium |
| Time to first value | Longest | Shortest | Medium |
| Reversibility | Hard (once cutover) | Easy (undo merges) | Easy (revert routes) |
| Dual-run cost | High | None | Medium |
| Test coverage | Can enforce TDD from start | Must retrofit | Can enforce TDD for new |
| Recommended for | Greenfield teams | Risk-averse orgs | Most balanced approach |
Agentic Coding Re-Evaluation
The scoring above assumes a human engineering team. This section re-evaluates all three options assuming Claude Code (agentic AI) does the implementation work while a human architect reviews and approves.
What Changes with Agentic Coding
graph TB
subgraph "Hard for Humans, Easy for Agents"
style HE fill:#d4edda,stroke:#28a745
H1[Read 191 repos and synthesize<br/>business logic into 18 services]
H2[Re-implement 24 GraphQL endpoints<br/>by reading legacy source]
H3[Reconstruct 75 RabbitMQ contracts<br/>from code analysis across repos]
H4[Write consolidated schema<br/>from 280+ Flyway migrations]
H5[Port 11 API integrations<br/>by reading existing integration code]
H6[Enforce TDD from day one<br/>writing tests before implementation]
H7[Maintain perfect consistency<br/>same patterns across all 18 services]
H8[Generate complete Flyway migrations<br/>for consolidated schema]
end
subgraph "Hard Regardless — Needs Human + Cloud Access"
style HR fill:#f8d7da,stroke:#dc3545
R1[GKE cluster provisioning<br/>Terraform apply, IAM, networking]
R2[Cloud SQL setup<br/>regional HA, private IP, backups]
R3[CDC pipeline<br/>Airbyte/Debezium runtime config]
R4[Istio routing rules<br/>live traffic switching]
R5[External API sandbox testing<br/>Stripe/Mux/Zoom credentials]
R6[Production data migration<br/>validation against real data]
R7[DNS + TLS + Load Balancer<br/>live infrastructure cutover]
end
subgraph "Actually Easier for Agents Than Humans"
style AE fill:#cce5ff,stroke:#004085
E1[Reading ALL legacy code<br/>not just the parts you remember]
E2[Finding every edge case<br/>in 280+ migrations]
E3[Cross-referencing contracts<br/>between 35 services simultaneously]
E4[Writing exhaustive tests<br/>from BDD scenarios]
E5[Maintaining naming conventions<br/>across entire codebase perfectly]
end
Re-Scored Comparison — Agentic Coding
| Dimension | A: Clean Build | B: In-Place | C: Hybrid |
|---|---|---|---|
| Schema quality | Best — designed fresh | Good — mechanical move | Best — designed fresh |
| Code quality | Best — no legacy debt | Worst — carries tech debt | Good — port selectively |
| Migration risk | Medium — agent validates thoroughly | Low — same code | Low — CDC + rollback |
| Feature parity risk | Low — agent reads ALL legacy code | None — same code | Low — legacy fallback |
| Agentic effort | Medium — agent excels at this | Medium — refactoring messy code is hard even for agents | High — CDC/infra work is agent-unfriendly |
| Time to first value | Medium — agent works fast | Shortest | Longest — infrastructure bottleneck |
| Reversibility | Hard (once cutover) | Easy (undo merges) | Easy (revert routes) |
| Dual-run cost | High | None | Medium |
| Test coverage | Best — TDD from scratch, agent writes all tests | Worst — must retrofit onto legacy | Good — TDD for new code |
| Agent-friendliness | Highest — pure code generation | Medium — needs to understand messy legacy | Lowest — infrastructure-heavy |
| Human review burden | Medium — review new code | High — review messy merges | High — review infra + sync logic |
Why Option A Wins with Agentic Coding
graph TB
subgraph "Option A: Clean Build — Agent Perspective"
direction TB
style OptA fill:#d4edda,stroke:#28a745
A1["Step 1: Read all 280+ Flyway migrations<br/>Synthesize into 6 consolidated schemas<br/>🤖 Agent strength: full codebase context"]
A2["Step 2: Generate Spring Boot project scaffold<br/>18 services, shared core-lib, unified patterns<br/>🤖 Agent strength: perfect consistency"]
A3["Step 3: Write BDD scenarios from legacy behavior<br/>Then write tests FIRST (TDD)<br/>🤖 Agent strength: exhaustive coverage"]
A4["Step 4: Port business logic service-by-service<br/>Read legacy → write clean equivalent<br/>🤖 Agent strength: reads ALL source code"]
A5["Step 5: Port GraphQL endpoints + RabbitMQ contracts<br/>Read existing contracts → implement on new schema<br/>🤖 Agent strength: mechanical precision"]
A6["Step 6: Port external API integrations<br/>Stripe, Mux, Zoom, Stream Chat, Twilio, Mandrill<br/>🤖 Agent strength: reads existing integration code"]
A1 --> A2 --> A3 --> A4 --> A5 --> A6
end
subgraph "Human Responsibilities"
direction TB
style Human fill:#fff3cd,stroke:#ffc107
H1[Review + approve generated code]
H2[Provision cloud infrastructure<br/>GKE, Cloud SQL, networking]
H3[Configure external API credentials<br/>Stripe sandbox, Mux, Zoom]
H4[Run data migration against prod<br/>validate real data]
H5[Perform traffic cutover<br/>Istio route switching]
end
A6 --> H1
H1 --> H2 --> H3 --> H4 --> H5
Why Option B Gets Worse with Agentic Coding
Option B’s main advantage was “preserving working code” — but that advantage assumes code is expensive to write. With agentic coding, writing code is cheap. What’s expensive is dealing with messy code:
- Merging services means understanding implicit coupling, undocumented behavior, and Gen 1 patterns mixed with Gen 2
- Retrofitting tests onto zero-coverage legacy code is harder than writing tests alongside new code
- Refactoring inconsistent patterns across 191 repos is tedious even for agents — more context switching, more edge cases
- The “messy intermediate state” of partially-merged repos creates confusing context for the agent
Why Option C Gets Worse with Agentic Coding
Option C’s main advantage was risk mitigation through incremental migration. But:
- The CDC pipeline (Airbyte/Debezium) is infrastructure work — the hardest category for agentic coding
- Dual-write consistency is a runtime problem that requires live debugging with real data
- The strangler fig router adds a permanent coordination layer that must be maintained
- Each migration wave requires infrastructure changes (Istio routes, CDC config) — the agent-unfriendly parts
- The incremental approach means the agent is context-switching between “build new” and “maintain sync” constantly
Revised Recommendation
graph LR
subgraph "Human Team"
HT[Option C: Hybrid<br/>Most balanced for humans<br/>Lower risk per step]
end
subgraph "Agentic Coding"
AC[Option A: Clean Build<br/>Plays to agent strengths<br/>Best long-term outcome]
end
subgraph "Key Insight"
KI["The 'hardest' option for humans<br/>is the 'most natural' for agents<br/><br/>Reading 191 repos and writing<br/>18 clean services is exactly<br/>what agentic coding does best"]
end
HT ~~~ KI ~~~ AC
Option A is the recommended approach for agentic coding because:
- Pure code generation — the agent’s core strength. No infrastructure provisioning, no CDC pipelines, no live traffic management.
- Full codebase context — the agent can read ALL 191 repos simultaneously, something no human team can do. This eliminates the “feature parity risk” that makes Option A scary for humans.
- TDD from scratch — writing tests alongside new code is natural. Retrofitting tests onto legacy (Option B) or writing tests for code that also needs sync logic (Option C) is harder.
- Clean architecture — no compromises for backward compatibility within the codebase. Every service follows identical patterns because one agent writes them all.
- Human effort concentrated on high-value work — architects review generated code, provision infrastructure, manage credentials, and make cutover decisions. The mechanical coding work is delegated entirely.
Remaining Risks (Option A + Agentic Coding)
| Risk | Mitigation |
|---|---|
| Agent misses subtle legacy behavior | Human reviews + comparison testing against legacy endpoints |
| Data migration correctness | Generate migration scripts, run against staging copy of prod DB |
| External API contract drift | Port integration tests that run against sandbox APIs |
| Schema consolidation misses edge cases | Agent reads ALL 280+ migrations; human validates with production queries |
| Big-bang cutover risk | Stage: deploy NexGen alongside legacy, shadow traffic, then switch |
Generated: 2026-02-01