ADR

ADR-001: Service Consolidation Strategy

Last updated: 2026-02-01 | Decisions

ADR-001: Service Consolidation Strategy

Status

Proposed — Pending engineering team review

Context

The platform runs ~28 backend application services per tenant in production, with ~40 total ArgoCD applications per tenant when including infrastructure and frontends. The platform serves 3 production tenants (agilenetwork, nilgameplan, vtnil), each deploying a near-identical set of services with config-only differentiation (H11 L2).

ArgoCD Production App Breakdown (per tenant, verified from agilenetwork)

Category Count Examples
Backend services ~28 celebrity, chat, class-catalog, content, email, fan, group-profile, identityx-26, inventory, journey, media, message-board, notifications, org-manager, purchase-request-bpm, search, shoutout, shoutout-bpm, sms, sse, stripe, subscriptions, tags, tracking, transaction, users, wallet, webinar
Frontend apps ~3 admin-fe, celeb-fe, mono-web
Infrastructure/platform ~9 pgbouncer, rabbitmq, redis, pvcs, secrets, cdn, kube-fledged, site-maintenance, superset
Total ArgoCD apps ~40

Tenant-Specific Variations

Not all services deploy to all tenants:

Service agilenetwork nilgameplan vtnil Notes
chat Yes No No agilenetwork only
search Yes No No agilenetwork only
message-board Yes No Yes Not in nilgameplan
nilgp-partnerportal-be/fe No Yes No Partner portal (nilgameplan only)
investordeck No Yes No Investor-facing app (nilgameplan only)
onsite-event No No Yes In-person event mgmt (vtnil only)

Analysis (Sessions 0-10) identified over-decomposed services, consistent patterns across all services, and clean domain boundaries. The question is how to reduce operational complexity while preserving functionality.

Decision

Consolidate from ~28 backend services to ~18 services per tenant along domain-driven boundaries, using incremental in-place merges rather than rewrites. This consolidation applies across all 3 production tenants (agilenetwork, nilgameplan, vtnil).

Consolidation Map

Target Source Services Rationale
identity-service celebrity, fan, users Same domain, shared Keycloak
content-service content, media Shared Mux, overlapping video
payment-service stripe, subscriptions, wallet, transaction Same financial domain
purchase-workflow purchase-request-bpm (state machine) CIB Seven replacement
shoutout-service shoutout, shoutout-bpm Absorb BPM into service
class-catalog-service class-catalog, journey Same learning domain
notification-service email, sms, notifications Shared DB, delivery pipeline
platform-services tags, tracking, group-profile, org-manager Small supporting services

Services Kept Separate

inventory (cross-cutting hub), webinar (Zoom lifecycle), chat (Stream SaaS), message-board (Redis SSE), sse (platform infrastructure), search (Elasticsearch), event (distinct domain), keycloak, identityx-26 (identity provider integration).

Note: Tenant-specific services (nilgp-partnerportal-be/fe, investordeck, onsite-event) will be evaluated separately as part of the multi-brand consolidation (see ADR-004).

Hypothesis Background

Primary: Consolidation reduces operational overhead without losing functionality. - Evidence: All services use identical patterns (core-lib, GraphQL, RabbitMQ) — merging is additive code organization, not rewriting (H13 L1). - Database-per-service boundaries are clean (H6 L1) — no hidden data coupling.

Alternative 1: Keep all ~28 backend services, upgrade each independently. - Rejected: Operational overhead of ~28 services × 3 production tenants × CI/CD × monitoring is disproportionate to service complexity. 6 services have <5 endpoints.

Alternative 2: Full rewrite to fewer services with new patterns. - Rejected: H14 falsified — existing patterns are sound, rewrite duplicates work without benefit.

Falsifiability Criteria

Evidence Quality

Evidence Assurance
Pattern consistency across services L1 (H13)
Clean service boundaries L1 (H6)
Over-decomposed services identified L1 (tech debt inventory)
RabbitMQ contracts discoverable L2 (H12)
Multi-brand is config-only L2 (H11)
ArgoCD production manifests reviewed L2 (IaC verification — 40 apps enumerated per tenant)
Terraform production config reviewed L2 (IaC verification — 3 production tenants confirmed)
Data volumes L0 (H8 — partial)

Overall: L1 (WLNK capped by data volume uncertainty)

Bounded Validity

Consequences

Positive: ~50% fewer deployments, simpler CI/CD, reduced monitoring surface, lower infrastructure cost. Negative: Larger deployment units per service, more complex codebase per repo, risk of coupling within consolidated services. Mitigated by: Module boundaries within consolidated services (Maven multi-module), separate Flyway migration folders.


Decision date: 2026-01-30 Review by: 2026-07-30