ADR

ADR-015: Testing Strategy

Last updated: 2026-02-01 | Decisions

ADR-015: Testing Strategy

Status

Proposed — Pending engineering team review

Context

H7 falsified: test coverage across all Gen 2 services is near-zero. Most services have 2-3 test files with minimal assertions. There are no CI test gates — builds pass regardless of test results. This is the single biggest risk to any migration or consolidation effort.

Current State

Metric	Value
Average test files per service	2-3
Test coverage	Near-zero (no coverage tooling configured)
CI test gates	None — tests don’t fail builds
Integration test framework	None
E2E test framework	None
Test database strategy	None standardized
BDD/Gherkin scenarios	None

Impact

Every service touched during consolidation (ADR-001) is a regression gamble
Cannot validate database migration (ADR-005) without integration tests
Cannot verify BPM migration (ADR-013) without workflow tests
Cannot validate API compatibility after service merges
Agentic coding amplifies this: agents can generate code fast, but without tests, they generate regressions equally fast

Decision

Adopt a tiered testing strategy prioritized by blast radius, using Testcontainers for integration tests and BDD/Gherkin for acceptance criteria. Leverage agentic coding to generate test suites from existing code and API contracts.

Testing Pyramid

Layer	Framework	Scope	Target Coverage
Unit	JUnit 5 + Mockito	Individual classes, business logic	80% line coverage for new/modified code
Integration	Spring Boot Test + Testcontainers	Service + database + RabbitMQ	All repository methods, message handlers
Contract	Spring Cloud Contract or Pact	GraphQL schema compatibility	All inter-service API contracts
E2E/BDD	Cucumber + REST Assured (backend), Playwright (frontend)	Full user flows	Critical paths: payment, auth, shoutout lifecycle

Priority Order (by blast radius)

Priority	Service(s)	Why First	Test Focus
P0	payment-service (stripe, subscriptions, wallet, transaction)	Financial transactions. Errors = money loss.	Payment flows, refund logic, wallet balance, idempotency
P0	purchase-workflow	BPM migration (ADR-013). Most complex state transitions.	All state transitions, timer events, error recovery
P1	identity-service (celebrity, fan, users)	28+ services depend on identity data.	Profile CRUD, Keycloak token validation, role mappings
P1	shoutout-service + shoutout-bpm	Revenue-generating workflow with external integrations (Mux, FFmpeg).	Full shoutout lifecycle, video processing callbacks
P1	inventory-service	Cross-cutting hub — 5 domain dependencies.	Product CRUD, availability checks, domain event publishing
P2	notification-service (email, sms, notifications)	Delivery pipeline.	Message routing, template rendering, delivery status
P2	content-service, webinar-service	Content management, Mux/Zoom integrations.	CRUD, external API interactions
P3	All remaining services	Lower blast radius.	Basic CRUD, message handlers

Testcontainers Strategy

// Shared test infrastructure — PostgreSQL + RabbitMQ + Redis
@Testcontainers
@SpringBootTest
class PaymentServiceIntegrationTest {
    @Container
    static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16");
    @Container
    static RabbitMQContainer rabbit = new RabbitMQContainer("rabbitmq:3.12-management");
    @Container
    static GenericContainer<?> redis = new GenericContainer<>("redis:7-alpine");
}

Each service’s integration tests spin up real PostgreSQL, RabbitMQ, and Redis containers. Flyway migrations run against the test database. No mocking of infrastructure — tests validate actual behavior.

Agentic Test Generation Strategy

Agentic coding changes the test generation calculus:

Agent reads existing service code — GraphQL schemas, entity models, message handlers, business logic
Agent generates test scaffolding — unit tests for all public methods, integration tests for all repository/message handler methods
Agent generates BDD scenarios — from GraphQL schema + business rules → Given/When/Then
Human reviews — validates test assertions match actual business requirements
Agent iterates — fixes failing tests, adds edge cases, improves coverage

This is the highest-ROI use of agentic coding: generating comprehensive test suites from existing code is mechanical work that agents excel at.

CI Gate Enforcement

Gate	Threshold	When
Unit test pass	100%	Every PR
Integration test pass	100%	Every PR
Line coverage (new code)	80%	Every PR
Line coverage (overall)	60% initial → 80% target	Gradual enforcement
BDD scenario pass	100%	Every PR touching covered features

Coverage Tooling

JaCoCo for Java code coverage (already in Maven ecosystem)
SonarQube or Codecov for coverage reporting and PR checks
GitHub Actions reusable workflow for test execution and coverage enforcement

Hypothesis Background

Primary: A tiered testing strategy prioritized by blast radius, combined with agentic test generation, can achieve 80% coverage on critical services before migration begins.

Evidence: Near-zero test coverage confirmed across all services (L2 — H7 falsified)
Evidence: All services use consistent patterns (core-lib, GraphQL, RabbitMQ) making test generation systematic (L1 — H13)
Evidence: Testcontainers is the de facto standard for Spring Boot integration testing (L1)
Evidence: Agentic coding can generate test scaffolding from existing code (L1 — demonstrated in other projects, not tested on this codebase)

Alternative 1: Write tests only during migration (test as you touch). - Partially accepted: this is the long-tail strategy for P2/P3 services. But P0 services (payment, purchase-workflow) need tests BEFORE migration.

Alternative 2: E2E tests only (skip unit/integration). - Rejected: E2E tests are slow, flaky, and don’t pinpoint failures. The pyramid exists for a reason.

Falsifiability Criteria

If agentic test generation produces >30% false-positive tests (tests that pass but don’t actually validate behavior) → manual test writing required for critical paths
If Testcontainers startup time exceeds 60s per test class → evaluate shared container strategy or TestNG parallel execution
If 80% coverage target on payment services takes >2 sprints → reduce target to 60% and focus on critical path coverage only
If CI gate enforcement blocks >50% of PRs in the first month → start with warning-only mode and gradually enforce

Evidence Quality

Evidence	Assurance
Near-zero test coverage	L2 (H7 falsified — verified across multiple services)
Consistent service patterns	L1 (H13 — core-lib, GraphQL, RabbitMQ)
Testcontainers works with Spring Boot 3.x	L1 (documented, widely adopted)
JaCoCo integrates with Maven	L2 (standard Java tooling)
Agentic test generation effectiveness	L0 (unproven on this codebase)

Overall: L0 (WLNK capped by agentic test generation effectiveness L0)

Bounded Validity

Scope: All Gen 2 backend services. Frontend testing strategy is separate (Playwright for E2E, Jest/Vitest for unit).
Expiry: Re-evaluate coverage targets after 6 months of data on actual defect rates.
Review trigger: If agentic test generation proves ineffective. If Testcontainers adds unacceptable CI time.
Monitoring: Track coverage % per service, CI pipeline duration, defect escape rate (bugs found in staging/production).

Consequences

Positive: - Safety net for all migration and consolidation work - CI gates prevent regression from shipping - BDD scenarios create living documentation of business rules - Agentic test generation leverages AI for highest-ROI mechanical work - Testcontainers validates real infrastructure behavior (not mocks)

Negative: - Significant upfront investment before migration work begins - CI pipeline time increases (Testcontainers startup) - Coverage targets may feel blocking initially - Test maintenance overhead as services evolve

Decision date: 2026-02-01 Review by: 2026-08-01