ADR

ADR-024: Feature Flag Strategy

Last updated: 2026-02-01 | Decisions

ADR-024: Feature Flag Strategy

Status

Proposed — Pending engineering team review

Context

The platform has no feature flag system. The only runtime configuration mechanism is tenant-level toggles in values-globals.yaml (Helm values), which require a full deployment to change. This blocks gradual rollouts, canary deployments, A/B testing, and safe migration rollbacks.

Current State

Component Current Gap
Feature flags None Cannot toggle features at runtime
Tenant config values-globals.yaml per tenant Requires Helm deployment to change any value
Gradual rollout Not possible All-or-nothing deployments
A/B testing Not possible No user segmentation for features
Migration safety No kill switch Cannot disable new code paths without rollback
Dark launching Not possible Cannot deploy code without exposing to users

Why Now

Service consolidation (ADR-001) and BPM migration (ADR-013) introduce significant risk. Feature flags enable: - Migration rollback without deployment: if Operaton BPM has issues, flag back to old path - Gradual consolidation: merge services but flag individual features on/off - Tenant-specific rollout: test changes on one brand before all four - Safe deployment: deploy code behind flag, enable when validated

Decision

Adopt Unleash (self-hosted) as the feature flag platform, integrated into both backend (Spring Boot) and frontend (Angular) applications. Use tenant-aware evaluation for multi-brand rollouts.

Why Unleash (Not Alternatives)

Option Assessment
Unleash (Recommended) Open source, self-hosted (data stays in our infrastructure). Server-side + client-side SDKs. Supports custom strategies including tenant-based targeting. Free for self-hosted.
LaunchDarkly Best-in-class SaaS but per-seat + per-MAU pricing. At 4 tenants with unknown user counts, cost is unpredictable. Evaluate if Unleash proves insufficient.
Flagsmith Open source alternative. Fewer strategy types than Unleash. Smaller community.
Custom (Redis/DB-backed) No SDK support, no audit trail, no gradual rollout strategies. Reinventing existing tools.
GCP Feature Flags (Firebase) Firebase Remote Config is mobile-focused. Not designed for backend flag evaluation.

Architecture

Unleash Server (platform namespace)
    │
    ├── PostgreSQL (flag definitions, audit log)
    │
    ├── Backend SDK (Unleash Java Client)
    │   └── Spring Boot services evaluate flags server-side
    │
    └── Frontend SDK (Unleash Proxy + React/Angular Client)
        └── Unleash Edge proxy for client-side evaluation
            └── Angular apps evaluate flags client-side

Flag Taxonomy

Flag Type Purpose Lifecycle Example
Release Gate new features Removed after full rollout operaton-bpm-enabled
Migration Toggle between old/new code paths Removed after migration complete use-consolidated-payment-service
Experiment A/B test variants Removed after experiment concludes new-checkout-flow
Ops Runtime operational control Permanent maintenance-mode
Tenant Per-brand feature availability Permanent nil-game-plan-exclusive-feature

Evaluation Strategies

Strategy Description Use Case
Tenant-based Flag enabled for specific tenants Roll out to VT NIL first, then others
Percentage rollout Gradual % of users Enable new checkout for 10% → 25% → 50% → 100%
User ID Specific users Beta testing with known users
Environment Per environment Enabled in staging, disabled in production
Date-based Scheduled activation Enable feature at product launch date

Tenant-Aware Flag Context

// Spring Boot flag evaluation with tenant context
@Component
public class FeatureFlagService {

    private final Unleash unleash;

    public boolean isEnabled(String flagName, String tenantId, String userId) {
        UnleashContext context = UnleashContext.builder()
            .appName("payment-service")
            .environment(activeProfile)  // dev, staging, prod
            .userId(userId)
            .addProperty("tenantId", tenantId)
            .build();
        return unleash.isEnabled(flagName, context);
    }
}

Migration Flag Pattern

For service consolidation (ADR-001) and BPM migration (ADR-013):

// Example: payment service consolidation
if (featureFlags.isEnabled("use-consolidated-payment-service", tenantId, userId)) {
    // New consolidated payment logic
    return consolidatedPaymentService.processPayment(request);
} else {
    // Legacy separate service logic
    return legacyStripeService.processPayment(request);
}

Flag lifecycle: 1. Deploy code behind flag (default: OFF) 2. Enable in staging environment 3. Canary — enable for 5% of production users 4. Gradual rollout — 25% → 50% → 100% 5. Remove flag — clean up branching code after full rollout confirmed

Unleash Deployment

Component Deployment Resource
Unleash Server Kubernetes (platform namespace) 1 pod, 256Mi RAM
Unleash PostgreSQL Sidecar or shared Cloud SQL Minimal (flag definitions are small)
Unleash Edge Kubernetes (platform namespace) 1 pod, 128Mi RAM (for frontend proxying)

Integration Points

Layer SDK Evaluation
Spring Boot services unleash-client-java Server-side, cached (10s refresh)
Angular frontend @unleash/proxy-client-react (adapter for Angular) Client-side via Unleash Edge
BPM (Operaton) Custom service task delegates Server-side via FeatureFlagService
CI/CD pipeline Unleash API Pre-deployment checks (“is flag ready?”)

Hypothesis Background

Primary: Self-hosted Unleash provides the feature flag capabilities needed for safe migration and gradual rollout without SaaS cost uncertainty.

Alternative 1: LaunchDarkly (managed SaaS). - Not rejected permanently — if Unleash operational overhead is burdensome, LaunchDarkly is the best SaaS option. Evaluate after 6 months of Unleash operation.

Alternative 2: No feature flags — use canary deployments only. - Rejected: Canary deployments only control traffic splitting, not feature branching within code. Cannot toggle specific features independently of deployment.

Falsifiability Criteria

Evidence Quality

Evidence Assurance
No feature flags in platform L2 (verified from code analysis)
Tenant config via Helm values L2 (verified from values-globals.yaml)
Unleash supports tenant-based strategies L1 (documented, not tested)
Unleash Java SDK works with Spring Boot L1 (documented, widely used)
Unleash operational overhead L0 (unknown — no experience running it)

Overall: L1 (WLNK capped by unknown Unleash operational overhead)

Bounded Validity

Consequences

Positive: - Safe migration rollback without deployment - Gradual rollout reduces blast radius of changes - Tenant-specific feature control - A/B testing capability for product decisions - Operational kill switches for incident response - Self-hosted — no per-user SaaS cost

Negative: - New infrastructure to operate (Unleash server) - Flag branching code adds complexity (must clean up after rollout) - Risk of “flag debt” — old flags never removed - Frontend SDK adds small bundle size - Flag evaluation adds per-request overhead (mitigated by SDK caching)


Decision date: 2026-02-01 Review by: 2026-08-01