ADR

ADR-011: Video Platform Strategy — Mux Evaluation & AI-Enhanced Video Pipeline

Last updated: 2026-02-01 | Decisions

ADR-011: Video Platform Strategy — Mux Evaluation & AI-Enhanced Video Pipeline

Status

Under Investigation — Market research complete, AI capability analysis added, pending POC validation

Context

Mux is the exclusive video backbone for the entire platform. Both the content service and media service depend on Mux for video upload, transcoding, playback, and thumbnail generation. Shoutouts (personalized video messages from experts to fans) are stored as Mux assets. The frontend embeds the Mux player SDK for all video playback.

Current Mux Usage

Capability Service How Mux is Used
Video upload Content, Media POST /mux/upload generates signed Mux upload URLs; frontend uploads directly to Mux
Transcoding Content, Media Mux transcodes uploaded video into multiple formats/bitrates automatically
Playback Content, Media mux_playback_id stored in DB; frontend uses Mux player SDK
Thumbnails Media Mux auto-generates thumbnails from video; GET /thumbnail/{id} serves them
Livestream webhooks Content POST /api/content/mux/webhook receives async video processing events
Asset lifecycle Media Full CRUD: create, read, update, delete Mux assets
Shoutout videos Media Fan-requested personalized videos stored as Mux assets
Mux Data (analytics) mux-sync (ETL) Python/dlt pipeline syncing Mux analytics to Snowflake

What Mux Actually Does

Breaking down the Mux dependency into discrete capabilities:

Capability Complexity Self-Hostable? Alternatives
Video upload (signed URLs) Low Yes — GCS/S3 signed URLs Any cloud storage
Transcoding (multi-bitrate) High — CPU intensive Yes — FFmpeg, but requires compute infrastructure AWS MediaConvert, GCP Transcoder API, Coconut, self-hosted FFmpeg
Adaptive bitrate playback (HLS/DASH) Medium Yes — serve HLS from CDN Video.js + self-hosted HLS, Cloudflare Stream
Player SDK Low Yes — Video.js, Plyr, hls.js Open source players
Thumbnail generation Low Yes — FFmpeg one-liner Any video processing tool
Livestreaming High Complex — requires ingest servers + CDN Cloudflare Stream, AWS IVS, self-hosted (Nginx RTMP)
Webhook events Low Yes — build own event system Standard pub/sub
Analytics (Mux Data) Medium Build own GCP BigQuery, custom analytics

Cost Concern

Mux pricing is per-minute for encoding and per-minute for streaming delivery. For a content-heavy platform with video shoutouts, on-demand content, and potential live streaming, costs scale linearly with usage. The platform does not currently track Mux costs separately.

Rival / CortexOne as Compute Platform

Rival (cortexone.rival.io) is an AI function marketplace and serverless execution platform that runs on GCP. While Rival doesn’t have native video transcoding today, its CortexOne function platform supports:

A CortexOne function wrapping FFmpeg for video transcoding is architecturally feasible — it follows the same pattern as existing compute-heavy functions (graphrag-model-trainer, predictive-engine, document-chunker).

Decision

Replace Mux with a self-managed video pipeline using GCS for storage, a CortexOne/FFmpeg function for transcoding, and Cloudflare CDN for delivery. Keep the option to use a managed transcoding API (GCP Transcoder API) as a fallback if CortexOne FFmpeg proves insufficient for scale.

Target Architecture

graph TB
    subgraph "Clients"
        WEB[Next.js Web]
        MOB[React Native Mobile]
    end

    subgraph "Video Upload"
        CMS[Content/Media Service<br/>Java 21 / Spring Boot]
        GCS[Google Cloud Storage<br/>Raw video uploads]
    end

    subgraph "Transcoding (Replace Mux)"
        QUEUE[RabbitMQ / Cloud Tasks<br/>Transcode job queue]
        CX[CortexOne Function<br/>FFmpeg transcoding<br/>Multi-bitrate HLS output]
        GCS_HLS[GCS Bucket<br/>HLS segments + manifests]
    end

    subgraph "Delivery"
        CDN[Cloudflare CDN<br/>or GCP CDN]
        PLAYER[Video.js / hls.js<br/>Open source player]
    end

    subgraph "Fallback Option"
        GCPT[GCP Transcoder API<br/>Managed transcoding]
    end

    WEB -->|Upload video| CMS
    MOB -->|Upload video| CMS
    CMS -->|Store raw video| GCS
    CMS -->|Queue transcode job| QUEUE
    QUEUE -->|Invoke| CX
    CX -->|Read raw video| GCS
    CX -->|Write HLS output| GCS_HLS
    CX -->|Notify complete| CMS
    GCS_HLS -->|Serve via| CDN
    WEB -->|Stream HLS| CDN
    MOB -->|Stream HLS| CDN
    CDN --> PLAYER

    GCS -.->|Alternative| GCPT
    GCPT -.->|HLS output| GCS_HLS

Video Pipeline: Detailed Flow

sequenceDiagram
    participant User
    participant App as Web/Mobile App
    participant CMS as Content/Media Service
    participant GCS as GCS (Raw)
    participant Queue as RabbitMQ
    participant CX as CortexOne FFmpeg
    participant HLS as GCS (HLS)
    participant CDN as Cloudflare CDN

    User->>App: Upload video
    App->>CMS: POST /video/upload
    CMS->>GCS: Generate signed upload URL
    CMS->>App: Return signed URL
    App->>GCS: Upload raw video (direct)
    GCS->>CMS: Upload complete notification
    CMS->>Queue: Publish TranscodeJob {videoId, gcsPath, presets}
    Queue->>CX: Invoke CortexOne function
    CX->>GCS: Download raw video
    CX->>CX: FFmpeg transcode (360p, 720p, 1080p HLS)
    CX->>CX: Generate thumbnail (frame extraction)
    CX->>HLS: Upload HLS segments + manifest
    CX->>CMS: POST /video/transcode-complete {videoId, hlsUrl, thumbnailUrl}
    CMS->>CMS: Store HLS playback URL in DB
    User->>App: Play video
    App->>CDN: Request HLS manifest
    CDN->>HLS: Fetch segments (cached at edge)
    CDN->>App: Stream video (adaptive bitrate)

CortexOne FFmpeg Function Design

Aspect Detail
Runtime Python (matches existing CortexOne patterns)
Core dependency FFmpeg (installed in Docker image)
Input GCS path to raw video file
Output HLS segments + m3u8 manifest at multiple bitrates → GCS
Presets 360p (800kbps), 720p (2.5Mbps), 1080p (5Mbps)
Thumbnails FFmpeg frame extraction at configurable timestamp
Duration limit Configurable per job type (shoutout: 5min max, content: 60min max)
Concurrency Scale via CortexOne — each invocation is a separate container
Monitoring Progress callbacks to content/media service
Error handling Dead letter queue for failed transcodes; retry with exponential backoff

Transcoding Presets

# Standard on-demand video
ffmpeg -i input.mp4 \
  -filter_complex "[0:v]split=3[v1][v2][v3]; \
    [v1]scale=640:360[v360]; \
    [v2]scale=1280:720[v720]; \
    [v3]scale=1920:1080[v1080]" \
  -map "[v360]" -map 0:a -c:v libx264 -b:v 800k -c:a aac -b:a 96k \
    -hls_time 6 -hls_list_size 0 -f hls 360p/index.m3u8 \
  -map "[v720]" -map 0:a -c:v libx264 -b:v 2500k -c:a aac -b:a 128k \
    -hls_time 6 -hls_list_size 0 -f hls 720p/index.m3u8 \
  -map "[v1080]" -map 0:a -c:v libx264 -b:v 5000k -c:a aac -b:a 192k \
    -hls_time 6 -hls_list_size 0 -f hls 1080p/index.m3u8

# Shoutout video (single quality, fast encode)
ffmpeg -i input.mp4 \
  -c:v libx264 -preset fast -b:v 2500k -c:a aac -b:a 128k \
  -hls_time 4 -hls_list_size 0 -f hls output/index.m3u8

Player Migration

Current Target Change
Mux Player SDK (@mux/mux-player) Video.js + hls.js Open source, no vendor lock-in
Mux React Native SDK react-native-video + hls.js Native player, free
Mux playback URLs (stream.mux.com) Cloudflare CDN URLs (cdn.theagilenetwork.com/video/) Self-managed CDN
Mux thumbnail URLs (image.mux.com) GCS-hosted thumbnails via CDN FFmpeg-generated

Migration Strategy

Phase Action Risk
Phase 1 Build CortexOne FFmpeg function, test with sample videos Low — no production impact
Phase 2 New uploads go through self-managed pipeline; existing Mux assets unchanged Medium — dual pipeline
Phase 3 Migrate existing Mux assets: download → re-transcode → store in GCS Medium — bulk download from Mux
Phase 4 Update frontend player (Mux SDK → Video.js/hls.js) Medium — player regression testing
Phase 5 Decommission Mux account Low — after all assets migrated

What About Livestreaming?

Broadcast is confirmed inactive (H1 L2). If live streaming is needed in the future:

Option Cost Complexity
Cloudflare Stream Live Per-minute delivery Low — managed service
AWS IVS (Interactive Video Service) Per-hour ingest + delivery Low — managed service
Self-hosted (Nginx RTMP + FFmpeg) Infrastructure only High — custom build
Re-evaluate Mux for live only Mux live pricing Low — familiar integration

Recommendation: Cross that bridge when live streaming is actually needed. On-demand video (shoutouts, content, articles) is the current use case.

Cost Comparison (Estimated)

Component Mux Self-Managed Savings
Encoding $0.015/min (Mux encoding) CortexOne compute (GCP Cloud Run) ~70-80% for moderate volume
Storage Included in Mux GCS ($0.02/GB/month) Comparable
Delivery 0.007/min(Muxstreaming)|CloudflareCDN(freetiergenerous, thenper − GB)| 50 − 70|* * Player * *|Free(Muxplayer)|Free(Video.js)|Same||* * Thumbnails * *|Included|FFmpeg(free)|Same||* * Analytics * *|MuxData() Custom (BigQuery or self-built) Varies
Ops overhead Zero (managed) CortexOne function + GCS + CDN management Increased

Key trade-off: Self-managing costs less but requires building and maintaining the transcoding pipeline. CortexOne makes this feasible by providing the compute platform — you’re not building from scratch.

Hypothesis Background

Primary: Replacing Mux with a self-managed video pipeline (GCS + CortexOne FFmpeg + Cloudflare CDN) reduces video infrastructure costs by 50-70% while maintaining video quality, and CortexOne’s serverless compute model can handle transcoding workloads at the platform’s current scale.

Alternative 1: Keep Mux, optimize usage. - Not fully rejected. If video volume is low (<1000 videos/month), Mux’s operational simplicity may outweigh cost savings. Need actual Mux billing data to make this call.

Alternative 2: Replace Mux with Cloudflare Stream (managed). - Viable for delivery but Cloudflare Stream has limited transcoding customization. Good for simple use cases but less control than self-managed FFmpeg. Could be a middle ground.

Alternative 3: Replace Mux with GCP Transcoder API (managed). - Viable as a fallback. GCP Transcoder API is pay-per-minute, cheaper than Mux, but still a managed service cost. Could be used instead of CortexOne FFmpeg if self-managed transcoding proves unreliable.

Alternative 4: Replace Mux with AWS MediaConvert. - Rejected: Platform is on GCP. Adding AWS cross-cloud dependency for video transcoding adds complexity. GCP Transcoder API or CortexOne FFmpeg are GCP-native.

Falsifiability Criteria

Evidence Quality

Evidence Assurance
Mux is exclusive video backbone L2 (verified — content + media services, all tenants)
Broadcast is inactive L2 (verified — H1, no ArgoCD app)
Media service already uses FFmpeg L1 (FFmpeg in media service codebase)
CortexOne runs CPU-intensive functions L1 (68+ functions, ML workloads verified)
CortexOne can run FFmpeg in container L0 (architecturally feasible, not tested)
Mux actual monthly costs L0 (no billing data available)
Video volume (uploads/month) L0 (no metrics available)
Mux asset count for migration L0 (not inventoried)
HLS delivery quality self-managed vs Mux L0 (not benchmarked)

Overall: L0 (WLNK capped by unknown Mux costs and untested CortexOne FFmpeg capability)

WLNK Warning: This decision is capped at L0. Before committing to migration: 1. Get Mux billing data (promotes cost evidence to L2) 2. Build proof-of-concept CortexOne FFmpeg function (promotes feasibility to L1) 3. Inventory existing Mux assets (promotes migration scope to L1)

Bounded Validity

Consequences

Positive: - Eliminates Mux vendor dependency and per-minute pricing - Video infrastructure on GCP (same cloud as everything else) - CortexOne provides elastic compute for transcoding without managing servers - Open source player (Video.js) — no SDK vendor lock-in - Full control over transcoding presets, quality, and output formats - Thumbnails generated as part of transcode pipeline (not separate Mux feature) - Path to custom video features (watermarking, clips, previews) without Mux API limitations

Negative: - Must build and maintain transcoding pipeline (CortexOne function + job queue + error handling) - Mux provides zero-ops video — self-managing adds operational burden - Existing Mux assets must be migrated (bulk download + re-transcode) - Player migration (Mux SDK → Video.js) requires frontend testing - No Mux Data analytics — must build or buy alternative - CortexOne FFmpeg is unproven for video transcoding at scale

Mitigated by: CortexOne’s existing production track record with CPU-intensive workloads. GCP Transcoder API as managed fallback. Phased migration with dual pipeline. Video.js is battle-tested (millions of deployments). Mux asset download API enables bulk migration.


Decision date: 2026-01-31 Review by: 2026-07-31