ADR-011: Video Platform Strategy — Mux Evaluation & AI-Enhanced Video Pipeline
ADR-011: Video Platform Strategy — Mux Evaluation & AI-Enhanced Video Pipeline
Status
Under Investigation — Market research complete, AI capability analysis added, pending POC validation
Context
Mux is the exclusive video backbone for the entire platform. Both the content service and media service depend on Mux for video upload, transcoding, playback, and thumbnail generation. Shoutouts (personalized video messages from experts to fans) are stored as Mux assets. The frontend embeds the Mux player SDK for all video playback.
Current Mux Usage
| Capability | Service | How Mux is Used |
|---|---|---|
| Video upload | Content, Media | POST /mux/upload generates signed Mux upload URLs;
frontend uploads directly to Mux |
| Transcoding | Content, Media | Mux transcodes uploaded video into multiple formats/bitrates automatically |
| Playback | Content, Media | mux_playback_id stored in DB; frontend uses Mux player
SDK |
| Thumbnails | Media | Mux auto-generates thumbnails from video;
GET /thumbnail/{id} serves them |
| Livestream webhooks | Content | POST /api/content/mux/webhook receives async video
processing events |
| Asset lifecycle | Media | Full CRUD: create, read, update, delete Mux assets |
| Shoutout videos | Media | Fan-requested personalized videos stored as Mux assets |
| Mux Data (analytics) | mux-sync (ETL) | Python/dlt pipeline syncing Mux analytics to Snowflake |
What Mux Actually Does
Breaking down the Mux dependency into discrete capabilities:
| Capability | Complexity | Self-Hostable? | Alternatives |
|---|---|---|---|
| Video upload (signed URLs) | Low | Yes — GCS/S3 signed URLs | Any cloud storage |
| Transcoding (multi-bitrate) | High — CPU intensive | Yes — FFmpeg, but requires compute infrastructure | AWS MediaConvert, GCP Transcoder API, Coconut, self-hosted FFmpeg |
| Adaptive bitrate playback (HLS/DASH) | Medium | Yes — serve HLS from CDN | Video.js + self-hosted HLS, Cloudflare Stream |
| Player SDK | Low | Yes — Video.js, Plyr, hls.js | Open source players |
| Thumbnail generation | Low | Yes — FFmpeg one-liner | Any video processing tool |
| Livestreaming | High | Complex — requires ingest servers + CDN | Cloudflare Stream, AWS IVS, self-hosted (Nginx RTMP) |
| Webhook events | Low | Yes — build own event system | Standard pub/sub |
| Analytics (Mux Data) | Medium | Build own | GCP BigQuery, custom analytics |
Cost Concern
Mux pricing is per-minute for encoding and per-minute for streaming delivery. For a content-heavy platform with video shoutouts, on-demand content, and potential live streaming, costs scale linearly with usage. The platform does not currently track Mux costs separately.
Rival / CortexOne as Compute Platform
Rival (cortexone.rival.io) is an AI function marketplace and serverless execution platform that runs on GCP. While Rival doesn’t have native video transcoding today, its CortexOne function platform supports:
- Python functions — 68+ production functions running on GCP Cloud Run
- CPU-intensive workloads — ML model training, prediction engines, document processing
- Docker containers — Custom runtimes with arbitrary dependencies (FFmpeg could be added)
- gRPC execution — Low-latency function invocation
- Multi-runtime — Python, JavaScript, Lua support
A CortexOne function wrapping FFmpeg for video transcoding is architecturally feasible — it follows the same pattern as existing compute-heavy functions (graphrag-model-trainer, predictive-engine, document-chunker).
Decision
Replace Mux with a self-managed video pipeline using GCS for storage, a CortexOne/FFmpeg function for transcoding, and Cloudflare CDN for delivery. Keep the option to use a managed transcoding API (GCP Transcoder API) as a fallback if CortexOne FFmpeg proves insufficient for scale.
Target Architecture
graph TB
subgraph "Clients"
WEB[Next.js Web]
MOB[React Native Mobile]
end
subgraph "Video Upload"
CMS[Content/Media Service<br/>Java 21 / Spring Boot]
GCS[Google Cloud Storage<br/>Raw video uploads]
end
subgraph "Transcoding (Replace Mux)"
QUEUE[RabbitMQ / Cloud Tasks<br/>Transcode job queue]
CX[CortexOne Function<br/>FFmpeg transcoding<br/>Multi-bitrate HLS output]
GCS_HLS[GCS Bucket<br/>HLS segments + manifests]
end
subgraph "Delivery"
CDN[Cloudflare CDN<br/>or GCP CDN]
PLAYER[Video.js / hls.js<br/>Open source player]
end
subgraph "Fallback Option"
GCPT[GCP Transcoder API<br/>Managed transcoding]
end
WEB -->|Upload video| CMS
MOB -->|Upload video| CMS
CMS -->|Store raw video| GCS
CMS -->|Queue transcode job| QUEUE
QUEUE -->|Invoke| CX
CX -->|Read raw video| GCS
CX -->|Write HLS output| GCS_HLS
CX -->|Notify complete| CMS
GCS_HLS -->|Serve via| CDN
WEB -->|Stream HLS| CDN
MOB -->|Stream HLS| CDN
CDN --> PLAYER
GCS -.->|Alternative| GCPT
GCPT -.->|HLS output| GCS_HLS
Video Pipeline: Detailed Flow
sequenceDiagram
participant User
participant App as Web/Mobile App
participant CMS as Content/Media Service
participant GCS as GCS (Raw)
participant Queue as RabbitMQ
participant CX as CortexOne FFmpeg
participant HLS as GCS (HLS)
participant CDN as Cloudflare CDN
User->>App: Upload video
App->>CMS: POST /video/upload
CMS->>GCS: Generate signed upload URL
CMS->>App: Return signed URL
App->>GCS: Upload raw video (direct)
GCS->>CMS: Upload complete notification
CMS->>Queue: Publish TranscodeJob {videoId, gcsPath, presets}
Queue->>CX: Invoke CortexOne function
CX->>GCS: Download raw video
CX->>CX: FFmpeg transcode (360p, 720p, 1080p HLS)
CX->>CX: Generate thumbnail (frame extraction)
CX->>HLS: Upload HLS segments + manifest
CX->>CMS: POST /video/transcode-complete {videoId, hlsUrl, thumbnailUrl}
CMS->>CMS: Store HLS playback URL in DB
User->>App: Play video
App->>CDN: Request HLS manifest
CDN->>HLS: Fetch segments (cached at edge)
CDN->>App: Stream video (adaptive bitrate)
CortexOne FFmpeg Function Design
| Aspect | Detail |
|---|---|
| Runtime | Python (matches existing CortexOne patterns) |
| Core dependency | FFmpeg (installed in Docker image) |
| Input | GCS path to raw video file |
| Output | HLS segments + m3u8 manifest at multiple bitrates → GCS |
| Presets | 360p (800kbps), 720p (2.5Mbps), 1080p (5Mbps) |
| Thumbnails | FFmpeg frame extraction at configurable timestamp |
| Duration limit | Configurable per job type (shoutout: 5min max, content: 60min max) |
| Concurrency | Scale via CortexOne — each invocation is a separate container |
| Monitoring | Progress callbacks to content/media service |
| Error handling | Dead letter queue for failed transcodes; retry with exponential backoff |
Transcoding Presets
# Standard on-demand video
ffmpeg -i input.mp4 \
-filter_complex "[0:v]split=3[v1][v2][v3]; \
[v1]scale=640:360[v360]; \
[v2]scale=1280:720[v720]; \
[v3]scale=1920:1080[v1080]" \
-map "[v360]" -map 0:a -c:v libx264 -b:v 800k -c:a aac -b:a 96k \
-hls_time 6 -hls_list_size 0 -f hls 360p/index.m3u8 \
-map "[v720]" -map 0:a -c:v libx264 -b:v 2500k -c:a aac -b:a 128k \
-hls_time 6 -hls_list_size 0 -f hls 720p/index.m3u8 \
-map "[v1080]" -map 0:a -c:v libx264 -b:v 5000k -c:a aac -b:a 192k \
-hls_time 6 -hls_list_size 0 -f hls 1080p/index.m3u8
# Shoutout video (single quality, fast encode)
ffmpeg -i input.mp4 \
-c:v libx264 -preset fast -b:v 2500k -c:a aac -b:a 128k \
-hls_time 4 -hls_list_size 0 -f hls output/index.m3u8
Player Migration
| Current | Target | Change |
|---|---|---|
| Mux Player SDK (@mux/mux-player) | Video.js + hls.js | Open source, no vendor lock-in |
| Mux React Native SDK | react-native-video + hls.js | Native player, free |
| Mux playback URLs (stream.mux.com) | Cloudflare CDN URLs (cdn.theagilenetwork.com/video/) | Self-managed CDN |
| Mux thumbnail URLs (image.mux.com) | GCS-hosted thumbnails via CDN | FFmpeg-generated |
Migration Strategy
| Phase | Action | Risk |
|---|---|---|
| Phase 1 | Build CortexOne FFmpeg function, test with sample videos | Low — no production impact |
| Phase 2 | New uploads go through self-managed pipeline; existing Mux assets unchanged | Medium — dual pipeline |
| Phase 3 | Migrate existing Mux assets: download → re-transcode → store in GCS | Medium — bulk download from Mux |
| Phase 4 | Update frontend player (Mux SDK → Video.js/hls.js) | Medium — player regression testing |
| Phase 5 | Decommission Mux account | Low — after all assets migrated |
What About Livestreaming?
Broadcast is confirmed inactive (H1 L2). If live streaming is needed in the future:
| Option | Cost | Complexity |
|---|---|---|
| Cloudflare Stream Live | Per-minute delivery | Low — managed service |
| AWS IVS (Interactive Video Service) | Per-hour ingest + delivery | Low — managed service |
| Self-hosted (Nginx RTMP + FFmpeg) | Infrastructure only | High — custom build |
| Re-evaluate Mux for live only | Mux live pricing | Low — familiar integration |
Recommendation: Cross that bridge when live streaming is actually needed. On-demand video (shoutouts, content, articles) is the current use case.
Cost Comparison (Estimated)
| Component | Mux | Self-Managed | Savings |
|---|---|---|---|
| Encoding | $0.015/min (Mux encoding) | CortexOne compute (GCP Cloud Run) | ~70-80% for moderate volume |
| Storage | Included in Mux | GCS ($0.02/GB/month) | Comparable |
| Delivery | 0.007/min(Muxstreaming)|CloudflareCDN(freetiergenerous, thenper − GB)| 50 − 70|* * Player * *|Free(Muxplayer)|Free(Video.js)|Same||* * Thumbnails * *|Included|FFmpeg(free)|Same||* * Analytics * *|MuxData() | Custom (BigQuery or self-built) | Varies |
| Ops overhead | Zero (managed) | CortexOne function + GCS + CDN management | Increased |
Key trade-off: Self-managing costs less but requires building and maintaining the transcoding pipeline. CortexOne makes this feasible by providing the compute platform — you’re not building from scratch.
Hypothesis Background
Primary: Replacing Mux with a self-managed video pipeline (GCS + CortexOne FFmpeg + Cloudflare CDN) reduces video infrastructure costs by 50-70% while maintaining video quality, and CortexOne’s serverless compute model can handle transcoding workloads at the platform’s current scale.
- Evidence: Mux charges per-minute for encoding and delivery. For a platform with shoutout videos (short, high volume) and on-demand content, costs scale linearly with usage.
- Evidence: CortexOne already runs 68+ production functions including CPU-intensive workloads (ML model training, document processing). FFmpeg transcoding is architecturally similar — containerized, stateless, I/O-bound.
- Evidence: The media service already uses FFmpeg for video processing internally. The knowledge exists in the codebase.
- Evidence: Broadcast/livestreaming is inactive (H1 L2) — the use case is entirely on-demand video, which is simpler than live.
Alternative 1: Keep Mux, optimize usage. - Not fully rejected. If video volume is low (<1000 videos/month), Mux’s operational simplicity may outweigh cost savings. Need actual Mux billing data to make this call.
Alternative 2: Replace Mux with Cloudflare Stream (managed). - Viable for delivery but Cloudflare Stream has limited transcoding customization. Good for simple use cases but less control than self-managed FFmpeg. Could be a middle ground.
Alternative 3: Replace Mux with GCP Transcoder API (managed). - Viable as a fallback. GCP Transcoder API is pay-per-minute, cheaper than Mux, but still a managed service cost. Could be used instead of CortexOne FFmpeg if self-managed transcoding proves unreliable.
Alternative 4: Replace Mux with AWS MediaConvert. - Rejected: Platform is on GCP. Adding AWS cross-cloud dependency for video transcoding adds complexity. GCP Transcoder API or CortexOne FFmpeg are GCP-native.
Falsifiability Criteria
- If CortexOne FFmpeg function cannot transcode a 1080p 30-minute video within 10 minutes → use GCP Transcoder API instead
- If self-managed HLS delivery quality (buffering, startup time) is measurably worse than Mux → investigate CDN configuration or reconsider managed delivery
- If total self-managed costs (compute + storage + CDN + ops time) exceed 80% of Mux costs → the operational overhead doesn’t justify the switch
- If Mux monthly bill is <$200/month → the savings don’t justify migration effort; keep Mux
- If video upload volume exceeds 500 videos/day → CortexOne scaling may need evaluation
- If actual Mux asset count exceeds 50,000 → bulk migration timeline needs careful planning
Evidence Quality
| Evidence | Assurance |
|---|---|
| Mux is exclusive video backbone | L2 (verified — content + media services, all tenants) |
| Broadcast is inactive | L2 (verified — H1, no ArgoCD app) |
| Media service already uses FFmpeg | L1 (FFmpeg in media service codebase) |
| CortexOne runs CPU-intensive functions | L1 (68+ functions, ML workloads verified) |
| CortexOne can run FFmpeg in container | L0 (architecturally feasible, not tested) |
| Mux actual monthly costs | L0 (no billing data available) |
| Video volume (uploads/month) | L0 (no metrics available) |
| Mux asset count for migration | L0 (not inventoried) |
| HLS delivery quality self-managed vs Mux | L0 (not benchmarked) |
Overall: L0 (WLNK capped by unknown Mux costs and untested CortexOne FFmpeg capability)
WLNK Warning: This decision is capped at L0. Before committing to migration: 1. Get Mux billing data (promotes cost evidence to L2) 2. Build proof-of-concept CortexOne FFmpeg function (promotes feasibility to L1) 3. Inventory existing Mux assets (promotes migration scope to L1)
Bounded Validity
- Scope: All video upload, transcoding, and playback across all tenants and services. Affects content service, media service, and all frontend video components.
- Expiry: Re-evaluate if Mux introduces significantly cheaper pricing tiers, or if video volume grows beyond CortexOne’s comfortable scaling range.
- Review trigger: If self-managed video quality complaints exceed Mux-era baselines, or if CortexOne compute costs approach Mux costs.
- Monitoring: Transcode job duration (p50, p95), video startup time, buffering rate, CDN cache hit ratio, monthly video infrastructure cost.
Consequences
Positive: - Eliminates Mux vendor dependency and per-minute pricing - Video infrastructure on GCP (same cloud as everything else) - CortexOne provides elastic compute for transcoding without managing servers - Open source player (Video.js) — no SDK vendor lock-in - Full control over transcoding presets, quality, and output formats - Thumbnails generated as part of transcode pipeline (not separate Mux feature) - Path to custom video features (watermarking, clips, previews) without Mux API limitations
Negative: - Must build and maintain transcoding pipeline (CortexOne function + job queue + error handling) - Mux provides zero-ops video — self-managing adds operational burden - Existing Mux assets must be migrated (bulk download + re-transcode) - Player migration (Mux SDK → Video.js) requires frontend testing - No Mux Data analytics — must build or buy alternative - CortexOne FFmpeg is unproven for video transcoding at scale
Mitigated by: CortexOne’s existing production track record with CPU-intensive workloads. GCP Transcoder API as managed fallback. Phased migration with dual pipeline. Video.js is battle-tested (millions of deployments). Mux asset download API enables bulk migration.
Decision date: 2026-01-31 Review by: 2026-07-31