Architecture

Data Models — Cross-Cutting Synthesis

Last updated: 2026-02-01 | Architecture

Data Models — Cross-Cutting Synthesis

Key Takeaways

  1. 35 PostgreSQL databases per tenant — Database-per-service pattern strictly enforced. No cross-database foreign keys. All inter-service data references use UUIDs or external IDs. Single shared DB exception (email/sms/notifications).
  2. ~280 Flyway migrations across analyzed services — Inventory (36) and Class-Catalog (32) are the most complex. Wallet (3) and SSE (3) are the simplest. All use additive-only migration patterns (no destructive schema changes found).
  3. All user IDs are Keycloak UUIDs — No local identity generation. Every service references users by Keycloak UUID. This simplifies data migration (no ID mapping needed) but creates universal Keycloak dependency.
  4. External ID coupling in 4 services — Stripe (customer/product/subscription IDs), Mux (asset/playback IDs), Zoom (meeting/registrant IDs), and Stream Chat (channel IDs) store external platform references. These must be preserved during migration.
  5. No double-entry bookkeeping — Transaction service uses a simple payment log (single table, JSON purchases column). If Gen 3 needs financial reporting, the data model must be redesigned.

Data Lineage: Gen 1 → Gen 2

Schema Evolution Pattern

All Gen 2 services follow the same schema evolution approach: - Flyway manages migrations (V1, V2, V3… pattern) - Additive only — new columns are nullable, no column drops observed - UUID primary keys throughout (no auto-increment integers) - Keycloak UUID foreign keys for all user references - created_on / updated_on timestamps on most entities - Soft deletes where needed (is_deleted + deleted_on)

Gen 1 → Gen 2 Schema Differences

Domain Gen 1 Schema Gen 2 Schema Transformation
Identity (Celebrity) Basic (bio, name, tags) Extended (+type, company, title, video, soft delete) Superset — Gen 2 adds columns
Identity (Fan) 4 fields 4 fields (identical) No change
Content Single table 7 tables (categories, ratings, files, SEU/PDU) Major expansion
Media Part of content 13 tables (blogs, news, thumbnails) Split + expansion
Payment (Stripe) Unknown 7 tables (exchange rates, discount codes, checkout data) New domain
Payment (Wallet) Unknown 3 tables (balance, transactions) Simplified in Gen 2
Events (Inventory) Unknown 12+ tables, 36 migrations Most complex
Communication Unknown Shared DB with 14+ tables New consolidated DB

Data Migration Complexity by Domain

Domain Complexity Rationale
Events (Inventory) High 36 migrations, 12+ tables, cross-cutting product catalog
Events (Class-Catalog) High 32 migrations, 15+ entities, learning credits
Content & Streaming High 62 migrations across 3 services, Mux/Zoom external IDs
Events (Shoutout) Medium 24 migrations, BPM state, Mux video references
Payment (Billing) Medium 27 migrations across 3 services, Stripe external IDs
Identity Medium 33 migrations, Keycloak UUID coupling
Communication Medium 44 migrations across 6 services, shared DB
Payment (Wallet/Txn) Low 9 migrations, simple schemas

Complete Entity-Relationship Overview

Identity Domain

celebrity_profiles (PK: celebrity_user_id → Keycloak UUID)
  ├── celebrity_profile_image
  ├── celebrity_profile_banner
  ├── celebrity_profile_shortlinks
  ├── profile_status_history
  └── referral_codes

fan_profiles (PK: fan_user_id → Keycloak UUID)
  ├── fan_follows (PK: MD5 composite)
  ├── fan_follow_changes (audit)
  ├── fan_media
  └── global_preferences

Content & Streaming Domain

contents (PK: UUID, FK: creator_id → Keycloak)
  ├── content_categories → categories
  ├── content_ratings
  ├── content_files (NFS paths)
  ├── content_tags
  └── seu_pdu_completions

media (PK: UUID, FK: creator_id → Keycloak)
  ├── media_thumbnails
  ├── media_tags
  ├── blogs
  └── news

webinars (PK: UUID, FK: host_id → Keycloak)
  ├── webinar_registrations (zoom_registrant_id)
  ├── webinar_recordings (zoom_recording_id)
  ├── webinar_calendar_events (google_event_id)
  ├── webinar_attendees
  └── webinar_series

Payment Domain

stripe_customers (PK: peeq_user_id → Keycloak, stripe_customer_id)
exchange_rates (virtual currency pricing)
discount_codes → discount_codes_usage
checkout_purchase_products (stripe_product_id, inventory_id)
checkout_purchase_recipients (PII: name, email, phone, LinkedIn)
promo_code_usage
one_off_subscriptions_to_cancel (stripe_subscription_id)

subscriptions (PK: UUID, inventory_info JSON)
  ├── subscription_associated_tags
  └── series → series_published_notification_log

balance_transaction (PK: UUID, user_id, int balance)
  └── create_transaction (debit/credit log)
      └── create_transaction_related_transactions

transactions (single table, JSON purchases column)

Events Domain

shoutout (PK: UUID)
  ├── shoutout_video (mux_asset_id, mux_playback_id)
  ├── shoutout_recipient
  └── shoutout_fulfillment_state

inventory_items (PK: UUID, 36 migrations, 12+ tables)
  ├── inventory_prices
  ├── inventory_entitlements
  ├── inventory_purchases (user_id, state machine)
  ├── inventory_tags
  └── [complex product catalog schema]

class_catalog (32 migrations, 15+ entities)
  ├── courses
  ├── sections (learning units)
  ├── journeys (learning paths)
  ├── office_hours
  ├── ceu_pdu_credits
  └── [learning management schema]

Communication Domain

-- Shared DB: peeq-notification-service-db --
notification_preferences (user_id, channel, enabled)
notification_subscriptions (user_id, topic)
notification_log (sent notifications)
notification_digests (batched notifications)

email_templates (template definitions)
email_log (sent emails with Mandrill IDs)

sms_templates (template definitions)
sms_log (sent messages with Twilio SIDs)

-- Separate DBs --
chat: stream_channels, channel_access (Stream Chat channel IDs)
message_board: boards, posts, donations, leaderboard
sse: sse_groups, sse_group_members

Data Volume Estimates

What We Know (from Infrastructure — Session 8)

Metric Value
Cloud SQL tier db-custom-2-6656 (2 vCPU, 6.5 GB RAM)
Max connections 750 (shared across 35 databases)
Databases per tenant 35 PostgreSQL
Production tenants 4 (×35 = 140 total databases)
Backup Point-in-time recovery enabled

What We Don’t Know (H8 — Still L0)

Proxy Estimates (from migration counts)

Service Migrations Estimated Maturity Likely Volume
Inventory 36 Very mature High (product catalog)
Class-Catalog 32 Very mature Medium-High (courses)
Media 26 Mature Medium (video metadata)
Webinar 24 Mature Medium (events + registrations)
Shoutout 24 Mature Medium (video fulfillment)
Celebrity 22 Mature Medium (profiles)
Email 14 Moderate High (email logs)
Stripe 13 Moderate Medium (checkout data)
Fan 11 Moderate High (follows)
Wallet 3 Simple Medium (transactions)

PII Data Locations

Identified across all analyzed domains:

Table Service PII Fields Session
checkout_purchase_recipients Stripe first_name, last_name, email, phone_number, linked_in_address 4
celebrity_profiles Celebrity first_name, last_name, display_name, bio, company_name 2
fan_profiles Fan (metadata only — minimal PII) 2
shoutout_recipient Shoutout recipient name, email (for personalized videos) 6
webinar_registrations Webinar user_id → Keycloak (PII in Keycloak, not here) 3
Keycloak DB Keycloak email, phone, name, auth credentials 2

Note: Most services store only Keycloak UUIDs, not PII directly. PII is concentrated in Keycloak, checkout recipients, celebrity profiles, and shoutout recipients.

Stateful Data Requiring Special Migration Handling

Active Process Instances (BPM)

BPM Engine Database Active State
purchase-request-bpm purchase_request_bpm Running purchase workflows with timer events
shoutout-bpm shoutout_bpm Running shoutout fulfillment workflows

Migration strategy: Drain all active instances before BPM engine replacement. Both workflows are simple enough (~10 states) to complete within hours.

Session/Token State (Redis + Keycloak)

State Storage TTL Impact
Keycloak user sessions Keycloak DB Configurable Must be preserved or gracefully expired
Magic Link tokens Redis 24h Accept brief auth interruption
OAuth2 refresh tokens Keycloak DB Configurable Realm export/import preserves these
SSE group memberships PostgreSQL + Redis Persistent Must rebuild SSE group state

Queued Messages (RabbitMQ)

75+ message types across all domains. During migration: - Consumer side: Drain queues before stopping services - Publisher side: Stop publishers before draining - Order: Stop publishers → drain queues → migrate → restart

NFS File Storage

PVC Size Service Contents
pvc-content 50Gi Content Articles, resources, uploads
pvc-media 50Gi Media Video/image files
pvc-shoutout 50Gi Shoutout Fulfillment videos
pvc-stream 50Gi Streaming Stream assets

Migration: Must copy NFS data to new storage (GCS recommended) or maintain NFS mount during transition.

Elasticsearch Index Migration

Current State

Index Types (Inferred)

Index Category Purpose Source
Application logs Service logs peeq-logging pipeline
Search indices Content, media, celebrity search Search service
Kibana metadata .kibana index Kibana internals

Migration Considerations

Data Migration Strategy Recommendations

Phase 1: Low-Risk (Can migrate independently)

Phase 2: Medium-Risk (External ID coupling)

Phase 3: High-Risk (Complex schemas + external IDs)

Phase 4: Infrastructure (Must be last)


Last updated: 2026-01-30 — Session 9 Review by: 2026-04-30 Staleness risk: Medium — schema evolution continues with development