Skip to content

System Architecture

System Architecture

Refreshed 2026-05-06 (SUR-284). All known post-SUR-215 / post-SUR-218 inaccuracies resolved. The “Shared marketing/help route analytics” section has been replaced with accurate content. See the CHANGE SUMMARY below for the full update history.

SUR-261 (2026-04-26) restructured src/AuthScreen.jsx into a three-zone layout with a “Use a different account” bottom sheet and added an email-OTP sign-in path alongside Google OAuth. Email-OTP pins shouldCreateUser:false (sign-in only — no self-signup). Hero markup extracted into src/components/HomeHero.jsx, shared between AuthScreen and HomeScreen. Platform detection lives in src/lib/platform.js (isStandaloneOrTwa()). Branded magic-link template at supabase/email-templates/magic-link.html. The “Auth boundary” subsection below covers the new helpers.

CHANGE SUMMARY

  • Updated: Reflected the live managed Anthropic proxy path, quota enforcement, and rate-limit surfacing (src/api.js, src/supabase.js, supabase/functions/anthropic-proxy/index.ts).
  • Updated: Captured the ai_usage_daily ledger and its impact on runtime flows and risks (supabase/migrations/0004_ai_usage_tracking.sql, 0005_upsert_ai_usage_fn.sql).
  • Updated (2026-04-23/24): SUR-215 stripped LandingPage.jsx, WaitlistScreen.jsx, HowItWorksPage.jsx; SUR-238 reworked capture screen nav. Staleness banner updated accordingly.
  • Updated (2026-04-26, SUR-261): AuthScreen restructured into three-zone layout; new email-OTP sign-in path (requestEmailOtp / verifyEmailOtp in src/supabase.js); platform helper at src/lib/platform.js; shared HomeHero component; branded supabase/email-templates/magic-link.html (manual dashboard upload).
  • Updated (2026-04-29, SUR-92): Per-user managed-AI quota: getResolvedMonthlyLimit reads from user_profiles.month_limit (default 50) + optional allocation_override; getMonthlyUsage now sums request_count across all action types (no per-action carve-out); emitTelemetry is fire-and-forget via fireAndForget() so the 429 response is not delayed by PostHog network behaviour.
  • Updated (2026-04-30): public/.well-known/assetlinks.json now declares delegate_permission/common.get_login_creds so the Android TWA can share passkeys/credentials with the browser via Google Smart Lock. A second SHA-256 fingerprint (debug/test keystore) was also registered.
  • Updated (2026-05-01, SUR-242): Azure AI Content Safety guardrails added to anthropic-proxy (guardrail.ts): Prompt Shields (direct injection), Spotlighting (indirect injection in transcribed text), and harm classifiers (output moderation, threshold ≥5). All fail-open. Prompt fencing added to system prompts. Client-side PII detection module (src/safety/) added — structured-regex warn-not-block, stub seams for SUR-246.
  • Updated (2026-05-01, SUR-256): surfc-web/ now has a founder blog at /blog/ — Astro content collections (MDX), paginated index, [slug] renderer, RSS feed, reading time, rehype autolink headings.
  • Updated (2026-05-02, SUR-233): LinkedDevicesModal added as the UI surface for useKeyManagement device list + removal. deviceLabel.js (getDeviceLabel) captures a "Platform · Browser" label at passkey-enrolment time and stores it in wrapped_key_blobs.device_label. Lockout guards prevent removing the current device or last wrapper.
  • Updated (2026-05-02): AddIdeaSheet, AddIdeaBanner, and createCustomIdea.js added for the note-form idea-tagging flow. useFocusRestore hook added for accessible focus management in modals and bottom sheets.
  • Updated (2026-05-02, SUR-237): Transfer-v1 TTL enforcement moved server-side. Migration 0014 adds DEFAULT ((extract(epoch from now()) * 1000)::bigint) to wrapped_key_blobs.created_at (server-stamp) and introduces select_fresh_transfer_blob SECURITY DEFINER RPC. redeemDeviceTransfer now calls the RPC; client-side TRANSFER_MAX_AGE_MS / TRANSFER_MAX_FUTURE_SKEW_MS removed. TRANSFER_SANITY_FUTURE_SKEW_MS (5 min) retained as client defense-in-depth.
  • Updated (2026-05-04, SUR-303 refactor/sur-303-extract): anthropic-proxy prompt extraction: TRANSCRIBE_SYSTEM, TRANSCRIBE_SYSTEM_ASCII_NOTE, and buildDiscoverSystem moved from index.ts into prompts.ts; GREAT_IDEAS now sourced from src/constants.js (Deno cross-tree import — no content change). prompts.test.ts added under __tests__/. The SUR-300 eval harness imports from prompts.ts to prevent prod/harness drift.
  • Updated (2026-05-04, SUR-303): App.jsx partial refactor landed. Extracted: AppGates (gate ladder), ShellNavigation (ShellTabRow + ShellBottomNav), NoteActionOverlay (note long-press flow), IdeaActionOverlay (idea long-press flow), UnsyncedChangesModal (unsynced-changes sign-out confirmation). New hooks: useUserProfile (fetches user_profiles from Supabase), useMediaQuery (SSR-safe matchMedia). In-app help system (SUR-209 Phase 2) shipped: HelpCenterScreen at /help, HelpArticle at /help/:slug, HelpArticleBody (markdown renderer), src/help/manifest.js (reads docs/getting-started/*.md at build time). PolicyPage now serves /policies/:kind in-app via Termly embed. New production deps: react-markdown, remark-gfm, remark-directive.
  • Updated (2026-05-03, SUR-233 follow-up): fetchDeviceList in useKeyManagement.js wrapped in useCallback([session?.user?.id]) to give it a stable identity across renders, stopping an unbounded Supabase refetch loop that fired each time Settings opened. fetchDeviceList now also sets activeWrapperCount from rows.length so a single SELECT covers both the trigger-count and the modal data. App.jsx fires fetchDeviceList via useEffect whenever Settings opens for an enrolled, online user. SettingsModal drops the activeWrapperCount > 0 gate — the “X devices linked” trigger now renders for any enrolled user (including returning users whose unlock went through tryEagerKeyRestore). Regression test: src/test/useKeyManagement-stability.test.jsx.
  • Updated (2026-05-03, chore): VitePress srcExclude in docs/.vitepress/config.mjs extended with outreach/** and spikes/** to prevent VitePress from trying to render internal research docs during the Cloudflare Pages help-site build. .gitignore tightened — adds supabase/.temp/, deno.lock, temp_*.js, .claude/scheduled_tasks.lock, .claude/settings*.json, and local-only directories ref/, design/, test-inputs/.
  • Updated (2026-05-06, SUR-327): Entitlement model + non-blocking gate instrumentation (Phase A of SUR-235). New shared resolver supabase/functions/_shared/entitlements.ts → getResolvedEntitlements is the SSoT for tier-based capabilities; me-entitlements Edge Function exposes it; src/hooks/useEntitlements.js consumes it. Image uploads now go through the new image-upload Edge Function so the storage write and the user_profiles.image_storage_bytes_used counter increment share a transaction (atomic via the adjust_image_storage_bytes SECURITY DEFINER RPC in migration 0017). Four non-quota gate sites (re_discover / custom_ideas / max_devices / image_storage) emit would_have_blocked PostHog events but do not alter behaviour — SUR-262 observes; SUR-328 flips to enforcing. Full data-architecture coverage in Data Architecture → Entitlements.
  • Updated (2026-05-10, SUR-360 + SUR-361): Auth dispatch hardening landed in two parts. SUR-360 turned on email confirmation ([auth.email] enable_confirmations = true) and wired Resend SMTP ([auth.email.smtp]smtp.resend.com:587, sender hello@surfc.app, RESEND_API_KEY env var; the same key the approve-waitlist Edge Function already uses). Branded confirmation template added at supabase/email-templates/confirm.html; templates are NOT auto-synced to the dashboard — re-uploading via Studio → Authentication → Templates is a manual step on every change. SUR-361 added Cloudflare Turnstile via a new useTurnstile hook (src/hooks/useTurnstile.js) — script-tag loader with module-level dedup + 5 s deadline poll, appearance: 'interaction-only' (invisible unless Cloudflare needs a challenge). Server-side wired in [auth.captcha] (provider turnstile, secret from SUPABASE_AUTH_CAPTCHA_SECRET); client bundles via VITE_TURNSTILE_SITE_KEY. The widget lives inside EmailSignInFlow only — not at the AuthScreen top level — because GoTrue only enforces [auth.captcha] on dispatch endpoints (signup / password / OTP / recover) and supabase-js v2.100.0’s signInWithOAuth silently drops options.captchaToken (forwards only redirectTo / scopes / queryParams / skipBrowserRedirect to _handleProviderSignIn — see node_modules/@supabase/auth-js/dist/main/GoTrueClient.js:669-677 and :2030-2042). Threading a token into Google/Apple sign-in would be cargo-cult — UX latency for no security gain. Local-dev secrets template added at supabase/.env.example (Cloudflare’s documented always-pass test secret 1x0000000000000000000000000000000AA is fine for the captcha secret; Inbucket on port 54324 intercepts auth email regardless so the Resend key is unused locally). CI: .github/workflows/db-test.yml now exports both env vars before supabase start so the local stack boots cleanly with captcha enabled.
  • Updated (2026-05-10, SUR-365): Marketing CTA flip to direct signup. Every surfc.app “Request invitation” CTA in Nav.astro / Hero.astro / ClosingCta.astro is now “Sign up free” → app.surfc.app; data-cta values renamed to nav_signup / hero_signup / closing_signup (these are the first event in SUR-367’s rebuilt funnel). src/components/WaitlistForm.astro and tests/waitlist.spec.ts + tests/fixtures/cors-server.mjs deleted; src/pages/waitlist.astro replaced with a noindex “we’ve opened up” sunset page (3 s meta http-equiv="refresh" to app.surfc.app). astro.config.mjs filters /waitlist/ out of the sitemap. Dead .wl-* rules dropped from marketing.css. Hero helper text rewritten to “New to Surfc? Create a free account. Already have an account? Sign in.” ClosingCta lede rewritten to “Surfc is free to use. Pro is for heavier readers when you’re ready.” PUBLIC_SUPABASE_URL / PUBLIC_SUPABASE_ANON_KEY retained in .env.example because src/lib/checkout.ts still uses them for /pricing (only PUBLIC_WAITLIST_ENDPOINT was dropped). The waitlist-signup Edge Function and PostHog funnel rebuild are SUR-367.
  • Updated (2026-05-10, SUR-364): Self-service signup cutover. The client-side waitlist gate (PendingApprovalScreen, fetchWaitlistStatus in src/supabase.js, waitlistStatus state in useAuth.js, the waitlistStatus !== 'approved' block in AppGates.jsx) was removed and requestEmailOtp flipped from shouldCreateUser: false to the supabase-js default (true) so unknown emails create a new auth.users row on first OTP request. The dashboard “Allow new users to sign up” toggle is the live cutover lever (SUR-359, manual). The marketing-side waitlist surface (Astro /waitlist route + CTAs) lands separately under SUR-365; the waitlist-signup Edge Function sunset and PostHog funnel rebuild are SUR-367. The waitlist_requests table, match_waitlist_on_signup trigger, and approve-waitlist Edge Function remain alive (admin UI dependency).
  • Updated (2026-05-07, SUR-85): Stripe billing infrastructure landed — three new Edge Functions (create-checkout-session, stripe-webhook, create-billing-portal-session) plus migration 0019_stripe_billing.sql (additive user_profiles columns + stripe_webhook_events ledger). Client wrappers createCheckoutSession / createBillingPortalSession added in src/supabase.js. Stripe SDK pinned at npm:stripe@^22, API version pinned at 2026-04-22.dahlia. Lapsed-Pro 30-day grace window enforced at read time in getResolvedEntitlements rather than via a daily cron flip — user_tier stays 'pro' after subscription.deleted and the resolver returns 'free' only once subscription_current_period_end + 30 days has passed. Webhook idempotency via INSERT … ON CONFLICT DO NOTHING against stripe_webhook_events.event_id. Full coverage in the Stripe billing flows section below.
  • Updated (2026-05-11, SUR-370): Intent-aware auth-landing surface. AuthScreen.jsx reads ?intent=signup from window.location.search and renders signup-framed UI (overlay heading “Create your account” + “Free forever. No credit card.” reassurance, pre-opened email sheet, secondary CTA “Already have an account? Sign in”, bottom-sheet title swap to “Sign up with email”). Default landing (?intent=signin or no query) renders existing framing. Both intent and open_email are stripped via history.replaceState post-consumption — only when the value was meaningful (intent ∈ {signup,signin}, open_email === '1') so unrecognised values are not silently consumed. Shared OAuth + email-OTP primitives extracted from AuthScreen into src/components/AuthControls.jsx (props: intent, autoOpenEmail, onError) so SUR-357 can compose them into UpgradeAuthGate later. Unified telemetry event: the previously-shipped upgrade_gate_viewed { intent: 'upgrade', interval, ref } from SUR-352 is renamed to auth_landing_viewed with an added surface: 'upgrade_gate' prop; AuthScreen fires the same event with surface: 'authscreen' and intent ∈ {signup,signin}. Both surfaces spread seven canonical UTM/click-ID keys (utm_source/medium/campaign/term/content, gclid, fbclid) from the URL via src/lib/utmParams.js → readUtmParams(). Marketing-side preservation via surfc-web/src/scripts/preserveUtm.ts (wired in BaseLayout.astro) — forwards UTMs from the marketing page’s URL onto every <a data-cta> whose origin matches app.surfc.app, re-attaches on astro:page-load so view-transitions don’t silently break attribution. App.jsx’s catch-all unauth redirect (<Navigate to="/signin" replace />) now preserves window.location.search so a stray landing at /?intent=signup carries the param across to AuthScreen. Marketing CTAs (Nav.astro, Hero.astro, ClosingCta.astro, waitlist.astro) now compose signupUrl() from a shared surfc-web/src/lib/appUrl.ts helper — signupUrl() returns ${appUrl}/signin?intent=signup (deep-links past the redirect). data-cta values from SUR-365 (nav_signup, hero_signup, closing_signup, waitlist_legacy_signup) preserved for SUR-367 funnel parity.
  • Updated (2026-05-14, SUR-351): Stripe billing — silent-failure fix on the customer-id race + invoice deprecation. A user paid for Pro and their user_profiles row never flipped to 'pro' because four parallel create-checkout-session invocations each created their own Stripe customer (race in the unconditional SELECT → CREATE → UPDATE); last UPDATE won; the user clicked a Checkout URL pointing at a different customer id; the webhook arrived with a subscription.customer that no longer matched any profile and silently no-op’d with a 200. Three layers landed in lockstep: (L1) ensureStripeCustomer now uses a conditional UPDATE filtered on stripe_customer_id IS NULL so parallel callers converge on a single customer id; race-loser issues a best-effort stripe.customers.del for its leaked customer and re-reads the survivor. (L2) stripe-webhook/handler.ts adds resolveProfileForSubscription — when findProfileByCustomer misses, it falls back to sub.metadata.user_id (always set via subscription_data.metadata.user_id at Checkout), self-heals stripe_customer_id on the resolved profile, and emits a structured [stripe-webhook] self_healed_stripe_customer_id log. The fallback is guarded against stale at-least-once deliveries: if the profile’s stripe_subscription_id is non-null and doesn’t match the event’s sub.id, the path no-ops with a skip_self_heal_subscription_id_mismatch log so a delayed customer.subscription.deleted for an old sub cannot clobber a user who is currently active on a newer one. The unrecoverable miss now also logs profile_not_found_for_customer (previously silent — that silence was the property the original outage exploited). Applied to both handleSubscriptionUpsert and handleSubscriptionDeleted. (L3) New getSubscriptionIdFromInvoice resolves invoice.parent.subscription_details.subscription (the post-2024-10-28 location for the pinned 2026-04-22.dahlia API version) → legacy top-level → lines[0].subscription. Each level handles both string id and expanded-object form. Without L3 every invoice.paid / invoice.payment_failed had been returning no_op: invoice_without_subscription in production — dunning recovery and past_due mirroring were both broken. Fix is in supabase/functions/create-checkout-session/index.ts and supabase/functions/stripe-webhook/handler.ts; covered by 14 new Deno tests (race-winner / race-loser, metadata self-heal across created/updated/deleted, the subscription-id mismatch guard with three branches, four invoice payload shapes). Open question / out of scope: why create-checkout-session was called 4× per click (StrictMode? hydration script in surfc-web/src/pages/pricing.astro? missing debounce on the Upgrade button?) is tracked separately. CI gap noted: .github/workflows/edge-functions-test.yml does not currently type-check or test create-checkout-session or stripe-webhook, so these tests live outside the CI gate until the workflow is broadened.
  • Updated (2026-05-28, SUR-501): Stripe billing — create-checkout-session self-heals a stale stripe_customer_id. Follow-up to the SUR-500 incident (a live STRIPE_SECRET_KEY against a test-mode cus_…stripe.checkout.sessions.create throws resource_missing → caught as internal_error 500, an unrecoverable upgrade dead-end). When sessions.create throws a Stripe resource_missing naming the customer (deleted, or a test↔live mode switch), the function now clears the stored id, recreates via ensureStripeCustomer, and retries the session once; a resource_missing on the price (a config error recreation can’t fix) or a second failure surfaces unchanged as internal_error (single retry, no loop). The clear is conditional on the observed stale id (.eq('stripe_customer_id', staleId), mirroring the SUR-351 race-safe write) so two concurrent healers cannot clobber a freshly-recreated id or mint a second customer; ensureStripeCustomer then re-reads the winner. The heal emits a loud [create-checkout-session] customer self-heal: <old> -> <new> log so a deploy-wide mode mismatch (a test key wrongly deployed to prod would 404 every live customer) is detectable rather than silently absorbed. Pairs with the SUR-351 webhook self-heal: the retried session preserves subscription_data.metadata.user_id, so resolveProfileForSubscription maps the resulting subscription back to the user even though the customer id changed — no Surfc/Stripe divergence. Self-heal deliberately does not reconcile entitlements (that was SUR-500 step 4). In supabase/functions/create-checkout-session/index.ts; 5 new Deno tests (recreate+retry success, price-miss passthrough, no-loop guard, clear-failure bubbles, concurrent-healer no-clobber). Open assumption: relies on the pinned stripe@^22 setting param: 'customer' (vs line_items[0][price] for a bad price) on these errors — to be confirmed by the Stripe test-mode E2E (the open billing-reviewer HOLD on the code PR).
  • Updated (2026-05-20, SUR-371): Auth dispatch hardening — defensive guard on VITE_TURNSTILE_SITE_KEY. Closes the silent-failure mode that surfaced on 2026-05-10 immediately after the SUR-364 cutover, where a Netlify production deploy ran with VITE_TURNSTILE_SITE_KEY unset (the env var was added to Netlify after the build). Vite bundled undefined, the client captchaReady = !TURNSTILE_SITE_KEY || Boolean(turnstile.token) short-circuited to true, every email-OTP submission fired requestEmailOtp(email, undefined), and GoTrue rejected each one with captcha protection: request disallowed (no captcha_token found) — 100% of email signups broken; the form looked perfectly functional. Three coordinated guards now make recurrence impossible: (1) Explicit dev/prod branching in EmailSignInFlow (the env read moved from module scope to component body, so the prod path is testable via vi.stubEnv): const captchaConfigured = Boolean(siteKey); const captchaReady = captchaConfigured ? Boolean(turnstile.token) : import.meta.env.DEV — dev convenience preserved, prod never silently bypasses. (2) Visible runtime banner scoped to the email form only (Google OAuth stays functional — captcha-exempt by design per SUR-361): “Sign-in is temporarily unavailable — captcha is not configured. Please contact support.” with role="alert" for screen-reader parity, plus a Strict-Mode-guarded console.error once on mount. Deliberately not a PostHog event — per GATING.md §5 Q3 that would have escalated the whole change off the CE surface. (3) Build-time hard fail via a new Vite plugin in vite-plugins/sur-371-turnstile-key-guard.js that throws when command === 'build' && mode === 'production' && !env.VITE_TURNSTILE_SITE_KEY. Uses Vite’s loadEnv() so the check mirrors what import.meta.env will see in the bundle (shell env first, then .env.production, then .env.local, then .env); reading process.env directly would have false-failed when the key is set only in .env.production for local prod-mode builds. Plugin extracted to its own module so the regression test in src/test/vite-config-turnstile-guard.test.js can import it without side-effecting through vite.config.js. Secondary UX polish: while the Turnstile widget is loading, the Send button now shows “Verifying your browser…” with a spinner rather than a silent disabled state (the secondary annoyance documented in the incident report). CI workflow patched (.github/workflows/build-test.yml) to set VITE_TURNSTILE_SITE_KEY: 1x00000000000000000000AA (Cloudflare’s documented always-pass test site key — same pattern as SUPABASE_AUTH_CAPTCHA_SECRET=1x0000000000000000000000000000000AA in supabase/.env.example) so the new guard doesn’t break CI, while the real key continues to be set only on Netlify. Documentation: .env.example gained a VITE_TURNSTILE_SITE_KEY section calling out the Netlify-prod requirement (the original ticket asked to update supabase/.env.example, but that file holds the server secret SUPABASE_AUTH_CAPTCHA_SECRETVITE_ vars live in the PWA root .env; doc target corrected); CLAUDE.md Auth dispatch hardening bullet declares the var a client-side concern that MUST be set in Netlify prod env vars. CE persona pass (security/regression/ux personas at SHA 0db06f5): 0 BLOCKERs, 5 CONCERNs + 1 NIT all addressed (added role="alert", removed paternalistic email-input disable, added error-state truth-table test, extracted plugin + regression test, added load-bearing test-order comment). Test coverage: 18/18 in auth-screen-email.test.jsx (13 existing + 5 new SUR-371 cases — prod-misconfig with handler-level guard via fireEvent.submit, dev no-key, “Verifying your browser…” pending state, configured+token render, and the (captchaConfigured && !token && error) cell flagged by the regression-reviewer); 6/6 in the new vite-config-turnstile-guard.test.js; full surfc suite 836 tests pass, 0 failures.
  • Updated (2026-05-12, SUR-357 + SUR-368): Open-signup cutover completed. SUR-357 replaced the link-CTA bounce at UpgradeAuthGate.jsx with inline <AuthControls /> composition — cold visitors clicking “Get Pro” on surfc.app/pricing now reach OAuth redirect / OTP issuance in one click without losing the price-echo + Pro-upgrade chrome. AuthControls.jsx gained three optional host-override props: sheetTitle (BottomSheet header override), secondaryLabel (email-opener button label override), and onCtaClick(method) (synchronous callback fired at the CTA commit moment — Google click or sheet open; AuthScreen leaves it unset). UpgradeAuthGate wires onCtaClick to preserve the frozen SUR-367 upgrade_gate_auth_started { interval, ref, method } contract verbatim across the refactor — the fire-site moved from inline handler to callback but the payload shape is identical. The upgrade gate also gained a ”← Back to pricing” escape hatch that round-trips non-null UTM keys to surfc.app/pricing (inverse of the marketing-side preserveUtm.ts forward path). SUR-368 closed out the documentation: new docs/getting-started/account-setup.md help-center article (signup → email verification → passkey enrolment → first capture), new docs/runbooks/open-signup-rollback.md rollback runbook (five-step ladder from “close the door in Supabase Studio” to full revert), and a CLAUDE.md Monetisation paragraph consolidating the open-signup model across SUR-360 (Resend SMTP email verification), SUR-361 (Cloudflare Turnstile on dispatch endpoints), SUR-362 (handle_new_auth_user() trigger, replacing the waitlist-bound bootstrap), SUR-363 (relaxed RLS — migration 0021 dropped the waitlist-EXISTS predicate from books/notes/custom_ideas/storage.objects, restoring 0001 ownership-only access), SUR-364 (PWA waitlist gate removal), SUR-365 (marketing CTA flip), and SUR-367 (waitlist-signup Edge Function sunset + conversion-funnel rebuild). New users hit the on_auth_user_created trigger → handle_new_auth_user() (migration 0020), which upserts a user_profiles row with month_limit = 50 default, user_tier = 'free', and a name derived from raw_user_meta_data.full_name (fallback: split_part(email, '@', 1)). The bespoke admin-side spend panel that would consolidate cost monitoring is in flight under SUR-230 at intranet.surfc.app/admin/spend; until it lands, abuse-spend detection is manual (Supabase usage panel + Anthropic console).
  • Updated (2026-05-27, SUR-308): Re-discover now gates through the shared client-side PII review. New usePiiReview controller (src/hooks/usePiiReview.js) owns the single PiiReviewSheet + guardrail telemetry (pathtranscribe | discover | rediscover); constructed once in App.jsx and injected into useNoteForm (capture transcribe/discover paths) and useNoteActions.rediscoverIdeas — the latter previously sent an existing note’s text to the discover endpoint with no PII review (the FUNCTIONAL.md §10 gap). On re-discover: Cancel aborts (no AI call), Redact sends asterisked text to the API only (stored note unchanged), Send proceeds.
  • Updated (2026-05-30, SUR-316): Prompt versioning v1. The three managed-AI system prompts (and their model / max_tokens) move out of prompts.ts code constants into a new service-role-only prompts table (migration 0027), loaded per call by anthropic-proxy via getPrompt() (promptLoader.ts) with a ~5-min in-memory cache that fails open to the SUR-303 constants on any DB read error — fallback rows record prompt_version = 0 and surface _promptFallback: true to the client (mirroring the _failOpen guardrail contract; the client emits a prompt_fallback PostHog event alongside guardrail_fail_open). Each successful managed call records prompt_name + prompt_version to a new per-call ai_usage_events table (migration 0029, written best-effort beside the unchanged ai_usage_daily quota upsert) and to a new server-side PostHog event managed_ai_call_succeeded (prompt_name / prompt_version / fail_open / prompt_fallback). prompts.ts was refactored to extract DISCOVER_CANON / DISCOVER_WITH_CUSTOM_TEMPLATE + renderDiscoverWithCustom, keeping buildDiscoverSystem byte-identical (snapshot tests guard it; the seed in 0028 is byte-identical to the constants). Migration 0030 additionally locks down EXECUTE on the pre-existing upsert_ai_usage RPC (previously callable by any authenticated user via PostgREST). Full data-architecture coverage in Data Architecture → Prompt versioning; operations in the Prompt versioning runbook.
  • Updated (2026-06-28, SUR-711): Retired the ?intent=signup signup framing on AuthScreen.jsx — one signed-out landing for every entry point (no “Create your account” overlay; the email sheet no longer auto-opens from a marketing landing, which had covered the SUR-706 Terms/Privacy consent notice). AuthScreen still consumes ?open_email=1 (UpgradeAuthGate deep-link). Telemetry: auth_landing_viewed from authscreen now carries the constant intent: 'signin' (shape unchanged) and the app_signup_started anchor was deleted (replaced by separate instrumentation). Cross-repo: surfc-web/src/lib/appUrl.ts signupUrl()/signin. AuthControlsintent="signup" prop is unchanged (still used by UpgradeAuthGate).
  • Updated (2026-06-28, SUR-673): Rebranded the Supabase auth-email templates (supabase/email-templates/{magic-link,confirm}.html) to the braird identity — wordmark, cool-paper/forest-ink palette, braird.app links/footer; Go-template vars ({{ .ConfirmationURL }} / {{ .Token }} / {{ .Email }}) and the SUR-705 link+code structure preserved (no {{ if }} branch, no braces in comments). [auth.email.smtp] sender flips to admin_email = "hello@braird.app" / sender_name = "Deji @ braird" for local-CLI parity; the prod From identity is owned by SUR-674 and prod ships only via the manual dashboard paste, gated on braird.app Resend domain verification + warmup (SUR-669). Reset/recovery template is out of scope (braird auth is passwordless OTP/magic-link).

Evidence gathered from source files only, per AGENTS.md.

Component overview

  • Presentation layer: src/App.jsx wires state hooks, owns the route tree, and renders the shell layout. A partial refactor (SUR-303, 2026-05-04) has extracted: AppGates (sequential gate ladder: encryption check → enrollment → unlock → migration → device-add/transfer-redeem), ShellNavigation (ShellTabRow for desktop/tablet top-nav + ShellBottomNav for mobile), NoteActionOverlay (note long-press flow via forwardRef), IdeaActionOverlay (idea long-press flow via forwardRef), and UnsyncedChangesModal (replaces window.confirm for unsynced sign-out). Other notable components: AddIdeaSheet (idea-tagging bottom sheet in the note form), AddIdeaBanner (post-creation description prompt), LinkedDevicesModal (SUR-233, E2EE device list + removal), PolicyPage (Termly embed for /policies/:kind), HelpCenterScreen (in-app help index at /help), HelpArticle (in-app article at /help/:slug), and HelpArticleBody (markdown renderer with VitePress-link rewriting and callout support).
  • Local services: Hooks in src/hooks/ couple Dexie persistence and UI state: useAuth (auth + sync), useUI (navigation/view state), useNoteForm (note creation + AI ingestion), useSettings (custom ideas/import/export), useNoteActions (mutations + rediscovery), useKeyManagement (E2EE key lifecycle + device management — fetchDeviceList is memoized via useCallback([session?.user?.id]) so the App.jsx Settings-open effect does not trigger an unbounded refetch loop [SUR-233]), useUserProfile (fetches user_profiles Supabase row), useMediaQuery (SSR-safe matchMedia subscriptions), useFocusRestore (accessible focus return on modal close), and useToast.
  • Local store: Dexie (src/db.js) contains entity tables, schema migrations, CRUD helpers, outbox queue, and merge/import logic; it is the single source of truth when offline.
  • Cloud boundary: src/supabase.js wraps supabase-js auth (Google OAuth + email-OTP sign-in/sign-up via signInWithGoogle / requestEmailOtp / verifyEmailOtp), CRUD, and storage; versioned SQL under supabase/migrations/*.sql codifies tables/RLS/storage, while scripts/schema-contract.js + scripts/check-schema.js verify drift. Self-service signup is open post-SUR-364: requestEmailOtp omits shouldCreateUser so an unknown email creates a new auth.users row alongside the OTP send, matching the dashboard “Allow new users to sign up” toggle (manual flip in SUR-359). Auth dispatch hardening (SUR-360 + SUR-361, 2026-05-10): Resend SMTP (smtp.resend.com:587, sender hello@surfc.app) is wired via [auth.email.smtp] in supabase/config.toml and depends on the RESEND_API_KEY env var. Cloudflare Turnstile is wired via [auth.captcha] (provider turnstile, SUPABASE_AUTH_CAPTCHA_SECRET server-side, VITE_TURNSTILE_SITE_KEY client-side); the useTurnstile hook (src/hooks/useTurnstile.js) is mounted inside the email-OTP flow only because GoTrue only enforces [auth.captcha] on dispatch endpoints and supabase-js v2.100.0’s signInWithOAuth silently drops options.captchaToken (node_modules/@supabase/auth-js/dist/main/GoTrueClient.js:669-677). Post-SUR-371 (2026-05-20): VITE_TURNSTILE_SITE_KEY is now defended on two surfaces — vite-plugins/sur-371-turnstile-key-guard.js hard-fails npm run build when the var is unset in production (using loadEnv so the check matches what import.meta.env sees), and EmailSignInFlow in src/components/AuthControls.jsx shows a visible role="alert" banner + console.error if a production build somehow still reaches the runtime without it. The old !TURNSTILE_SITE_KEY short-circuit that silently bypassed captcha in production is gone; the dev convenience (skip captcha when the key is absent and import.meta.env.DEV is true) is preserved by explicit branching. Auth-landing surfaces (SUR-370 + SUR-357, 2026-05-11/12; SUR-711 retired the signup framing, 2026-06-28): AuthScreen.jsx is the single signed-out landing — SUR-711 removed the ?intent=signup framing (no signup overlay; the email sheet no longer auto-opens from a marketing landing, which had been sitting over the Terms/Privacy consent notice). It still consumes ?open_email=1 to pre-open the sheet for UpgradeAuthGate’s “Use a different email” deep-link. UpgradeAuthGate.jsx composes the same <AuthControls /> inline (SUR-357), with three optional host-override props (sheetTitle, secondaryLabel, onCtaClick) that let it render upgrade-specific copy (“Sign up to continue to Pro” / “Continue with email”) and own auth-funnel telemetry without the shared primitive growing host-specific event knowledge. The OAuth + email-OTP primitives live in src/components/AuthControls.jsx; the SUR-367 funnel event upgrade_gate_auth_started { interval, ref, method } is fired by UpgradeAuthGate via the onCtaClick callback (host-owned contract, AuthScreen leaves the prop unset). A single auth_landing_viewed PostHog event covers both landing surfaces: { surface: 'authscreen', intent: 'signin', ...utm } from AuthScreen (SUR-711 fixed intent to the constant 'signin' and removed the app_signup_started anchor) and { surface: 'upgrade_gate', intent: 'upgrade', interval, ref, ...utm } from UpgradeAuthGate (replaces the previously-shipped upgrade_gate_viewed). UTM/click-ID dimensions come from src/lib/utmParams.js → readUtmParams() on the app side and are forwarded across the cross-domain hop by surfc-web/src/scripts/preserveUtm.ts (wired in BaseLayout.astro); the upgrade gate also round-trips non-null UTM keys back to surfc.app/pricing via a subtle ”← Back to pricing” link, so a cold visitor who clicks away from the gate retains campaign attribution on re-entry.
  • AI ingestion: src/api.js performs Anthropic calls; src/ingest/*.js convert raw manual/photo input into normalized notes consumed by useNoteForm. All managed users post to invokeAnthropicProxy, which invokes supabase/functions/anthropic-proxy/ to run the call server-side and record usage (src/supabase.js).
  • Client-side safety: src/safety/index.js runs checkStructuredPii against note text before managed AI submission and returns PiiMatch[] for the BottomSheet review UI. Policy is warn-not-block at v1.4. Stub seams exist for SUR-246 on-device prompt injection and NER PII detection. Every managed-AI entry point gates through the shared usePiiReview controller (reviewBeforeSend / awaitPiiReview) before submission — transcription and discovery in the capture flow (useNoteForm) and re-discovery of an existing note (useNoteActions.rediscoverIdeas); one review sheet and one telemetry contract (pathtranscribe | discover | rediscover) serve all paths (SUR-308). On re-discover, Redact sends the asterisked text to the API only — the stored note is unchanged.
  • Public help page: HowItWorksPage.jsx and LandingPage.jsx were removed by SUR-215 (2026-04-23). Marketing content lives in surfc-web/.

Runtime flows

sequenceDiagram
participant UI as React UI (App + components)
participant Hooks as Hooks (useAuth/useNoteForm)
participant Dexie as Dexie (src/db.js)
participant Outbox as Outbox Queue
participant Supabase as Supabase (DB+Storage)
participant Proxy as anthropic-proxy (Edge Fn, SUR-10)
participant Anthropic as Anthropic API
UI->>Hooks: user actions (capture, edit, sync)
Note over Hooks,Anthropic: BYOK path
Hooks->>Anthropic: callTranscribeImage / callDiscoverIdeas (if BYOK key)
Anthropic-->>Hooks: JSON transcription/tags
Note over Hooks,Proxy: Managed path
Hooks->>Proxy: POST /anthropic-proxy {action, payload}
Proxy->>Anthropic: managed Anthropic call
Anthropic-->>Proxy: response + usage tokens
Proxy->>Supabase: upsert ai_usage_daily (service role)
Proxy-->>Hooks: JSON transcription/tags
Hooks->>Dexie: saveBook/saveNote/saveCustomIdea
Hooks->>Outbox: enqueue(table,payload) when offline/failure
Hooks->>Supabase: upsert entities + uploadImage when online
Supabase-->>Hooks: merged datasets (fetchAllCloud)
Hooks->>Dexie: mergeCloudRecords + downloadImage()
Dexie-->>UI: loadAll() for rendering

Analytics events

Both surfaces share a single PostHog project so the landing → app sign-up funnel is visible in one stream. Funnel rebuild is tracked in SUR-367 (the SUR-365 marketing CTA flip is the first event — data-cta="*_signup"app_cta_clicked).

  • Marketing surface (surfc-web/): pageview and app_cta_clicked events fire from src/layouts/BaseLayout.astro (the global data-cta click handler).
  • App surface (surfc/): product-level events fire from hooks and components. Notable: help_index_opened / help_article_viewed (in-app help), guardrail_fail_open (safety pipeline), note_created / ideas_discovered (core capture loop). PostHog is initialized via posthog-js with the project token from VITE_PUBLIC_POSTHOG_PROJECT_TOKEN.
  • /policies/:kind is served in-app by PolicyPage (Termly embed). A Netlify 301 from old app.surfc.app/policies/* bookmarks also exists in netlify.toml as a fallback.

State orchestration (local state vs. sync vs. AI)

  • useAuth initializes on app load: retrieves the Supabase session, hydrates Dexie via loadAll, subscribes to auth state changes, tracks online/offline events, flushes the outbox, merges Supabase data, and exposes books, notes, customIdeas, apiKey, cloudWrite, and syncFromCloud (src/hooks/useAuth.js).
  • useUI holds presentation-only signals (mobileView, selectedIdea, search text, modal/lightbox booleans) plus derived collections like ideaCounts (src/hooks/useUI.js).
  • useNoteForm bridges all three layers: it stores capture/AI state locally, persists notes/books to Dexie, kicks off Supabase uploads via cloudWrite + uploadImage, and calls Anthropic through callTranscribeImage / callDiscoverIdeas (src/hooks/useNoteForm.js).
  • useSettings manages Dexie metadata (API key, custom ideas), triggers Supabase writes via cloudWrite, and coordinates UI state for imports and tag renames (src/hooks/useSettings.js).
  • useNoteActions edits/deletes notes in Dexie, mirrors the mutations to Supabase, and re-tags notes with Anthropic when rediscoverIdeas fires (src/hooks/useNoteActions.js). Since SUR-308, rediscoverIdeas runs the client-side PII review (via the injected usePiiReview controller) before the Anthropic call; Edit aborts, Redact sends asterisked text to the API only, Send proceeds.

Offline sync pipeline

  • Schema probe: syncFromCloud runs probeCloudSchema once per session before touching Supabase; errors set syncStatus and block the rest of the pipeline until fixed (src/hooks/useAuth.js, src/supabase.js).
  • Write path: Hooks call saveBook/saveNote/saveCustomIdea to persist locally. useAuth.cloudWrite attempts Supabase upserts immediately; failures or offline states result in an outbox entry via enqueue (src/db.js, src/hooks/useAuth.js).
  • Flush path: On login or reconnect, syncFromCloud loads queued entries (getOutbox), collapses multiple edits via collapseOutboxItems, and retries Supabase upserts; success deletes IDs (src/supabase.js).
  • Merge path: After flushing, the client downloads every table via fetchAllCloud, runs mergeCloudRecords inside a Dexie transaction, then reloads UI state with loadAll. Missing note images are fetched via downloadImage and stored as imageDataUrl for offline lightbox support (src/hooks/useAuth.js).
  • Conflict model: mergeCloudRecords uses updated_at vs. updatedAt to enforce last-write-wins, respecting tombstones, and preserving local image previews when overwriting metadata (src/db.js). Tests src/test/outbox.test.js and src/test/sync.test.js assert these semantics.

Concentration zones & coupling

  • src/App.jsx: Houses all layout logic, mobile/desktop views, modal toggles, and long-press orchestration. Any new screen or state change must flow through this file, increasing fragility.
  • src/hooks/useNoteForm.js: Blends UX state, Dexie writes, Supabase uploads, AI calls (BYOK + managed), and navigation callbacks, making it the de facto domain service layer.
  • src/db.js: Centralizes schema defs, migrations, CRUD helpers, outbox, merge logic, and import/export, so subtle changes ripple across persistence, sync, and settings flows.
  • src/supabase.js: Acts as the sole cloud boundary; any change to Supabase schema or auth must be reflected here plus the migrations/contract (supabase/migrations/*.sql, scripts/schema-contract.js).

Managed AI Proxy

Managed Anthropic calls now flow through the live supabase/functions/anthropic-proxy/ Edge Function. The function is split across four source files:

  • index.ts — request routing, quota enforcement, guardrail orchestration, usage recording.
  • prompts.ts (SUR-303) — single source of truth for all system prompts: TRANSCRIBE_SYSTEM, TRANSCRIBE_SYSTEM_ASCII_NOTE, and buildDiscoverSystem(customIdeas). Imports GREAT_IDEAS from src/constants.js via Deno cross-tree import. The SUR-300 eval harness also imports from here so prompt edits are tested in the same file they ship from — no copy-paste drift.
  • guardrail.ts — Azure AI Content Safety wrapper (shield, moderate). See Safety guardrail pipeline below.
  • parseJson.ts — JSON extraction utilities for Anthropic response parsing.

Notable behaviors backed by code:

  • callTranscribeImage / callDiscoverIdeas pass the Supabase session to invokeAnthropicProxy when no API key is saved (src/api.js, src/supabase.js).
  • The Edge Function validates the caller via supabase.auth.getUser() using the anon key, then resolves the caller’s monthly cap from user_profiles.month_limit (single source of truth, default 50) plus a valid allocation_override via getResolvedMonthlyLimit, compares it to the live ai_usage_daily sum from getMonthlyUsage (cross-action total — no per-action-type carve-out, so transcribe and discover calls count toward the same shared cap), and records successful usage (request count + tokens) through the upsert_ai_usage RPC (supabase/functions/anthropic-proxy/index.ts, supabase/migrations/0004_ai_usage_tracking.sql, 0005_upsert_ai_usage_fn.sql, 0009_user_profiles.sql, 0012_per_user_quota_limits.sql — SUR-92).
  • HTTP 429 responses include { error: 'rate_limit' } and emit a server-side managed_ai_rate_limit_hit PostHog event via fireAndForget() — truly non-blocking, so PostHog network latency does not pad the 429 response time (SUR-92 fix, 2026-04-29); invokeAnthropicProxy maps the 429 to an Error with isRateLimit = true so useNoteForm/useNoteActions can show the upgrade message (src/supabase.js, src/hooks/useNoteForm.js).
  • A missing user_profiles row fails closed with 500 profile_missing rather than silently using a fallback constant — the trigger and approval-Edge-Function insert paths cover every authorised user, so this is treated as an unrecoverable invariant break.
  • The managed and BYOK paths share the same UI/state flows; canonicalisation and Dexie persistence happen identically after the proxy returns.

Safety guardrail pipeline (SUR-242)

Every managed request passes through a multi-stage safety pipeline in anthropic-proxy before usage is recorded. Stages run in order and an early BLOCK returns HTTP 422 without incrementing the user’s quota:

  1. Prompt fencing — untrusted delimiters wrap user-supplied content inside the system prompt, preventing injected text from escaping the data plane.
  2. Input shieldshield(userPrompt, []) calls Azure AI Content Safety Prompt Shields (text:shieldPrompt). BLOCK on attackDetected → HTTP 422 { error: 'guardrail_blocked', detector: 'prompt_shield', leg: 'input' }.
  3. Anthropic call — proceeds only if input is clean.
  4. Transcription-post Spotlighting — after a transcribe response, the transcribed text is passed back to shield() as a document to catch indirect injection embedded in the scanned page. BLOCK on attackDetected → HTTP 422 { detector: 'spotlight', leg: 'transcription_post' }.
  5. Output harm moderationmoderate(transcribedText) calls Azure text:analyze. Categories: Hate, Violence, Sexual, SelfHarm. Severity threshold ≥5 (loosened from default 4 after false-positives on literary/technical content in the SUR-242 spike). BLOCK → HTTP 422 { detector: 'harm', leg: 'output' }. Skipped on discover — Discover output is a constrained allow-list of Idea names with no free-form text.
  6. Usage recording — only fires on a full clean pass.

All Azure calls fail-open on 5xx, network error, timeout, or missing config — a guardrail outage never blocks the core loop. _failOpen: true in the response body signals the client to emit a guardrail_fail_open PostHog event. Required env vars: AZURE_CONTENT_SAFETY_ENDPOINT, AZURE_CONTENT_SAFETY_KEY, AZURE_CONTENT_SAFETY_API_VERSION.

Stripe billing flows (SUR-85)

Three Supabase Edge Functions implement paid Pro subscriptions. Data-plane columns + the idempotency ledger are documented in Data Architecture → Stripe billing; this section covers the request flows.

FunctionAuthPurpose
create-checkout-sessionJWTResolves / lazy-creates user_profiles.stripe_customer_id, then opens a Stripe Checkout Session and returns { url }. The persist step is a conditional UPDATE filtered on stripe_customer_id IS NULL (SUR-351) so N parallel callers converge on a single customer id; the race-loser deletes its own leaked Stripe customer (best-effort) and re-reads the survivor. Self-heal (SUR-501): if sessions.create throws resource_missing on the customer, the stored id is cleared — conditionally, on the observed stale id, so a concurrent healer’s fresh id is never clobbered — recreated via ensureStripeCustomer, and the session is retried once; a resource_missing on the price, or a second failure, surfaces as internal_error.
stripe-webhookStripe signature only (no JWT)Verifies signature, dedupes via stripe_webhook_events, dispatches to per-event-type handlers in handler.ts. Requires verify_jwt = false in supabase/config.toml under [functions.stripe-webhook] — Stripe deliveries don’t carry a Supabase JWT, so the gateway’s default JWT check would 401 every event before the handler runs. Same pattern as approve-waitlist and waitlist-signup.
create-billing-portal-sessionJWTLooks up stripe_customer_id, opens a Stripe Billing Portal session, returns { url }. 404 no_customer if the user has never started a checkout.

Stripe SDK pinned at npm:stripe@^22; API version pinned in each function constructor at 2026-04-22.dahlia. Stripe API versions are deliberately pinned so a Stripe-side default change cannot shift the wire shape under us — when we want a newer one we bump it explicitly.

Webhook subscriptions registered on the Stripe endpoint (exactly five): customer.subscription.created, customer.subscription.updated, customer.subscription.deleted, invoice.paid, invoice.payment_failed.

sequenceDiagram
participant UI as React app
participant CCS as create-checkout-session
participant Stripe as Stripe Checkout
participant Hook as stripe-webhook
participant DB as Supabase DB
participant Resolver as getResolvedEntitlements
UI->>CCS: { interval, successUrl, cancelUrl } + JWT
CCS->>DB: SELECT stripe_customer_id from user_profiles
alt no customer yet
CCS->>Stripe: customers.create({ email, metadata.user_id })
CCS->>DB: UPDATE … WHERE stripe_customer_id IS NULL RETURNING stripe_customer_id
alt won the race
Note over CCS,DB: returning row → use the new customer
else lost the race
CCS->>Stripe: customers.del(leaked) (best-effort)
CCS->>DB: SELECT stripe_customer_id (survivor)
end
end
CCS->>Stripe: checkout.sessions.create(...)
Stripe-->>CCS: { url }
CCS-->>UI: { url }
UI->>Stripe: redirect to checkout
Stripe-->>UI: success/cancel redirect
Stripe->>Hook: subscription.created (signed)
Hook->>Hook: stripe.webhooks.constructEventAsync(rawBody)
Hook->>DB: INSERT stripe_webhook_events ON CONFLICT DO NOTHING
alt fresh event
Hook->>DB: UPDATE user_profiles SET user_tier='pro', tier_started_at=now(), subscription_status, current_period_end
else duplicate
Hook-->>Stripe: 200 deduplicated
end
Hook-->>Stripe: 200 received
UI->>Resolver: refresh useEntitlements()
Resolver->>DB: SELECT user_tier, subscription_status, current_period_end, ...
Resolver-->>UI: { tier: 'pro', capabilities: PRO_DEFAULTS }

Webhook handler — idempotency, signature, dispatch

stripe-webhook/index.ts is the entry point and handler.ts is the pure dispatcher (split for testability — dispatchEvent is covered by Deno tests with synthetic events).

Two non-obvious Deno specifics:

  • Raw body before parsing. req.text() is read before any JSON handling. Stripe’s signature is over the exact bytes Stripe sent; re-serialising via req.json() would break the HMAC.
  • constructEventAsync, not constructEvent. The sync variant uses Node’s crypto module, which Deno does not provide. The async variant uses WebCrypto.

The dispatch table:

Event typeDB writesuser_tier impact
customer.subscription.created / customer.subscription.updatedstripe_subscription_id, subscription_status, subscription_current_period_endFlip to 'pro' (and stamp tier_started_at) on first transition into active / trialing for a 'free' user. Never demote.
customer.subscription.deletedsubscription_status='canceled', subscription_current_period_endNever written. Demotion is owned by the resolver at read time (deferred 30-day grace).
invoice.paidsubscription_status='active' unless current status is canceledNone. Re-confirms active after an invoice.payment_failed → past_due resolution. The canceled-skip guard prevents a delayed retry from resurrecting a cancelled subscription past the grace cutoff.
invoice.payment_failedsubscription_status='past_due' unless current status is canceledNone. Stripe handles dunning before subscription.deleted. Same canceled-skip rationale as invoice.paid.

After dispatch, processed_at is stamped on the ledger row. Failures during dispatch (DB errors, an unrecognised subscription_status from a future Stripe API version that violates the CHECK constraint) return 500 to Stripe, which retries. The retry’s upsert hits the existing row; ledgerInsertEvent then re-reads processed_at to decide:

  • processed_at IS NULL → prior delivery’s dispatch crashed before the marker was written. Re-run dispatch on this retry — the user_profiles state change from the original event never landed, and skipping would silently strand a paid subscription event.
  • processed_at IS NOT NULL → the original delivery already applied fully. Short-circuit to a 200 deduplicated; Stripe stops retrying.

The dispatch handlers are individually idempotent (UPDATEs with stable inputs derived from the Stripe event), so a re-run on a row that partially applied before crashing converges to the correct end state without compensating logic. The corollary: never rely on side effects that aren’t expressed via the user_profiles update payload, since they won’t be re-run on reclaim.

Customer-id race + self-heal (SUR-351)

Two failure-mode hardenings on the data plane between Checkout and the webhook, both responding to a 2026-05-08 incident where a paid Pro upgrade silently no-op’d the user_profiles flip:

  1. Race-safe customer creation in create-checkout-session. The original SELECT → CREATE → UPDATE sequence let N parallel invocations (e.g. a double-click that fired four create-checkout-session POSTs in one second per the production logs) each see stripe_customer_id IS NULL, each create their own Stripe customer, and each persist; last UPDATE wins. The user clicked whichever Checkout URL they got, paid on a customer the DB no longer pointed at, and the webhook arrived with a subscription.customer that didn’t match any profile. The fix replaces the persist with a conditional UPDATE … WHERE user_id = ? AND stripe_customer_id IS NULL plus a RETURNING stripe_customer_id. Race-winners get their own customer back; race-losers see no row, best-effort stripe.customers.del their leaked customer (no charges attached, but keeps the customer list tidy), re-read the survivor, and use it. Every parallel call converges on the same customer id, so whichever Checkout URL the user clicks settles the subscription on the customer that’s actually linked in user_profiles.

  2. metadata.user_id fallback + self-heal in the webhook. Even with L1, any future regression (or any other source of stripe_customer_id drift) would silently no-op via the defensive profile_not_found_for_customer → 200 path. handler.ts now exposes resolveProfileForSubscription: when findProfileByCustomer misses, it falls back to sub.metadata.user_id — which create-checkout-session already populates via subscription_data.metadata.user_id for every subscription opened through our flow. On hit, it writes the correct stripe_customer_id back to the resolved profile and emits a structured [stripe-webhook] self_healed_stripe_customer_id log so any L1 regression is loud, not silent. Applied to both handleSubscriptionUpsert and handleSubscriptionDeleted.

    The fallback is guarded against stale at-least-once deliveries. If the metadata lookup resolves a profile whose stripe_subscription_id is non-null and doesn’t match event.data.object.id, the fallback no-ops with a skip_self_heal_subscription_id_mismatch log instead of self-healing. Without this guard, a delayed customer.subscription.deleted (or duplicate created/updated) for an older sub_id carrying the same metadata.user_id would resolve the live profile, the caller would apply the stale event, and a user active on a newer subscription would have their stripe_customer_id overwritten and their subscription_status flipped to 'canceled' — silently and permanently. The guard scope is narrow on purpose: it protects the metadata-fallback path only. Direct customer-id matches carry stronger evidence (the live customer link points at this profile) and remain unguarded; a separate hardening of handleSubscriptionDeleted against delayed cancels on a matched customer-id is tracked outside SUR-351.

    The unrecoverable miss (both lookups fail) now also logs profile_not_found_for_customer with the full event context, converting the previously silent failure mode into a searchable signal in Supabase Function logs.

  3. Invoice subscription-id extraction across API versions. The invoice.subscription field was deprecated in Stripe API version 2024-10-28 in favour of invoice.parent.subscription_details.subscription. This codebase pins 2026-04-22.dahlia, so the legacy field is no longer populated and every invoice.paid / invoice.payment_failed had been returning no_op: invoice_without_subscription in production — silently breaking the dunning recovery (active re-confirmation) and past_due mirroring paths. The new getSubscriptionIdFromInvoice helper resolves the new path first (string id or expanded object), falls back to the legacy top-level field as defence-in-depth for any older payload variant, and finally to lines[0].subscription as a last resort.

Verification at the line of fire: the self_healed_stripe_customer_id log should be absent under normal operation. Its presence in production logs means L1 has regressed somewhere we didn’t anticipate and the metadata fallback has carried the system. Likewise, skip_self_heal_subscription_id_mismatch should be rare and worth auditing if it appears more than incidentally — it usually means the profile holds a stale stripe_subscription_id (a previous subscription that should have been cleared on cancel-and-resubscribe) or that Stripe is replaying very old events for a subscription the user no longer has.

Client integration

src/supabase.js exposes thin wrappers that match the invokeAnthropicProxy shape — fresh-session refresh, JWT in Authorization, typed errors:

export async function createCheckoutSession({ interval, successUrl, cancelUrl })
export async function createBillingPortalSession({ returnUrl })

createBillingPortalSession surfaces a 404 no_customer response as a typed error.isNoCustomer = true so the UI can route the user to the upgrade flow rather than show a generic error toast. SUR-86 / SUR-88 / SUR-89 wire these into the Plans page, Settings deliberate-upgrade, and the quota-exhausted popup respectively — those issues do not redefine the contract.

Required Edge Function secrets

STRIPE_SECRET_KEY, STRIPE_WEBHOOK_SECRET, STRIPE_PRICE_ID_PRO_MONTHLY, STRIPE_PRICE_ID_PRO_ANNUAL. Documented under docs/supabase-setup.md → Edge Function secrets / Stripe. The shared loader supabase/functions/_shared/env.ts → requireEnv throws on missing values so a misconfigured deploy fails on the first request rather than silently mid-request.

Out of scope for SUR-85

  • UI surfaces. SUR-86 (Plans page on surfc-web/), SUR-88 (Settings deliberate upgrade), SUR-89 (quota-exhausted popup) own the upgrade / manage-subscription buttons. SUR-85 stops at the function + wrapper boundary.
  • Trial mechanics. Deferred to SUR-262 / SUR-235. The webhook treats trialing exactly like active, so a future trial wiring needs no webhook change.
  • PostHog billing events. checkout_started, subscription_created, subscription_canceled are the relevant CTA-side events; they fire from the UI surfaces above, not from the Edge Functions.
  • CORS / body-parsing helper extraction. Tracked in SUR-339 — the three new functions still define their own CORS_HEADERS constants alongside anthropic-proxy, waitlist-signup, delete-account, me-entitlements, and image-upload. SUR-339 unifies them into _shared/cors.ts + _shared/http.ts post-SUR-85.

Confirmed / Assumption / Unknown

In-app help system (SUR-209 Phase 2, SUR-303, 2026-05-04)

The help system is built directly into the React app and reads the same markdown sources as the public VitePress site at help.surfc.app:

  • src/help/manifest.js — reads docs/getting-started/*.md at Vite build time via import.meta.glob({ query: '?raw', eager: true }). Parses frontmatter, orders articles per ARTICLE_ORDER, and exports articles, indexArticle, getArticle(slug), and isForwardReference(slug). FORWARD_REFERENCES is an allow-list of slugs that don’t have articles yet — links to them render as disabled <span> rather than broken routes.
  • /helpHelpCenterScreen — searchable article index; fires help_index_opened PostHog event. Public route (no auth required).
  • /help/:slugHelpArticleHelpArticleBody — renders article markdown with react-markdown + remark-gfm + remark-directive; rewrites VitePress-style absolute links (/getting-started/<slug>) and relative links to SPA /help/<slug> <Link> elements; callouts (:::note, :::tip, :::warning, :::danger) rendered via HelpCallout through src/help/calloutPlugin.js.
  • Both routes are outside AppGates — accessible to unauthenticated sessions (e.g. from the AuthScreen or a cold URL share).
  • Help articles bundled at build time: changes to docs/getting-started/*.md require a new Vite build to appear in the app.

Confirmed / Assumption / Unknown

  • Confirmed:
    • Offline-first architecture with Dexie + outbox queue, last-write-wins merge, and Supabase sync, as evidenced by src/hooks/useAuth.js, src/db.js, and tests.
    • All managed AI calls flow through anthropic-proxy; direct browser-to-Anthropic communication was removed (SUR-91). The BYOK trace remaining is a one-time apiKey cleanup in db.js.
    • Supabase serves as both the operational datastore and auth boundary per supabase/migrations/0001_initial_schema.sql and the schema contract tooling.
    • Azure AI Content Safety guardrail pipeline (guardrail.ts) is wired into anthropic-proxy with fail-open semantics (SUR-242, 2026-05-01).
    • Client-side PII detection via src/safety/ (SUR-242); warn-not-block; SUR-246 NER stubs in place.
    • In-app help system at /help and /help/:slug (SUR-303, 2026-05-04); reads docs/getting-started/*.md at build time. Public routes — no auth required.
    • /how-it-works client-side route — removed by SUR-215 (2026-04-23); marketing lives in surfc-web/.
  • Assumption:
    • Future ingest adapters (Readwise/Kindle) will plug into the existing adapter interface; comments exist but no code.
    • Multi-device concurrency is limited to eventual consistency; there is no mention of CRDTs or per-field merges.
    • Background sync relies entirely on the user opening the app; no service worker sync hooks are implemented.
    • Azure Content Safety severity threshold 5 was validated against Surfc-shaped content in the SUR-242 spike; revisit if harm-category false-positive rate changes at higher usage scale.
  • Unknown:
    • No evidence of push notifications, share targets, or OS integrations beyond the PWA manifest.
    • Error telemetry/monitoring stack is not referenced.
    • How image storage is cleaned up across deletes remains unspecified.