System Architecture
System Architecture
Refreshed 2026-05-06 (SUR-284). All known post-SUR-215 / post-SUR-218 inaccuracies resolved. The “Shared marketing/help route analytics” section has been replaced with accurate content. See the CHANGE SUMMARY below for the full update history.
SUR-261 (2026-04-26) restructured
src/AuthScreen.jsxinto a three-zone layout with a “Use a different account” bottom sheet and added an email-OTP sign-in path alongside Google OAuth. Email-OTP pinsshouldCreateUser:false(sign-in only — no self-signup). Hero markup extracted intosrc/components/HomeHero.jsx, shared betweenAuthScreenandHomeScreen. Platform detection lives insrc/lib/platform.js(isStandaloneOrTwa()). Branded magic-link template atsupabase/email-templates/magic-link.html. The “Auth boundary” subsection below covers the new helpers.
CHANGE SUMMARY
- Updated: Reflected the live managed Anthropic proxy path, quota enforcement, and rate-limit surfacing (
src/api.js,src/supabase.js,supabase/functions/anthropic-proxy/index.ts).- Updated: Captured the ai_usage_daily ledger and its impact on runtime flows and risks (
supabase/migrations/0004_ai_usage_tracking.sql,0005_upsert_ai_usage_fn.sql).- Updated (2026-04-23/24): SUR-215 stripped
LandingPage.jsx,WaitlistScreen.jsx,HowItWorksPage.jsx; SUR-238 reworked capture screen nav. Staleness banner updated accordingly.- Updated (2026-04-26, SUR-261): AuthScreen restructured into three-zone layout; new email-OTP sign-in path (
requestEmailOtp/verifyEmailOtpinsrc/supabase.js); platform helper atsrc/lib/platform.js; sharedHomeHerocomponent; brandedsupabase/email-templates/magic-link.html(manual dashboard upload).- Updated (2026-04-29, SUR-92): Per-user managed-AI quota:
getResolvedMonthlyLimitreads fromuser_profiles.month_limit(default 50) + optionalallocation_override;getMonthlyUsagenow sumsrequest_countacross all action types (no per-action carve-out);emitTelemetryis fire-and-forget viafireAndForget()so the 429 response is not delayed by PostHog network behaviour.- Updated (2026-04-30):
public/.well-known/assetlinks.jsonnow declaresdelegate_permission/common.get_login_credsso the Android TWA can share passkeys/credentials with the browser via Google Smart Lock. A second SHA-256 fingerprint (debug/test keystore) was also registered.- Updated (2026-05-01, SUR-242): Azure AI Content Safety guardrails added to
anthropic-proxy(guardrail.ts): Prompt Shields (direct injection), Spotlighting (indirect injection in transcribed text), and harm classifiers (output moderation, threshold ≥5). All fail-open. Prompt fencing added to system prompts. Client-side PII detection module (src/safety/) added — structured-regex warn-not-block, stub seams for SUR-246.- Updated (2026-05-01, SUR-256):
surfc-web/now has a founder blog at/blog/— Astro content collections (MDX), paginated index,[slug]renderer, RSS feed, reading time, rehype autolink headings.- Updated (2026-05-02, SUR-233):
LinkedDevicesModaladded as the UI surface foruseKeyManagementdevice list + removal.deviceLabel.js(getDeviceLabel) captures a"Platform · Browser"label at passkey-enrolment time and stores it inwrapped_key_blobs.device_label. Lockout guards prevent removing the current device or last wrapper.- Updated (2026-05-02):
AddIdeaSheet,AddIdeaBanner, andcreateCustomIdea.jsadded for the note-form idea-tagging flow.useFocusRestorehook added for accessible focus management in modals and bottom sheets.- Updated (2026-05-02, SUR-237): Transfer-v1 TTL enforcement moved server-side. Migration 0014 adds
DEFAULT ((extract(epoch from now()) * 1000)::bigint)towrapped_key_blobs.created_at(server-stamp) and introducesselect_fresh_transfer_blobSECURITY DEFINER RPC.redeemDeviceTransfernow calls the RPC; client-sideTRANSFER_MAX_AGE_MS/TRANSFER_MAX_FUTURE_SKEW_MSremoved.TRANSFER_SANITY_FUTURE_SKEW_MS(5 min) retained as client defense-in-depth.- Updated (2026-05-04, SUR-303 refactor/sur-303-extract):
anthropic-proxyprompt extraction:TRANSCRIBE_SYSTEM,TRANSCRIBE_SYSTEM_ASCII_NOTE, andbuildDiscoverSystemmoved fromindex.tsintoprompts.ts;GREAT_IDEASnow sourced fromsrc/constants.js(Deno cross-tree import — no content change).prompts.test.tsadded under__tests__/. The SUR-300 eval harness imports fromprompts.tsto prevent prod/harness drift.- Updated (2026-05-04, SUR-303):
App.jsxpartial refactor landed. Extracted:AppGates(gate ladder),ShellNavigation(ShellTabRow+ShellBottomNav),NoteActionOverlay(note long-press flow),IdeaActionOverlay(idea long-press flow),UnsyncedChangesModal(unsynced-changes sign-out confirmation). New hooks:useUserProfile(fetchesuser_profilesfrom Supabase),useMediaQuery(SSR-safematchMedia). In-app help system (SUR-209 Phase 2) shipped:HelpCenterScreenat/help,HelpArticleat/help/:slug,HelpArticleBody(markdown renderer),src/help/manifest.js(readsdocs/getting-started/*.mdat build time).PolicyPagenow serves/policies/:kindin-app via Termly embed. New production deps:react-markdown,remark-gfm,remark-directive.- Updated (2026-05-03, SUR-233 follow-up):
fetchDeviceListinuseKeyManagement.jswrapped inuseCallback([session?.user?.id])to give it a stable identity across renders, stopping an unbounded Supabase refetch loop that fired each time Settings opened.fetchDeviceListnow also setsactiveWrapperCountfromrows.lengthso a single SELECT covers both the trigger-count and the modal data.App.jsxfiresfetchDeviceListviauseEffectwhenever Settings opens for an enrolled, online user.SettingsModaldrops theactiveWrapperCount > 0gate — the “X devices linked” trigger now renders for any enrolled user (including returning users whose unlock went throughtryEagerKeyRestore). Regression test:src/test/useKeyManagement-stability.test.jsx.- Updated (2026-05-03, chore): VitePress
srcExcludeindocs/.vitepress/config.mjsextended withoutreach/**andspikes/**to prevent VitePress from trying to render internal research docs during the Cloudflare Pages help-site build..gitignoretightened — addssupabase/.temp/,deno.lock,temp_*.js,.claude/scheduled_tasks.lock,.claude/settings*.json, and local-only directoriesref/,design/,test-inputs/.- Updated (2026-05-06, SUR-327): Entitlement model + non-blocking gate instrumentation (Phase A of SUR-235). New shared resolver
supabase/functions/_shared/entitlements.ts → getResolvedEntitlementsis the SSoT for tier-based capabilities;me-entitlementsEdge Function exposes it;src/hooks/useEntitlements.jsconsumes it. Image uploads now go through the newimage-uploadEdge Function so the storage write and theuser_profiles.image_storage_bytes_usedcounter increment share a transaction (atomic via theadjust_image_storage_bytesSECURITY DEFINER RPC in migration0017). Four non-quota gate sites (re_discover/custom_ideas/max_devices/image_storage) emitwould_have_blockedPostHog events but do not alter behaviour — SUR-262 observes; SUR-328 flips to enforcing. Full data-architecture coverage in Data Architecture → Entitlements.- Updated (2026-05-10, SUR-360 + SUR-361): Auth dispatch hardening landed in two parts. SUR-360 turned on email confirmation (
[auth.email] enable_confirmations = true) and wired Resend SMTP ([auth.email.smtp]→smtp.resend.com:587, senderhello@surfc.app,RESEND_API_KEYenv var; the same key theapprove-waitlistEdge Function already uses). Branded confirmation template added atsupabase/email-templates/confirm.html; templates are NOT auto-synced to the dashboard — re-uploading via Studio → Authentication → Templates is a manual step on every change. SUR-361 added Cloudflare Turnstile via a newuseTurnstilehook (src/hooks/useTurnstile.js) — script-tag loader with module-level dedup + 5 s deadline poll,appearance: 'interaction-only'(invisible unless Cloudflare needs a challenge). Server-side wired in[auth.captcha](providerturnstile, secret fromSUPABASE_AUTH_CAPTCHA_SECRET); client bundles viaVITE_TURNSTILE_SITE_KEY. The widget lives insideEmailSignInFlowonly — not at the AuthScreen top level — because GoTrue only enforces[auth.captcha]on dispatch endpoints (signup / password / OTP / recover) and supabase-js v2.100.0’ssignInWithOAuthsilently dropsoptions.captchaToken(forwards onlyredirectTo / scopes / queryParams / skipBrowserRedirectto_handleProviderSignIn— seenode_modules/@supabase/auth-js/dist/main/GoTrueClient.js:669-677and:2030-2042). Threading a token into Google/Apple sign-in would be cargo-cult — UX latency for no security gain. Local-dev secrets template added atsupabase/.env.example(Cloudflare’s documented always-pass test secret1x0000000000000000000000000000000AAis fine for the captcha secret; Inbucket on port 54324 intercepts auth email regardless so the Resend key is unused locally). CI:.github/workflows/db-test.ymlnow exports both env vars beforesupabase startso the local stack boots cleanly with captcha enabled.- Updated (2026-05-10, SUR-365): Marketing CTA flip to direct signup. Every
surfc.app“Request invitation” CTA inNav.astro/Hero.astro/ClosingCta.astrois now “Sign up free” →app.surfc.app;data-ctavalues renamed tonav_signup/hero_signup/closing_signup(these are the first event in SUR-367’s rebuilt funnel).src/components/WaitlistForm.astroandtests/waitlist.spec.ts+tests/fixtures/cors-server.mjsdeleted;src/pages/waitlist.astroreplaced with anoindex“we’ve opened up” sunset page (3 smeta http-equiv="refresh"toapp.surfc.app).astro.config.mjsfilters/waitlist/out of the sitemap. Dead.wl-*rules dropped frommarketing.css. Hero helper text rewritten to “New to Surfc? Create a free account. Already have an account? Sign in.” ClosingCta lede rewritten to “Surfc is free to use. Pro is for heavier readers when you’re ready.”PUBLIC_SUPABASE_URL/PUBLIC_SUPABASE_ANON_KEYretained in.env.examplebecausesrc/lib/checkout.tsstill uses them for/pricing(onlyPUBLIC_WAITLIST_ENDPOINTwas dropped). Thewaitlist-signupEdge Function and PostHog funnel rebuild are SUR-367.- Updated (2026-05-10, SUR-364): Self-service signup cutover. The client-side waitlist gate (
PendingApprovalScreen,fetchWaitlistStatusinsrc/supabase.js,waitlistStatusstate inuseAuth.js, thewaitlistStatus !== 'approved'block inAppGates.jsx) was removed andrequestEmailOtpflipped fromshouldCreateUser: falseto the supabase-js default (true) so unknown emails create a newauth.usersrow on first OTP request. The dashboard “Allow new users to sign up” toggle is the live cutover lever (SUR-359, manual). The marketing-side waitlist surface (Astro/waitlistroute + CTAs) lands separately under SUR-365; thewaitlist-signupEdge Function sunset and PostHog funnel rebuild are SUR-367. Thewaitlist_requeststable,match_waitlist_on_signuptrigger, andapprove-waitlistEdge Function remain alive (admin UI dependency).- Updated (2026-05-07, SUR-85): Stripe billing infrastructure landed — three new Edge Functions (
create-checkout-session,stripe-webhook,create-billing-portal-session) plus migration0019_stripe_billing.sql(additiveuser_profilescolumns +stripe_webhook_eventsledger). Client wrapperscreateCheckoutSession/createBillingPortalSessionadded insrc/supabase.js. Stripe SDK pinned atnpm:stripe@^22, API version pinned at2026-04-22.dahlia. Lapsed-Pro 30-day grace window enforced at read time ingetResolvedEntitlementsrather than via a daily cron flip —user_tierstays'pro'aftersubscription.deletedand the resolver returns'free'only oncesubscription_current_period_end + 30 dayshas passed. Webhook idempotency viaINSERT … ON CONFLICT DO NOTHINGagainststripe_webhook_events.event_id. Full coverage in the Stripe billing flows section below.- Updated (2026-05-11, SUR-370): Intent-aware auth-landing surface.
AuthScreen.jsxreads?intent=signupfromwindow.location.searchand renders signup-framed UI (overlay heading “Create your account” + “Free forever. No credit card.” reassurance, pre-opened email sheet, secondary CTA “Already have an account? Sign in”, bottom-sheet title swap to “Sign up with email”). Default landing (?intent=signinor no query) renders existing framing. Bothintentandopen_emailare stripped viahistory.replaceStatepost-consumption — only when the value was meaningful (intent ∈ {signup,signin},open_email === '1') so unrecognised values are not silently consumed. Shared OAuth + email-OTP primitives extracted fromAuthScreenintosrc/components/AuthControls.jsx(props:intent,autoOpenEmail,onError) so SUR-357 can compose them intoUpgradeAuthGatelater. Unified telemetry event: the previously-shippedupgrade_gate_viewed { intent: 'upgrade', interval, ref }from SUR-352 is renamed toauth_landing_viewedwith an addedsurface: 'upgrade_gate'prop;AuthScreenfires the same event withsurface: 'authscreen'andintent ∈ {signup,signin}. Both surfaces spread seven canonical UTM/click-ID keys (utm_source/medium/campaign/term/content,gclid,fbclid) from the URL viasrc/lib/utmParams.js → readUtmParams(). Marketing-side preservation viasurfc-web/src/scripts/preserveUtm.ts(wired inBaseLayout.astro) — forwards UTMs from the marketing page’s URL onto every<a data-cta>whose origin matchesapp.surfc.app, re-attaches onastro:page-loadso view-transitions don’t silently break attribution.App.jsx’s catch-all unauth redirect (<Navigate to="/signin" replace />) now preserveswindow.location.searchso a stray landing at/?intent=signupcarries the param across to AuthScreen. Marketing CTAs (Nav.astro,Hero.astro,ClosingCta.astro,waitlist.astro) now composesignupUrl()from a sharedsurfc-web/src/lib/appUrl.tshelper —signupUrl()returns${appUrl}/signin?intent=signup(deep-links past the redirect).data-ctavalues from SUR-365 (nav_signup,hero_signup,closing_signup,waitlist_legacy_signup) preserved for SUR-367 funnel parity.- Updated (2026-05-14, SUR-351): Stripe billing — silent-failure fix on the customer-id race + invoice deprecation. A user paid for Pro and their
user_profilesrow never flipped to'pro'because four parallelcreate-checkout-sessioninvocations each created their own Stripe customer (race in the unconditionalSELECT → CREATE → UPDATE); last UPDATE won; the user clicked a Checkout URL pointing at a different customer id; the webhook arrived with asubscription.customerthat no longer matched any profile and silently no-op’d with a 200. Three layers landed in lockstep: (L1)ensureStripeCustomernow uses a conditional UPDATE filtered onstripe_customer_id IS NULLso parallel callers converge on a single customer id; race-loser issues a best-effortstripe.customers.delfor its leaked customer and re-reads the survivor. (L2)stripe-webhook/handler.tsaddsresolveProfileForSubscription— whenfindProfileByCustomermisses, it falls back tosub.metadata.user_id(always set viasubscription_data.metadata.user_idat Checkout), self-healsstripe_customer_idon the resolved profile, and emits a structured[stripe-webhook] self_healed_stripe_customer_idlog. The fallback is guarded against stale at-least-once deliveries: if the profile’sstripe_subscription_idis non-null and doesn’t match the event’ssub.id, the path no-ops with askip_self_heal_subscription_id_mismatchlog so a delayedcustomer.subscription.deletedfor an old sub cannot clobber a user who is currently active on a newer one. The unrecoverable miss now also logsprofile_not_found_for_customer(previously silent — that silence was the property the original outage exploited). Applied to bothhandleSubscriptionUpsertandhandleSubscriptionDeleted. (L3) NewgetSubscriptionIdFromInvoiceresolvesinvoice.parent.subscription_details.subscription(the post-2024-10-28 location for the pinned2026-04-22.dahliaAPI version) → legacy top-level →lines[0].subscription. Each level handles both string id and expanded-object form. Without L3 everyinvoice.paid/invoice.payment_failedhad been returningno_op: invoice_without_subscriptionin production — dunning recovery andpast_duemirroring were both broken. Fix is insupabase/functions/create-checkout-session/index.tsandsupabase/functions/stripe-webhook/handler.ts; covered by 14 new Deno tests (race-winner / race-loser, metadata self-heal across created/updated/deleted, the subscription-id mismatch guard with three branches, four invoice payload shapes). Open question / out of scope: whycreate-checkout-sessionwas called 4× per click (StrictMode? hydration script insurfc-web/src/pages/pricing.astro? missing debounce on the Upgrade button?) is tracked separately. CI gap noted:.github/workflows/edge-functions-test.ymldoes not currently type-check or testcreate-checkout-sessionorstripe-webhook, so these tests live outside the CI gate until the workflow is broadened.- Updated (2026-05-28, SUR-501): Stripe billing —
create-checkout-sessionself-heals a stalestripe_customer_id. Follow-up to the SUR-500 incident (a liveSTRIPE_SECRET_KEYagainst a test-modecus_…→stripe.checkout.sessions.createthrowsresource_missing→ caught asinternal_error500, an unrecoverable upgrade dead-end). Whensessions.createthrows a Striperesource_missingnaming the customer (deleted, or a test↔live mode switch), the function now clears the stored id, recreates viaensureStripeCustomer, and retries the session once; aresource_missingon the price (a config error recreation can’t fix) or a second failure surfaces unchanged asinternal_error(single retry, no loop). The clear is conditional on the observed stale id (.eq('stripe_customer_id', staleId), mirroring the SUR-351 race-safe write) so two concurrent healers cannot clobber a freshly-recreated id or mint a second customer;ensureStripeCustomerthen re-reads the winner. The heal emits a loud[create-checkout-session] customer self-heal: <old> -> <new>log so a deploy-wide mode mismatch (a test key wrongly deployed to prod would 404 every live customer) is detectable rather than silently absorbed. Pairs with the SUR-351 webhook self-heal: the retried session preservessubscription_data.metadata.user_id, soresolveProfileForSubscriptionmaps the resulting subscription back to the user even though the customer id changed — no Surfc/Stripe divergence. Self-heal deliberately does not reconcile entitlements (that was SUR-500 step 4). Insupabase/functions/create-checkout-session/index.ts; 5 new Deno tests (recreate+retry success, price-miss passthrough, no-loop guard, clear-failure bubbles, concurrent-healer no-clobber). Open assumption: relies on the pinnedstripe@^22settingparam: 'customer'(vsline_items[0][price]for a bad price) on these errors — to be confirmed by the Stripe test-mode E2E (the openbilling-reviewerHOLD on the code PR).- Updated (2026-05-20, SUR-371): Auth dispatch hardening — defensive guard on
VITE_TURNSTILE_SITE_KEY. Closes the silent-failure mode that surfaced on 2026-05-10 immediately after the SUR-364 cutover, where a Netlify production deploy ran withVITE_TURNSTILE_SITE_KEYunset (the env var was added to Netlify after the build). Vite bundledundefined, the clientcaptchaReady = !TURNSTILE_SITE_KEY || Boolean(turnstile.token)short-circuited totrue, every email-OTP submission firedrequestEmailOtp(email, undefined), and GoTrue rejected each one withcaptcha protection: request disallowed (no captcha_token found)— 100% of email signups broken; the form looked perfectly functional. Three coordinated guards now make recurrence impossible: (1) Explicit dev/prod branching inEmailSignInFlow(the env read moved from module scope to component body, so the prod path is testable viavi.stubEnv):const captchaConfigured = Boolean(siteKey); const captchaReady = captchaConfigured ? Boolean(turnstile.token) : import.meta.env.DEV— dev convenience preserved, prod never silently bypasses. (2) Visible runtime banner scoped to the email form only (Google OAuth stays functional — captcha-exempt by design per SUR-361): “Sign-in is temporarily unavailable — captcha is not configured. Please contact support.” withrole="alert"for screen-reader parity, plus a Strict-Mode-guardedconsole.erroronce on mount. Deliberately not a PostHog event — per GATING.md §5 Q3 that would have escalated the whole change off the CE surface. (3) Build-time hard fail via a new Vite plugin invite-plugins/sur-371-turnstile-key-guard.jsthat throws whencommand === 'build' && mode === 'production' && !env.VITE_TURNSTILE_SITE_KEY. Uses Vite’sloadEnv()so the check mirrors whatimport.meta.envwill see in the bundle (shell env first, then.env.production, then.env.local, then.env); readingprocess.envdirectly would have false-failed when the key is set only in.env.productionfor local prod-mode builds. Plugin extracted to its own module so the regression test insrc/test/vite-config-turnstile-guard.test.jscan import it without side-effecting throughvite.config.js. Secondary UX polish: while the Turnstile widget is loading, the Send button now shows “Verifying your browser…” with a spinner rather than a silent disabled state (the secondary annoyance documented in the incident report). CI workflow patched (.github/workflows/build-test.yml) to setVITE_TURNSTILE_SITE_KEY: 1x00000000000000000000AA(Cloudflare’s documented always-pass test site key — same pattern asSUPABASE_AUTH_CAPTCHA_SECRET=1x0000000000000000000000000000000AAinsupabase/.env.example) so the new guard doesn’t break CI, while the real key continues to be set only on Netlify. Documentation:.env.examplegained aVITE_TURNSTILE_SITE_KEYsection calling out the Netlify-prod requirement (the original ticket asked to updatesupabase/.env.example, but that file holds the server secretSUPABASE_AUTH_CAPTCHA_SECRET—VITE_vars live in the PWA root.env; doc target corrected);CLAUDE.mdAuth dispatch hardening bullet declares the var a client-side concern that MUST be set in Netlify prod env vars. CE persona pass (security/regression/ux personas at SHA0db06f5): 0 BLOCKERs, 5 CONCERNs + 1 NIT all addressed (addedrole="alert", removed paternalistic email-input disable, added error-state truth-table test, extracted plugin + regression test, added load-bearing test-order comment). Test coverage: 18/18 inauth-screen-email.test.jsx(13 existing + 5 new SUR-371 cases — prod-misconfig with handler-level guard viafireEvent.submit, dev no-key, “Verifying your browser…” pending state, configured+token render, and the(captchaConfigured && !token && error)cell flagged by the regression-reviewer); 6/6 in the newvite-config-turnstile-guard.test.js; full surfc suite 836 tests pass, 0 failures.- Updated (2026-05-12, SUR-357 + SUR-368): Open-signup cutover completed. SUR-357 replaced the link-CTA bounce at
UpgradeAuthGate.jsxwith inline<AuthControls />composition — cold visitors clicking “Get Pro” onsurfc.app/pricingnow reach OAuth redirect / OTP issuance in one click without losing the price-echo + Pro-upgrade chrome.AuthControls.jsxgained three optional host-override props:sheetTitle(BottomSheet header override),secondaryLabel(email-opener button label override), andonCtaClick(method)(synchronous callback fired at the CTA commit moment — Google click or sheet open; AuthScreen leaves it unset). UpgradeAuthGate wiresonCtaClickto preserve the frozen SUR-367upgrade_gate_auth_started { interval, ref, method }contract verbatim across the refactor — the fire-site moved from inline handler to callback but the payload shape is identical. The upgrade gate also gained a ”← Back to pricing” escape hatch that round-trips non-null UTM keys tosurfc.app/pricing(inverse of the marketing-sidepreserveUtm.tsforward path). SUR-368 closed out the documentation: newdocs/getting-started/account-setup.mdhelp-center article (signup → email verification → passkey enrolment → first capture), newdocs/runbooks/open-signup-rollback.mdrollback runbook (five-step ladder from “close the door in Supabase Studio” to full revert), and a CLAUDE.md Monetisation paragraph consolidating the open-signup model across SUR-360 (Resend SMTP email verification), SUR-361 (Cloudflare Turnstile on dispatch endpoints), SUR-362 (handle_new_auth_user()trigger, replacing the waitlist-bound bootstrap), SUR-363 (relaxed RLS — migration0021dropped the waitlist-EXISTS predicate from books/notes/custom_ideas/storage.objects, restoring 0001 ownership-only access), SUR-364 (PWA waitlist gate removal), SUR-365 (marketing CTA flip), and SUR-367 (waitlist-signup Edge Function sunset + conversion-funnel rebuild). New users hit theon_auth_user_createdtrigger →handle_new_auth_user()(migration0020), which upserts auser_profilesrow withmonth_limit = 50default,user_tier = 'free', and anamederived fromraw_user_meta_data.full_name(fallback:split_part(email, '@', 1)). The bespoke admin-side spend panel that would consolidate cost monitoring is in flight under SUR-230 atintranet.surfc.app/admin/spend; until it lands, abuse-spend detection is manual (Supabase usage panel + Anthropic console).- Updated (2026-05-27, SUR-308): Re-discover now gates through the shared client-side PII review. New
usePiiReviewcontroller (src/hooks/usePiiReview.js) owns the singlePiiReviewSheet+ guardrail telemetry (path∈transcribe|discover|rediscover); constructed once inApp.jsxand injected intouseNoteForm(capture transcribe/discover paths) anduseNoteActions.rediscoverIdeas— the latter previously sent an existing note’s text to the discover endpoint with no PII review (the FUNCTIONAL.md §10 gap). On re-discover: Cancel aborts (no AI call), Redact sends asterisked text to the API only (stored note unchanged), Send proceeds.- Updated (2026-05-30, SUR-316): Prompt versioning v1. The three managed-AI system prompts (and their
model/max_tokens) move out ofprompts.tscode constants into a new service-role-onlypromptstable (migration0027), loaded per call byanthropic-proxyviagetPrompt()(promptLoader.ts) with a ~5-min in-memory cache that fails open to the SUR-303 constants on any DB read error — fallback rows recordprompt_version = 0and surface_promptFallback: trueto the client (mirroring the_failOpenguardrail contract; the client emits aprompt_fallbackPostHog event alongsideguardrail_fail_open). Each successful managed call recordsprompt_name+prompt_versionto a new per-callai_usage_eventstable (migration0029, written best-effort beside the unchangedai_usage_dailyquota upsert) and to a new server-side PostHog eventmanaged_ai_call_succeeded(prompt_name/prompt_version/fail_open/prompt_fallback).prompts.tswas refactored to extractDISCOVER_CANON/DISCOVER_WITH_CUSTOM_TEMPLATE+renderDiscoverWithCustom, keepingbuildDiscoverSystembyte-identical (snapshot tests guard it; the seed in0028is byte-identical to the constants). Migration0030additionally locks downEXECUTEon the pre-existingupsert_ai_usageRPC (previously callable by any authenticated user via PostgREST). Full data-architecture coverage in Data Architecture → Prompt versioning; operations in the Prompt versioning runbook.
- Updated (2026-06-28, SUR-711): Retired the
?intent=signupsignup framing onAuthScreen.jsx— one signed-out landing for every entry point (no “Create your account” overlay; the email sheet no longer auto-opens from a marketing landing, which had covered the SUR-706 Terms/Privacy consent notice).AuthScreenstill consumes?open_email=1(UpgradeAuthGate deep-link). Telemetry:auth_landing_viewedfromauthscreennow carries the constantintent: 'signin'(shape unchanged) and theapp_signup_startedanchor was deleted (replaced by separate instrumentation). Cross-repo:surfc-web/src/lib/appUrl.tssignupUrl()→/signin.AuthControls’intent="signup"prop is unchanged (still used byUpgradeAuthGate).
- Updated (2026-06-28, SUR-673): Rebranded the Supabase auth-email templates (
supabase/email-templates/{magic-link,confirm}.html) to the braird identity — wordmark, cool-paper/forest-ink palette,braird.applinks/footer; Go-template vars ({{ .ConfirmationURL }}/{{ .Token }}/{{ .Email }}) and the SUR-705 link+code structure preserved (no{{ if }}branch, no braces in comments).[auth.email.smtp]sender flips toadmin_email = "hello@braird.app"/sender_name = "Deji @ braird"for local-CLI parity; the prod From identity is owned by SUR-674 and prod ships only via the manual dashboard paste, gated onbraird.appResend domain verification + warmup (SUR-669). Reset/recovery template is out of scope (braird auth is passwordless OTP/magic-link).
Evidence gathered from source files only, per AGENTS.md.
Component overview
- Presentation layer:
src/App.jsxwires state hooks, owns the route tree, and renders the shell layout. A partial refactor (SUR-303, 2026-05-04) has extracted:AppGates(sequential gate ladder: encryption check → enrollment → unlock → migration → device-add/transfer-redeem),ShellNavigation(ShellTabRowfor desktop/tablet top-nav +ShellBottomNavfor mobile),NoteActionOverlay(note long-press flow viaforwardRef),IdeaActionOverlay(idea long-press flow viaforwardRef), andUnsyncedChangesModal(replaceswindow.confirmfor unsynced sign-out). Other notable components:AddIdeaSheet(idea-tagging bottom sheet in the note form),AddIdeaBanner(post-creation description prompt),LinkedDevicesModal(SUR-233, E2EE device list + removal),PolicyPage(Termly embed for/policies/:kind),HelpCenterScreen(in-app help index at/help),HelpArticle(in-app article at/help/:slug), andHelpArticleBody(markdown renderer with VitePress-link rewriting and callout support). - Local services: Hooks in
src/hooks/couple Dexie persistence and UI state:useAuth(auth + sync),useUI(navigation/view state),useNoteForm(note creation + AI ingestion),useSettings(custom ideas/import/export),useNoteActions(mutations + rediscovery),useKeyManagement(E2EE key lifecycle + device management —fetchDeviceListis memoized viauseCallback([session?.user?.id])so the App.jsx Settings-open effect does not trigger an unbounded refetch loop [SUR-233]),useUserProfile(fetchesuser_profilesSupabase row),useMediaQuery(SSR-safematchMediasubscriptions),useFocusRestore(accessible focus return on modal close), anduseToast. - Local store: Dexie (
src/db.js) contains entity tables, schema migrations, CRUD helpers, outbox queue, and merge/import logic; it is the single source of truth when offline. - Cloud boundary:
src/supabase.jswraps supabase-js auth (Google OAuth + email-OTP sign-in/sign-up viasignInWithGoogle/requestEmailOtp/verifyEmailOtp), CRUD, and storage; versioned SQL undersupabase/migrations/*.sqlcodifies tables/RLS/storage, whilescripts/schema-contract.js+scripts/check-schema.jsverify drift. Self-service signup is open post-SUR-364:requestEmailOtpomitsshouldCreateUserso an unknown email creates a newauth.usersrow alongside the OTP send, matching the dashboard “Allow new users to sign up” toggle (manual flip in SUR-359). Auth dispatch hardening (SUR-360 + SUR-361, 2026-05-10): Resend SMTP (smtp.resend.com:587, senderhello@surfc.app) is wired via[auth.email.smtp]insupabase/config.tomland depends on theRESEND_API_KEYenv var. Cloudflare Turnstile is wired via[auth.captcha](providerturnstile,SUPABASE_AUTH_CAPTCHA_SECRETserver-side,VITE_TURNSTILE_SITE_KEYclient-side); theuseTurnstilehook (src/hooks/useTurnstile.js) is mounted inside the email-OTP flow only because GoTrue only enforces[auth.captcha]on dispatch endpoints and supabase-js v2.100.0’ssignInWithOAuthsilently dropsoptions.captchaToken(node_modules/@supabase/auth-js/dist/main/GoTrueClient.js:669-677). Post-SUR-371 (2026-05-20):VITE_TURNSTILE_SITE_KEYis now defended on two surfaces —vite-plugins/sur-371-turnstile-key-guard.jshard-failsnpm run buildwhen the var is unset in production (usingloadEnvso the check matches whatimport.meta.envsees), andEmailSignInFlowinsrc/components/AuthControls.jsxshows a visiblerole="alert"banner +console.errorif a production build somehow still reaches the runtime without it. The old!TURNSTILE_SITE_KEYshort-circuit that silently bypassed captcha in production is gone; the dev convenience (skip captcha when the key is absent andimport.meta.env.DEVis true) is preserved by explicit branching. Auth-landing surfaces (SUR-370 + SUR-357, 2026-05-11/12; SUR-711 retired the signup framing, 2026-06-28):AuthScreen.jsxis the single signed-out landing — SUR-711 removed the?intent=signupframing (no signup overlay; the email sheet no longer auto-opens from a marketing landing, which had been sitting over the Terms/Privacy consent notice). It still consumes?open_email=1to pre-open the sheet forUpgradeAuthGate’s “Use a different email” deep-link.UpgradeAuthGate.jsxcomposes the same<AuthControls />inline (SUR-357), with three optional host-override props (sheetTitle,secondaryLabel,onCtaClick) that let it render upgrade-specific copy (“Sign up to continue to Pro” / “Continue with email”) and own auth-funnel telemetry without the shared primitive growing host-specific event knowledge. The OAuth + email-OTP primitives live insrc/components/AuthControls.jsx; the SUR-367 funnel eventupgrade_gate_auth_started { interval, ref, method }is fired by UpgradeAuthGate via theonCtaClickcallback (host-owned contract, AuthScreen leaves the prop unset). A singleauth_landing_viewedPostHog event covers both landing surfaces:{ surface: 'authscreen', intent: 'signin', ...utm }from AuthScreen (SUR-711 fixedintentto the constant'signin'and removed theapp_signup_startedanchor) and{ surface: 'upgrade_gate', intent: 'upgrade', interval, ref, ...utm }from UpgradeAuthGate (replaces the previously-shippedupgrade_gate_viewed). UTM/click-ID dimensions come fromsrc/lib/utmParams.js → readUtmParams()on the app side and are forwarded across the cross-domain hop bysurfc-web/src/scripts/preserveUtm.ts(wired inBaseLayout.astro); the upgrade gate also round-trips non-null UTM keys back tosurfc.app/pricingvia a subtle ”← Back to pricing” link, so a cold visitor who clicks away from the gate retains campaign attribution on re-entry. - AI ingestion:
src/api.jsperforms Anthropic calls;src/ingest/*.jsconvert raw manual/photo input into normalized notes consumed byuseNoteForm. All managed users post toinvokeAnthropicProxy, which invokessupabase/functions/anthropic-proxy/to run the call server-side and record usage (src/supabase.js). - Client-side safety:
src/safety/index.jsrunscheckStructuredPiiagainst note text before managed AI submission and returnsPiiMatch[]for the BottomSheet review UI. Policy is warn-not-block at v1.4. Stub seams exist for SUR-246 on-device prompt injection and NER PII detection. Every managed-AI entry point gates through the sharedusePiiReviewcontroller (reviewBeforeSend/awaitPiiReview) before submission — transcription and discovery in the capture flow (useNoteForm) and re-discovery of an existing note (useNoteActions.rediscoverIdeas); one review sheet and one telemetry contract (path∈transcribe|discover|rediscover) serve all paths (SUR-308). On re-discover, Redact sends the asterisked text to the API only — the stored note is unchanged. Public help page:HowItWorksPage.jsxandLandingPage.jsxwere removed by SUR-215 (2026-04-23). Marketing content lives insurfc-web/.
Runtime flows
sequenceDiagram participant UI as React UI (App + components) participant Hooks as Hooks (useAuth/useNoteForm) participant Dexie as Dexie (src/db.js) participant Outbox as Outbox Queue participant Supabase as Supabase (DB+Storage) participant Proxy as anthropic-proxy (Edge Fn, SUR-10) participant Anthropic as Anthropic API
UI->>Hooks: user actions (capture, edit, sync) Note over Hooks,Anthropic: BYOK path Hooks->>Anthropic: callTranscribeImage / callDiscoverIdeas (if BYOK key) Anthropic-->>Hooks: JSON transcription/tags Note over Hooks,Proxy: Managed path Hooks->>Proxy: POST /anthropic-proxy {action, payload} Proxy->>Anthropic: managed Anthropic call Anthropic-->>Proxy: response + usage tokens Proxy->>Supabase: upsert ai_usage_daily (service role) Proxy-->>Hooks: JSON transcription/tags Hooks->>Dexie: saveBook/saveNote/saveCustomIdea Hooks->>Outbox: enqueue(table,payload) when offline/failure Hooks->>Supabase: upsert entities + uploadImage when online Supabase-->>Hooks: merged datasets (fetchAllCloud) Hooks->>Dexie: mergeCloudRecords + downloadImage() Dexie-->>UI: loadAll() for renderingAnalytics events
Both surfaces share a single PostHog project so the landing → app sign-up funnel is
visible in one stream. Funnel rebuild is tracked in SUR-367 (the SUR-365 marketing
CTA flip is the first event — data-cta="*_signup" → app_cta_clicked).
- Marketing surface (
surfc-web/): pageview andapp_cta_clickedevents fire fromsrc/layouts/BaseLayout.astro(the globaldata-ctaclick handler). - App surface (
surfc/): product-level events fire from hooks and components. Notable:help_index_opened/help_article_viewed(in-app help),guardrail_fail_open(safety pipeline),note_created/ideas_discovered(core capture loop). PostHog is initialized viaposthog-jswith the project token fromVITE_PUBLIC_POSTHOG_PROJECT_TOKEN. /policies/:kindis served in-app byPolicyPage(Termly embed). A Netlify 301 from oldapp.surfc.app/policies/*bookmarks also exists innetlify.tomlas a fallback.
State orchestration (local state vs. sync vs. AI)
useAuthinitializes on app load: retrieves the Supabase session, hydrates Dexie vialoadAll, subscribes to auth state changes, tracks online/offline events, flushes the outbox, merges Supabase data, and exposesbooks,notes,customIdeas,apiKey,cloudWrite, andsyncFromCloud(src/hooks/useAuth.js).useUIholds presentation-only signals (mobileView,selectedIdea, search text, modal/lightbox booleans) plus derived collections likeideaCounts(src/hooks/useUI.js).useNoteFormbridges all three layers: it stores capture/AI state locally, persists notes/books to Dexie, kicks off Supabase uploads viacloudWrite+uploadImage, and calls Anthropic throughcallTranscribeImage/callDiscoverIdeas(src/hooks/useNoteForm.js).useSettingsmanages Dexie metadata (API key, custom ideas), triggers Supabase writes viacloudWrite, and coordinates UI state for imports and tag renames (src/hooks/useSettings.js).useNoteActionsedits/deletes notes in Dexie, mirrors the mutations to Supabase, and re-tags notes with Anthropic whenrediscoverIdeasfires (src/hooks/useNoteActions.js). Since SUR-308,rediscoverIdeasruns the client-side PII review (via the injectedusePiiReviewcontroller) before the Anthropic call; Edit aborts, Redact sends asterisked text to the API only, Send proceeds.
Offline sync pipeline
- Schema probe:
syncFromCloudrunsprobeCloudSchemaonce per session before touching Supabase; errors setsyncStatusand block the rest of the pipeline until fixed (src/hooks/useAuth.js,src/supabase.js). - Write path: Hooks call
saveBook/saveNote/saveCustomIdeato persist locally.useAuth.cloudWriteattempts Supabase upserts immediately; failures or offline states result in anoutboxentry viaenqueue(src/db.js,src/hooks/useAuth.js). - Flush path: On login or reconnect,
syncFromCloudloads queued entries (getOutbox), collapses multiple edits viacollapseOutboxItems, and retries Supabase upserts; success deletes IDs (src/supabase.js). - Merge path: After flushing, the client downloads every table via
fetchAllCloud, runsmergeCloudRecordsinside a Dexie transaction, then reloads UI state withloadAll. Missing note images are fetched viadownloadImageand stored asimageDataUrlfor offline lightbox support (src/hooks/useAuth.js). - Conflict model:
mergeCloudRecordsusesupdated_atvs.updatedAtto enforce last-write-wins, respecting tombstones, and preserving local image previews when overwriting metadata (src/db.js). Testssrc/test/outbox.test.jsandsrc/test/sync.test.jsassert these semantics.
Concentration zones & coupling
src/App.jsx: Houses all layout logic, mobile/desktop views, modal toggles, and long-press orchestration. Any new screen or state change must flow through this file, increasing fragility.src/hooks/useNoteForm.js: Blends UX state, Dexie writes, Supabase uploads, AI calls (BYOK + managed), and navigation callbacks, making it the de facto domain service layer.src/db.js: Centralizes schema defs, migrations, CRUD helpers, outbox, merge logic, and import/export, so subtle changes ripple across persistence, sync, and settings flows.src/supabase.js: Acts as the sole cloud boundary; any change to Supabase schema or auth must be reflected here plus the migrations/contract (supabase/migrations/*.sql,scripts/schema-contract.js).
Managed AI Proxy
Managed Anthropic calls now flow through the live supabase/functions/anthropic-proxy/
Edge Function. The function is split across four source files:
index.ts— request routing, quota enforcement, guardrail orchestration, usage recording.prompts.ts(SUR-303) — single source of truth for all system prompts:TRANSCRIBE_SYSTEM,TRANSCRIBE_SYSTEM_ASCII_NOTE, andbuildDiscoverSystem(customIdeas). ImportsGREAT_IDEASfromsrc/constants.jsvia Deno cross-tree import. The SUR-300 eval harness also imports from here so prompt edits are tested in the same file they ship from — no copy-paste drift.guardrail.ts— Azure AI Content Safety wrapper (shield,moderate). See Safety guardrail pipeline below.parseJson.ts— JSON extraction utilities for Anthropic response parsing.
Notable behaviors backed by code:
callTranscribeImage/callDiscoverIdeaspass the Supabase session toinvokeAnthropicProxywhen no API key is saved (src/api.js,src/supabase.js).- The Edge Function validates the caller via
supabase.auth.getUser()using the anon key, then resolves the caller’s monthly cap fromuser_profiles.month_limit(single source of truth, default 50) plus a validallocation_overrideviagetResolvedMonthlyLimit, compares it to the liveai_usage_dailysum fromgetMonthlyUsage(cross-action total — no per-action-type carve-out, sotranscribeanddiscovercalls count toward the same shared cap), and records successful usage (request count + tokens) through theupsert_ai_usageRPC (supabase/functions/anthropic-proxy/index.ts,supabase/migrations/0004_ai_usage_tracking.sql,0005_upsert_ai_usage_fn.sql,0009_user_profiles.sql,0012_per_user_quota_limits.sql— SUR-92). - HTTP 429 responses include
{ error: 'rate_limit' }and emit a server-sidemanaged_ai_rate_limit_hitPostHog event viafireAndForget()— truly non-blocking, so PostHog network latency does not pad the 429 response time (SUR-92 fix, 2026-04-29);invokeAnthropicProxymaps the 429 to an Error withisRateLimit = truesouseNoteForm/useNoteActionscan show the upgrade message (src/supabase.js,src/hooks/useNoteForm.js). - A missing
user_profilesrow fails closed with500 profile_missingrather than silently using a fallback constant — the trigger and approval-Edge-Function insert paths cover every authorised user, so this is treated as an unrecoverable invariant break. - The managed and BYOK paths share the same UI/state flows; canonicalisation and Dexie persistence happen identically after the proxy returns.
Safety guardrail pipeline (SUR-242)
Every managed request passes through a multi-stage safety pipeline in anthropic-proxy
before usage is recorded. Stages run in order and an early BLOCK returns HTTP 422 without
incrementing the user’s quota:
- Prompt fencing — untrusted delimiters wrap user-supplied content inside the system prompt, preventing injected text from escaping the data plane.
- Input shield —
shield(userPrompt, [])calls Azure AI Content Safety Prompt Shields (text:shieldPrompt). BLOCK onattackDetected→ HTTP 422{ error: 'guardrail_blocked', detector: 'prompt_shield', leg: 'input' }. - Anthropic call — proceeds only if input is clean.
- Transcription-post Spotlighting — after a
transcriberesponse, the transcribed text is passed back toshield()as a document to catch indirect injection embedded in the scanned page. BLOCK onattackDetected→ HTTP 422{ detector: 'spotlight', leg: 'transcription_post' }. - Output harm moderation —
moderate(transcribedText)calls Azuretext:analyze. Categories: Hate, Violence, Sexual, SelfHarm. Severity threshold ≥5 (loosened from default 4 after false-positives on literary/technical content in the SUR-242 spike). BLOCK → HTTP 422{ detector: 'harm', leg: 'output' }. Skipped ondiscover— Discover output is a constrained allow-list of Idea names with no free-form text. - Usage recording — only fires on a full clean pass.
All Azure calls fail-open on 5xx, network error, timeout, or missing config — a guardrail
outage never blocks the core loop. _failOpen: true in the response body signals the client
to emit a guardrail_fail_open PostHog event. Required env vars:
AZURE_CONTENT_SAFETY_ENDPOINT, AZURE_CONTENT_SAFETY_KEY, AZURE_CONTENT_SAFETY_API_VERSION.
Stripe billing flows (SUR-85)
Three Supabase Edge Functions implement paid Pro subscriptions. Data-plane columns + the idempotency ledger are documented in Data Architecture → Stripe billing; this section covers the request flows.
| Function | Auth | Purpose |
|---|---|---|
create-checkout-session | JWT | Resolves / lazy-creates user_profiles.stripe_customer_id, then opens a Stripe Checkout Session and returns { url }. The persist step is a conditional UPDATE filtered on stripe_customer_id IS NULL (SUR-351) so N parallel callers converge on a single customer id; the race-loser deletes its own leaked Stripe customer (best-effort) and re-reads the survivor. Self-heal (SUR-501): if sessions.create throws resource_missing on the customer, the stored id is cleared — conditionally, on the observed stale id, so a concurrent healer’s fresh id is never clobbered — recreated via ensureStripeCustomer, and the session is retried once; a resource_missing on the price, or a second failure, surfaces as internal_error. |
stripe-webhook | Stripe signature only (no JWT) | Verifies signature, dedupes via stripe_webhook_events, dispatches to per-event-type handlers in handler.ts. Requires verify_jwt = false in supabase/config.toml under [functions.stripe-webhook] — Stripe deliveries don’t carry a Supabase JWT, so the gateway’s default JWT check would 401 every event before the handler runs. Same pattern as approve-waitlist and waitlist-signup. |
create-billing-portal-session | JWT | Looks up stripe_customer_id, opens a Stripe Billing Portal session, returns { url }. 404 no_customer if the user has never started a checkout. |
Stripe SDK pinned at npm:stripe@^22; API version pinned in each
function constructor at 2026-04-22.dahlia. Stripe API versions are
deliberately pinned so a Stripe-side default change cannot shift the wire
shape under us — when we want a newer one we bump it explicitly.
Webhook subscriptions registered on the Stripe endpoint (exactly five):
customer.subscription.created, customer.subscription.updated,
customer.subscription.deleted, invoice.paid, invoice.payment_failed.
sequenceDiagram participant UI as React app participant CCS as create-checkout-session participant Stripe as Stripe Checkout participant Hook as stripe-webhook participant DB as Supabase DB participant Resolver as getResolvedEntitlements
UI->>CCS: { interval, successUrl, cancelUrl } + JWT CCS->>DB: SELECT stripe_customer_id from user_profiles alt no customer yet CCS->>Stripe: customers.create({ email, metadata.user_id }) CCS->>DB: UPDATE … WHERE stripe_customer_id IS NULL RETURNING stripe_customer_id alt won the race Note over CCS,DB: returning row → use the new customer else lost the race CCS->>Stripe: customers.del(leaked) (best-effort) CCS->>DB: SELECT stripe_customer_id (survivor) end end CCS->>Stripe: checkout.sessions.create(...) Stripe-->>CCS: { url } CCS-->>UI: { url } UI->>Stripe: redirect to checkout Stripe-->>UI: success/cancel redirect
Stripe->>Hook: subscription.created (signed) Hook->>Hook: stripe.webhooks.constructEventAsync(rawBody) Hook->>DB: INSERT stripe_webhook_events ON CONFLICT DO NOTHING alt fresh event Hook->>DB: UPDATE user_profiles SET user_tier='pro', tier_started_at=now(), subscription_status, current_period_end else duplicate Hook-->>Stripe: 200 deduplicated end Hook-->>Stripe: 200 received
UI->>Resolver: refresh useEntitlements() Resolver->>DB: SELECT user_tier, subscription_status, current_period_end, ... Resolver-->>UI: { tier: 'pro', capabilities: PRO_DEFAULTS }Webhook handler — idempotency, signature, dispatch
stripe-webhook/index.ts
is the entry point and
handler.ts
is the pure dispatcher (split for testability — dispatchEvent is
covered by Deno tests with synthetic events).
Two non-obvious Deno specifics:
- Raw body before parsing.
req.text()is read before any JSON handling. Stripe’s signature is over the exact bytes Stripe sent; re-serialising viareq.json()would break the HMAC. constructEventAsync, notconstructEvent. The sync variant uses Node’scryptomodule, which Deno does not provide. The async variant uses WebCrypto.
The dispatch table:
| Event type | DB writes | user_tier impact |
|---|---|---|
customer.subscription.created / customer.subscription.updated | stripe_subscription_id, subscription_status, subscription_current_period_end | Flip to 'pro' (and stamp tier_started_at) on first transition into active / trialing for a 'free' user. Never demote. |
customer.subscription.deleted | subscription_status='canceled', subscription_current_period_end | Never written. Demotion is owned by the resolver at read time (deferred 30-day grace). |
invoice.paid | subscription_status='active' unless current status is canceled | None. Re-confirms active after an invoice.payment_failed → past_due resolution. The canceled-skip guard prevents a delayed retry from resurrecting a cancelled subscription past the grace cutoff. |
invoice.payment_failed | subscription_status='past_due' unless current status is canceled | None. Stripe handles dunning before subscription.deleted. Same canceled-skip rationale as invoice.paid. |
After dispatch, processed_at is stamped on the ledger row. Failures
during dispatch (DB errors, an unrecognised subscription_status from a
future Stripe API version that violates the CHECK constraint) return 500
to Stripe, which retries. The retry’s upsert hits the existing row;
ledgerInsertEvent then re-reads processed_at to decide:
processed_at IS NULL→ prior delivery’s dispatch crashed before the marker was written. Re-run dispatch on this retry — the user_profiles state change from the original event never landed, and skipping would silently strand a paid subscription event.processed_at IS NOT NULL→ the original delivery already applied fully. Short-circuit to a 200 deduplicated; Stripe stops retrying.
The dispatch handlers are individually idempotent (UPDATEs with stable inputs derived from the Stripe event), so a re-run on a row that partially applied before crashing converges to the correct end state without compensating logic. The corollary: never rely on side effects that aren’t expressed via the user_profiles update payload, since they won’t be re-run on reclaim.
Customer-id race + self-heal (SUR-351)
Two failure-mode hardenings on the data plane between Checkout and the
webhook, both responding to a 2026-05-08 incident where a paid Pro
upgrade silently no-op’d the user_profiles flip:
-
Race-safe customer creation in
create-checkout-session. The originalSELECT → CREATE → UPDATEsequence let N parallel invocations (e.g. a double-click that fired fourcreate-checkout-sessionPOSTs in one second per the production logs) each seestripe_customer_id IS NULL, each create their own Stripe customer, and each persist; last UPDATE wins. The user clicked whichever Checkout URL they got, paid on a customer the DB no longer pointed at, and the webhook arrived with asubscription.customerthat didn’t match any profile. The fix replaces the persist with a conditionalUPDATE … WHERE user_id = ? AND stripe_customer_id IS NULLplus aRETURNING stripe_customer_id. Race-winners get their own customer back; race-losers see no row, best-effortstripe.customers.deltheir leaked customer (no charges attached, but keeps the customer list tidy), re-read the survivor, and use it. Every parallel call converges on the same customer id, so whichever Checkout URL the user clicks settles the subscription on the customer that’s actually linked inuser_profiles. -
metadata.user_idfallback + self-heal in the webhook. Even with L1, any future regression (or any other source ofstripe_customer_iddrift) would silently no-op via the defensiveprofile_not_found_for_customer → 200path.handler.tsnow exposesresolveProfileForSubscription: whenfindProfileByCustomermisses, it falls back tosub.metadata.user_id— whichcreate-checkout-sessionalready populates viasubscription_data.metadata.user_idfor every subscription opened through our flow. On hit, it writes the correctstripe_customer_idback to the resolved profile and emits a structured[stripe-webhook] self_healed_stripe_customer_idlog so any L1 regression is loud, not silent. Applied to bothhandleSubscriptionUpsertandhandleSubscriptionDeleted.The fallback is guarded against stale at-least-once deliveries. If the metadata lookup resolves a profile whose
stripe_subscription_idis non-null and doesn’t matchevent.data.object.id, the fallback no-ops with askip_self_heal_subscription_id_mismatchlog instead of self-healing. Without this guard, a delayedcustomer.subscription.deleted(or duplicate created/updated) for an oldersub_idcarrying the samemetadata.user_idwould resolve the live profile, the caller would apply the stale event, and a user active on a newer subscription would have theirstripe_customer_idoverwritten and theirsubscription_statusflipped to'canceled'— silently and permanently. The guard scope is narrow on purpose: it protects the metadata-fallback path only. Direct customer-id matches carry stronger evidence (the live customer link points at this profile) and remain unguarded; a separate hardening ofhandleSubscriptionDeletedagainst delayed cancels on a matched customer-id is tracked outside SUR-351.The unrecoverable miss (both lookups fail) now also logs
profile_not_found_for_customerwith the full event context, converting the previously silent failure mode into a searchable signal in Supabase Function logs. -
Invoice subscription-id extraction across API versions. The
invoice.subscriptionfield was deprecated in Stripe API version 2024-10-28 in favour ofinvoice.parent.subscription_details.subscription. This codebase pins2026-04-22.dahlia, so the legacy field is no longer populated and everyinvoice.paid/invoice.payment_failedhad been returningno_op: invoice_without_subscriptionin production — silently breaking the dunning recovery (activere-confirmation) andpast_duemirroring paths. The newgetSubscriptionIdFromInvoicehelper resolves the new path first (string id or expanded object), falls back to the legacy top-level field as defence-in-depth for any older payload variant, and finally tolines[0].subscriptionas a last resort.
Verification at the line of fire: the self_healed_stripe_customer_id
log should be absent under normal operation. Its presence in
production logs means L1 has regressed somewhere we didn’t anticipate
and the metadata fallback has carried the system. Likewise,
skip_self_heal_subscription_id_mismatch should be rare and worth
auditing if it appears more than incidentally — it usually means the
profile holds a stale stripe_subscription_id (a previous subscription
that should have been cleared on cancel-and-resubscribe) or that Stripe
is replaying very old events for a subscription the user no longer has.
Client integration
src/supabase.js exposes thin wrappers that match the
invokeAnthropicProxy shape — fresh-session refresh, JWT in
Authorization, typed errors:
export async function createCheckoutSession({ interval, successUrl, cancelUrl })export async function createBillingPortalSession({ returnUrl })createBillingPortalSession surfaces a 404 no_customer response as a
typed error.isNoCustomer = true so the UI can route the user to the
upgrade flow rather than show a generic error toast. SUR-86 / SUR-88 /
SUR-89 wire these into the Plans page, Settings deliberate-upgrade, and
the quota-exhausted popup respectively — those issues do not redefine
the contract.
Required Edge Function secrets
STRIPE_SECRET_KEY, STRIPE_WEBHOOK_SECRET,
STRIPE_PRICE_ID_PRO_MONTHLY, STRIPE_PRICE_ID_PRO_ANNUAL. Documented
under docs/supabase-setup.md → Edge Function secrets / Stripe.
The shared loader supabase/functions/_shared/env.ts → requireEnv
throws on missing values so a misconfigured deploy fails on the first
request rather than silently mid-request.
Out of scope for SUR-85
- UI surfaces. SUR-86 (Plans page on
surfc-web/), SUR-88 (Settings deliberate upgrade), SUR-89 (quota-exhausted popup) own the upgrade / manage-subscription buttons. SUR-85 stops at the function + wrapper boundary. - Trial mechanics. Deferred to SUR-262 / SUR-235. The webhook treats
trialingexactly likeactive, so a future trial wiring needs no webhook change. - PostHog billing events.
checkout_started,subscription_created,subscription_canceledare the relevant CTA-side events; they fire from the UI surfaces above, not from the Edge Functions. - CORS / body-parsing helper extraction. Tracked in SUR-339 — the
three new functions still define their own
CORS_HEADERSconstants alongsideanthropic-proxy,waitlist-signup,delete-account,me-entitlements, andimage-upload. SUR-339 unifies them into_shared/cors.ts+_shared/http.tspost-SUR-85.
Confirmed / Assumption / Unknown
In-app help system (SUR-209 Phase 2, SUR-303, 2026-05-04)
The help system is built directly into the React app and reads the same markdown sources as the public VitePress site at help.surfc.app:
src/help/manifest.js— readsdocs/getting-started/*.mdat Vite build time viaimport.meta.glob({ query: '?raw', eager: true }). Parses frontmatter, orders articles perARTICLE_ORDER, and exportsarticles,indexArticle,getArticle(slug), andisForwardReference(slug).FORWARD_REFERENCESis an allow-list of slugs that don’t have articles yet — links to them render as disabled<span>rather than broken routes./help→HelpCenterScreen— searchable article index; fireshelp_index_openedPostHog event. Public route (no auth required)./help/:slug→HelpArticle→HelpArticleBody— renders article markdown withreact-markdown+remark-gfm+remark-directive; rewrites VitePress-style absolute links (/getting-started/<slug>) and relative links to SPA/help/<slug><Link>elements; callouts (:::note,:::tip,:::warning,:::danger) rendered viaHelpCalloutthroughsrc/help/calloutPlugin.js.- Both routes are outside
AppGates— accessible to unauthenticated sessions (e.g. from theAuthScreenor a cold URL share). - Help articles bundled at build time: changes to
docs/getting-started/*.mdrequire a new Vite build to appear in the app.
Confirmed / Assumption / Unknown
- Confirmed:
- Offline-first architecture with Dexie + outbox queue, last-write-wins merge, and Supabase sync, as evidenced by
src/hooks/useAuth.js,src/db.js, and tests. - All managed AI calls flow through
anthropic-proxy; direct browser-to-Anthropic communication was removed (SUR-91). The BYOK trace remaining is a one-timeapiKeycleanup indb.js. - Supabase serves as both the operational datastore and auth boundary per
supabase/migrations/0001_initial_schema.sqland the schema contract tooling. - Azure AI Content Safety guardrail pipeline (
guardrail.ts) is wired intoanthropic-proxywith fail-open semantics (SUR-242, 2026-05-01). - Client-side PII detection via
src/safety/(SUR-242); warn-not-block; SUR-246 NER stubs in place. - In-app help system at
/helpand/help/:slug(SUR-303, 2026-05-04); readsdocs/getting-started/*.mdat build time. Public routes — no auth required. — removed by SUR-215 (2026-04-23); marketing lives in/how-it-worksclient-side routesurfc-web/.
- Offline-first architecture with Dexie + outbox queue, last-write-wins merge, and Supabase sync, as evidenced by
- Assumption:
- Future ingest adapters (Readwise/Kindle) will plug into the existing adapter interface; comments exist but no code.
- Multi-device concurrency is limited to eventual consistency; there is no mention of CRDTs or per-field merges.
- Background sync relies entirely on the user opening the app; no service worker sync hooks are implemented.
- Azure Content Safety severity threshold 5 was validated against Surfc-shaped content in the SUR-242 spike; revisit if harm-category false-positive rate changes at higher usage scale.
- Unknown:
- No evidence of push notifications, share targets, or OS integrations beyond the PWA manifest.
- Error telemetry/monitoring stack is not referenced.
- How image storage is cleaned up across deletes remains unspecified.