agnes-the-ai-analyst/docs/superpowers/plans/2026-04-22-cloudflare-access-auth.md
ZdenekSrotyr d2c76cb221
User management + PAT + CLI distribution + HTML auth redirect (#9 #10 #11 #12) (#28)
* fix: redirect unauthenticated HTML routes to /login (#10)

* docs(plan): user mgmt + PAT + CLI distribution implementation plan (#9 #10 #11 #12)

* build(docker): produce wheel artifact for /cli/download (#9)

* feat(db): schema v5 — users.active + deactivated_at/by (#11)

* feat(api): /cli/download wheel + /cli/install.sh with baked server URL (#9)

* feat(users): repository supports active flag + count_admins (#11)

* feat(ui): /install page with per-deployment install instructions (#9)

* feat(api): user PATCH/reset-password/set-password/activate/deactivate (#11)

* fix(cli): da login prompts for password and sends it in body (#9)

* test(api): safeguard tests for self-deactivate and last admin (#11)

* feat(auth): reject requests from deactivated users (#11)

* fixup(#10): propagate next through /login buttons + lock down sanitizer tests

* feat(cli): da admin set-role/activate/deactivate/reset-password/set-password (#11)

* feat(ui): /admin/users management page (#11)

* feat(db): schema v6 — personal_access_tokens (#12)

* feat(users): access_tokens repository (#12)

* feat(auth): JWT carries typ (session|pat) and explicit jti (#12)

* feat(auth): reject revoked/expired PATs; update last_used_at (#12)

* feat(api): /auth/tokens CRUD + admin revoke; session-only guard (#12)

* feat(cli): da auth token create/list/revoke (#12)

* feat(ui): /profile page with PAT create/list/revoke (#12)

* docs: PAT usage and session/PAT TTL clarification (#12)

* feat(auth): PAT first-use-from-new-IP audit + last_used_ip (schema v7) (#12)

Closes remaining acceptance gap from issue #12: audit_log entry on first use
of a PAT from an IP that differs from the recorded last_used_ip.

- schema v7: personal_access_tokens.last_used_ip column
- AccessTokenRepository.mark_used now stores the client IP
- get_current_user extracts client IP (X-Forwarded-For first hop, fallback
  to request.client.host) and emits a token.first_use_new_ip audit when the
  IP changes on a subsequent use (not the very first use)
- tests: new-ip audit, same-ip no-op, first-ever-use no-op, schema v7 column

* fix: address Devin review findings on PR #28

- app/main.py: exclude /auth/* from HTML redirect handler so JSON
  endpoints under /auth/ (PAT CRUD used by `da auth token` CLI) keep
  their 401 JSON contract (Devin #1, bug)
- app/api/tokens.py: reject expires_in_days <= 0 explicitly; use
  `is not None` so 0 no longer silently creates a non-expiring token
  (Devin #2)
- app/api/users.py: validate role against Role enum in create_user
  to match update_user and prevent 500 on role-protected requests
  later (Devin #3)
- app/web/templates/admin_users.html: escape user-supplied strings
  before innerHTML; move onclick handlers to addEventListener via
  data attributes so emails with quotes / HTML no longer break the UI
  or enable stored XSS (Devin #4)
- app/auth/router.py, app/auth/providers/{password,google}.py:
  reject deactivated users at login instead of issuing a JWT that
  would then fail on the next request — removes the confusing
  redirect loop (Devin #5)
- CLAUDE.md: document schema v7 instead of stale v4 (Devin #6)
- tests/test_web_ui.py: regression test for the /auth/* JSON 401

* feat(web): add /profile and /admin/users links to dashboard nav

* feat(web): point setup banner at /install page

* chore(web): drop unused setup_instructions context

* fix: address Devin review round 2 on PR #28

- app/api/tokens.py: when expires_in_days is None (the "never" option),
  use a ~100-year JWT expiry so the token doesn't silently die in 24h
  via the session-default fallback in create_access_token. The real
  expiry enforcement stays in verify_token's DB-level check (Devin 🔴)
- app/web/templates/profile.html: escape t.name and other user-supplied
  strings via esc() helper before innerHTML, same pattern as
  admin_users.html. Move revoke onclick to data-attribute +
  addEventListener (Devin 🟡)
- app/api/cli_artifacts.py: use `mktemp -d` with X's at end of template
  for GNU/BSD portability, place wheel inside the temp dir and
  clean up with rm -rf (Devin 🚩)

* feat(web): redesign /install page; make curl one-liner primary, collapse manual

Rebuild the public /install page using the dashboard visual language
(shared header, card layout, gradient hero, design tokens from
style-custom.css). The page is now anchored on the one-liner install
path: curl -fsSL <server>/cli/install.sh | bash is rendered as the
primary, prominent step 1, while the old manual wheel-download flow
is tucked behind a closed-by-default <details> block for users in
restricted/offline environments.

Information architecture:
  hero (server URL + version)
  -> step 1: quick install (one-liner, big Copy button)
  -> step 2: create PAT on /profile + export DA_TOKEN / da auth whoami
  -> step 3: Claude Code / MCP via ~/.config/da/token.json
  -> collapsed "Manual install" details for download-wheel flow
  -> footer link to docs/HEADLESS_USAGE.md

Every shell snippet has a vanilla-JS "Copy" button that confirms
visually ("Copied!" for 1.5s) and falls back to textarea+execCommand
on non-secure contexts. No new dependencies, no bundler.

The route now also pulls an optional user so the header shows the
same nav (Dashboard / Profile / Logout) as dashboard.html when a
session exists, while staying fully public when signed out.

* fix(cli): use real wheel filename in install.sh (broken pip/uv install)

The installer wrote the downloaded wheel as agnes_cli.whl, which lacks a
PEP-427 version component — both pip and uv tool install reject it and
abort the one-liner.

Use curl -OJ so Content-Disposition determines the on-disk filename, then
resolve it via glob. Install an EXIT trap to remove the tmpdir even when
install fails.

* fix(web): correct manual install wheel glob and add PEP 668 / PATH hints

- Wheel glob is agnes_the_ai_analyst-*.whl (not agnes-*.whl) — the old
  pattern never matched the real artefact name from the build.
- Add — or — separator between uv tool install and pip install.
- Warn that pip install --user is blocked on macOS Homebrew / modern
  Debian (PEP 668) and recommend uv tool install as the default path.
- Both flows now show the ~/.local/bin PATH hint so a fresh shell can
  find the da binary after install.

* fix(web): consistent session.user reference in install header

The avatar-letter fallback inside {% if session.user %} was reading
user.name / user.email directly, but the route dependency can pass
user=None — those references resolved to an empty FlexDict and produced
an empty avatar circle. Read everything through session.user to match
the guard and the dashboard pattern.

* fix(web): point headless usage link at GitHub source

/docs/HEADLESS_USAGE.md 404s — no static route serves repo docs. Point
the footer link at the rendered markdown on GitHub instead of adding a
dedicated docs serving route just for one file.

* feat(web): /install hero size, anon sign-in banner, step 2 copy polish

- Bump hero h1 from 26px to 30px to match dashboard primary scale.
- Anonymous visitors see a small sign-in banner above Step 2 (creating
  a token requires auth; without the banner the flow appears stuck).
- Add an 'After generating your token' section label inside Step 2 so
  the /profile CTA button no longer looks wedged mid-sentence between
  adjacent paragraphs.

* chore(web): /install a11y + version pill polish

- aria-live='polite' on copy buttons so screen readers announce the
  'Copied!' state change.
- Replace redundant INSTANCE_NAME eyebrow (already in the header logo)
  with 'Getting started'.
- Hide the version pill when AGNES_VERSION is unset/'dev' — avoids the
  misleading 'vdev' label in local/unbuilt runs.
- Manual summary focus-visible outline-offset +2px (was -2px which
  clipped inside the card), and mark the chevron as decorative.

* fix(web): use session.user in dashboard avatar fallback

Inside {% if session.user %} guard, the avatar fallback referenced
(user.name or user.email). If user is None the block crashes when
the profile picture is absent. Align with the guard variable.

* fix: address Devin review round 3 on PR #28

- app/api/users.py: stop auto-sending email from reset_password. The
  magic-link sender would deliver a "Login Link" that — when clicked —
  consumes the reset_token via verify_magic_link and logs the user in
  WITHOUT prompting for a new password. Admins now share the raw
  reset_token from the API response manually, or use set-password
  directly. email_sent is always False. Documented inline. (Devin 🟡)
- app/api/cli_artifacts.py: harden /cli/install.sh generation against
  shell injection via Host header or AGNES_VERSION. base_url is
  validated against a strict scheme+host+port regex; version against
  an alnum + dot/dash/underscore allowlist. Both values are also
  piped through shlex.quote() as defense in depth. (Devin 🟡)

The shared users.reset_token column between magic-link and password-
reset flows (Devin 🚩) remains an architectural gap; splitting into
separate columns needs schema v8 and is tracked for a follow-up PR.

* docs, chore(grpn): manual-deploy helpers + hackathon deploy learnings

Adds scripts/grpn/ — Makefile + agnes-auto-upgrade.sh + README for
operating Agnes on GRPN's existing foundryai-development VM when the
full Terraform flow is blocked by org policies:

- iam.disableServiceAccountKeyCreation (org constraint) forbids SA
  JSON keys, so GCP_SA_KEY-based CI is unavailable
- No projectIamAdmin delegation → bootstrap-gcp.sh can't grant roles
- Secret Manager IAM bindings require setIamPolicy which editor lacks

Helper targets: deploy, deploy-tag, recreate, restart, stop, start,
status, version, logs, ps, env, ssh, tunnel, open, bootstrap-admin,
set-data-source, install-cron, uninstall-cron.

docs/superpowers/plans/2026-04-22-grpn-deploy-learnings.md — running
log of all org-policy constraints hit during the hackathon deploy,
with workarounds and derived follow-ups (WIF support, external_ip
variable, customer onboarding IAM checklist).

Not a replacement for the TF flow — stopgap until WIF lands.

* fix(web): make header logos clickable links to home

* feat(web): one-click "Setup a new Claude Code" button

Adds a single-button flow on the dashboard and /install page that
generates a fresh personal access token via POST /auth/tokens and
copies a complete, paste-ready setup script (server URL, token,
install/verify commands) to the clipboard. Falls back to a modal
textarea when the clipboard is blocked; redirects to /login on 401;
surfaces backend errors inline.

- dashboard.html: replaces the top "Set up your local environment"
  anchor with a real button wired to setupNewClaude(). Removes the
  duplicate bottom setup banner to keep a single entry point.
- install.html: for signed-in users, Step 1 leads with the one-click
  button and demotes the curl one-liner into a collapsible "Or run
  manually" aside. Anonymous visitors still see the curl flow plus a
  sign-in hint.
- No new deps. Vanilla JS. Token lives in memory/clipboard only —
  never rendered into persistent DOM.

* feat(cli): add "da auth import-token" for non-interactive PAT login

Writes a provided JWT into ~/.config/da/token.json using the canonical
{access_token, email, role} shape expected by save_token(). Decodes the
token locally to pull email/role claims, verifies it against the server
via GET /api/catalog/tables, and refuses to overwrite an existing token
file if the server returns 401. --email / --role overrides exist for
tokens missing those claims; --skip-verify bypasses the server round-trip
for offline / CI scenarios.

* test(cli): cover da auth import-token success + 401 + claim-fallback paths

Three new tests in TestAuthImportToken:
- valid JWT + 200 -> canonical token.json written
- 401 from /api/catalog/tables -> exit 1, existing token file untouched
- JWT without email/role claims -> refused without overrides, accepted
  with --email / --role flags

* feat(web): update one-click Claude setup instructions — explicit uv install, import-token, skills question

Replaces the fragile `cat > token.json <<EOF` clipboard payload with an
explicit, auditable sequence:

  1. `curl -fsSL /cli/download` + `uv tool install --force` (no opaque
     `curl | bash`).
  2. `da auth import-token --token ...` instead of hand-written JSON.
  3. Explicit PATH persistence for zsh/bash.
  4. A required question to the user about whether to copy the bundled
     skills into ~/.claude/skills/agnes/ or pull them on-demand via
     `da skills show`.
  5. A final confirmation step with whoami + version output.

Factored both pages to include a shared partial
(app/web/templates/_claude_setup_instructions.jinja) so dashboard.html
and install.html can never drift apart again. {server_url} and {token}
stay as runtime placeholders substituted by renderSetupInstructions().

* feat(ui): modernize /admin/users + unify header nav across pages

- New shared partial app/web/templates/_app_header.html — single source
  of truth for the top navigation. Used by base.html and dashboard.html
  (which doesn't extend base.html). Active page highlighted via
  request.url.path. Admin "Users" link gated by session.user.role.
- style-custom.css: add .app-header / .app-nav-link / .app-btn-logout /
  .app-avatar styles (mirrors dashboard's previous inline copy under
  app-* prefix). Mobile-friendly fallback at <720px.
- base.html: include the new partial so every page extending base
  (admin_users, profile, login_email, error, …) gets the same chrome
  the dashboard has.
- dashboard.html: replace its inline <header class="header"> markup
  with the shared partial. Inline .header CSS left in place as
  harmless dead code (separate cleanup PR).
- admin_users.html: rewritten with avatars, role pills (color-coded
  per role), toggle switch for active, search/filter input, toast
  notifications, modal dialogs replacing alert/confirm/prompt,
  one-click copy for the reset token, empty / loading states.
  All XSS-safe via the existing esc() helper + data-attribute
  event delegation.
- tests/test_web_ui.py: smoke test that /admin/users renders the new
  shared header chrome and the modernized markup.

* feat(api): serve CLI wheel at /cli/agnes.whl for direct uv install

uv tool install inspects the URL path suffix to recognise a wheel, so
/cli/download (which has no .whl suffix) cannot be installed directly.
Expose a stable /cli/agnes.whl alias over the same wheel lookup so users
can run: uv tool install --force https://<server>/cli/agnes.whl

* test(cli): cover da auth import-token --server persisting to config.yaml

The server persistence was already implemented in the import-token command
(save_config({server}) call) but not covered by tests. Add an explicit test
so the one-step setup contract — single import-token call writes both token
and server — cannot regress.

* feat(web): simpler Claude setup — single uv install URL, single import-token call

User feedback: the prior clipboard payload repeated the server URL and
token across multiple steps (curl + tmpfile + install + rm + separate
seed-config + import-token). Collapse to:

 1. uv tool install --force {server_url}/cli/agnes.whl  (single URL, direct)
 2. da auth import-token --token ... --server ...        (one call, persists both)
 3. da auth whoami
 4. skills (ask user first)
 5. confirm

uv accepts HTTPS URLs that end in .whl and installs them directly, so
the tmpfile dance is unnecessary. import-token --server already persists
the server to config.yaml, so no separate printf > config.yaml step.

* fix(tests): update admin users heading assertion after template rename

The admin_users.html template now uses <h2 class="users-title">Users</h2>
instead of <h2>User management</h2>. Update the assertion to match.

* feat(ui): unify header across remaining 7 standalone pages

These 7 pages render their own full <html> and don't extend base.html,
so the previous unification commit only covered base + dashboard. Each
had its own ad-hoc <header> markup with inconsistent classes
(.top-header / .header / .page-header), inconsistent nav-link sets,
and inconsistent avatar/email styling.

Replace each inline <header>...</header> block with the shared
{% include '_app_header.html' %} so /activity-center, /admin/permissions,
/admin/tables, /catalog, /corporate-memory, /corporate-memory/admin,
and /install all show the same chrome (Dashboard / Install CLI /
Profile / Users / email + avatar / Logout) with the active page
highlighted via request.url.path.

Old inline header CSS (.header, .top-header, .page-header, .nav-link,
etc.) is left in place as harmless dead code; it can be cleaned up in
a follow-up sweep.

* feat(web): add readable preview of Claude setup payload on dashboard + /install

Move the line-by-line setup instructions into app/web/setup_instructions.py
as the single source of truth, then render them in two modes from the
existing _claude_setup_instructions.jinja partial:

- preview_mode=True  → visible, read-only <pre><code> block with the real
  server URL and a clearly-styled placeholder token (never a real one).
- preview_mode=False → the JS SETUP_INSTRUCTIONS_TEMPLATE used by the
  one-click flow (unchanged behaviour).

Both /dashboard (env-setup-cta card) and /install (Step 1 card) now show
the preview directly under the 'Setup a new Claude Code' button so users
can see exactly what will land in their clipboard before they click.

* feat(web): update setup instructions — `da diagnose` step, explicit section titles

Rework the Claude Code setup payload to:

- Give every numbered step an unambiguous verb header ("1) Install the CLI",
  "2) Log in", "3) Verify the login", "4) Run diagnostics", "5) Skills (ask
  the user first)", "6) Confirm").
- Add step 4 `da diagnose` as the post-login health check. The CLI already
  ships this command (cli/commands/diagnose.py); it prints "Overall:
  healthy" and a list of green checks that map cleanly to next actions.
- Ask the skills copy-vs-on-demand question verbatim so Claude Code always
  prompts the user the same way.
- Replace the terse "Confirm" line with a 4-bullet summary (version,
  whoami, skills choice, diagnose status) so the return message is
  structured and comparable across setups.

* chore(web): remove stale MCP card from /install (no MCP server today)

The 'Use with Claude Code / MCP' card (Step 3 on /install) referenced an
MCP integration Agnes does not ship. Remove the whole card. The one-click
'Setup a new Claude Code' flow in Step 1 already covers the long-lived
client use case and is less confusing than dangling persistence tips for
a non-existent integration.

* feat(api): include user_email + last_used_ip + user_id in admin tokens list response

Adds AdminTokenItem response model (superset of TokenListItem) and
AccessTokenRepository.list_all_with_user() joining personal_access_tokens
with users to denormalize user_email. Needed for /admin/tokens UI where
admins triage tokens across all users.

* feat(web): /admin/tokens page — list, filter, search, revoke across all users

Adds a new admin-only page with client-side filtering (status, user email,
last-used window), column sorting, counts bar (active/revoked/expired),
and an inline revoke action. Mirrors the /admin/users visual language.

* feat(web): add Tokens nav link for admins + deep-link from admin/users row

Admin-only nav entry to /admin/tokens, and a per-row Tokens button on
/admin/users that prefills the token page's user filter via ?user=<email>.

* test(admin): cover /admin/tokens rendering, filter state, non-admin denial, revoke

Verifies admin can render the page (title + JS hooks present), a non-admin
is blocked, unauthenticated users are redirected, the admin list response
includes user_email / user_id / last_used_ip, and admin can revoke another
user's token.

* feat(web): modern redesign of /admin/tokens — hero, stat strip, refined table, responsive cards, a11y

* feat(web): ditch the table — /admin/tokens as a card stack, modern GitHub-style list

Replaces the table-based layout with a stack of self-contained token cards
inside a <ul role=list>. Each card is a flex row: avatar + name/meta on the
left, last-used block in the middle, status pill + outlined 'Revoke' button
on the right. Status and sort controls are pill-shaped toggle chips; user
email search has an inline search icon. No <table>/<tr>/<th>/<td> anywhere.
Responsive below 720px (card stacks vertically) and 480px (stat chips 2x2).
Preserves filter IDs (flt-status, flt-user, flt-last-used) and data-revoke
for existing tests.

* feat(web): add /tokens (role-aware) — single page for both user PAT CRUD and admin overview

- Rename admin_tokens.html -> tokens.html with a new is_admin context flag.
- New route GET /tokens: renders the same card-stack UI for everyone.
  * Admins: loads /auth/admin/tokens, shows owner column + stat strip, keeps
    the owner-email search box and sort-by-owner chip.
  * Non-admins: loads /auth/tokens (own tokens only), hides owner column +
    stat chips, adds a 'New token' CTA in the hero that opens a modal
    (name + expires_in_days) calling POST /auth/tokens. The raw token is
    revealed once in a dismissable banner and cleared from the DOM on Hide.
- GET /admin/tokens now 302-redirects to /tokens, preserving query string
  (so the /admin/users deep-link ?user=foo still works).

* feat(web): /tokens full-bleed layout to match dashboard width

The hero, toolbar, and card list used to sit inside base.html's .container
(max-width 800px). Break out with negative horizontal margins so the page
spans the viewport like /dashboard does, capped at 1440px for readability
on very wide screens with a 24px gutter on each side.

- No change to base.html itself. The override is scoped to .tokens-page.
- body { overflow-x: hidden; } guards against rare horizontal scrollbars.
- < 808px viewport: reset to natural flow (mobile already narrower).
- ≥ 1488px viewport: cap to 1440px and re-center.

* chore(web): remove /profile template + nav link (redirect /profile -> /tokens)

The old /profile PAT CRUD page is now redundant — the modern /tokens page
covers both user and admin flows. Delete the template; the router's
/profile handler already 302-redirects to /tokens.

Nav cleanup:
- Remove the 'Profile' link.
- Show a single 'Tokens' link to every signed-in user (previously only
  admins saw it).
- Active-state matches /tokens, /admin/tokens, and /profile so the
  highlight survives the redirect chain.

/install CTA now points at /tokens instead of /profile.

* test: cover /tokens for admin + non-admin flows, /profile redirect, nav update

tests/test_admin_tokens_ui.py
- Point admin rendering test at /tokens directly and tighten assertions
  (admin-only stat strip + owner search, non-admin CTA absent).
- Add test_non_admin_can_render_tokens_page: personal body, New-token CTA,
  create-modal, reveal banner; stat strip + owner search absent.
- Add test_admin_tokens_redirects_to_tokens: 302 to /tokens, query string
  (?user=...) preserved for the /admin/users deep-link.
- Add test_profile_redirects_to_tokens: 302 to /tokens.
- Add test_non_admin_can_create_pat_via_tokens_page_api: exercises the
  POST /auth/tokens call that the non-admin create-modal submits.

tests/test_pat.py
- test_profile_page_renders -> test_profile_page_redirects_to_tokens:
  assert the 302 + that /tokens lands on the unified non-admin body.

tests/test_web_ui.py
- admin_users nav assertion: 'Tokens' link present, 'Profile' link absent.
- Add test_nav_shows_tokens_link_for_non_admin: non-admins see the same
  'Tokens' link (previously only admins did).
- Add test_profile_redirects_to_tokens back-compat check.

* feat(web): collapse 'What Claude Code will receive' by default

The preview block on /dashboard and /install now uses <details>/<summary>
so it is hidden by default. Click the chevron/title to expand and review
the clipboard payload. Markup stays in the DOM so existing tests that
assert on content continue to pass.

* fix(web): /tokens width — override .container to 1280px like dashboard

The negative-margin full-bleed trick was fragile and pushed content past
the right edge on deployed viewports. Replace with a simple max-width
override of base.html's .container on this page only, matching
/dashboard's 1280px center-column layout.

* feat(web): split role-aware /tokens into my_tokens.html + admin_tokens.html

* feat(web): router — separate handlers for /tokens (own) and /admin/tokens (all)

* feat(web): nav — show Tokens for all, add All tokens for admins

* test: cover split token pages (own vs all) + admin access gating

* feat(web): move 'My tokens' into a user dropdown menu

Replaces the separate Tokens/email/Logout nav trio with a rounded
avatar trigger that opens a dropdown containing the user's email,
role, a 'My tokens' link, and Logout. Admin-only 'All tokens' stays
as a top-level nav item since it's an admin function, not a personal
one. Click-outside and Escape close the panel; chevron rotates on
open.

* fix(api): allow PATs to list/get/revoke their own tokens (CLI flow)

The documented 'da auth token list/revoke' CLI flow in
docs/HEADLESS_USAGE.md uses a PAT, but the previous dependency
(require_session_token) returned 403. Only create_token must be
session-only to prevent PAT-spawning-PAT chains; listing and
revoking your own tokens is safe with a PAT.

* fix(api): cap expires_in_days at 3650 to avoid datetime overflow (500 to 400)

Values above ~11 million days overflowed datetime.max in
datetime.now(utc) + timedelta(days=...) and surfaced as an
unhandled OverflowError → 500. Cap at 10 years with a clear
400 instead; the no-expiry code path is unaffected.

* fix(api): relax _SAFE_URL_RE to allow path prefixes, underscores, and IPv6

The previous regex rejected legitimate reverse-proxy base_url values
(https://host/agnes/), underscores in Docker Compose hostnames, and
IPv6 literals (http://[::1]:8000). Widen the charset and allow an
optional trailing path. shlex.quote continues to provide
defense-in-depth against any metacharacter that slips through.

* fix(web): /login/email and Google OAuth propagate next_path

Previously, /login/email silently dropped the ?next=<path> query
param so the hidden form field rendered empty and login always
landed on /dashboard. Google's button was hard-coded to
/auth/google/login, ignoring next entirely.

- /login page now appends ?next to the Google button URL
- /login/email reads + sanitizes next, passes as template context
- google_login stashes sanitized next_path in session['login_next']
- google_callback pops + re-sanitizes and redirects there

Sanitization factored into app/auth/_common.safe_next_path.

* fix(auth): differentiate argon2 VerifyMismatchError from internal errors in web login

The previous except (VerifyMismatchError, Exception) collapsed both
cases into the generic 'invalid credentials' redirect, silently
hiding corrupted-hash / library errors from ops. Split the two:
bad password still gets ?error=invalid; anything else logs via
logger.exception and redirects with ?err=auth_internal so ops have
a visible signal and users don't retry forever against a broken
password_hash column.

* docs: correct CLAUDE.md table name (personal_access_tokens)

v7 note referenced 'access_tokens.last_used_ip' but the real table
is personal_access_tokens (as mentioned two tokens earlier in the
same bullet). Same-file consistency fix.

* chore(web): clarify admin user-reset UI — encourage Set password over the unused reset_token

POST /api/users/{id}/reset-password stores and returns a token
but no endpoint consumes it — the magic-link sender would log the
user in without prompting for a new password, defeating the reset.
- Drop the 'Reset' row action from admin_users so admins aren't
  pointed at a dead end.
- Rewrite the reveal-modal copy to tell admins to use Set password
  and explicitly note that the magic-link flow isn't available
  for reset tokens in this build.
The API endpoint stays for API-level future use.

* test: cover PAT CLI flow, expires_in_days overflow, proxy base_url, next propagation

- tests/test_pat.py: PAT can list own tokens (200, was 403);
  PAT can revoke own tokens (204); create_token returns 400 for
  expires_in_days > 3650 (was 500 via datetime overflow).
- tests/test_cli_artifacts.py: _SAFE_URL_RE accepts reverse-proxy
  path prefixes, underscores, and IPv6 literals; end-to-end check
  of cli_install_script with a stubbed base_url that includes
  a path prefix (Agnes behind /agnes/).
- tests/test_web_ui.py: /login propagates ?next to the Google
  button URL; /login/email renders next in the hidden form field
  and strips hostile values; unit coverage of safe_next_path.

* fix(security): use \Z instead of $ in URL/version allowlists (trailing-\n bypass)

Python regex `$` also matches just before a trailing newline, so a Host
header or AGNES_VERSION value like "good.example.com\n$(rm -rf /)"
would slip past the allowlist. `\Z` anchors to strict end-of-string.

shlex.quote downstream remains as defense-in-depth, but the allowlist
is now the tight gate it claims to be.

* fix(auth): PAT with null expiry omits JWT exp claim (DB is the source of truth)

Previously a PAT created with `expires_in_days=null` (user-requested
"never expires") set the DB `expires_at` to NULL (correct) but still
baked a ~100y `exp` claim into the JWT. That is misleading: the PAT
silently did expire eventually, despite the UI and API promising
"no expiry".

`create_access_token` now accepts `omit_exp=True` to skip the `exp`
claim entirely. `app/api/tokens.py` passes that when `expires_in_days
is None`. The authoritative expiry check lives in
`app/auth/dependencies.py`, which reads `expires_at` from the DB row —
unchanged. PyJWT accepts claim-less JWTs indefinitely.

* test: cover trailing-newline regex bypass + no-exp JWT for unbounded PAT

- test_safe_url_re_rejects_trailing_newline_bypass: asserts both
  `_SAFE_URL_RE` and `_SAFE_VERSION_RE` reject values with a trailing
  `\n` (previously accepted because Python `$` matches before `\n`).
- test_pat_null_expiry_jwt_has_no_exp_claim: POST /auth/tokens with
  `expires_in_days=null`, decode the returned JWT, assert `exp` is
  absent while `typ=pat`, `sub`, and `jti` are still present.
- test_pat_with_null_expiry_is_accepted_by_verify_token: verify_token
  round-trips a claim-less JWT without ExpiredSignatureError.
- test_pat_null_expiry_end_to_end_allows_authenticated_request: use
  the null-expiry PAT against /auth/tokens and confirm it authenticates.

* docs(auth): document X-Forwarded-For trust model in _client_ip

Deployment runs behind Caddy which strips incoming X-Forwarded-For
and sets its own, so the leftmost hop is trustworthy. Clarify that
the stored last_used_ip is audit-only and never used for access
control — if the app is ever exposed directly, this value becomes
client-settable.

* docs: /profile → /tokens in install.sh next-steps, CLI error, HEADLESS_USAGE, security skill

After splitting PAT management to /tokens (with /profile as a back-compat
302), stale references remained in user-facing text. Update them to the
canonical /tokens URL so shell scripts, CLI error hints, docs, and the
bundled security skill are all consistent.
2026-04-22 14:24:28 +02:00

53 KiB

Cloudflare Access Auth Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Add Cloudflare Access as a third authentication method that coexists with existing password and Google OAuth providers — users behind a Cloudflare Zero Trust tunnel are auto-logged-in via signed edge JWT; direct-access users keep password/Google/PAT flows.

Architecture: New provider module app/auth/providers/cloudflare.py exposes is_available() + verify_cf_jwt(). A starlette middleware (app/auth/middleware.py, wired in app/main.py) runs before route handlers, detects the Cf-Access-Jwt-Assertion header, verifies it against the Cloudflare team JWKS with audience check, and — on success — provisions the user and sets the standard access_token cookie. Middleware is a pure pass-through when the header is missing or verification fails, preserving all existing flows.

Tech Stack: PyJWT 2.12 (already in deps) with PyJWKClient for JWKS fetch + 5-min cache; FastAPI middleware decorator; existing UserRepository + create_access_token() helpers.


Context — What's Already in the Codebase

Existing auth providers follow a common pattern:

  • app/auth/providers/password.py:32is_available() always True
  • app/auth/providers/google.py:24is_available() returns True when GOOGLE_CLIENT_ID + GOOGLE_CLIENT_SECRET set
  • app/auth/providers/email.py:31is_available() returns True when SMTP_HOST or SENDGRID_API_KEY set

All three self-register via is_available() in app/main.py:138-146. Login page (app/web/router.py:194-232) iterates providers and builds login_buttons dynamically.

Cookie flow: create_access_token()response.set_cookie(key="access_token", ...) (see app/auth/providers/google.py:97-103 for the reference pattern). Downstream routes read via app/auth/dependencies.py:33 → header first, cookie fallback.

Cloudflare Access differs from those providers — it's not a clickable login button. It's an edge gate that injects a signed JWT in the Cf-Access-Jwt-Assertion header on every request. The provider therefore lives as a middleware that runs before handlers and transparently exchanges that edge JWT for our session cookie.

Env Vars (New)

Var Required Purpose
CF_ACCESS_TEAM yes Your Cloudflare team domain prefix (e.g. keboolahttps://keboola.cloudflareaccess.com/cdn-cgi/access/certs)
CF_ACCESS_AUD yes Application AUD tag (from CF dashboard → Access → Applications → your app → Overview)
CF_ACCESS_DOMAIN_ALLOW no Comma-separated email domain allowlist; if unset, falls back to instance.yaml allowed_domains (same as Google)

Security Model

  1. Trust gate: middleware only inspects the header when both CF_ACCESS_TEAM and CF_ACCESS_AUD are set. If either is unset, the header is ignored — this prevents header spoofing on deployments that don't sit behind CF.
  2. Audience check: jwt.decode(..., audience=CF_ACCESS_AUD) — PyJWT raises on mismatch.
  3. Issuer check: explicit options={"require": ["iss"]} and issuer=f"https://{CF_ACCESS_TEAM}.cloudflareaccess.com".
  4. Signature: JWKS public keys fetched from https://<team>.cloudflareaccess.com/cdn-cgi/access/certs (cached by PyJWKClient with 5-min TTL).
  5. Failure is silent pass-through: invalid/expired/missing JWT → middleware does nothing, request proceeds to route where normal auth (cookie/Bearer/login redirect) kicks in. Never returns 401 from middleware — that would break password/Google login paths.
  6. Domain allowlist: same logic as google.py:68-72 — reject email domains outside the configured allowlist.

File Structure

Create:

  • app/auth/providers/cloudflare.py — provider module (is_available, verify_cf_jwt, get_or_create_user_from_cf)
  • app/auth/middleware.pyCloudflareAccessMiddleware starlette middleware class
  • tests/test_cloudflare_auth.py — unit + integration tests
  • docs/auth-cloudflare.md — ops doc (how to configure the CF tunnel + Access app)

Modify:

  • app/main.py:137-146 — register middleware between SessionMiddleware and route handlers
  • app/web/router.py:194-232 — add optional "Protected by Cloudflare Access" hint on login page when provider is available

Task 1: Test scaffolding — fixtures for JWKS + signed tokens

Before writing any provider code, build the test plumbing. The rest of the plan depends on it.

Files:

  • Create: tests/test_cloudflare_auth.py

  • Test: tests/test_cloudflare_auth.py

  • Step 1: Write a fixture that generates an RSA keypair and a mock JWKS

Add to tests/test_cloudflare_auth.py:

"""Tests for Cloudflare Access auth provider and middleware."""

import json
import time
import uuid
from base64 import urlsafe_b64encode
from typing import Callable

import pytest
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography.hazmat.primitives import serialization
import jwt as pyjwt


def _b64url_uint(n: int) -> str:
    """Encode an integer as base64url per RFC 7518 §6.3.1."""
    byte_length = (n.bit_length() + 7) // 8
    return urlsafe_b64encode(n.to_bytes(byte_length, "big")).rstrip(b"=").decode()


@pytest.fixture
def cf_keypair():
    """Generate an RSA keypair for signing test CF Access JWTs."""
    priv = rsa.generate_private_key(public_exponent=65537, key_size=2048)
    pub = priv.public_key()
    pub_numbers = pub.public_numbers()
    kid = "test-kid-1"
    jwks = {
        "keys": [
            {
                "kty": "RSA",
                "kid": kid,
                "use": "sig",
                "alg": "RS256",
                "n": _b64url_uint(pub_numbers.n),
                "e": _b64url_uint(pub_numbers.e),
            }
        ]
    }
    priv_pem = priv.private_bytes(
        encoding=serialization.Encoding.PEM,
        format=serialization.PrivateFormat.PKCS8,
        encryption_algorithm=serialization.NoEncryption(),
    )
    return {"kid": kid, "jwks": jwks, "private_pem": priv_pem}


@pytest.fixture
def make_cf_jwt(cf_keypair) -> Callable[..., str]:
    """Factory: build a signed CF Access JWT with overridable claims."""
    def _make(
        email: str = "user@example.com",
        aud: str = "test-aud-123",
        iss: str = "https://testteam.cloudflareaccess.com",
        exp_offset: int = 3600,
        name: str = "Test User",
        extra_claims: dict | None = None,
    ) -> str:
        now = int(time.time())
        claims = {
            "email": email,
            "name": name,
            "aud": aud,
            "iss": iss,
            "iat": now,
            "exp": now + exp_offset,
            "sub": str(uuid.uuid4()),
        }
        if extra_claims:
            claims.update(extra_claims)
        return pyjwt.encode(
            claims,
            cf_keypair["private_pem"],
            algorithm="RS256",
            headers={"kid": cf_keypair["kid"]},
        )
    return _make
  • Step 2: Add fixtures — reset cached JWKS client + patch JWKS fetch

Append to tests/test_cloudflare_auth.py:

@pytest.fixture(autouse=True)
def _reset_cf_jwks_cache(monkeypatch):
    """Reset the module-level JWKS client so each test starts fresh.

    Without this, a client built from a previous test's team/URL would persist.
    """
    import sys
    mod = sys.modules.get("app.auth.providers.cloudflare")
    if mod is not None:
        monkeypatch.setattr(mod, "_JWKS_CLIENT", None, raising=False)
        monkeypatch.setattr(mod, "_JWKS_TEAM", None, raising=False)


@pytest.fixture
def patch_jwks(monkeypatch, cf_keypair):
    """Patch PyJWKClient so verify_cf_jwt reads our test key instead of hitting the network."""
    from cryptography.hazmat.primitives import serialization as _ser
    # Build a PyJWK-compatible signing key object from the public key
    pub_pem = _ser.load_pem_private_key(cf_keypair["private_pem"], password=None).public_key().public_bytes(
        encoding=_ser.Encoding.PEM,
        format=_ser.PublicFormat.SubjectPublicKeyInfo,
    )

    class _FakeSigningKey:
        def __init__(self, key_bytes: bytes):
            from cryptography.hazmat.primitives.serialization import load_pem_public_key
            self.key = load_pem_public_key(key_bytes)

    def _fake_get_signing_key_from_jwt(self, token):
        return _FakeSigningKey(pub_pem)

    monkeypatch.setattr(
        "jwt.PyJWKClient.get_signing_key_from_jwt",
        _fake_get_signing_key_from_jwt,
    )
  • Step 3: Add a client fixture with CF env configured

Append to tests/test_cloudflare_auth.py:

@pytest.fixture
def cf_client(tmp_path, monkeypatch, patch_jwks):
    """TestClient with CF_ACCESS_* env vars set so the provider is available."""
    monkeypatch.setenv("DATA_DIR", str(tmp_path))
    monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-32chars-minimum!!!!!")
    monkeypatch.setenv("TESTING", "1")
    monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
    monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")

    from fastapi.testclient import TestClient
    from app.main import create_app

    app = create_app()
    return TestClient(app)


@pytest.fixture
def no_cf_client(tmp_path, monkeypatch):
    """TestClient WITHOUT CF_ACCESS_* env — provider should be unavailable."""
    monkeypatch.setenv("DATA_DIR", str(tmp_path))
    monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-32chars-minimum!!!!!")
    monkeypatch.setenv("TESTING", "1")
    monkeypatch.delenv("CF_ACCESS_TEAM", raising=False)
    monkeypatch.delenv("CF_ACCESS_AUD", raising=False)

    from fastapi.testclient import TestClient
    from app.main import create_app

    app = create_app()
    return TestClient(app)
  • Step 4: Run pytest to confirm fixtures import cleanly (no tests yet — the file should collect with 0 tests)

Run: pytest tests/test_cloudflare_auth.py -v Expected: collected 0 items (no tests defined yet, but no import/collection errors).

  • Step 5: Commit
git add tests/test_cloudflare_auth.py
git commit -m "test(auth): add Cloudflare Access test scaffolding fixtures"

Task 2: Cloudflare provider module — is_available()

Files:

  • Create: app/auth/providers/cloudflare.py

  • Test: tests/test_cloudflare_auth.py

  • Step 1: Write the failing is_available() tests

Append to tests/test_cloudflare_auth.py:

class TestCloudflareProviderAvailability:
    def test_unavailable_without_env(self, monkeypatch):
        monkeypatch.delenv("CF_ACCESS_TEAM", raising=False)
        monkeypatch.delenv("CF_ACCESS_AUD", raising=False)
        # Force re-import so module-level env reads are fresh
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)
        assert cf_mod.is_available() is False

    def test_unavailable_with_only_team(self, monkeypatch):
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.delenv("CF_ACCESS_AUD", raising=False)
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)
        assert cf_mod.is_available() is False

    def test_unavailable_with_only_aud(self, monkeypatch):
        monkeypatch.delenv("CF_ACCESS_TEAM", raising=False)
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)
        assert cf_mod.is_available() is False

    def test_available_with_both_env(self, monkeypatch):
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)
        assert cf_mod.is_available() is True
  • Step 2: Run tests to verify they fail

Run: pytest tests/test_cloudflare_auth.py::TestCloudflareProviderAvailability -v Expected: FAIL with ModuleNotFoundError: No module named 'app.auth.providers.cloudflare'

  • Step 3: Create app/auth/providers/cloudflare.py with minimal is_available()

Create app/auth/providers/cloudflare.py:

"""Cloudflare Access auth provider — verifies edge JWT from Cloudflare Zero Trust.

Unlike password/google/email providers, Cloudflare Access is NOT a clickable
login button. Cloudflare's edge gate injects a signed JWT in the
`Cf-Access-Jwt-Assertion` header on every request. The app trusts that JWT
(after verifying signature + audience) and auto-provisions the user, issuing
our standard `access_token` cookie so downstream route handlers work unchanged.

This module exposes pure functions; the request-interception logic lives in
`app/auth/middleware.py`.
"""

import logging
import os

logger = logging.getLogger(__name__)


def _team() -> str:
    return os.environ.get("CF_ACCESS_TEAM", "")


def _aud() -> str:
    return os.environ.get("CF_ACCESS_AUD", "")


def is_available() -> bool:
    """Provider is active only when BOTH team and aud are configured.

    The two-env-var gate prevents header spoofing on deployments that don't
    sit behind Cloudflare — an attacker could otherwise forge
    `Cf-Access-Jwt-Assertion` and bypass auth.

    Env vars are read at call time (not cached at import) so tests and
    runtime env changes behave predictably.
    """
    return bool(_team() and _aud())
  • Step 4: Run tests to verify they pass

Run: pytest tests/test_cloudflare_auth.py::TestCloudflareProviderAvailability -v Expected: 4 passed.

  • Step 5: Commit
git add app/auth/providers/cloudflare.py tests/test_cloudflare_auth.py
git commit -m "feat(auth): Cloudflare Access provider skeleton with is_available()"

Task 3: JWT verification with JWKS

Files:

  • Modify: app/auth/providers/cloudflare.py

  • Test: tests/test_cloudflare_auth.py

  • Step 1: Write the failing verify_cf_jwt tests

Append to tests/test_cloudflare_auth.py:

class TestVerifyCfJwt:
    def test_valid_token_returns_claims(self, monkeypatch, patch_jwks, make_cf_jwt):
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        token = make_cf_jwt(email="alice@example.com")
        claims = cf_mod.verify_cf_jwt(token)
        assert claims is not None
        assert claims["email"] == "alice@example.com"

    def test_wrong_audience_rejected(self, monkeypatch, patch_jwks, make_cf_jwt):
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        token = make_cf_jwt(aud="wrong-aud")
        assert cf_mod.verify_cf_jwt(token) is None

    def test_wrong_issuer_rejected(self, monkeypatch, patch_jwks, make_cf_jwt):
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        token = make_cf_jwt(iss="https://evil.example.com")
        assert cf_mod.verify_cf_jwt(token) is None

    def test_expired_token_rejected(self, monkeypatch, patch_jwks, make_cf_jwt):
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        token = make_cf_jwt(exp_offset=-60)  # expired 60s ago
        assert cf_mod.verify_cf_jwt(token) is None

    def test_malformed_token_rejected(self, monkeypatch, patch_jwks):
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        assert cf_mod.verify_cf_jwt("not-a-jwt") is None
        assert cf_mod.verify_cf_jwt("") is None

    def test_verify_returns_none_when_unavailable(self, monkeypatch):
        monkeypatch.delenv("CF_ACCESS_TEAM", raising=False)
        monkeypatch.delenv("CF_ACCESS_AUD", raising=False)
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        assert cf_mod.verify_cf_jwt("anything") is None
  • Step 2: Run tests to verify they fail

Run: pytest tests/test_cloudflare_auth.py::TestVerifyCfJwt -v Expected: FAIL with AttributeError: module 'app.auth.providers.cloudflare' has no attribute 'verify_cf_jwt'

  • Step 3: Implement verify_cf_jwt in app/auth/providers/cloudflare.py

Replace the contents of app/auth/providers/cloudflare.py with:

"""Cloudflare Access auth provider — verifies edge JWT from Cloudflare Zero Trust.

Unlike password/google/email providers, Cloudflare Access is NOT a clickable
login button. Cloudflare's edge gate injects a signed JWT in the
`Cf-Access-Jwt-Assertion` header on every request. The app trusts that JWT
(after verifying signature + audience) and auto-provisions the user, issuing
our standard `access_token` cookie so downstream route handlers work unchanged.

This module exposes pure functions; the request-interception logic lives in
`app/auth/middleware.py`.
"""

import logging
import os
from typing import Optional

import jwt as pyjwt
from jwt import PyJWKClient

logger = logging.getLogger(__name__)

_JWKS_CLIENT: Optional[PyJWKClient] = None
_JWKS_TEAM: Optional[str] = None  # team string the cached client was built for


def _team() -> str:
    return os.environ.get("CF_ACCESS_TEAM", "")


def _aud() -> str:
    return os.environ.get("CF_ACCESS_AUD", "")


def is_available() -> bool:
    """Provider is active only when BOTH team and aud are configured."""
    return bool(_team() and _aud())


def _jwks_url() -> str:
    return f"https://{_team()}.cloudflareaccess.com/cdn-cgi/access/certs"


def _issuer() -> str:
    return f"https://{_team()}.cloudflareaccess.com"


def _get_jwks_client() -> PyJWKClient:
    """Lazy-init JWKS client. PyJWKClient caches keys with 5-min TTL by default.

    If `CF_ACCESS_TEAM` changes (e.g. between tests), rebuild the client.
    """
    global _JWKS_CLIENT, _JWKS_TEAM
    current_team = _team()
    if _JWKS_CLIENT is None or _JWKS_TEAM != current_team:
        _JWKS_CLIENT = PyJWKClient(_jwks_url(), cache_jwk_set=True, lifespan=300)
        _JWKS_TEAM = current_team
    return _JWKS_CLIENT


def verify_cf_jwt(token: str) -> Optional[dict]:
    """Verify a Cloudflare Access JWT. Returns claims dict on success, None on any failure.

    Never raises — all exceptions are logged at debug and mapped to None so the
    middleware can treat them as "pass through to normal auth."
    """
    if not is_available():
        return None
    if not token:
        return None
    try:
        signing_key = _get_jwks_client().get_signing_key_from_jwt(token)
        claims = pyjwt.decode(
            token,
            signing_key.key,
            algorithms=["RS256"],
            audience=_aud(),
            issuer=_issuer(),
            options={"require": ["exp", "iat", "iss", "aud"]},
        )
        return claims
    except pyjwt.InvalidTokenError as e:
        logger.debug("CF Access JWT invalid: %s", e)
        return None
    except Exception as e:
        # JWKS fetch failure, network error, etc. — never propagate
        logger.warning("CF Access JWT verification error: %s", e)
        return None
  • Step 4: Run tests to verify they pass

Run: pytest tests/test_cloudflare_auth.py::TestVerifyCfJwt -v Expected: 6 passed.

  • Step 5: Commit
git add app/auth/providers/cloudflare.py tests/test_cloudflare_auth.py
git commit -m "feat(auth): verify Cloudflare Access JWT with JWKS + audience + issuer"

Task 4: User provisioning from Cloudflare identity

Files:

  • Modify: app/auth/providers/cloudflare.py

  • Test: tests/test_cloudflare_auth.py

  • Step 1: Write the failing user-provisioning tests

Append to tests/test_cloudflare_auth.py:

class TestGetOrCreateUserFromCf:
    def test_creates_new_user(self, tmp_path, monkeypatch):
        monkeypatch.setenv("DATA_DIR", str(tmp_path))
        monkeypatch.setenv("TESTING", "1")
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        from src.db import get_system_db
        conn = get_system_db()
        try:
            user = cf_mod.get_or_create_user_from_cf(
                email="new@example.com", name="New User", conn=conn,
            )
            assert user is not None
            assert user["email"] == "new@example.com"
            assert user["role"] == "analyst"
        finally:
            conn.close()

    def test_returns_existing_user(self, tmp_path, monkeypatch):
        monkeypatch.setenv("DATA_DIR", str(tmp_path))
        monkeypatch.setenv("TESTING", "1")
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        from src.db import get_system_db
        from src.repositories.users import UserRepository
        conn = get_system_db()
        try:
            UserRepository(conn).create(
                id="existing-id", email="existing@example.com",
                name="Existing", role="admin",
            )
            user = cf_mod.get_or_create_user_from_cf(
                email="existing@example.com", name="Existing", conn=conn,
            )
            assert user["id"] == "existing-id"
            assert user["role"] == "admin"  # role preserved
        finally:
            conn.close()

    def test_deactivated_user_rejected(self, tmp_path, monkeypatch):
        monkeypatch.setenv("DATA_DIR", str(tmp_path))
        monkeypatch.setenv("TESTING", "1")
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        from src.db import get_system_db
        from src.repositories.users import UserRepository
        conn = get_system_db()
        try:
            UserRepository(conn).create(
                id="deact-id", email="deact@example.com",
                name="Deact", role="analyst",
            )
            UserRepository(conn).update(id="deact-id", active=False)
            user = cf_mod.get_or_create_user_from_cf(
                email="deact@example.com", name="Deact", conn=conn,
            )
            assert user is None
        finally:
            conn.close()

    def test_domain_allowlist_rejects_outsider(self, tmp_path, monkeypatch):
        monkeypatch.setenv("DATA_DIR", str(tmp_path))
        monkeypatch.setenv("TESTING", "1")
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        monkeypatch.setenv("CF_ACCESS_DOMAIN_ALLOW", "example.com")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        from src.db import get_system_db
        conn = get_system_db()
        try:
            user = cf_mod.get_or_create_user_from_cf(
                email="outsider@evil.com", name="Outsider", conn=conn,
            )
            assert user is None
        finally:
            conn.close()

    def test_domain_allowlist_accepts_insider(self, tmp_path, monkeypatch):
        monkeypatch.setenv("DATA_DIR", str(tmp_path))
        monkeypatch.setenv("TESTING", "1")
        monkeypatch.setenv("CF_ACCESS_TEAM", "testteam")
        monkeypatch.setenv("CF_ACCESS_AUD", "test-aud-123")
        monkeypatch.setenv("CF_ACCESS_DOMAIN_ALLOW", "example.com,partner.com")
        import importlib
        from app.auth.providers import cloudflare as cf_mod
        importlib.reload(cf_mod)

        from src.db import get_system_db
        conn = get_system_db()
        try:
            user = cf_mod.get_or_create_user_from_cf(
                email="ok@partner.com", name="Partner", conn=conn,
            )
            assert user is not None
            assert user["email"] == "ok@partner.com"
        finally:
            conn.close()
  • Step 2: Run tests to verify they fail

Run: pytest tests/test_cloudflare_auth.py::TestGetOrCreateUserFromCf -v Expected: FAIL with AttributeError: ... has no attribute 'get_or_create_user_from_cf'

  • Step 3: Add get_or_create_user_from_cf to app/auth/providers/cloudflare.py

Append to app/auth/providers/cloudflare.py:

import uuid
from typing import Any

import duckdb

from src.repositories.users import UserRepository


def _allowed_domains() -> list[str]:
    """Domain allowlist — CF_ACCESS_DOMAIN_ALLOW env wins, else instance.yaml."""
    env = os.environ.get("CF_ACCESS_DOMAIN_ALLOW", "").strip()
    if env:
        return [d.strip().lower() for d in env.split(",") if d.strip()]
    try:
        from app.instance_config import get_allowed_domains
        return [d.lower() for d in (get_allowed_domains() or [])]
    except Exception:
        return []


def get_or_create_user_from_cf(
    email: str,
    name: str,
    conn: duckdb.DuckDBPyConnection,
) -> Optional[dict[str, Any]]:
    """Look up or provision a user from a verified CF Access identity.

    Returns the user dict on success; returns None when:
    - email domain is outside the allowlist
    - user exists but is deactivated

    New users default to `analyst` role (same default as Google OAuth).
    """
    if not email or not isinstance(email, str):
        return None

    allow = _allowed_domains()
    if allow:
        domain = email.split("@")[-1].lower()
        if domain not in allow:
            logger.info("CF Access: rejecting email outside allowlist: %s", email)
            return None

    repo = UserRepository(conn)
    user = repo.get_by_email(email)
    if user is None:
        user_id = str(uuid.uuid4())
        repo.create(
            id=user_id,
            email=email,
            name=name or email.split("@")[0],
            role="analyst",
        )
        user = repo.get_by_email(email)
        logger.info("CF Access: provisioned new user %s", email)

    if not bool(user.get("active", True)):
        logger.info("CF Access: rejecting deactivated user %s", email)
        return None

    return user
  • Step 4: Run tests to verify they pass

Run: pytest tests/test_cloudflare_auth.py::TestGetOrCreateUserFromCf -v Expected: 5 passed.

  • Step 5: Commit
git add app/auth/providers/cloudflare.py tests/test_cloudflare_auth.py
git commit -m "feat(auth): provision users from verified Cloudflare Access identity"

Task 5: Middleware skeleton — pass-through

Files:

  • Create: app/auth/middleware.py

  • Test: tests/test_cloudflare_auth.py

  • Step 1: Write the failing pass-through tests

Append to tests/test_cloudflare_auth.py:

class TestMiddlewarePassthrough:
    def test_no_header_no_cookie_redirects_to_login(self, cf_client):
        """Dashboard without any auth → normal 302 to /login (middleware must not interfere)."""
        resp = cf_client.get("/dashboard", follow_redirects=False)
        assert resp.status_code == 302
        assert "/login" in resp.headers.get("location", "")

    def test_invalid_cf_header_passes_through(self, cf_client):
        """Garbage CF header → middleware ignores it → normal 302 to login."""
        resp = cf_client.get(
            "/dashboard",
            headers={"Cf-Access-Jwt-Assertion": "not-a-valid-jwt"},
            follow_redirects=False,
        )
        assert resp.status_code == 302
        assert "/login" in resp.headers.get("location", "")

    def test_middleware_unavailable_when_env_missing(self, no_cf_client, make_cf_jwt):
        """Without CF_ACCESS_* env, middleware must be inert even if header is present."""
        # Note: make_cf_jwt still produces a token but middleware should ignore it.
        token = make_cf_jwt()
        resp = no_cf_client.get(
            "/dashboard",
            headers={"Cf-Access-Jwt-Assertion": token},
            follow_redirects=False,
        )
        # No cookie set, normal redirect to login
        assert resp.status_code == 302
        assert "access_token" not in resp.cookies
  • Step 2: Run tests to verify they fail

Run: pytest tests/test_cloudflare_auth.py::TestMiddlewarePassthrough -v Expected: FAIL — most likely ModuleNotFoundError: app.auth.middleware once create_app() tries to import it (after Task 7 wires it in). At this stage they'll actually pass because middleware isn't registered yet. If they pass, that's fine — they are assertions of baseline behavior we must preserve.

(The tests explicitly verify the absence of CF-induced behavior, so they're a safety net against breaking existing flows once middleware is wired up.)

  • Step 3: Create app/auth/middleware.py with a pass-through middleware

Create app/auth/middleware.py:

"""Starlette middleware that transparently exchanges a verified Cloudflare Access
JWT for our standard `access_token` session cookie.

Runs before route handlers. On every request:

1. If the CF provider is not configured, pass through untouched.
2. If the request carries an `Authorization: Bearer` header (API/CLI/PAT
   client), pass through — those clients don't need a cookie, and setting
   one could leak into subsequent requests from shared clients.
3. If the request already has an `access_token` cookie, pass through
   (don't overwrite an active session — user may have logged in manually).
4. If a `Cf-Access-Jwt-Assertion` header is present and verifies, provision
   the user, mint our JWT, set the cookie, continue.
5. On any verification failure, pass through — the route handler will
   apply its normal auth logic (cookie/Bearer/redirect).

Never returns 401 from the middleware itself — that would break password/Google
login flows on deployments that enable CF as *one of several* auth methods.
"""

import logging

from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response

from app.auth.providers import cloudflare as cf

logger = logging.getLogger(__name__)

CF_HEADER = "Cf-Access-Jwt-Assertion"
COOKIE_NAME = "access_token"
COOKIE_MAX_AGE = 86400  # 24h — matches ACCESS_TOKEN_EXPIRE_HOURS in app/auth/jwt.py


class CloudflareAccessMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next) -> Response:
        if not cf.is_available():
            return await call_next(request)
        # Bearer clients (PATs, API scripts) manage their own auth — don't set a cookie on them.
        auth_header = request.headers.get("authorization", "")
        if auth_header.lower().startswith("bearer "):
            return await call_next(request)
        if request.cookies.get(COOKIE_NAME):
            return await call_next(request)
        token = request.headers.get(CF_HEADER)
        if not token:
            return await call_next(request)

        claims = cf.verify_cf_jwt(token)
        if claims is None:
            return await call_next(request)

        # Import inside dispatch to avoid circular imports at module load time
        from src.db import get_system_db
        from app.auth.jwt import create_access_token

        email = claims.get("email", "")
        name = claims.get("name", "")
        conn = get_system_db()
        try:
            user = cf.get_or_create_user_from_cf(email=email, name=name, conn=conn)
        finally:
            conn.close()

        if user is None:
            # Email outside allowlist or deactivated — pass through so the
            # normal 401 → /login redirect tells the user why.
            return await call_next(request)

        app_jwt = create_access_token(
            user_id=user["id"],
            email=user["email"],
            role=user["role"],
        )

        response = await call_next(request)
        import os
        use_secure = os.environ.get("TESTING", "").lower() not in ("1", "true")
        response.set_cookie(
            key=COOKIE_NAME,
            value=app_jwt,
            httponly=True,
            max_age=COOKIE_MAX_AGE,
            samesite="lax",
            secure=use_secure,
        )
        # Stash on request.state so this-request handlers can see the identity.
        # (Not strictly needed — the next request will use the cookie — but it
        # makes the first CF-authenticated request behave identically to a
        # cookie-authenticated one.)
        request.state.cf_user = user
        return response
  • Step 4: Run tests to verify they still pass (middleware not yet wired up — baseline behavior unchanged)

Run: pytest tests/test_cloudflare_auth.py::TestMiddlewarePassthrough -v Expected: 3 passed.

  • Step 5: Commit
git add app/auth/middleware.py tests/test_cloudflare_auth.py
git commit -m "feat(auth): Cloudflare Access middleware skeleton (not yet wired)"

Task 6: Wire the middleware into create_app()

Files:

  • Modify: app/main.py:137

  • Test: tests/test_cloudflare_auth.py

  • Step 1: Write the failing integration test (CF header → authenticated)

Append to tests/test_cloudflare_auth.py:

class TestMiddlewareAutoLogin:
    def test_valid_cf_header_auto_logs_in(self, cf_client, make_cf_jwt):
        """Valid CF JWT on /dashboard request → middleware sets cookie → 200 (not 302)."""
        token = make_cf_jwt(email="alice@example.com", name="Alice")
        resp = cf_client.get(
            "/dashboard",
            headers={"Cf-Access-Jwt-Assertion": token},
            follow_redirects=False,
        )
        # Middleware provisioned Alice + set cookie → dashboard renders
        assert resp.status_code == 200, (
            f"Expected 200, got {resp.status_code}: {resp.text[:300]}"
        )
        # Cookie was set on the response
        assert "access_token" in resp.cookies
        # User now exists
        from src.db import get_system_db
        from src.repositories.users import UserRepository
        conn = get_system_db()
        try:
            user = UserRepository(conn).get_by_email("alice@example.com")
            assert user is not None
            assert user["role"] == "analyst"
        finally:
            conn.close()

    def test_bearer_pat_passes_through_without_cookie(self, cf_client, make_cf_jwt):
        """A Bearer-authenticated client (PAT/API) must NOT get a cookie set,
        even if a CF header is also present."""
        from app.auth.jwt import create_access_token
        from src.db import get_system_db
        from src.repositories.users import UserRepository
        import uuid as _uuid

        conn = get_system_db()
        try:
            uid = str(_uuid.uuid4())
            UserRepository(conn).create(
                id=uid, email="pat@example.com", name="PAT User", role="analyst",
            )
            bearer = create_access_token(uid, "pat@example.com", "analyst")
        finally:
            conn.close()

        cf_token = make_cf_jwt(email="spoofed@example.com")
        resp = cf_client.get(
            "/dashboard",
            headers={
                "Authorization": f"Bearer {bearer}",
                "Cf-Access-Jwt-Assertion": cf_token,
            },
            follow_redirects=False,
        )
        # Bearer auth succeeds, middleware skipped → no cookie leaked
        assert resp.status_code == 200
        assert "access_token" not in resp.cookies
        # Spoofed email must not have been provisioned
        from src.db import get_system_db as _gdb
        conn2 = _gdb()
        try:
            spoofed = UserRepository(conn2).get_by_email("spoofed@example.com")
            assert spoofed is None
        finally:
            conn2.close()

    def test_existing_cookie_wins_over_cf_header(self, cf_client, make_cf_jwt):
        """If the user already has an access_token cookie, middleware must not overwrite it."""
        from app.auth.jwt import create_access_token
        from src.db import get_system_db
        from src.repositories.users import UserRepository
        import uuid as _uuid

        conn = get_system_db()
        try:
            uid = str(_uuid.uuid4())
            UserRepository(conn).create(
                id=uid, email="bob@example.com", name="Bob", role="admin",
            )
            existing_token = create_access_token(uid, "bob@example.com", "admin")
        finally:
            conn.close()

        cf_client.cookies.set("access_token", existing_token)
        cf_token = make_cf_jwt(email="carol@example.com", name="Carol")
        resp = cf_client.get(
            "/dashboard",
            headers={"Cf-Access-Jwt-Assertion": cf_token},
            follow_redirects=False,
        )
        assert resp.status_code == 200
        # Carol must NOT have been provisioned — existing cookie session wins
        from src.db import get_system_db as _gdb
        conn2 = _gdb()
        try:
            carol = UserRepository(conn2).get_by_email("carol@example.com")
            assert carol is None
        finally:
            conn2.close()
  • Step 2: Run tests to verify they fail

Run: pytest tests/test_cloudflare_auth.py::TestMiddlewareAutoLogin -v Expected: FAIL — test_valid_cf_header_auto_logs_in gets 302 (login redirect) because middleware isn't registered yet.

  • Step 3: Register the middleware in app/main.py

In app/main.py, locate lines 58-61 (SessionMiddleware setup) and insert the CF middleware registration immediately after the CORS block (after line 71). The new block goes between line 71 and line 73.

Edit app/main.py — add this block right before the # Load .env_overlay comment on line 73:

    # Cloudflare Access middleware — runs before route handlers to exchange
    # a verified CF edge JWT for our session cookie. Inert unless
    # CF_ACCESS_TEAM + CF_ACCESS_AUD are both set.
    from app.auth.middleware import CloudflareAccessMiddleware
    app.add_middleware(CloudflareAccessMiddleware)

(Starlette middleware runs in LIFO order; adding CF last means it runs first on inbound requests — exactly what we want.)

  • Step 4: Run tests to verify they pass

Run: pytest tests/test_cloudflare_auth.py::TestMiddlewareAutoLogin tests/test_cloudflare_auth.py::TestMiddlewarePassthrough -v Expected: 5 passed.

  • Step 5: Run the full auth test suite to verify no regressions

Run: pytest tests/test_auth_providers.py tests/test_journey_bootstrap_auth.py tests/test_cloudflare_auth.py -v Expected: all pass (no existing auth test should break — CF is inert without env vars, and the non-CF client fixtures don't set them).

  • Step 6: Commit
git add app/main.py tests/test_cloudflare_auth.py
git commit -m "feat(auth): wire Cloudflare Access middleware into FastAPI app"

Task 7: Login page hint when CF is available

Files:

  • Modify: app/web/router.py:194-232

  • Test: tests/test_cloudflare_auth.py

  • Step 1: Write the failing login-page test

Append to tests/test_cloudflare_auth.py:

class TestLoginPageCfHint:
    def test_login_page_shows_cf_hint_when_available(self, cf_client):
        """When CF provider is available, login page shows an informational hint."""
        resp = cf_client.get("/login")
        assert resp.status_code == 200
        assert "Cloudflare Access" in resp.text

    def test_login_page_no_cf_hint_when_unavailable(self, no_cf_client):
        """Without CF env, no hint on login page."""
        resp = no_cf_client.get("/login")
        assert resp.status_code == 200
        assert "Cloudflare Access" not in resp.text
  • Step 2: Run tests to verify they fail

Run: pytest tests/test_cloudflare_auth.py::TestLoginPageCfHint -v Expected: test_login_page_shows_cf_hint_when_available FAILs (hint not yet rendered).

  • Step 3: Add CF hint to the login page context

In app/web/router.py, locate the login_page handler (starts around line 194). Replace the body of the function with:

@router.get("/login", response_class=HTMLResponse)
async def login_page(request: Request):
    next_path = request.query_params.get("next", "")
    if not next_path.startswith("/") or next_path.startswith("//"):
        next_path = ""

    providers = []
    try:
        from app.auth.providers.google import is_available as google_available
        if google_available():
            providers.append({"name": "google", "display_name": "Google", "icon": "google"})
    except Exception:
        pass
    providers.append({"name": "password", "display_name": "Email & Password", "icon": "key"})
    try:
        from app.auth.providers.email import is_available as email_available
        if email_available():
            providers.append({"name": "email", "display_name": "Email Link", "icon": "mail"})
    except Exception:
        pass

    # Convert to login_buttons format expected by template
    login_buttons = []
    for p in providers:
        if p["name"] == "google":
            login_buttons.append({"url": "/auth/google/login", "text": "Sign in with Google", "css_class": "btn-primary", "icon_html": ""})
        elif p["name"] == "password":
            _url = "/login/password"
            if next_path:
                _url += f"?next={quote(next_path, safe='')}"
            login_buttons.append({"url": _url, "text": "Sign in with Email & Password", "css_class": "btn-secondary", "icon_html": ""})
        elif p["name"] == "email":
            _url = "/login/email"
            if next_path:
                _url += f"?next={quote(next_path, safe='')}"
            login_buttons.append({"url": _url, "text": "Sign in with Email Link", "css_class": "btn-secondary", "icon_html": ""})

    cf_available = False
    try:
        from app.auth.providers.cloudflare import is_available as cf_is_available
        cf_available = cf_is_available()
    except Exception:
        pass

    ctx = _build_context(
        request, providers=providers, login_buttons=login_buttons,
        next_path=next_path, cf_available=cf_available,
    )
    return templates.TemplateResponse(request, "login.html", ctx)
  • Step 4: Add the hint to the login template

In app/web/templates/login.html, locate the {% if not login_buttons %} block (around line 117) and insert the CF hint just before it:

                {% if cf_available %}
                <p class="login-note" style="margin-top: 16px; font-size: 12px; opacity: 0.8;">
                    This deployment is protected by Cloudflare Access. If you expected to be
                    signed in automatically, please access via your configured Cloudflare URL.
                </p>
                {% endif %}

                {% if not login_buttons %}
  • Step 5: Run tests to verify they pass

Run: pytest tests/test_cloudflare_auth.py::TestLoginPageCfHint -v Expected: 2 passed.

  • Step 6: Commit
git add app/web/router.py app/web/templates/login.html tests/test_cloudflare_auth.py
git commit -m "feat(auth): show Cloudflare Access hint on login page when enabled"

Task 8: Documentation

Files:

  • Create: docs/auth-cloudflare.md

  • Modify: README.md (add one-line pointer)

  • Step 1: Write the ops documentation

Create docs/auth-cloudflare.md:

# Cloudflare Access Authentication

Agnes can be deployed behind a Cloudflare Zero Trust tunnel with Access
protecting it as an SSO gate. When configured, users who pass CF's
identity check are automatically signed into Agnes — no second login.

This works **alongside** the built-in password and Google OAuth flows:
direct connections (e.g. local dev, CLI with PAT) still use those. Only
the CF-gated path auto-logs-in.

## Prerequisites

- A Cloudflare Zero Trust team (Free tier works for up to 50 users)
- A domain routed to Agnes via Cloudflare Tunnel (`cloudflared`) or CF proxy
- An Access Application configured in front of that domain

## Configure the Access Application

1. In the Cloudflare Zero Trust dashboard → **Access****Applications****Add an application****Self-hosted**
2. Application domain: the hostname routed to your Agnes instance
   (e.g. `agnes.yourco.com`)
3. Identity providers: enable your IdP (Google Workspace, Okta, etc.)
4. Policies: add at least one Allow policy (e.g. email ending in `@yourco.com`)
5. After creation, open the app → **Overview** tab and copy the **Application
   Audience (AUD) Tag**

## Configure Agnes

Set two environment variables in your deployment (`.env` or Secret Manager):

```bash
CF_ACCESS_TEAM=yourteam          # from https://yourteam.cloudflareaccess.com
CF_ACCESS_AUD=abc123...          # AUD Tag from the Application → Overview page
```

Optionally restrict which email domains can auto-provision:

```bash
CF_ACCESS_DOMAIN_ALLOW=yourco.com,partner.com
```

If unset, falls back to `allowed_domains` in `config/instance.yaml` (same
allowlist used by the Google OAuth provider).

Restart Agnes. That's it — requests arriving with a valid
`Cf-Access-Jwt-Assertion` header will auto-provision a new `analyst` user
and issue a session cookie.

## Security Model

- **Both env vars required**: if either `CF_ACCESS_TEAM` or `CF_ACCESS_AUD`
  is unset, the middleware is completely inert and the header is ignored.
  This prevents header spoofing on deployments that don't actually sit
  behind Cloudflare.
- **JWT verification**: signature checked against the team's JWKS
  (`https://<team>.cloudflareaccess.com/cdn-cgi/access/certs`, cached 5 min);
  `aud` and `iss` both validated; expired tokens rejected.
- **Never overwrites an existing session**: if the user already has an
  `access_token` cookie, the middleware passes through — you can always
  sign in explicitly with password/Google on a CF-protected deployment.
- **Never 401s from middleware**: if verification fails for any reason, the
  request continues to the normal auth layer — users see the normal login
  page rather than a confusing middleware error.
- **PAT/API (Bearer) clients are skipped**: requests carrying an
  `Authorization: Bearer <token>` header bypass the middleware entirely —
  no cookie is set. This preserves the clean stateless contract for
  CLI tools, CI, and scripts.

## Logout Semantics

Clicking "log out" in Agnes clears the local `access_token` cookie.
**However, if the user is still behind Cloudflare Access**, the next
request will carry a fresh `Cf-Access-Jwt-Assertion` header and the
middleware will immediately re-issue a session cookie — logout appears
to have no effect.

To fully sign out on a CF-gated deployment, the user must also sign out
of their Cloudflare Access session by visiting:

```
https://<your-agnes-domain>/cdn-cgi/access/logout
```

Consider linking to this URL from Agnes's logout UI on CF-gated
deployments, or document it in your internal user guide.

## Troubleshooting

**Auto-login doesn't happen:**
- Check `CF_ACCESS_TEAM` matches the exact subdomain (no protocol, no path):
  `keboola`, not `https://keboola.cloudflareaccess.com`
- Check `CF_ACCESS_AUD` is the **Application AUD Tag**, not the Access
  Team ID
- Verify the request actually has the header:
  `curl -I https://agnes.yourco.com/dashboard` behind CF should show
  `Cf-Access-Jwt-Assertion` in the request (use `cloudflared access curl`
  or browser dev tools)
- Check Agnes logs for `CF Access JWT invalid: ...` or
  `CF Access JWT verification error: ...`

**"User deactivated" redirect:**
- Someone deactivated this user in Agnes's admin panel. CF Access passes
  identity, but Agnes enforces the `active` flag.

**New users arrive with `analyst` role — how do I get admin access?**
- Same as Google OAuth: bootstrap the first admin manually
  (`POST /auth/bootstrap`) or have an existing admin promote via the web UI.
  • Step 2: Add a pointer to README.md

In README.md, locate the "Documentation" section (around line 134) and add one line to the list:

- [Cloudflare Access Auth](docs/auth-cloudflare.md) — SSO via Cloudflare Zero Trust tunnel

Place it between the Onboarding Guide and Deployment Guide entries.

  • Step 3: Commit
git add docs/auth-cloudflare.md README.md
git commit -m "docs(auth): Cloudflare Access setup + troubleshooting guide"

Task 9: Final integration — full test sweep + manual smoke

Files:

  • None (verification only)

  • Step 1: Run the full test suite

Run: pytest tests/ -v --timeout=60 Expected: all tests pass (633+ existing + ~20 new CF tests).

  • Step 2: Start the app locally without CF env and verify unchanged behavior

Run:

unset CF_ACCESS_TEAM CF_ACCESS_AUD
DATA_DIR=./tmp-data SEED_ADMIN_EMAIL=admin@local.test JWT_SECRET_KEY=$(python3 -c 'import secrets;print(secrets.token_urlsafe(32))') uvicorn app.main:app --port 8001 &
sleep 2
curl -s -I http://localhost:8001/login | head -1
curl -s -I http://localhost:8001/dashboard | head -3  # should 302 → /login
kill %1

Expected: HTTP/1.1 200 OK for /login, HTTP/1.1 302 for /dashboard with location: /login?....

  • Step 3: Start the app with CF env and verify CF hint on login page

Run:

DATA_DIR=./tmp-data2 SEED_ADMIN_EMAIL=admin@local.test JWT_SECRET_KEY=$(python3 -c 'import secrets;print(secrets.token_urlsafe(32))') CF_ACCESS_TEAM=example CF_ACCESS_AUD=test-aud uvicorn app.main:app --port 8002 &
sleep 2
curl -s http://localhost:8002/login | grep -c "Cloudflare Access"
kill %1

Expected: 1 (hint present).

  • Step 4: Verify middleware is inert without env even when header is spoofed

Run (re-use the no-CF server from step 2):

unset CF_ACCESS_TEAM CF_ACCESS_AUD
DATA_DIR=./tmp-data3 SEED_ADMIN_EMAIL=admin@local.test JWT_SECRET_KEY=$(python3 -c 'import secrets;print(secrets.token_urlsafe(32))') uvicorn app.main:app --port 8003 &
sleep 2
# Forge a header — should be ignored
curl -s -o /dev/null -w "%{http_code}\n" -H "Cf-Access-Jwt-Assertion: eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJhdHRhY2tlciJ9.fake" http://localhost:8003/dashboard
kill %1

Expected: 302 (redirect to login — header ignored because CF_ACCESS_* env is unset).

  • Step 5: Clean up tmp dirs
rm -rf tmp-data tmp-data2 tmp-data3
  • Step 6: Final commit and branch status
git log --oneline -10
git status  # should be clean

Self-Review Notes

Spec coverage:

  • Three auth methods coexist (Task 6 verifies existing providers unchanged; Task 7 verifies login page works with/without CF)
  • CF header verification with JWKS + aud + iss + exp (Task 3)
  • User provisioning with domain allowlist (Task 4)
  • Middleware is pass-through on failure (Task 5)
  • Existing cookie wins over CF header (Task 6, test_existing_cookie_wins_over_cf_header)
  • Header spoofing prevented when env unset (Task 5 test_middleware_unavailable_when_env_missing + Task 9 Step 4)
  • Docs cover setup + security model + troubleshooting (Task 8)

Non-goals (explicit):

  • No group/role mapping from CF claims — new users always get analyst, admins promote manually (same as Google)
  • No CLI/PAT integration with CF — PATs remain for programmatic access
  • No UI to configure CF from the admin panel — env-only

Risks / edge cases:

  • JWKS network fetch fails → verify_cf_jwt returns None → pass-through. Users see login page. Acceptable.
  • Clock skew > 5min → tokens reject as expired. PyJWT has no leeway by default; acceptable (CF tokens are short-lived, typically 24h).
  • TESTING=1 disables cookie secure=True (see middleware use_secure logic) — matches existing pattern in google.py:96.

Review-driven refinements applied (pre-execution):

  • Env read at call time, not import time. CF_ACCESS_TEAM / CF_ACCESS_AUD are read via helper functions on each call (Task 3), making tests and runtime env changes predictable. _JWKS_CLIENT cache keyed by team string so it rebuilds when the team env changes.
  • Autouse fixture _reset_cf_jwks_cache resets the module-level JWKS client + team cache between tests, eliminating cross-test leakage (Task 1 Step 2).
  • PAT Bearer pass-through. Middleware skips requests carrying Authorization: Bearer so CLI / PAT clients don't have cookies silently set on them (Task 5 Step 3). Test test_bearer_pat_passes_through_without_cookie verifies (Task 6 Step 1).
  • Email type safety. get_or_create_user_from_cf guards not isinstance(email, str) (Task 4 Step 3).
  • Logout semantics documented. docs/auth-cloudflare.md has a dedicated "Logout Semantics" section explaining the CF-session logout URL required for full sign-out on CF-gated deployments (Task 8 Step 1).