agnes-the-ai-analyst/docs/auth-google-oauth.md
ZdenekSrotyr 1381770057
fix(auth): uvicorn --proxy-headers + Google OAuth doc + vendor-agnostic OSS rule in CLAUDE.md (#39)
* fix(compose): pass --proxy-headers to uvicorn so OAuth callbacks resolve to https

When the app runs behind a reverse proxy (Caddy, nginx, Cloudflare Tunnel),
uvicorn's default policy of trusting X-Forwarded-* only from 127.0.0.1 means
the request the container sees still looks like http://localhost:8000/...,
even when the user is on https://. The OAuth provider then sends Google a
callback URL Google has never seen — Error 400: redirect_uri_mismatch.

--proxy-headers + --forwarded-allow-ips '*' tell uvicorn to honor those
headers from any source. The container only ever sees its own docker network
anyway; trusting it everywhere is safe in this deployment shape.

Adds docs/auth-google-oauth.md with the full operator gotcha list — env
vars that have to be set, instance.yaml fields that silently fall back to
defaults, and the DB workaround for ad-hoc role promotion when
SEED_ADMIN_EMAIL was missed on first boot.

* docs(claude): codify vendor-agnostic OSS rule for AI agents and humans

Adds a "Vendor-agnostic OSS" section to CLAUDE.md spelling out what cannot
land in this repo (specific deployments, internal hostnames/projects, cross-
references to private repos, customer-specific paths) and how to phrase
abstractions instead. Plus a pre-PR grep checklist in the existing "Git
Commits & Pull Requests" section.

This trips up agents and humans alike — the previous version of #39 had
private-deployment references in the body and a customer domain in a doc
example. Surfacing the rule once in the file every Claude/Cursor/Aider
session reads should prevent that on the next PR.

* docs(oauth): cover DOMAIN + SERVER_URL env vars introduced by PR #48

PR #48 (merged) added DOMAIN-gated Secure cookie in google.py and
documented SERVER_URL in .env.template, but this operator doc was
drafted before that merge and didn't reference either variable.
Adding both to the env table and extending the common-failure-modes
table with a sticky-cookie / redirect-URI-mismatch entry that
references SERVER_URL as the host-header-independent fix. Also
aligns the compose command snippet with the `='*'` syntax that
actually ships on main post-PR #48.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Vojtech Rysanek <vrysanek@groupon.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 07:07:33 +00:00

6.1 KiB

Google OAuth — operator gotchas

The Google OAuth provider (app/auth/providers/google.py) reads GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET straight from environment variables. If either is empty, is_available() returns False and the login page falls back to email / password auth without complaint.

Env vars

Var Required for Google Notes
GOOGLE_CLIENT_ID yes From Google Cloud Console OAuth 2.0 Client ID (Web application).
GOOGLE_CLIENT_SECRET yes From the same client. Rotate via "Reset secret" on the client; old value is invalidated immediately.
SESSION_SECRET yes Used by Starlette SessionMiddleware to stash OAuth state/nonce between /auth/google/login and /auth/google/callback. Auto-generated to data/state/.session_secret if unset, but for multi-replica or VM-rebuild scenarios pin it explicitly.
JWT_SECRET_KEY yes Signs the access-token cookie. Same auto-generate-and-persist pattern as SESSION_SECRET.
FORWARDED_ALLOW_IPS only when behind a reverse proxy Default 127.0.0.1 — uvicorn ignores X-Forwarded-Proto/Host from any other client IP, which means callbacks come back as http://localhost:8000/... instead of https://your-host/.... Set to * (or the proxy's IP) when terminating TLS at Caddy / nginx / Cloudflare Tunnel. The compose command: already passes --proxy-headers --forwarded-allow-ips='*' — this env var is the override.
DOMAIN recommended behind TLS Public hostname (data.example.com). Gates the Secure flag on the access-token cookie in google_callback() — when set, the cookie is only sent over HTTPS, when empty the cookie works over plain HTTP so local dev is unbroken. Also consumed by the Caddy profile.
SERVER_URL optional Absolute base URL (https://data.example.com) used to build OAuth callback URLs and other external links. Set it when you don't trust the incoming Host header (e.g. a misconfigured proxy), so the callback URL is deterministic regardless of what the reverse proxy forwards. Must match the redirect URI registered on the Google OAuth client.
SEED_ADMIN_EMAIL recommended on first boot App startup (app/main.py) creates this user with role="admin" if missing. Combined with Google OAuth, the first time the matching email signs in, repo.get_by_email() finds the seeded record and the user lands as admin.

instance.yaml requirements that affect auth

config/loader.py:_validate_config requires:

  • instance.name
  • auth.allowed_domain (CSV — e.g. "example.com, partner.org"; empty allows any verified Google account)
  • auth.webapp_secret_key (typically "${SESSION_SECRET}")
  • server.host
  • server.hostname

If any are missing, app/instance_config.py catches the ValueError, logs Could not load instance.yaml: ... Using defaults, and the app keeps running with empty instance config. That means get_allowed_domains() returns [] and every verified Google account is allowed. Always grep your runtime log for Could not load instance.yaml after a config change — silent fallback is by design (resilience over strictness) but easy to miss.

OAuth client setup (Google Cloud Console)

  1. APIs & Services → Credentials → "Create Credentials" → "OAuth client ID" → "Web application".
  2. Authorized redirect URIs — one per public hostname:
    https://<hostname>/auth/google/callback
    
    Add http://localhost:8000/auth/google/callback for local dev.
  3. The Client ID and Client Secret go into GOOGLE_CLIENT_ID / GOOGLE_CLIENT_SECRET.

Common failure modes

Symptom Cause Fix
Error 400: redirect_uri_mismatch Either the URI isn't registered on the OAuth client, or the app generated http://localhost:8000/... because FORWARDED_ALLOW_IPS wasn't set (or SERVER_URL isn't defined and the proxy's Host header is missing / wrong). Add the URI in Console; verify FORWARDED_ALLOW_IPS=* reaches the container; pin SERVER_URL=https://<host> to bypass Host-header reliance.
Login works but the user keeps getting re-prompted on the next request Access-token cookie lost between requests. Common cause: DOMAIN unset → Secure=False but the browser hit the app over https:// via a proxy and dropped the cookie for another reason; or DOMAIN set but the browser hit http://. Set DOMAIN=<hostname> to match the terminator's hostname, and always serve over HTTPS to the browser.
/login?error=google_not_configured GOOGLE_CLIENT_ID or GOOGLE_CLIENT_SECRET empty in container env. Inspect docker compose exec app env | grep GOOGLE.
/login?error=domain_not_allowed User's email domain isn't in auth.allowed_domain. Add the domain (CSV) and reload — note that allowed_domain only takes effect when instance.yaml validates (see above).
Login succeeds but /admin/* returns "Requires role admin or higher" New user got role="analyst" (default for Google-provisioned users). The JWT in the cookie is also stale. Set SEED_ADMIN_EMAIL BEFORE first login, or promote in DB and have the user log out + log back in.

DB role promotion (when SEED_ADMIN_EMAIL was missed)

The system DB (/data/state/system.duckdb) is held exclusively by uvicorn (PID 1 in container), so docker compose exec app python ... can't open a second connection. Stop the app, run a throwaway container against the host volume, restart:

Adjust the install dir, the host data path the data volume maps to, and the image tag for your deployment:

cd <install-dir>                                   # wherever docker-compose.yml lives
COMPOSE='docker compose -f docker-compose.yml -f docker-compose.prod.yml -f docker-compose.host-mount.yml'
$COMPOSE stop app scheduler
docker run --rm -v <data-dir>:/data --entrypoint python ghcr.io/keboola/agnes-the-ai-analyst:${AGNES_TAG:-stable} -c "
import duckdb
c = duckdb.connect('/data/state/system.duckdb')
c.execute(\"UPDATE users SET role = 'admin' WHERE email = 'me@example.com'\")
c.close()
"
$COMPOSE up -d app scheduler

The promoted user must sign out and sign back in — JWTs carry the role at issue time and don't refresh until a new token is issued.