fix(auth): uvicorn --proxy-headers + Google OAuth doc + vendor-agnostic OSS rule in CLAUDE.md (#39)

* fix(compose): pass --proxy-headers to uvicorn so OAuth callbacks resolve to https

When the app runs behind a reverse proxy (Caddy, nginx, Cloudflare Tunnel),
uvicorn's default policy of trusting X-Forwarded-* only from 127.0.0.1 means
the request the container sees still looks like http://localhost:8000/...,
even when the user is on https://. The OAuth provider then sends Google a
callback URL Google has never seen — Error 400: redirect_uri_mismatch.

--proxy-headers + --forwarded-allow-ips '*' tell uvicorn to honor those
headers from any source. The container only ever sees its own docker network
anyway; trusting it everywhere is safe in this deployment shape.

Adds docs/auth-google-oauth.md with the full operator gotcha list — env
vars that have to be set, instance.yaml fields that silently fall back to
defaults, and the DB workaround for ad-hoc role promotion when
SEED_ADMIN_EMAIL was missed on first boot.

* docs(claude): codify vendor-agnostic OSS rule for AI agents and humans

Adds a "Vendor-agnostic OSS" section to CLAUDE.md spelling out what cannot
land in this repo (specific deployments, internal hostnames/projects, cross-
references to private repos, customer-specific paths) and how to phrase
abstractions instead. Plus a pre-PR grep checklist in the existing "Git
Commits & Pull Requests" section.

This trips up agents and humans alike — the previous version of #39 had
private-deployment references in the body and a customer domain in a doc
example. Surfacing the rule once in the file every Claude/Cursor/Aider
session reads should prevent that on the next PR.

* docs(oauth): cover DOMAIN + SERVER_URL env vars introduced by PR #48

PR #48 (merged) added DOMAIN-gated Secure cookie in google.py and
documented SERVER_URL in .env.template, but this operator doc was
drafted before that merge and didn't reference either variable.
Adding both to the env table and extending the common-failure-modes
table with a sticky-cookie / redirect-URI-mismatch entry that
references SERVER_URL as the host-header-independent fix. Also
aligns the compose command snippet with the `='*'` syntax that
actually ships on main post-PR #48.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Vojtech Rysanek <vrysanek@groupon.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
ZdenekSrotyr 2026-04-24 09:07:33 +02:00 committed by GitHub
parent 9e19fb5219
commit 1381770057
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 88 additions and 0 deletions

View file

@ -214,7 +214,20 @@ Auth providers in `app/auth/` (FastAPI-based):
- `connectors/jira/transform.py` - Core Jira transform logic - `connectors/jira/transform.py` - Core Jira transform logic
- `services/ws_gateway/` - WebSocket notification gateway - `services/ws_gateway/` - WebSocket notification gateway
## Vendor-agnostic OSS — no customer-specific content
This repo is the public OSS distribution. **Nothing customer-specific belongs in code, configuration defaults, comments, docs, commit messages, PR titles, or PR bodies.** That includes:
- Specific deployments or brands (private VM names, internal product brands, organization names that aren't already public sponsors).
- Cloud project IDs, internal hostnames, runbook paths from a particular install (`/opt/<deployment>`, `<host>.<internal-domain>`, `prj-<org>-…`, internal SA emails).
- Cross-references to private repos (`<private-org>/<private-repo>#NN`). Describe the integration in generic terms or link to public examples instead.
When you motivate a change, frame it abstractly ("behind a TLS-terminating reverse proxy", "in containerized deploys") rather than naming a specific operator. When you show examples, use placeholders (`example.com`, `<your-host>`, `<install-dir>`). When config has reasonable defaults pulled from one deployment's habits, generalize them or surface them as documented examples — not hard-coded assumptions.
Customer-specific automation, hostnames, and identities live in private infra repos that *consume* this OSS. The OSS describes capabilities, defaults, and configuration knobs — not how a specific operator wired them up.
## Git Commits & Pull Requests ## Git Commits & Pull Requests
- Keep commit messages clean and concise - Keep commit messages clean and concise
- Do not include AI attribution in commits or PRs - Do not include AI attribution in commits or PRs
- Before opening a PR, scan the diff and the PR body for the customer-specific tokens listed above (`grep -niE '<token1>|<token2>|...'`). If anything matches, generalize or remove it.

View file

@ -1,6 +1,12 @@
services: services:
app: app:
build: . build: .
# --proxy-headers + --forwarded-allow-ips make uvicorn honor the
# X-Forwarded-Proto / X-Forwarded-Host headers any reverse proxy (Caddy,
# nginx, Cloudflare Tunnel) sets. Without it, request.url_for() emits
# http://localhost:8000/... even when the user is on https://, which
# breaks OAuth callbacks (redirect_uri_mismatch). Belt-and-suspenders —
# FORWARDED_ALLOW_IPS=* in .env does the same via env var.
command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --proxy-headers --forwarded-allow-ips='*' command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --proxy-headers --forwarded-allow-ips='*'
ports: ports:
- "8000:8000" - "8000:8000"

69
docs/auth-google-oauth.md Normal file
View file

@ -0,0 +1,69 @@
# Google OAuth — operator gotchas
The Google OAuth provider (`app/auth/providers/google.py`) reads `GOOGLE_CLIENT_ID` and `GOOGLE_CLIENT_SECRET` straight from environment variables. If either is empty, `is_available()` returns `False` and the login page falls back to email / password auth without complaint.
## Env vars
| Var | Required for Google | Notes |
|---|---|---|
| `GOOGLE_CLIENT_ID` | yes | From Google Cloud Console OAuth 2.0 Client ID (Web application). |
| `GOOGLE_CLIENT_SECRET` | yes | From the same client. Rotate via "Reset secret" on the client; old value is invalidated immediately. |
| `SESSION_SECRET` | yes | Used by Starlette `SessionMiddleware` to stash OAuth `state`/`nonce` between `/auth/google/login` and `/auth/google/callback`. Auto-generated to `data/state/.session_secret` if unset, but for multi-replica or VM-rebuild scenarios pin it explicitly. |
| `JWT_SECRET_KEY` | yes | Signs the access-token cookie. Same auto-generate-and-persist pattern as `SESSION_SECRET`. |
| `FORWARDED_ALLOW_IPS` | only when behind a reverse proxy | Default `127.0.0.1` — uvicorn ignores `X-Forwarded-Proto/Host` from any other client IP, which means callbacks come back as `http://localhost:8000/...` instead of `https://your-host/...`. Set to `*` (or the proxy's IP) when terminating TLS at Caddy / nginx / Cloudflare Tunnel. The compose `command:` already passes `--proxy-headers --forwarded-allow-ips='*'` — this env var is the override. |
| `DOMAIN` | recommended behind TLS | Public hostname (`data.example.com`). Gates the `Secure` flag on the access-token cookie in `google_callback()` — when set, the cookie is only sent over HTTPS, when empty the cookie works over plain HTTP so local dev is unbroken. Also consumed by the Caddy profile. |
| `SERVER_URL` | optional | Absolute base URL (`https://data.example.com`) used to build OAuth callback URLs and other external links. Set it when you don't trust the incoming `Host` header (e.g. a misconfigured proxy), so the callback URL is deterministic regardless of what the reverse proxy forwards. Must match the redirect URI registered on the Google OAuth client. |
| `SEED_ADMIN_EMAIL` | recommended on first boot | App startup (`app/main.py`) creates this user with `role="admin"` if missing. Combined with Google OAuth, the first time the matching email signs in, `repo.get_by_email()` finds the seeded record and the user lands as admin. |
## `instance.yaml` requirements that affect auth
`config/loader.py:_validate_config` requires:
- `instance.name`
- `auth.allowed_domain` (CSV — e.g. `"example.com, partner.org"`; empty allows any verified Google account)
- `auth.webapp_secret_key` (typically `"${SESSION_SECRET}"`)
- `server.host`
- `server.hostname`
If any are missing, `app/instance_config.py` catches the `ValueError`, logs `Could not load instance.yaml: ... Using defaults`, and the app keeps running with **empty** instance config. That means `get_allowed_domains()` returns `[]` and **every verified Google account is allowed**. Always grep your runtime log for `Could not load instance.yaml` after a config change — silent fallback is by design (resilience over strictness) but easy to miss.
## OAuth client setup (Google Cloud Console)
1. APIs & Services → Credentials → "Create Credentials" → "OAuth client ID" → "Web application".
2. Authorized redirect URIs — one per public hostname:
```
https://<hostname>/auth/google/callback
```
Add `http://localhost:8000/auth/google/callback` for local dev.
3. The Client ID and Client Secret go into `GOOGLE_CLIENT_ID` / `GOOGLE_CLIENT_SECRET`.
## Common failure modes
| Symptom | Cause | Fix |
|---|---|---|
| `Error 400: redirect_uri_mismatch` | Either the URI isn't registered on the OAuth client, or the app generated `http://localhost:8000/...` because `FORWARDED_ALLOW_IPS` wasn't set (or `SERVER_URL` isn't defined and the proxy's `Host` header is missing / wrong). | Add the URI in Console; verify `FORWARDED_ALLOW_IPS=*` reaches the container; pin `SERVER_URL=https://<host>` to bypass `Host`-header reliance. |
| Login works but the user keeps getting re-prompted on the next request | Access-token cookie lost between requests. Common cause: `DOMAIN` unset → `Secure=False` but the browser hit the app over `https://` via a proxy and dropped the cookie for another reason; or `DOMAIN` set but the browser hit `http://`. | Set `DOMAIN=<hostname>` to match the terminator's hostname, and always serve over HTTPS to the browser. |
| `/login?error=google_not_configured` | `GOOGLE_CLIENT_ID` or `GOOGLE_CLIENT_SECRET` empty in container env. | Inspect `docker compose exec app env \| grep GOOGLE`. |
| `/login?error=domain_not_allowed` | User's email domain isn't in `auth.allowed_domain`. | Add the domain (CSV) and reload — note that allowed_domain only takes effect when `instance.yaml` validates (see above). |
| Login succeeds but `/admin/*` returns "Requires role admin or higher" | New user got `role="analyst"` (default for Google-provisioned users). The JWT in the cookie is also stale. | Set `SEED_ADMIN_EMAIL` BEFORE first login, or promote in DB and have the user log out + log back in. |
## DB role promotion (when `SEED_ADMIN_EMAIL` was missed)
The system DB (`/data/state/system.duckdb`) is held exclusively by uvicorn (PID 1 in container), so `docker compose exec app python ...` can't open a second connection. Stop the app, run a throwaway container against the host volume, restart:
Adjust the install dir, the host data path the `data` volume maps to, and the image tag for your deployment:
```bash
cd <install-dir> # wherever docker-compose.yml lives
COMPOSE='docker compose -f docker-compose.yml -f docker-compose.prod.yml -f docker-compose.host-mount.yml'
$COMPOSE stop app scheduler
docker run --rm -v <data-dir>:/data --entrypoint python ghcr.io/keboola/agnes-the-ai-analyst:${AGNES_TAG:-stable} -c "
import duckdb
c = duckdb.connect('/data/state/system.duckdb')
c.execute(\"UPDATE users SET role = 'admin' WHERE email = 'me@example.com'\")
c.close()
"
$COMPOSE up -d app scheduler
```
The promoted user must sign out and sign back in — JWTs carry the role at issue time and don't refresh until a new token is issued.