agnes-the-ai-analyst/docs/auth-groups.md
Petr Simecek 1c18cdf15f
release(0.11.2): LOCAL_DEV_GROUPS dev mock + Makefile defaults + docs/local-development.md (#70)
* feat(auth): mock session.google_groups in LOCAL_DEV_MODE via LOCAL_DEV_GROUPS

LOCAL_DEV_MODE auto-logged-in the dev user but left session.google_groups
empty, so group-aware UI/code paths can't be exercised on localhost without
a real Google OAuth round-trip. New LOCAL_DEV_GROUPS env var (JSON array
matching the production {id, name} shape) populates the session on every
dev-bypass request — same structure the OAuth callback writes, so mock and
prod stay in lockstep. Compare-then-write avoids spurious Set-Cookie noise
on PAT/CLI requests; malformed input falls back to [] with a WARNING so
the dev mock never breaks the dev flow.

* refactor(auth): fail-fast LOCAL_DEV_GROUPS at startup + cache + no-mutate

Three small follow-ups on the same dev-mock vector before merge:

- Validate LOCAL_DEV_GROUPS at app startup and report the parsed group IDs
  in the LOCAL_DEV_MODE banner. A malformed value now warns loudly at boot
  instead of silently logging on the first authenticated request, where
  it's easy to miss.
- Cache the parsed result single-slot, keyed by the raw env-string. Avoids
  re-parsing JSON on every authenticated request without test-isolation
  surprises — when the env value changes, the key changes and the cache
  transparently rebuilds.
- Stop mutating the parsed-input dicts (item.setdefault → spread-merge)
  so the cached list stays a fresh value on every rebuild.
- Replace the try/except guard around request.session with hasattr —
  SessionMiddleware is always registered, the silent except was paranoid.

Tests grow by a direct session-cookie inspection (decoupled from the
profile template) and three startup-banner log assertions.

* fix(auth): drop fragile session-decoder test + actually skip empty-target write

Two follow-ups on the LOCAL_DEV_GROUPS feature before merge:

- Drop test_session_holds_mocked_groups_directly. It manually decoded the
  signed session cookie via TimestampSigner + base64, hardcoding both the
  Starlette session-cookie format and the 14-day max_age. Starlette has
  changed its session encoding before (URLSafeTimedSerializer pre-0.20)
  and would do so again silently — the test would fail with a cryptic
  BadSignature, not a clear "mock is broken" signal. The remaining
  test_dev_user_sees_mocked_groups_on_profile already covers the same
  observable signal (mocked groups in /profile body) without coupling to
  Starlette internals.

- Actually skip the session write when target_groups is empty. The previous
  comment claimed compare-then-write avoided spurious Set-Cookie noise on
  PAT/CLI requests, but on those requests session.get("google_groups") is
  None and target is [], so None != [] always evaluates True and the write
  fired anyway, marking the session dirty and re-issuing Set-Cookie on
  every request. Adding `target_groups and ...` to the guard makes the
  comment honest: empty mock now genuinely no-ops, stable browser sessions
  still skip via value-equality, and the only remaining write is the one
  that actually changes state.

33 auth tests still pass locally.

* fix(auth): match production's always-write semantics for stale dev groups

Devin code-review finding on PR #70: my earlier `target_groups and ...`
short-circuit silently diverged from the production OAuth callback. In
app/auth/providers/google.py:189-194 the callback always writes
session.google_groups on each login — including [] on failure or empty
token — so the session always reflects authoritative current state. The
mock should match.

Failure mode the previous guard left open: a developer sets
LOCAL_DEV_GROUPS=[{...}] for a session, the groups land in the signed
cookie, then the developer unsets the env var and reloads. target → [],
session.get → [{...}], `if target_groups and ...` is False, no write,
stale groups stay in the browser session indefinitely. Mock now lies
about state until logout.

Fix splits the guard:
- target_groups truthy + value-changed → write the new mock (existing path)
- target_groups falsy + non-empty stored → write [] to clear stale state
- otherwise no-op (target [] + stored None/[]: no transition to record)

PAT/CLI requests with no prior session still take the no-op path
(target=[], session.get → None which is falsy), so the original goal of
suppressing spurious Set-Cookie noise on token traffic is preserved.

Tests already cover the populated and unset paths; the new clear-stale
branch is correct by construction (production has the same shape) and
the rare manual reset workflow.

* release(0.11.2): default mocked groups in make local-dev + docs/local-development.md

Cuts 0.11.2 around the LOCAL_DEV_GROUPS work plus a small dev-experience
follow-up: every `make local-dev` now boots with two sensible default
mocked groups (Local Dev Engineers + Local Dev Admins on example.com),
so /profile and group-aware code paths render something realistic
without the operator having to discover and set LOCAL_DEV_GROUPS.

Layered so the default lives in the workflow, not the contract:

- scripts/run-local-dev.sh seeds LOCAL_DEV_GROUPS via shell ":="
  syntax — only sets the var when the operator hasn't already.
  Override: LOCAL_DEV_GROUPS='[...]' make local-dev. Disable:
  LOCAL_DEV_GROUPS= make local-dev.
- docker-compose.local-dev.yml swaps the commented JSON example for
  a bare `- LOCAL_DEV_GROUPS` passthrough — the value comes from the
  shell, the compose file just propagates it. Operators running
  `docker compose up` directly without the wrapper script get an
  empty mock (correct: they didn't opt into the make-driven defaults).
- Makefile help line mentions the mocked groups so the behavior is
  visible without grepping.

New docs/local-development.md consolidates dev-onboarding instructions
that were previously scattered across docker-compose.local-dev.yml
inline comments, docs/auth-groups.md "Local-dev mock" section, the
Makefile help text, and CLAUDE.md "First-Time Setup". Single page now
covers TL;DR, what LOCAL_DEV_MODE actually bypasses, group mocking
controls + verification, what is *not* mocked (Cloud Identity, real
OAuth, admin Workspace permissions), and the safety rails that keep
the dev shortcuts off production.

Version bump 0.11.1 → 0.11.2 in pyproject.toml, CHANGELOG cuts
[Unreleased] → [0.11.2] — 2026-04-26 with a fresh empty [Unreleased]
skeleton.

* fix(local-dev): default LOCAL_DEV_GROUPS truncated by shell parameter expansion

Reported by an operator running `make local-dev` against the freshly
released 0.11.2 — the LOCAL_DEV_MODE banner showed:

    LOCAL_DEV_GROUPS is not valid JSON, ignoring:
    Expecting ',' delimiter: line 1 column 70 (char 69)
    LOCAL_DEV_GROUPS is set but produced no valid groups —
    check the WARNING above for the parse error.

Cause: the default value lived inside `${LOCAL_DEV_GROUPS:=…}` parameter
expansion. Bash matches `}` to close the expansion at the *first* `}`
encountered in the body, regardless of context — even one inside a
nested JSON object literal. The two-element JSON array was therefore
truncated to the first group's closing brace, leaving an unparseable
fragment:

    [{"id":"local-dev-engineers@example.com","name":"Local Dev Engineers"

There is no escaping syntax for `}` inside parameter expansion (the
backslash escapes I had only escaped the quotes — `}` reaches bash
literally). Fix: hold the default in a single-quoted variable and
reference it through `${LOCAL_DEV_GROUPS:-$DEFAULT_LOCAL_DEV_GROUPS}`.
The variable's value is opaque to the expansion — no `}` matching
inside it — so the JSON survives intact. Verified with `python -m json`:

    parsed OK: 2 groups: ['local-dev-engineers@example.com',
                          'local-dev-admins@example.com']

Operators on a running 0.11.2 stack: `make local-dev-down && make
local-dev` to pick up the corrected default.

* fix(local-dev): respect LOCAL_DEV_GROUPS= disable path + add 0.11.2 changelog link

Two follow-ups from a Devin code-review pass on PR #70:

- run-local-dev.sh: switch ${LOCAL_DEV_GROUPS:-$DEFAULT} to
  ${LOCAL_DEV_GROUPS-$DEFAULT} (no leading colon). The :- form
  substitutes the default when the variable is unset OR set-but-empty,
  silently overwriting the documented disable knob. Three places
  promise this works — docs/local-development.md, the CHANGELOG entry,
  and the script's own comment — so the bug was an operator-facing
  lie, not just an implementation detail. The bare - form only
  substitutes on unset, so `LOCAL_DEV_GROUPS= make local-dev` now
  reaches the Python parser as "" and short-circuits to []. Verified
  with both empty and unset shells.

- CHANGELOG.md: add the [0.11.2] link reference at the bottom.
  Keep-a-Changelog convention is to mirror every version heading
  with a release-tag link in the footer; the 0.11.2 heading was
  missing its counterpart, breaking the Markdown link rendering on
  GitHub.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-26 16:48:55 +02:00

70 lines
4.5 KiB
Markdown

# Google Workspace Groups in /profile
How Agnes pulls a user's group memberships at Google sign-in and where they end up.
## Google Cloud setup (per OAuth client / project)
In the GCP project hosting the OAuth client (for Keboola dev: `kids-ai-data-analysis`):
1. **Enable Cloud Identity API**`APIs & Services → Library → "Cloud Identity API" → Enable`.
2. **OAuth consent screen → Data Access → Add or Remove Scopes** — manually add:
```
https://www.googleapis.com/auth/cloud-identity.groups.readonly
```
3. **OAuth client → Authorized redirect URIs** — must include `https://<host>/auth/google/callback` for the deployment that uses this client.
4. **OAuth consent screen → Audience** — keep `Internal` (own Workspace tenant only). `External` triggers verification review for the sensitive Cloud Identity scope.
That's it. No service account, no domain-wide delegation, no admin role per user.
## The `security` label trap
Cloud Identity exposes membership listing through `groups/-/memberships:searchTransitiveGroups`. Its `query` (CEL) **must include a label predicate**. Two label types matter:
- `cloudidentity.googleapis.com/groups.discussion_forum` — every Workspace group has it. **Returns 403 "Insufficient permissions"** for non-admin users.
- `cloudidentity.googleapis.com/groups.security` — only security-flagged groups have it as a top-level capability, but in practice **every Keboola Workspace group also carries this label**. **Returns 200** with the full membership list.
Agnes therefore queries with `security` (in `app/auth/providers/google.py`):
```python
"member_key_id == '<email>' && 'cloudidentity.googleapis.com/groups.security' in labels"
```
Switching to `discussion_forum` will silently break for everyone but Workspace admins.
## Storage + use
`app/auth/providers/google.py:google_callback` runs on every Google sign-in:
1. Fetch via `_fetch_google_groups(access_token, email)` → list of `{"id": "<email>", "name": "<displayName>"}`.
2. Write to `request.session["google_groups"]` (Starlette signed-cookie session — per-user, not in DB).
3. Failures (403, 401, network, 4xx) are swallowed and become `[]` so login never breaks.
Display: `app/web/templates/profile.html` reads `session.google_groups` and renders the list. Empty state explains "Groups are populated when you sign in with Google on a Workspace-enabled tenant."
**Not in DB.** Admin views (e.g. `/admin/users`) can't see other users' groups today — adding a `users.groups` column + persisting on callback is the path forward when that's needed.
**Refresh.** A user's stale session keeps stale groups. `Logout → sign in again` is the only refresh.
## Local-dev mock (no Google round-trip)
When developing on `localhost` with `LOCAL_DEV_MODE=1`, Google OAuth never runs, so `session.google_groups` would normally stay empty and group-aware UI/code paths can't be exercised. Set `LOCAL_DEV_GROUPS` to inject a mocked membership list:
```bash
export LOCAL_DEV_GROUPS='[{"id":"engineers@example.com","name":"Engineering"},{"id":"admins@example.com","name":"Admins"}]'
```
The value is a JSON array of objects matching the production shape (`{"id", "name"}`) so the mock and the real callback write the *same* structure into `session.google_groups`. Extra fields are preserved verbatim — handy for forward-compat testing of group attributes Google may return later.
`get_current_user` in `app/auth/dependencies.py` writes the parsed list into the session on every dev-bypass request (compare-then-write — no spurious `Set-Cookie` when the value is unchanged). Malformed input (invalid JSON, non-list, items missing `id`) is logged at WARNING and falls back to `[]` — the dev mock must never break the dev flow.
`docker-compose.local-dev.yml` carries a commented example at the right escape level for Compose YAML. **Never set this in production** — the variable is only honored when `LOCAL_DEV_MODE=1`.
## Debugging
`scripts/debug/probe_google_groups.py` — stdlib, takes a Playground-issued OAuth access token + email, hits 6 candidate endpoints, prints raw response. Use this **before** changing the production query — saves a deploy cycle per attempt.
```bash
python3 scripts/debug/probe_google_groups.py "ya29.…" user@keboola.com
```
Token via [OAuth 2.0 Playground](https://developers.google.com/oauthplayground/) → gear icon → own credentials → request the three scopes (`cloud-identity.groups.readonly`, `cloud-identity.groups`, `admin.directory.group.readonly`) → exchange code → copy access token.