agnes-the-ai-analyst/docs/auth-groups.md
Petr Simecek 1c18cdf15f
release(0.11.2): LOCAL_DEV_GROUPS dev mock + Makefile defaults + docs/local-development.md (#70)
* feat(auth): mock session.google_groups in LOCAL_DEV_MODE via LOCAL_DEV_GROUPS

LOCAL_DEV_MODE auto-logged-in the dev user but left session.google_groups
empty, so group-aware UI/code paths can't be exercised on localhost without
a real Google OAuth round-trip. New LOCAL_DEV_GROUPS env var (JSON array
matching the production {id, name} shape) populates the session on every
dev-bypass request — same structure the OAuth callback writes, so mock and
prod stay in lockstep. Compare-then-write avoids spurious Set-Cookie noise
on PAT/CLI requests; malformed input falls back to [] with a WARNING so
the dev mock never breaks the dev flow.

* refactor(auth): fail-fast LOCAL_DEV_GROUPS at startup + cache + no-mutate

Three small follow-ups on the same dev-mock vector before merge:

- Validate LOCAL_DEV_GROUPS at app startup and report the parsed group IDs
  in the LOCAL_DEV_MODE banner. A malformed value now warns loudly at boot
  instead of silently logging on the first authenticated request, where
  it's easy to miss.
- Cache the parsed result single-slot, keyed by the raw env-string. Avoids
  re-parsing JSON on every authenticated request without test-isolation
  surprises — when the env value changes, the key changes and the cache
  transparently rebuilds.
- Stop mutating the parsed-input dicts (item.setdefault → spread-merge)
  so the cached list stays a fresh value on every rebuild.
- Replace the try/except guard around request.session with hasattr —
  SessionMiddleware is always registered, the silent except was paranoid.

Tests grow by a direct session-cookie inspection (decoupled from the
profile template) and three startup-banner log assertions.

* fix(auth): drop fragile session-decoder test + actually skip empty-target write

Two follow-ups on the LOCAL_DEV_GROUPS feature before merge:

- Drop test_session_holds_mocked_groups_directly. It manually decoded the
  signed session cookie via TimestampSigner + base64, hardcoding both the
  Starlette session-cookie format and the 14-day max_age. Starlette has
  changed its session encoding before (URLSafeTimedSerializer pre-0.20)
  and would do so again silently — the test would fail with a cryptic
  BadSignature, not a clear "mock is broken" signal. The remaining
  test_dev_user_sees_mocked_groups_on_profile already covers the same
  observable signal (mocked groups in /profile body) without coupling to
  Starlette internals.

- Actually skip the session write when target_groups is empty. The previous
  comment claimed compare-then-write avoided spurious Set-Cookie noise on
  PAT/CLI requests, but on those requests session.get("google_groups") is
  None and target is [], so None != [] always evaluates True and the write
  fired anyway, marking the session dirty and re-issuing Set-Cookie on
  every request. Adding `target_groups and ...` to the guard makes the
  comment honest: empty mock now genuinely no-ops, stable browser sessions
  still skip via value-equality, and the only remaining write is the one
  that actually changes state.

33 auth tests still pass locally.

* fix(auth): match production's always-write semantics for stale dev groups

Devin code-review finding on PR #70: my earlier `target_groups and ...`
short-circuit silently diverged from the production OAuth callback. In
app/auth/providers/google.py:189-194 the callback always writes
session.google_groups on each login — including [] on failure or empty
token — so the session always reflects authoritative current state. The
mock should match.

Failure mode the previous guard left open: a developer sets
LOCAL_DEV_GROUPS=[{...}] for a session, the groups land in the signed
cookie, then the developer unsets the env var and reloads. target → [],
session.get → [{...}], `if target_groups and ...` is False, no write,
stale groups stay in the browser session indefinitely. Mock now lies
about state until logout.

Fix splits the guard:
- target_groups truthy + value-changed → write the new mock (existing path)
- target_groups falsy + non-empty stored → write [] to clear stale state
- otherwise no-op (target [] + stored None/[]: no transition to record)

PAT/CLI requests with no prior session still take the no-op path
(target=[], session.get → None which is falsy), so the original goal of
suppressing spurious Set-Cookie noise on token traffic is preserved.

Tests already cover the populated and unset paths; the new clear-stale
branch is correct by construction (production has the same shape) and
the rare manual reset workflow.

* release(0.11.2): default mocked groups in make local-dev + docs/local-development.md

Cuts 0.11.2 around the LOCAL_DEV_GROUPS work plus a small dev-experience
follow-up: every `make local-dev` now boots with two sensible default
mocked groups (Local Dev Engineers + Local Dev Admins on example.com),
so /profile and group-aware code paths render something realistic
without the operator having to discover and set LOCAL_DEV_GROUPS.

Layered so the default lives in the workflow, not the contract:

- scripts/run-local-dev.sh seeds LOCAL_DEV_GROUPS via shell ":="
  syntax — only sets the var when the operator hasn't already.
  Override: LOCAL_DEV_GROUPS='[...]' make local-dev. Disable:
  LOCAL_DEV_GROUPS= make local-dev.
- docker-compose.local-dev.yml swaps the commented JSON example for
  a bare `- LOCAL_DEV_GROUPS` passthrough — the value comes from the
  shell, the compose file just propagates it. Operators running
  `docker compose up` directly without the wrapper script get an
  empty mock (correct: they didn't opt into the make-driven defaults).
- Makefile help line mentions the mocked groups so the behavior is
  visible without grepping.

New docs/local-development.md consolidates dev-onboarding instructions
that were previously scattered across docker-compose.local-dev.yml
inline comments, docs/auth-groups.md "Local-dev mock" section, the
Makefile help text, and CLAUDE.md "First-Time Setup". Single page now
covers TL;DR, what LOCAL_DEV_MODE actually bypasses, group mocking
controls + verification, what is *not* mocked (Cloud Identity, real
OAuth, admin Workspace permissions), and the safety rails that keep
the dev shortcuts off production.

Version bump 0.11.1 → 0.11.2 in pyproject.toml, CHANGELOG cuts
[Unreleased] → [0.11.2] — 2026-04-26 with a fresh empty [Unreleased]
skeleton.

* fix(local-dev): default LOCAL_DEV_GROUPS truncated by shell parameter expansion

Reported by an operator running `make local-dev` against the freshly
released 0.11.2 — the LOCAL_DEV_MODE banner showed:

    LOCAL_DEV_GROUPS is not valid JSON, ignoring:
    Expecting ',' delimiter: line 1 column 70 (char 69)
    LOCAL_DEV_GROUPS is set but produced no valid groups —
    check the WARNING above for the parse error.

Cause: the default value lived inside `${LOCAL_DEV_GROUPS:=…}` parameter
expansion. Bash matches `}` to close the expansion at the *first* `}`
encountered in the body, regardless of context — even one inside a
nested JSON object literal. The two-element JSON array was therefore
truncated to the first group's closing brace, leaving an unparseable
fragment:

    [{"id":"local-dev-engineers@example.com","name":"Local Dev Engineers"

There is no escaping syntax for `}` inside parameter expansion (the
backslash escapes I had only escaped the quotes — `}` reaches bash
literally). Fix: hold the default in a single-quoted variable and
reference it through `${LOCAL_DEV_GROUPS:-$DEFAULT_LOCAL_DEV_GROUPS}`.
The variable's value is opaque to the expansion — no `}` matching
inside it — so the JSON survives intact. Verified with `python -m json`:

    parsed OK: 2 groups: ['local-dev-engineers@example.com',
                          'local-dev-admins@example.com']

Operators on a running 0.11.2 stack: `make local-dev-down && make
local-dev` to pick up the corrected default.

* fix(local-dev): respect LOCAL_DEV_GROUPS= disable path + add 0.11.2 changelog link

Two follow-ups from a Devin code-review pass on PR #70:

- run-local-dev.sh: switch ${LOCAL_DEV_GROUPS:-$DEFAULT} to
  ${LOCAL_DEV_GROUPS-$DEFAULT} (no leading colon). The :- form
  substitutes the default when the variable is unset OR set-but-empty,
  silently overwriting the documented disable knob. Three places
  promise this works — docs/local-development.md, the CHANGELOG entry,
  and the script's own comment — so the bug was an operator-facing
  lie, not just an implementation detail. The bare - form only
  substitutes on unset, so `LOCAL_DEV_GROUPS= make local-dev` now
  reaches the Python parser as "" and short-circuits to []. Verified
  with both empty and unset shells.

- CHANGELOG.md: add the [0.11.2] link reference at the bottom.
  Keep-a-Changelog convention is to mirror every version heading
  with a release-tag link in the footer; the 0.11.2 heading was
  missing its counterpart, breaking the Markdown link rendering on
  GitHub.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-26 16:48:55 +02:00

4.5 KiB

Google Workspace Groups in /profile

How Agnes pulls a user's group memberships at Google sign-in and where they end up.

Google Cloud setup (per OAuth client / project)

In the GCP project hosting the OAuth client (for Keboola dev: kids-ai-data-analysis):

  1. Enable Cloud Identity APIAPIs & Services → Library → "Cloud Identity API" → Enable.
  2. OAuth consent screen → Data Access → Add or Remove Scopes — manually add:
    https://www.googleapis.com/auth/cloud-identity.groups.readonly
    
  3. OAuth client → Authorized redirect URIs — must include https://<host>/auth/google/callback for the deployment that uses this client.
  4. OAuth consent screen → Audience — keep Internal (own Workspace tenant only). External triggers verification review for the sensitive Cloud Identity scope.

That's it. No service account, no domain-wide delegation, no admin role per user.

The security label trap

Cloud Identity exposes membership listing through groups/-/memberships:searchTransitiveGroups. Its query (CEL) must include a label predicate. Two label types matter:

  • cloudidentity.googleapis.com/groups.discussion_forum — every Workspace group has it. Returns 403 "Insufficient permissions" for non-admin users.
  • cloudidentity.googleapis.com/groups.security — only security-flagged groups have it as a top-level capability, but in practice every Keboola Workspace group also carries this label. Returns 200 with the full membership list.

Agnes therefore queries with security (in app/auth/providers/google.py):

"member_key_id == '<email>' && 'cloudidentity.googleapis.com/groups.security' in labels"

Switching to discussion_forum will silently break for everyone but Workspace admins.

Storage + use

app/auth/providers/google.py:google_callback runs on every Google sign-in:

  1. Fetch via _fetch_google_groups(access_token, email) → list of {"id": "<email>", "name": "<displayName>"}.
  2. Write to request.session["google_groups"] (Starlette signed-cookie session — per-user, not in DB).
  3. Failures (403, 401, network, 4xx) are swallowed and become [] so login never breaks.

Display: app/web/templates/profile.html reads session.google_groups and renders the list. Empty state explains "Groups are populated when you sign in with Google on a Workspace-enabled tenant."

Not in DB. Admin views (e.g. /admin/users) can't see other users' groups today — adding a users.groups column + persisting on callback is the path forward when that's needed.

Refresh. A user's stale session keeps stale groups. Logout → sign in again is the only refresh.

Local-dev mock (no Google round-trip)

When developing on localhost with LOCAL_DEV_MODE=1, Google OAuth never runs, so session.google_groups would normally stay empty and group-aware UI/code paths can't be exercised. Set LOCAL_DEV_GROUPS to inject a mocked membership list:

export LOCAL_DEV_GROUPS='[{"id":"engineers@example.com","name":"Engineering"},{"id":"admins@example.com","name":"Admins"}]'

The value is a JSON array of objects matching the production shape ({"id", "name"}) so the mock and the real callback write the same structure into session.google_groups. Extra fields are preserved verbatim — handy for forward-compat testing of group attributes Google may return later.

get_current_user in app/auth/dependencies.py writes the parsed list into the session on every dev-bypass request (compare-then-write — no spurious Set-Cookie when the value is unchanged). Malformed input (invalid JSON, non-list, items missing id) is logged at WARNING and falls back to [] — the dev mock must never break the dev flow.

docker-compose.local-dev.yml carries a commented example at the right escape level for Compose YAML. Never set this in production — the variable is only honored when LOCAL_DEV_MODE=1.

Debugging

scripts/debug/probe_google_groups.py — stdlib, takes a Playground-issued OAuth access token + email, hits 6 candidate endpoints, prints raw response. Use this before changing the production query — saves a deploy cycle per attempt.

python3 scripts/debug/probe_google_groups.py "ya29.…" user@keboola.com

Token via OAuth 2.0 Playground → gear icon → own credentials → request the three scopes (cloud-identity.groups.readonly, cloud-identity.groups, admin.directory.group.readonly) → exchange code → copy access token.