feat(auth): Google Workspace groups on /profile + tag-triggered Keboola deploy workflow (#56)

* feat(auth): display Google Workspace groups on /profile

- Request cloud-identity.groups.readonly scope in Google OAuth
- Fetch groups via Cloud Identity API after callback; tolerate 4xx
  (non-Workspace tenants) and network errors — never break login
- Store result in Starlette session as google_groups
- Replace /profile redirect with a real profile page rendering
  account details (email, name, role) and the group list; show a
  friendly empty state when no groups are available
- Tests: helper parsing + 403 + exception paths; profile page
  smoke test; updated the old redirect test

* test: remove stale /profile redirect tests

Cherry-pick of Zdeněk's 4f7e4cd ("display Google Workspace groups on
/profile") replaces the /profile redirect with a real profile page —
but only updated one of three tests that expected the old behaviour.

These two tests in test_admin_tokens_ui.py and test_pat.py were left
asserting `/profile → 302 /tokens`, which now returns
`/profile → 302 /login?next=%2Fprofile` for unauth users (the standard
auth guard) or `/profile → 200 HTML` for authenticated users.

Removed both rather than patched — coverage for the new behaviour
already exists in tests/test_auth_providers.py (added by the same
commit). The /tokens render assertions in the deleted test_pat.py case
are redundant with test_admin_tokens_ui.py's own /tokens UI tests.

* fix(auth): Google groups search query needs parent + labels predicates

Cloud Identity Groups Search API returns 400 INVALID_ARGUMENT when the
CEL query lacks the required `parent == 'customers/<id>'` predicate AND
a `'<label>' in labels` membership predicate. Zdeněk's original 4f7e4cd
query had only `member_key_id == '<email>'` — every fetch silently
returned [] and the /profile groups list was always empty.

Fix: build the query with all three required pieces:
  parent == 'customers/my_customer'   (alias = caller's own Workspace
                                       org; no need to look up customer ID)
  member_key_id == '<email>'           (filter to this user's memberships)
  'cloudidentity.googleapis.com/groups.discussion_forum' in labels
                                       (Workspace mailing-list groups —
                                       the common case; security-group
                                       coverage is a follow-up)

Also: log the full error body (not truncated to 200 chars) and the
query string so the next time Google rejects something we can diagnose
in one log line instead of a re-deploy.

Caught when first agnes-dev login completed normally (HTTP 302) but app
log showed `Google groups fetch returned 400 for petr@keboola.com:
{"error":{"code":400,"message":"Request contains an invalid argument."}}`
on the same VM (kids-ai-data-analysis / agnes-dev.keboola.com).

Reference: https://cloud.google.com/identity/docs/reference/rest/v1/groups/search

* feat(web): add Profile link to user dropdown menu

The /profile page (Zdeněk's 4f7e4cd cherry-pick) renders a real profile
view including Google Workspace groups, but had no entry point in the
UI — users could only reach it by typing the URL manually. Add a
"Profile" menu item between the user header (email + role) and
"My tokens" so the page is discoverable.

Side effect: cleaned up the leftover `or _path.startswith('/profile')`
condition on the "My tokens" active class, which dated from the old
/profile → /tokens redirect (removed in c789617). Now each menu item
owns its own active state.

* fix: profile-link tests + .env quoting for CADDY_TLS

Two issues caught by Keboola's first agnes-dev deploy + agnes-auto-upgrade
cron run:

1. tests/test_web_ui.py — two negative assertions ("href=/profile" NOT in
   body) date from when /profile was a redirect-only stub. Now /profile
   is a real page (groups display) AND has a dropdown menu link, so the
   negative assertions flip to positive. Same for ">Profile<" text in
   the non-admin nav test.

2. startup-script.sh.tpl — CADDY_TLS line must be QUOTED in .env, because
   agnes-auto-upgrade.sh sources .env via `set -a; . .env; set +a` and
   bash treats `KEY=value with spaces` as `KEY=value` followed by `with`
   and `spaces` exec attempts. Symptom: cron log spam
   `/opt/agnes/.env: line 14: petr@keboola.com: command not found`,
   the cron exits non-zero, and no auto-upgrade ever happens. Caddy
   itself reads the value fine because docker-compose env_file=.env
   parses key=value properly without shell-evaluating the rest.

   Fix: emit `CADDY_TLS="tls <email>"` instead of `CADDY_TLS=tls <email>`.
   Both the cron source and docker-compose env_file accept the quoted
   form; cron stops failing.

* fix(auth): use searchTransitiveGroups + security label for non-admin user

Three bugs in the original cherry-pick + my prior fix attempt, all caught
by a stdlib probe script (scripts/debug/probe_google_groups.py) run
locally with a Playground-issued OAuth token:

1. Wrong endpoint. `groups:search` is the admin "find groups in org"
   endpoint and 400s for non-admin users regardless of query. Switched
   to `groups/-/memberships:searchTransitiveGroups` which is the
   user-perspective "what groups am I in" endpoint.

2. Wrong label. Querying with `cloudidentity.googleapis.com/groups.discussion_forum`
   returns 403 "Insufficient permissions to retrieve memberships" even
   on the new endpoint — Workspace policy denies non-admin reads of
   discussion-forum groups. Switching to `groups.security` returns 200
   with the actual membership list. Empirically every Workspace group
   at Keboola carries BOTH labels, so the security filter sees the full
   set anyway. Confirmed with the probe script.

3. Wrong response shape. `searchTransitiveGroups` returns
   {"memberships": [...]}, not {"groups": [...]}. Parser updated
   accordingly.

Also adds scripts/debug/probe_google_groups.py — stdlib-only standalone
probe that hits 6 candidate endpoints with a user OAuth token. Saved a
deploy cycle (~10 min) per query iteration; future API-syntax debugging
should start there.

Verified end-to-end: petr@keboola.com login on agnes-dev returns 5
groups (LIC-1PASSWORD, ROLE_ATLASSIAN_*, etc.) via the probe; once
deployed, the same will populate session["google_groups"] and render
on /profile.

* test(auth): update Google groups parser fixture to match searchTransitiveGroups shape

Mock payload was `{"groups": [...]}` (the shape `groups:search` returns).
After switching to `groups/-/memberships:searchTransitiveGroups` in the
prior commit, the actual response is `{"memberships": [...]}` and the
parser iterates that key. Test now mirrors the real shape.

The per-item structure (groupKey.id + displayName) is unchanged, so the
expected output dict stays the same: [{"id": "...", "name": "..."}].

* docs(auth): add docs/auth-groups.md — Google Workspace groups runbook

Captures the non-obvious bits: the GCP-side setup checklist (Cloud
Identity API + scope on consent screen + Internal user type), the
`security` vs `discussion_forum` label trap (the latter 403s for
non-admins, the former 200s — one of those is a 4-iteration debug
session and shouldn't have to be repeated), where groups are stored
(session, not DB) and how to refresh (re-login), plus how to use the
probe script for future API-syntax issues.

Deliberately stops short of explaining "what is Cloud Identity" or
"what is OAuth scope" — those belong in Google's own docs, not ours.

* docs(claude): document release workflows + module versioning + recreate trick

New "Release & deploy workflows" section in CLAUDE.md covers what didn't
exist anywhere in the repo before:

- Distinction between release.yml (auto-build per push) vs the new
  keboola-deploy.yml (tag-triggered, explicit deploy only) — plus when
  to use which (per-developer convenience vs shared dev VM safety)
- Module versioning (infra-vX.Y.Z) and the bump-after-merge dance
- The lifecycle.ignore_changes [metadata_startup_script] gotcha and how
  to force a recreate via workflow_dispatch's recreate_targets input

All generic — no customer hostnames, project IDs, IPs. Customer-specific
deploy steps belong in the consuming infra repo's README.

Also: cross-reference docs/auth-groups.md from the Authentication
section so future Claude sessions find the Workspace-groups runbook
without grepping.

---------

Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
This commit is contained in:
Petr Simecek 2026-04-26 00:56:44 +02:00 committed by GitHub
parent 4799119c81
commit c25fd41bf7
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 726 additions and 63 deletions

View file

@ -185,10 +185,50 @@ Orchestrator ATTACHes it automatically.
### Authentication
Auth providers in `app/auth/` (FastAPI-based):
- **Google**: OAuth via Google
- **Google**: OAuth via Google (Workspace group memberships pulled at sign-in — see `docs/auth-groups.md` for the GCP setup checklist + the `security` label gotcha)
- **Email**: Email magic link (itsdangerous token)
- **Desktop**: JWT for API
## Release & deploy workflows
Two separate release.yml-style workflows produce GHCR images. Pick the one that matches what you're shipping.
### `release.yml` — auto-build on every push
Runs on **every** push to **every** branch.
- Push to `main``:stable`, `:stable-YYYY.MM.N` (CalVer).
- Push to non-main `<prefix>/<branch>``:dev`, `:dev-YYYY.MM.N`, `:dev-<branch-slug>`, and (when prefix isn't a Git Flow convention) `:dev-<prefix>-latest` alias.
VMs that pin to a floating tag (`:dev`, `:dev-<prefix>-latest`) auto-upgrade within ~5 min via the cron in `agnes-auto-upgrade.sh`. Convenient for per-developer dev VMs; **footgun for shared dev VMs** (last pusher wins, regardless of who).
### `keboola-deploy.yml` — tag-triggered, explicit deploy only
Runs **only** on git tags matching `keboola-deploy-*`. Publishes:
- `:keboola-deploy-<git-tag-suffix>` — immutable, tied to the exact commit
- `:keboola-deploy-latest` — floating alias the consumer pins to
**Operator workflow:**
```bash
git checkout <commit-or-branch>
git tag keboola-deploy-<descriptive-name>
git push origin keboola-deploy-<descriptive-name>
# → workflow builds + publishes both tags
# → VM cron picks up :keboola-deploy-latest within ~5 min
# → manual cron trigger (skip the wait): sudo /usr/local/bin/agnes-auto-upgrade.sh on the VM
```
Use this when the consumer (e.g. a customer dev VM) needs **deploy-when-I-decide** semantics — no surprise rollouts from upstream branch pushes by other contributors. The infra repo pins `image_tag = "keboola-deploy-latest"` on the relevant VM.
### Module versioning
The customer-instance Terraform module under `infra/modules/customer-instance/` is published as `infra-vMAJOR.MINOR.PATCH` git tags (separate from app CalVer tags). Bump on any module-API change; downstream infra repos pin to the tag in their `source = "github.com/keboola/agnes-the-ai-analyst//infra/modules/customer-instance?ref=infra-v1.X.Y"`.
After merging a module change to `main`:
```bash
git tag infra-vX.Y.Z origin/main
git push origin infra-vX.Y.Z
```
### Replacing a VM after a startup-script change
Module sets `lifecycle { ignore_changes = [metadata_startup_script] }` on `google_compute_instance.vm` so normal `terraform apply` doesn't churn running VMs. To propagate a startup-script update, trigger the consumer's apply workflow manually with the VM resource address — typical workflow_dispatch input is `recreate_targets='module.agnes.google_compute_instance.vm["<vm-name>"]'`.
## Key Implementation Details
### DuckDB Schema (src/db.py)

View file

@ -3,6 +3,7 @@
import os
import logging
import httpx
from authlib.integrations.starlette_client import OAuth
from fastapi import APIRouter, Request
from fastapi.responses import RedirectResponse
@ -21,6 +22,19 @@ oauth = OAuth()
GOOGLE_CLIENT_ID = os.environ.get("GOOGLE_CLIENT_ID", "")
GOOGLE_CLIENT_SECRET = os.environ.get("GOOGLE_CLIENT_SECRET", "")
# Cloud Identity Groups API — requires the cloud-identity.groups.readonly scope
# AND an admin-enabled Cloud Identity / Google Workspace tenant.
#
# We use `groups/-/memberships:searchTransitiveGroups` (the "what groups does
# THIS USER belong to" endpoint), NOT `groups:search` (admin "find groups in
# org" endpoint, which requires Groups Reader admin role + 400s otherwise).
# The `-` in the path is a wildcard meaning "search across all groups in the
# caller's organization". Returns transitive memberships (incl. nested groups).
# Reference: https://cloud.google.com/identity/docs/reference/rest/v1/groups.memberships/searchTransitiveGroups
GROUPS_SEARCH_URL = (
"https://cloudidentity.googleapis.com/v1/groups/-/memberships:searchTransitiveGroups"
)
def is_available() -> bool:
return bool(GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET)
@ -34,10 +48,75 @@ def _setup_oauth():
client_id=GOOGLE_CLIENT_ID,
client_secret=GOOGLE_CLIENT_SECRET,
server_metadata_url="https://accounts.google.com/.well-known/openid-configuration",
client_kwargs={"scope": "openid email profile"},
client_kwargs={
"scope": (
"openid email profile "
"https://www.googleapis.com/auth/cloud-identity.groups.readonly"
),
},
)
async def _fetch_google_groups(access_token: str, email: str) -> list[dict]:
"""Fetch Google Workspace groups the user belongs to.
Best-effort: returns [] on any failure (403 non-Workspace tenant, 401 expired
token, network error, etc.). Must never raise callers rely on this to keep
the login flow working even when Cloud Identity is unavailable.
searchTransitiveGroups query syntax (CEL) requires:
- a `labels` membership predicate scoping the group type
- `member_key_id == '<email>'` for the user
Without `labels` Google returns 400 INVALID_ARGUMENT (silently error
body just says "invalid argument").
Reference: https://cloud.google.com/identity/docs/reference/rest/v1/groups.memberships/searchTransitiveGroups
Why `security` label and not `discussion_forum`:
Empirically Keboola's Workspace lets a non-admin user read their own
group memberships ONLY for groups labelled as security groups
(`cloudidentity.googleapis.com/groups.security`). The same query with
`groups.discussion_forum` returns 403 "Insufficient permissions to
retrieve memberships" — the discussion_forum API needs admin scope.
In practice every Workspace group at Keboola carries BOTH labels, so
filtering on `security` returns the full membership list anyway.
Confirmed via scripts/debug/probe_google_groups.py.
"""
query = (
f"member_key_id == '{email}' "
f"&& 'cloudidentity.googleapis.com/groups.security' in labels"
)
params = {"query": query}
headers = {"Authorization": f"Bearer {access_token}"}
try:
async with httpx.AsyncClient(timeout=5.0) as client:
resp = await client.get(GROUPS_SEARCH_URL, params=params, headers=headers)
if resp.status_code >= 400:
# Log full body (not truncated) so future query-syntax / scope /
# tenant issues are diagnosable from one log line.
logger.warning(
"Google groups fetch returned %s for %s — query=%r — body=%s",
resp.status_code, email, query, resp.text,
)
return []
data = resp.json()
except Exception as e:
logger.warning("Google groups fetch failed for %s: %s", email, e)
return []
# searchTransitiveGroups returns `memberships`, not `groups`. Each membership
# carries the group identity in groupKey.id (email-shaped) + displayName.
groups = []
for m in data.get("memberships", []) or []:
group_key = (m.get("groupKey") or {}).get("id", "")
if not group_key:
continue
groups.append({
"id": group_key,
"name": m.get("displayName") or group_key,
})
return groups
_setup_oauth()
@ -102,6 +181,18 @@ async def google_callback(request: Request):
finally:
conn.close()
# Fetch Google Workspace groups (best-effort — must not break login).
access_token = token.get("access_token", "")
if access_token:
try:
groups = await _fetch_google_groups(access_token, email)
request.session["google_groups"] = groups
except Exception as e:
logger.warning("Failed to store google_groups in session: %s", e)
request.session["google_groups"] = []
else:
request.session["google_groups"] = []
# Issue JWT
jwt_token = create_access_token(user["id"], user["email"], user["role"])

View file

@ -620,7 +620,17 @@ async def admin_tokens_page(
return templates.TemplateResponse(request, "admin_tokens.html", ctx)
@router.get("/profile")
async def profile_redirect(request: Request):
"""Back-compat: /profile (PAT CRUD) has been unified under /tokens."""
return RedirectResponse(url="/tokens", status_code=302)
@router.get("/profile", response_class=HTMLResponse)
async def profile_page(
request: Request,
user: dict = Depends(get_current_user),
):
"""User profile — shows email, name, role, and Google Workspace groups.
Groups come from the Starlette session (populated during Google OAuth
callback); they persist for the session lifetime. Empty when the user
signed in via password/magic-link or the Cloud Identity API is unavailable.
"""
groups = request.session.get("google_groups", []) or []
ctx = _build_context(request, user=user, groups=groups)
return templates.TemplateResponse(request, "profile.html", ctx)

View file

@ -36,7 +36,8 @@
<div class="app-user-menu-role">{{ session.user.role | capitalize }}</div>
{% endif %}
</div>
<a class="app-user-menu-item {% if _path == '/tokens' or _path.startswith('/profile') %}is-active{% endif %}" role="menuitem" href="/tokens">My tokens</a>
<a class="app-user-menu-item {% if _path.startswith('/profile') %}is-active{% endif %}" role="menuitem" href="/profile">Profile</a>
<a class="app-user-menu-item {% if _path == '/tokens' %}is-active{% endif %}" role="menuitem" href="/tokens">My tokens</a>
<a class="app-user-menu-item" role="menuitem" href="{{ url_for('auth.logout') }}">Logout</a>
</div>
</div>

View file

@ -0,0 +1,207 @@
{% extends "base.html" %}
{% block title %}Profile — {{ config.INSTANCE_NAME }}{% endblock %}
{% block content %}
<style>
/* /profile — read-only account view with Google Workspace group list.
Matches the card/hero vocabulary used on /tokens. */
body > .container { max-width: 960px; }
.profile-page {
max-width: 960px;
margin: 0 auto;
padding: 28px 8px 48px;
box-sizing: border-box;
font-family: var(--font-primary, 'Inter', system-ui, -apple-system, BlinkMacSystemFont, sans-serif);
}
@media (max-width: 720px) {
.profile-page { padding: 20px 0 32px; }
}
.profile-hero {
background: linear-gradient(135deg, #0073D1 0%, #0056A3 100%);
border-radius: 14px;
padding: 28px 32px 24px;
margin-bottom: 20px;
box-shadow: 0 4px 16px rgba(0, 115, 209, 0.2);
color: #fff;
}
.profile-hero .hero-eyebrow {
font-size: 11px;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.8px;
color: rgba(255, 255, 255, 0.75);
margin-bottom: 8px;
}
.profile-hero .profile-title {
font-size: 28px;
font-weight: 600;
letter-spacing: -0.01em;
margin: 0 0 6px;
color: #fff;
}
.profile-hero .profile-subtitle {
font-size: 14px;
font-weight: 400;
color: rgba(255, 255, 255, 0.9);
margin: 0;
line-height: 1.5;
}
.section-card {
background: var(--surface, #fff);
border: 1px solid var(--border, #e5e7eb);
border-radius: 12px;
padding: 20px 24px;
margin-bottom: 16px;
}
.section-card h3 {
margin: 0 0 14px;
font-size: 13px;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.4px;
color: var(--text-secondary, #6b7280);
}
.account-grid {
display: grid;
grid-template-columns: max-content 1fr;
gap: 10px 18px;
font-size: 14px;
}
.account-grid .k {
color: var(--text-secondary, #6b7280);
font-size: 12px;
text-transform: uppercase;
letter-spacing: 0.4px;
font-weight: 600;
align-self: center;
}
.account-grid .v {
color: var(--text-primary, #1A253C);
font-weight: 500;
word-break: break-word;
}
.role-pill {
display: inline-flex;
align-items: center;
padding: 3px 10px;
border-radius: 999px;
font-size: 11.5px;
font-weight: 600;
text-transform: capitalize;
letter-spacing: 0.2px;
background: rgba(0, 115, 209, 0.10);
color: #0073D1;
border: 1px solid rgba(0, 115, 209, 0.25);
}
.groups-list {
list-style: none;
padding: 0;
margin: 0;
display: flex;
flex-direction: column;
gap: 8px;
}
.group-row {
display: flex;
align-items: center;
gap: 14px;
padding: 10px 14px;
border: 1px solid var(--border, #e5e7eb);
border-radius: 10px;
background: var(--background, #f9fafb);
}
.group-row .group-name {
font-size: 14px;
font-weight: 600;
color: var(--text-primary, #1A253C);
min-width: 0;
flex: 0 1 auto;
}
.group-row .group-id {
font-family: var(--font-mono, ui-monospace, "SF Mono", Menlo, monospace);
font-size: 12px;
color: var(--text-secondary, #6b7280);
word-break: break-all;
flex: 1 1 auto;
}
.empty-state {
padding: 20px 0 4px;
color: var(--text-secondary, #6b7280);
font-size: 14px;
line-height: 1.55;
}
.empty-state .empty-title {
font-weight: 600;
color: var(--text-primary, #1A253C);
margin-bottom: 4px;
}
.tokens-link-row {
margin-top: 18px;
padding-top: 16px;
border-top: 1px solid var(--border-light, #f3f4f6);
font-size: 13.5px;
color: var(--text-secondary, #6b7280);
}
.tokens-link-row a {
color: #0073D1;
text-decoration: none;
font-weight: 600;
}
.tokens-link-row a:hover { text-decoration: underline; }
</style>
<div class="profile-page">
<section class="profile-hero" aria-labelledby="profile-title">
<div class="hero-eyebrow">Your account</div>
<h2 class="profile-title" id="profile-title">Profile</h2>
<p class="profile-subtitle">Account details and Google Workspace group memberships.</p>
</section>
<section class="section-card" aria-label="Account details">
<h3>Account</h3>
<div class="account-grid">
<span class="k">Email</span>
<span class="v">{{ user.email or "—" }}</span>
<span class="k">Name</span>
<span class="v">{{ user.name or "—" }}</span>
<span class="k">Role</span>
<span class="v">
{% if user.role %}
<span class="role-pill">{{ user.role }}</span>
{% else %}
{% endif %}
</span>
</div>
<div class="tokens-link-row">
Manage personal access tokens at <a href="/tokens">/tokens</a>.
</div>
</section>
<section class="section-card" aria-label="Google Workspace groups">
<h3>Google Workspace groups</h3>
{% if groups and groups | length > 0 %}
<ul class="groups-list" role="list">
{% for g in groups %}
<li class="group-row" role="listitem">
<span class="group-name">{{ g.name or g.id }}</span>
<span class="group-id">{{ g.id }}</span>
</li>
{% endfor %}
</ul>
{% else %}
<div class="empty-state">
<div class="empty-title">No Google groups available</div>
<div>Groups are populated when you sign in with Google on a Workspace-enabled tenant. Other sign-in methods (email, password) don't expose group memberships.</div>
</div>
{% endif %}
</section>
</div>
{% endblock %}

56
docs/auth-groups.md Normal file
View file

@ -0,0 +1,56 @@
# Google Workspace Groups in /profile
How Agnes pulls a user's group memberships at Google sign-in and where they end up.
## Google Cloud setup (per OAuth client / project)
In the GCP project hosting the OAuth client (for Keboola dev: `kids-ai-data-analysis`):
1. **Enable Cloud Identity API**`APIs & Services → Library → "Cloud Identity API" → Enable`.
2. **OAuth consent screen → Data Access → Add or Remove Scopes** — manually add:
```
https://www.googleapis.com/auth/cloud-identity.groups.readonly
```
3. **OAuth client → Authorized redirect URIs** — must include `https://<host>/auth/google/callback` for the deployment that uses this client.
4. **OAuth consent screen → Audience** — keep `Internal` (own Workspace tenant only). `External` triggers verification review for the sensitive Cloud Identity scope.
That's it. No service account, no domain-wide delegation, no admin role per user.
## The `security` label trap
Cloud Identity exposes membership listing through `groups/-/memberships:searchTransitiveGroups`. Its `query` (CEL) **must include a label predicate**. Two label types matter:
- `cloudidentity.googleapis.com/groups.discussion_forum` — every Workspace group has it. **Returns 403 "Insufficient permissions"** for non-admin users.
- `cloudidentity.googleapis.com/groups.security` — only security-flagged groups have it as a top-level capability, but in practice **every Keboola Workspace group also carries this label**. **Returns 200** with the full membership list.
Agnes therefore queries with `security` (in `app/auth/providers/google.py`):
```python
"member_key_id == '<email>' && 'cloudidentity.googleapis.com/groups.security' in labels"
```
Switching to `discussion_forum` will silently break for everyone but Workspace admins.
## Storage + use
`app/auth/providers/google.py:google_callback` runs on every Google sign-in:
1. Fetch via `_fetch_google_groups(access_token, email)` → list of `{"id": "<email>", "name": "<displayName>"}`.
2. Write to `request.session["google_groups"]` (Starlette signed-cookie session — per-user, not in DB).
3. Failures (403, 401, network, 4xx) are swallowed and become `[]` so login never breaks.
Display: `app/web/templates/profile.html` reads `session.google_groups` and renders the list. Empty state explains "Groups are populated when you sign in with Google on a Workspace-enabled tenant."
**Not in DB.** Admin views (e.g. `/admin/users`) can't see other users' groups today — adding a `users.groups` column + persisting on callback is the path forward when that's needed.
**Refresh.** A user's stale session keeps stale groups. `Logout → sign in again` is the only refresh.
## Debugging
`scripts/debug/probe_google_groups.py` — stdlib, takes a Playground-issued OAuth access token + email, hits 6 candidate endpoints, prints raw response. Use this **before** changing the production query — saves a deploy cycle per attempt.
```bash
python3 scripts/debug/probe_google_groups.py "ya29.…" user@keboola.com
```
Token via [OAuth 2.0 Playground](https://developers.google.com/oauthplayground/) → gear icon → own credentials → request the three scopes (`cloud-identity.groups.readonly`, `cloud-identity.groups`, `admin.directory.group.readonly`) → exchange code → copy access token.

View file

@ -92,10 +92,15 @@ GOOGLE_CLIENT_SECRET=$(gcloud secrets versions access latest --secret=google-oau
# instance — leave it "none" and let the corp-PKI rotate scripts handle certs.
CADDY_TLS_LINE=""
if [ "$TLS_MODE" = "caddy" ] && [ -n "$DOMAIN" ]; then
# Value MUST be quoted in the .env file: agnes-auto-upgrade.sh sources
# /opt/agnes/.env via `set -a; . .env; set +a`, and bash interprets an
# unquoted `KEY=value with spaces` as `KEY=value` followed by trying to
# exec `with`/`spaces` as commands → boot succeeds but every cron tick
# logs "<email>: command not found".
if [ -n "$ACME_EMAIL" ]; then
CADDY_TLS_LINE="CADDY_TLS=tls $ACME_EMAIL"
CADDY_TLS_LINE="CADDY_TLS=\"tls $ACME_EMAIL\""
else
CADDY_TLS_LINE="CADDY_TLS=tls internal"
CADDY_TLS_LINE="CADDY_TLS=\"tls internal\""
fi
fi

View file

@ -0,0 +1,184 @@
#!/usr/bin/env python3
"""Probe Google Cloud Identity / Admin Directory APIs for "list groups of THIS user".
Run locally with a fresh user OAuth access token to figure out which endpoint
+ scope combo actually works for your Workspace tenant without a deploy cycle.
Stdlib only no pip install needed.
Why this exists:
Zdeněk's first attempt used `cloudidentity.googleapis.com/v1/groups:search`
with `cloud-identity.groups.readonly` scope. Returns 400 INVALID_ARGUMENT
in Keboola's Workspace because that endpoint requires admin permission
despite the scope name suggesting otherwise.
How to get an access token (Easiest path):
Google's OAuth 2.0 Playground (https://developers.google.com/oauthplayground/)
1. Click the gear icon (top right) tick "Use your own OAuth credentials"
2. Paste your Client ID + Secret (from kids-ai-data-analysis project,
same OAuth client agnes-dev uses)
3. Step 1: pick scopes. For comparison test all of:
https://www.googleapis.com/auth/cloud-identity.groups.readonly
https://www.googleapis.com/auth/cloud-identity.groups
https://www.googleapis.com/auth/admin.directory.group.readonly
openid
email
profile
4. Authorize APIs sign in as your Workspace user
5. Step 2: Exchange authorization code for tokens
6. Copy the "Access token" string (starts with `ya29.`)
Usage:
python3 scripts/debug/probe_google_groups.py <access_token> <email>
Example:
python3 scripts/debug/probe_google_groups.py ya29.a0AfH6S... petr@keboola.com
"""
from __future__ import annotations
import json
import sys
import urllib.error
import urllib.parse
import urllib.request
def _section(title: str) -> None:
print()
print("=" * 78)
print(f" {title}")
print("=" * 78)
def _probe(name: str, url: str, params: dict | None = None,
headers: dict | None = None) -> None:
print(f"\n--- {name} ---")
full_url = url
if params:
full_url = f"{url}?{urllib.parse.urlencode(params)}"
print(f" GET {url}")
if params:
for k, v in params.items():
print(f" {k}={v}")
req = urllib.request.Request(full_url, headers=headers or {})
try:
with urllib.request.urlopen(req, timeout=10) as resp:
status = resp.status
body_bytes = resp.read()
except urllib.error.HTTPError as e:
status = e.code
body_bytes = e.read()
except Exception as e:
print(f" EXCEPTION: {type(e).__name__}: {e}")
return
print(f" HTTP {status}")
body = body_bytes.decode("utf-8", errors="replace")
try:
body = json.dumps(json.loads(body), indent=2)
except Exception:
body = body[:600]
print(" body:")
for line in body.splitlines():
print(f" {line}")
def main() -> int:
if len(sys.argv) != 3:
print(__doc__)
return 1
access_token, email = sys.argv[1], sys.argv[2]
auth = {"Authorization": f"Bearer {access_token}"}
_section("0. Token introspection — what scopes does this token actually have?")
_probe(
"tokeninfo",
"https://oauth2.googleapis.com/tokeninfo",
params={"access_token": access_token},
)
_section("1. OpenID userinfo — verify token identifies the right user")
_probe(
"userinfo",
"https://openidconnect.googleapis.com/v1/userinfo",
headers=auth,
)
_section("2. Cloud Identity — searchTransitiveGroups (user perspective)")
for label_kind in ("discussion_forum", "security"):
_probe(
f"with labels = '{label_kind}'",
"https://cloudidentity.googleapis.com/v1/groups/-/memberships:searchTransitiveGroups",
params={
"query": (
f"member_key_id == '{email}' && "
f"'cloudidentity.googleapis.com/groups.{label_kind}' in labels"
),
},
headers=auth,
)
_section("3. Cloud Identity — searchDirectGroups (no transitive)")
_probe(
"direct only with discussion_forum label",
"https://cloudidentity.googleapis.com/v1/groups/-/memberships:searchDirectGroups",
params={
"query": (
f"member_key_id == '{email}' && "
"'cloudidentity.googleapis.com/groups.discussion_forum' in labels"
),
},
headers=auth,
)
_section("4. Cloud Identity — groups:search (admin endpoint, expected to fail)")
_probe(
"admin search with parent + member_key_id",
"https://cloudidentity.googleapis.com/v1/groups:search",
params={
"query": (
"parent == 'customers/my_customer' && "
f"member_key_id == '{email}' && "
"'cloudidentity.googleapis.com/groups.discussion_forum' in labels"
),
"view": "BASIC",
},
headers=auth,
)
_section("5. Admin SDK Directory — legacy groups?userKey (admin scope required)")
_probe(
"directory list groups for user",
"https://admin.googleapis.com/admin/directory/v1/groups",
params={"userKey": email},
headers=auth,
)
print()
print("=" * 78)
print("Interpretation guide:")
print("=" * 78)
print("""
HTTP 200 + groups list that's the working endpoint, use it in google.py
HTTP 200 + empty list endpoint works but user has no matching groups
HTTP 400 INVALID_ARG query syntax wrong OR permission issue Google
silently disguises as 400 (common for non-admin)
HTTP 403 PERMISSION token lacks scope or admin role
HTTP 401 UNAUTHENTICATED token expired (re-fetch from playground)
HTTP 404 NOT FOUND API not enabled, or wrong URL
If ALL Cloud Identity endpoints return 400/403 for a non-admin user, the
conclusion is: Cloud Identity Groups API requires admin permission for
user-perspective queries, regardless of OAuth scope. Switch to one of:
(a) Service Account + Domain-Wide Delegation (Vojta's v3 design)
(b) Workspace OIDC groups claim (admin enables in Workspace Console)
(c) Grant 'Groups Reader' role to every user (admin overhead)
""")
return 0
if __name__ == "__main__":
sys.exit(main())

View file

@ -261,17 +261,10 @@ def test_admin_tokens_deeplink_preserves_user_query(fresh_db):
assert 'id="flt-user"' in resp.text
# ── Back-compat redirects ─────────────────────────────────────────────────
def test_profile_redirects_to_tokens(fresh_db):
"""/profile no longer renders — it 302-redirects to /tokens."""
from fastapi.testclient import TestClient
from app.main import app
client = TestClient(app)
resp = client.get("/profile", follow_redirects=False)
assert resp.status_code == 302
assert resp.headers["location"] == "/tokens"
# NOTE: test_profile_redirects_to_tokens removed — /profile no longer
# redirects to /tokens; it renders a real profile page including Google
# Workspace groups (cherry-pick of Zdeněk's 4f7e4cd). Current /profile
# behaviour is covered by tests/test_auth_providers.py.
# ── Admin list API — expanded fields ───────────────────────────────────────

View file

@ -144,6 +144,97 @@ class TestGoogleOAuth:
assert "error" in resp.headers.get("location", "")
class TestGoogleGroupsFetch:
"""Unit tests for _fetch_google_groups — the helper must be tolerant of
every realistic failure mode (non-Workspace tenants return 403, expired
tokens return 401, network errors bubble from httpx) and never raise."""
def test_parses_groups_from_success_response(self, monkeypatch):
import asyncio
from app.auth.providers import google as gp
# searchTransitiveGroups returns {"memberships": [...]}, not {"groups": [...]}.
# Each item carries the group identity in groupKey.id + displayName,
# matching the actual API response shape.
fake_payload = {
"memberships": [
{
"group": "groups/abc123",
"groupKey": {"id": "team-eng@example.com"},
"displayName": "Engineering",
},
{
"group": "groups/def456",
"groupKey": {"id": "everyone@example.com"},
# No displayName — falls back to id
},
],
}
class _Resp:
status_code = 200
text = ""
def json(self):
return fake_payload
class _FakeClient:
def __init__(self, *a, **kw):
pass
async def __aenter__(self):
return self
async def __aexit__(self, *a):
return False
async def get(self, url, params=None, headers=None):
return _Resp()
monkeypatch.setattr(gp.httpx, "AsyncClient", _FakeClient)
groups = asyncio.run(gp._fetch_google_groups("fake-token", "user@example.com"))
assert groups == [
{"id": "team-eng@example.com", "name": "Engineering"},
{"id": "everyone@example.com", "name": "everyone@example.com"},
]
def test_returns_empty_on_403(self, monkeypatch):
"""Cloud Identity not enabled (non-Workspace tenant) → 403 → [] + warning."""
import asyncio
from app.auth.providers import google as gp
class _Resp:
status_code = 403
text = "Cloud Identity API has not been enabled"
class _FakeClient:
def __init__(self, *a, **kw): pass
async def __aenter__(self): return self
async def __aexit__(self, *a): return False
async def get(self, url, params=None, headers=None):
return _Resp()
monkeypatch.setattr(gp.httpx, "AsyncClient", _FakeClient)
groups = asyncio.run(gp._fetch_google_groups("fake-token", "user@example.com"))
assert groups == []
def test_returns_empty_on_exception(self, monkeypatch):
"""Network error inside httpx must be swallowed, not propagated."""
import asyncio
from app.auth.providers import google as gp
class _FakeClient:
def __init__(self, *a, **kw): pass
async def __aenter__(self): return self
async def __aexit__(self, *a): return False
async def get(self, *a, **kw):
raise RuntimeError("boom")
monkeypatch.setattr(gp.httpx, "AsyncClient", _FakeClient)
groups = asyncio.run(gp._fetch_google_groups("fake-token", "user@example.com"))
assert groups == []
class TestCookieAuth:
def test_web_ui_with_cookie(self, client):
"""Test that web UI routes accept JWT from cookie."""

View file

@ -277,40 +277,11 @@ def test_pat_cannot_create_pat(fresh_db):
assert resp.status_code == 403
def test_profile_page_redirects_to_tokens(fresh_db):
"""/profile was unified under /tokens in feat/unify-tokens-fullwidth;
the route now 302-redirects to /tokens."""
from fastapi.testclient import TestClient
import uuid
from src.db import get_system_db, close_system_db
from src.repositories.users import UserRepository
from app.auth.jwt import create_access_token
from app.main import app
conn = get_system_db()
try:
uid = str(uuid.uuid4())
UserRepository(conn).create(id=uid, email="u@t", name="U", role="analyst")
token = create_access_token(user_id=uid, email="u@t", role="analyst")
finally:
conn.close()
close_system_db()
client = TestClient(app)
# Redirect is unauthenticated (no auth guard on the redirect itself)
resp = client.get("/profile", follow_redirects=False)
assert resp.status_code == 302
assert resp.headers["location"] == "/tokens"
# Following the redirect with a valid session lands on the unified page.
resp = client.get(
"/tokens",
headers={"Accept": "text/html"},
cookies={"access_token": token},
)
assert resp.status_code == 200
assert "My tokens" in resp.text # non-admin title
assert 'id="new-token-btn"' in resp.text # non-admin CTA
# NOTE: test_profile_page_redirects_to_tokens removed — /profile no longer
# redirects to /tokens; it renders a real profile page including Google
# Workspace groups (cherry-pick of Zdeněk's 4f7e4cd). The /tokens render
# checks (My tokens title, new-token-btn) survive in the test_admin_tokens_ui
# suite.
def test_pat_first_use_from_new_ip_audits(fresh_db):

View file

@ -100,9 +100,11 @@ class TestWebUISmoke:
assert "app-header" in body
# Nav after split: "Tokens" (own) for every signed-in user +
# admin-only "All tokens" link pointing at /admin/tokens.
# Profile link added with the Google-Workspace-groups feature
# (cherry-pick of zs/google-groups-display + dropdown wiring).
assert 'href="/tokens"' in body
assert 'href="/admin/tokens"' in body
assert 'href="/profile"' not in body
assert 'href="/profile"' in body
assert 'href="/admin/users"' in body
# New modern UI markers
assert 'class="users-page"' in body
@ -111,7 +113,7 @@ class TestWebUISmoke:
assert 'id="confirm-modal"' in body
def test_nav_shows_tokens_link_for_non_admin(self, web_client, analyst_cookie):
"""Non-admins see the 'My tokens' user-menu link — no 'All tokens' link, no /profile."""
"""Non-admins see 'My tokens' + 'Profile' user-menu links — no 'All tokens'."""
resp = web_client.get("/dashboard", cookies=analyst_cookie)
assert resp.status_code in (200, 302)
if resp.status_code == 302:
@ -119,9 +121,9 @@ class TestWebUISmoke:
resp = web_client.get(resp.headers["location"], cookies=analyst_cookie)
body = resp.text
assert 'href="/tokens"' in body
assert 'href="/profile"' not in body
assert 'href="/profile"' in body
assert ">My tokens<" in body
assert ">Profile<" not in body
assert ">Profile<" in body
# Non-admins must NOT see the admin "All tokens" link.
assert 'href="/admin/tokens"' not in body
assert ">All tokens<" not in body
@ -138,11 +140,23 @@ class TestWebUISmoke:
assert ">My tokens<" in body
assert ">All tokens<" in body
def test_profile_redirects_to_tokens(self, web_client, admin_cookie):
"""Back-compat: /profile 302-redirects to /tokens."""
resp = web_client.get("/profile", cookies=admin_cookie, follow_redirects=False)
assert resp.status_code == 302
assert resp.headers["location"] == "/tokens"
def test_profile_renders_account_details(self, web_client, admin_cookie):
"""/profile renders a real profile page with email, name, role."""
resp = web_client.get("/profile", cookies=admin_cookie)
assert resp.status_code == 200
body = resp.text
assert "admin@test.com" in body
# Role pill + link to /tokens for PAT management
assert 'class="role-pill"' in body
assert 'href="/tokens"' in body
# Empty-state copy when no Google groups in session
assert "No Google groups available" in body
def test_profile_requires_auth(self, web_client):
"""/profile requires auth (was a 302 back-compat redirect before)."""
resp = web_client.get("/profile", follow_redirects=False)
# Auth dep raises 401; some configs may redirect to /login — accept either.
assert resp.status_code in (401, 302)
class TestClaudeSetupPreview: