feat(auth): Google Workspace group prefix filter + system mapping (#131)

Three new env vars wire the Google OAuth callback to a configurable Workspace prefix and route admin/everyone Workspace groups onto the seeded system rows: AGNES_GOOGLE_GROUP_PREFIX, AGNES_GROUP_ADMIN_EMAIL, AGNES_GROUP_EVERYONE_EMAIL. Login gate redirects users with no prefix-matching group to /login?error=not_in_allowed_group. BREAKING: auto-Everyone membership for new users removed. Admin UI/API are read-only on Google-managed groups. See docs/auth-groups.md.
This commit is contained in:
minasarustamyan 2026-04-29 14:08:04 +02:00 committed by GitHub
parent 82c5d71d63
commit c940593a90
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
19 changed files with 1285 additions and 293 deletions

View file

@ -23,13 +23,28 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C
- `POST /api/admin/register-table/precheck` — validation-only sibling of register-table. Returns `{"ok": true, "table": {rows, size_bytes, columns, …}}` for BQ rows after round-tripping `get_table`; surfaces NotFound → 404, Forbidden → 403, anything else → 400 with the GCP error verbatim. Also runs Pydantic validation for non-BQ source types so the CLI / UI gets a single endpoint shape.
- `--dry-run` flag on `da admin register-table` — calls `/precheck` and pretty-prints rows / bytes / columns; exits 0 on `ok`, 1 on validation or source-side error.
- Audit-log entries on every `register_table` / `update_table` / `unregister_table` mutation — closes the asymmetry where instance-config saves audited but registry mutations didn't (Decision 4 in #108). Secret-named fields in the request payload are masked as `***`; `description` is logged raw.
- **Google Workspace group prefix filter + system-group mapping.** Three new env vars wire the OAuth callback's group sync to a configurable Workspace prefix and route the admin/everyone Workspace groups into the seeded system rows.
- `AGNES_GOOGLE_GROUP_PREFIX` — when set (e.g. `grp_acme_`), only Workspace groups whose email local part starts with the prefix are mirrored into `user_group_members`. Empty = legacy behavior (mirror every fetched group).
- `AGNES_GROUP_ADMIN_EMAIL` — Workspace group email that maps onto the seeded `Admin` system row instead of creating a fresh `user_groups` entry. Members of that Workspace group land in `Admin` directly.
- `AGNES_GROUP_EVERYONE_EMAIL` — same mechanism for `Everyone`.
- **Login gate.** When `AGNES_GOOGLE_GROUP_PREFIX` is set and the user's Workspace fetch returned a non-empty list with zero prefix matches, the callback redirects to `/login?error=not_in_allowed_group` with a friendly inline banner. Empty fetch results (transient Cloud Identity failures) preserve the cached membership and let the login proceed — fail-soft only the soft-fail path; an explicit no-match still blocks. New error code `group_check_unavailable` is wired through the login banner for future use.
- **Admin UI subtitle for synced groups.** The `/admin/groups` table and the `/admin/groups/{id}` detail page render a derived display name (prefix stripped, `@domain` removed, capitalized) above a small monospace subtitle showing the full Workspace email. Edit / Delete affordances are hidden on Google-managed rows, and a "managed by Google Workspace — read-only here" banner appears on the detail page.
### Changed
- **BREAKING** Auto-`Everyone` membership for new users was removed. `UserRepository.create` no longer writes a `user_group_members` row, and `app.auth.access._user_group_ids` no longer adds a virtual `Everyone` id to the result. Every membership now traces to a real source row (`admin`, `google_sync`, or an explicit `system_seed`). If you relied on the implicit-Everyone behavior for plugin visibility, grant the plugin to a real group (e.g. an `everyone@example.com` Workspace group mapped via `AGNES_GROUP_EVERYONE_EMAIL`).
- **Admin UI / API are read-only on Google-managed groups.** `created_by='system:google-sync'` rows, plus the seeded `Admin` / `Everyone` rows when the matching email-mapping env var is set, return `409` with body `{"detail": {"code": "google_managed_readonly", ...}}` from `PATCH /api/admin/groups/{id}`, `DELETE /api/admin/groups/{id}`, `POST /api/admin/groups/{id}/members`, `DELETE /api/admin/groups/{id}/members/{user_id}`, `POST /api/admin/users/{id}/memberships`, `DELETE /api/admin/users/{id}/memberships/{group_id}`. Edit through admin.google.com, then sign in again to refresh.
- **Audit action names for corporate-memory operations renamed** from `km_<action>` to `corporate_memory.<action>` to match the 0.15.0 CHANGELOG documentation. The audit-tab filter accepts both prefixes for back-compat with rows already in the audit log (no historical-row rewrite). Issue #62.
- **`onDomainChange()` UX bug fixed** on `/corporate-memory`: domain and category filters now compose instead of resetting each other when either changes. Issue #62.
- `POST /api/memory/admin/edit` continues to accept title/content as before; the new `PATCH /api/memory/admin/{id}` is the recommended path for everything else (including title/content). The legacy endpoint is kept one release for back-compat.
### Internal
- New env vars surfaced into `ConfigProxy` so templates can derive the friendly display name client-side.
- New `is_google_managed: bool` field on `GroupResponse` (the API surface for the admin UI's group list/detail).
- New `UserGroupMembersRepository.has_any_google_sync_membership` helper (currently diagnostic; kept for a future tightening of the gate).
- New tests in `tests/test_google_group_prefix_sync.py`; `tests/test_repositories.py::TestUserRepositoryEveryoneAutoMember` renamed to `TestUserRepositoryNoAutoMembership` with inverted assertion; two `tests/test_marketplace_filter.py` tests adapted to the no-implicit-Everyone semantics. See `docs/auth-groups.md` for the full reference.
### Fixed
- `PATCH /api/memory/admin/{id}` now switches from `model_dump(exclude_none=True)` to `exclude_unset=True`, so an explicit `null` in the request body clears the field (e.g. `{"audience": null}` resets a previously-set audience to NULL). Pre-fix nulls were silently dropped, leaving no path to clear `audience` and only the empty-string short-circuit for `domain`. The endpoint now distinguishes "field absent from body" (untouched) from "field explicitly set to null" (cleared). Both `PATCH /api/memory/admin/{id}` and `POST /api/memory/admin/bulk-update` now reject an explicit `null` for `title` (NOT NULL in the schema) at the boundary with HTTP 400 instead of bubbling up as a 500 (PATCH) or per-item Constraint Error (bulk). Issue #62 / PR #126 review.

View file

@ -15,6 +15,7 @@ for every mutation so an admin's group/grant changes are traceable.
from __future__ import annotations
import logging
import os
from datetime import datetime
from typing import Any, List, Optional
@ -66,6 +67,57 @@ def _audit(
logger.warning("audit log failed for %s/%s", action, resource)
def _is_google_managed(g: dict) -> bool:
"""Whether a group row is owned by Google sync — admin UI/API treat such
rows as read-only.
Two ways a group can be Google-managed:
1. ``created_by='system:google-sync'`` auto-created by the OAuth
callback when the user belonged to a prefix-matching Workspace
group; ``name`` is the full Workspace email.
2. ``is_system=TRUE`` AND the group's name matches the env-configured
admin/everyone Workspace email the OAuth callback routes
memberships from those Workspace groups into the seeded system
row instead of creating a separate user_groups row, so the system
row effectively *becomes* a Google-synced row in this deployment.
Without the env mapping, system groups stay regular admin-managed
rows (renaming Admin is still blocked separately by
``UserGroupsRepository`` for code-reference safety).
"""
if (g.get("created_by") or "") == "system:google-sync":
return True
if g.get("is_system"):
from src.db import SYSTEM_ADMIN_GROUP, SYSTEM_EVERYONE_GROUP
admin_email = os.environ.get(
"AGNES_GROUP_ADMIN_EMAIL", ""
).strip().lower()
everyone_email = os.environ.get(
"AGNES_GROUP_EVERYONE_EMAIL", ""
).strip().lower()
if admin_email and g.get("name") == SYSTEM_ADMIN_GROUP:
return True
if everyone_email and g.get("name") == SYSTEM_EVERYONE_GROUP:
return True
return False
def _guard_google_managed(g: dict) -> None:
"""Raise 409 google_managed_readonly when the group is Google-managed."""
if _is_google_managed(g):
raise HTTPException(
status_code=409,
detail={
"code": "google_managed_readonly",
"message": (
"This group is managed by Google Workspace and is "
"read-only here. Add or remove members via "
"admin.google.com, or sign in again to refresh."
),
},
)
def _validate_resource_type(value: str) -> ResourceType:
try:
return ResourceType(value)
@ -214,6 +266,9 @@ class GroupResponse(BaseModel):
created_by: Optional[str] = None
member_count: int = 0
grant_count: int = 0
# True iff the row is owned by Google sync — admin UI hides edit/delete
# affordances and the API rejects mutations with 409 google_managed_readonly.
is_google_managed: bool = False
class CreateGroupRequest(BaseModel):
@ -262,6 +317,7 @@ def _group_to_response(
created_by=g.get("created_by"),
member_count=members_repo.count_members(g["id"]),
grant_count=grants_repo.count_for_group(g["id"]),
is_google_managed=_is_google_managed(g),
)
@ -328,6 +384,7 @@ async def update_group(
g = repo.get(group_id)
if not g:
raise HTTPException(status_code=404, detail="Group not found")
_guard_google_managed(g)
if g.get("is_system") and payload.name is not None and payload.name.strip() != g["name"]:
# System groups: block renames (the canonical names "Admin" /
# "Everyone" are referenced from app.auth.access and the
@ -368,6 +425,7 @@ async def delete_group(
g = repo.get(group_id)
if not g:
raise HTTPException(status_code=404, detail="Group not found")
_guard_google_managed(g)
if g.get("is_system"):
raise HTTPException(status_code=409, detail="Cannot delete a system group")
# Cascade members + grants atomically with the group row so a partial
@ -445,8 +503,10 @@ async def add_member(
user: dict = Depends(require_admin),
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
):
if not UserGroupsRepository(conn).get(group_id):
g = UserGroupsRepository(conn).get(group_id)
if not g:
raise HTTPException(status_code=404, detail="Group not found")
_guard_google_managed(g)
target = UserRepository(conn).get_by_email(payload.email)
if not target:
raise HTTPException(status_code=404, detail=f"User {payload.email!r} not found")
@ -488,6 +548,7 @@ async def remove_member(
group = UserGroupsRepository(conn).get(group_id)
if not group:
raise HTTPException(status_code=404, detail="Group not found")
_guard_google_managed(group)
if group["name"] == "Admin" and user_id == user["id"]:
if UserRepository(conn).count_admins(active_only=True) <= 1:
raise HTTPException(
@ -711,6 +772,7 @@ async def add_user_to_group(
group = UserGroupsRepository(conn).get(payload.group_id)
if not group:
raise HTTPException(status_code=404, detail="Group not found")
_guard_google_managed(group)
members = UserGroupMembersRepository(conn)
if members.has_membership(user_id, payload.group_id):
raise HTTPException(status_code=409, detail="Already a member")
@ -754,6 +816,7 @@ async def remove_user_from_group(
group = UserGroupsRepository(conn).get(group_id)
if not group:
raise HTTPException(status_code=404, detail="Group not found")
_guard_google_managed(group)
if group["name"] == "Admin" and user_id == user["id"]:
if UserRepository(conn).count_admins(active_only=True) <= 1:
raise HTTPException(

View file

@ -179,8 +179,8 @@ async def create_user(
import secrets
user_id = str(uuid.uuid4())
repo.create(id=user_id, email=payload.email, name=payload.name, role=payload.role)
# If the requested role is admin, add to Admin group. Anything else is just
# a member of Everyone (added implicitly by repo.create).
# If the requested role is admin, add to Admin group. Non-admin users start
# with no group memberships — admin-managed grants must be explicit.
if (payload.role or "").lower() == "admin":
_set_admin_membership(user_id, True, user.get("email"), conn)
_audit(conn, user["id"], "user.create", user_id, {"email": payload.email, "role": payload.role})

View file

@ -36,7 +36,7 @@ from fastapi import Depends, HTTPException, Request, status
from app.auth.dependencies import _get_db, get_current_user
from app.resource_types import ResourceType
from src.db import SYSTEM_ADMIN_GROUP, SYSTEM_EVERYONE_GROUP
from src.db import SYSTEM_ADMIN_GROUP
logger = logging.getLogger(__name__)
@ -52,24 +52,21 @@ def _get_group_id_by_name(name: str, conn: duckdb.DuckDBPyConnection) -> Optiona
def _user_group_ids(user_id: str, conn: duckdb.DuckDBPyConnection) -> set[str]:
"""Set of group_ids the user is in. Always includes Everyone.
"""Set of group_ids the user is in.
Membership rows live in ``user_group_members``; Everyone is added
unconditionally so callers don't have to special-case it. If the
Everyone group is missing (impossible in healthy installs but seen in
fresh-test scenarios), the helper logs once and proceeds with the
explicit memberships.
Returns only the rows present in ``user_group_members``. The implicit
"every user is in Everyone" virtual row was removed when Google-prefix
mapping landed every membership is now sourced from a concrete row
(``admin``, ``google_sync``, or ``system_seed``) so an operator
auditing /admin/access sees the same set the authorization layer
enforces. Callers that want Everyone-style "always granted" plugins
must grant them to a real group the user is a member of.
"""
rows = conn.execute(
"SELECT group_id FROM user_group_members WHERE user_id = ?",
[user_id],
).fetchall()
group_ids: set[str] = {r[0] for r in rows}
everyone_id = _get_group_id_by_name(SYSTEM_EVERYONE_GROUP, conn)
if everyone_id is not None:
group_ids.add(everyone_id)
return group_ids
return {r[0] for r in rows}
def is_user_admin(user_id: str, conn: duckdb.DuckDBPyConnection) -> bool:

View file

@ -1,50 +1,71 @@
"""Sync a user's Google Workspace group membership into users.groups.
"""Sync a user's Google Workspace group membership at OAuth callback.
Called from `app/auth/providers/google.py` in the OAuth callback. Uses the
Cloud Identity API (searchTransitiveGroups returns nested group
memberships too) with Application Default Credentials from the VM metadata
server. No JSON key, no domain-wide delegation.
Called from `app/auth/providers/google.py`. Uses keyless Domain-Wide
Delegation: the VM service account signs the impersonation JWT via the IAM
``signJwt`` API (no private key on disk), then exchanges that JWT for a
short-lived OAuth token scoped to ``admin.directory.group.readonly``. The
Admin SDK ``groups.list?userKey=`` endpoint returns the user's static AND
dynamic group memberships in one call.
Required one-off Workspace setup:
- Assign Groups Admin admin role to the VM service account.
- See docs/google-workspace-groups-request.md.
Required GCP setup (one-off):
Required VM config:
- `cloud-platform` access scope on the VM (already set on
grpn-sa-foundryai-execution) covers `cloud-identity.groups.readonly`.
- Cloud Identity API enabled on the project.
- The VM SA grants itself ``roles/iam.serviceAccountTokenCreator`` so it
can call ``IAMCredentials.signJwt`` for its own identity.
- A Domain-Wide Delegation entry exists in admin.google.com Security
API controls Domain-wide Delegation, mapping the VM SA's numeric
Unique ID to scope ``admin.directory.group.readonly``.
Required env on the VM:
- ``GOOGLE_ADMIN_SDK_SUBJECT`` the Workspace admin email the SA
impersonates. Must be a real Workspace user with directory read
privileges. When unset, this module fails soft and returns ``[]``.
- ``GOOGLE_ADMIN_SDK_SA_EMAIL`` (optional) explicit SA email override.
When unset, the SA is auto-detected from the GCE metadata server, i.e.
whichever SA the VM is currently running as. Useful off-VM (CI, tests).
Local dev / CI:
Set GOOGLE_ADMIN_SDK_MOCK_GROUPS to a comma-separated list. ADC from the
metadata server doesn't exist off-VM; without this flag local runs fall
through to the real-path and bail out with an empty list (fail-soft).
Set ``GOOGLE_ADMIN_SDK_MOCK_GROUPS`` to a comma-separated list of group
emails to bypass all Google calls. Empty value empty list. Unset
the real keyless-DWD path.
"""
from __future__ import annotations
import logging
import os
import urllib.error
import urllib.request
from typing import List
logger = logging.getLogger(__name__)
SCOPE = "https://www.googleapis.com/auth/cloud-identity.groups.readonly"
# CEL label filter — regular Workspace email groups (grp_*, eng-team@..., etc).
# Skips security groups, dynamic groups, POSIX groups, which we don't use for
# plugin RBAC.
_GROUP_LABEL_DISCUSSION = "cloudidentity.googleapis.com/groups.discussion_forum"
# Env var that, when set, bypasses the real API entirely. Value is comma-
# separated group names. Empty string → empty list. Unset → real API path.
# Bypass real API entirely. Comma-separated group emails. Empty → []. Unset →
# real keyless-DWD path.
MOCK_ENV = "GOOGLE_ADMIN_SDK_MOCK_GROUPS"
# Required: the Workspace admin email impersonated through DWD.
SUBJECT_ENV = "GOOGLE_ADMIN_SDK_SUBJECT"
# Optional: SA email override. When unset, auto-detect from GCE metadata.
SA_EMAIL_ENV = "GOOGLE_ADMIN_SDK_SA_EMAIL"
SCOPE = "https://www.googleapis.com/auth/admin.directory.group.readonly"
_METADATA_SA_URL = (
"http://metadata.google.internal/computeMetadata/v1/instance/"
"service-accounts/default/email"
)
def fetch_user_groups(email: str) -> List[str]:
"""Return the list of group names (emails) the user belongs to.
"""Return the list of group emails ``email`` is a member of.
Fail-soft: returns [] on any error. Caller must treat this as a soft
signal (login proceeds, users.groups stays whatever it was before).
Fail-soft: returns ``[]`` on any error (missing config, metadata server
unreachable, API 4xx/5xx, network outage). The caller in the OAuth
callback treats ``[]`` as "no data" and leaves the previous membership
snapshot intact so a transient outage does not wipe a user's groups.
"""
mock = os.environ.get(MOCK_ENV)
if mock is not None:
@ -52,51 +73,93 @@ def fetch_user_groups(email: str) -> List[str]:
return _fetch_real(email)
def _detect_sa_email() -> str | None:
"""Return the SA email this process should impersonate as.
Order of resolution:
1. ``GOOGLE_ADMIN_SDK_SA_EMAIL`` env var explicit override.
2. GCE metadata server the SA the VM is attached to.
Returns ``None`` when neither is available (off-VM with no override).
"""
explicit = os.environ.get(SA_EMAIL_ENV, "").strip()
if explicit:
return explicit
try:
req = urllib.request.Request(
_METADATA_SA_URL,
headers={"Metadata-Flavor": "Google"},
)
with urllib.request.urlopen(req, timeout=2) as resp:
return resp.read().decode("ascii").strip()
except (urllib.error.URLError, urllib.error.HTTPError, OSError):
return None
def _fetch_real(email: str) -> List[str]:
try:
from google.auth import default
from google.auth import default, iam
from google.auth.transport.requests import Request
from google.oauth2 import service_account
from googleapiclient.discovery import build
except ImportError:
logger.warning(
"google-api-python-client not installed; skipping group fetch"
"google-api-python-client / google-auth not installed; "
"skipping group fetch"
)
return []
subject = os.environ.get(SUBJECT_ENV, "").strip()
if not subject:
logger.warning(
"%s not set; skipping group fetch (keyless DWD requires an "
"admin email to impersonate)",
SUBJECT_ENV,
)
return []
sa_email = _detect_sa_email()
if not sa_email:
logger.warning(
"Could not determine VM service account email "
"(metadata server unreachable and %s not set); "
"skipping group fetch",
SA_EMAIL_ENV,
)
return []
try:
creds, _ = default(scopes=[SCOPE])
source, _ = default()
signer = iam.Signer(Request(), source, sa_email)
creds = service_account.Credentials(
signer=signer,
service_account_email=sa_email,
token_uri="https://oauth2.googleapis.com/token",
scopes=[SCOPE],
subject=subject,
)
service = build(
"cloudidentity", "v1", credentials=creds, cache_discovery=False
"admin", "directory_v1",
credentials=creds,
cache_discovery=False,
)
except Exception as e: # noqa: BLE001 - fail-soft by design
logger.warning("Google client init failed: %s", e)
logger.warning("Admin SDK init failed: %s", e)
return []
# Escape single quotes in the email to keep the CEL query well-formed even
# if a user has a quote in their login (rare, but defensive).
safe_email = email.replace("'", "\\'")
query = (
f"member_key_id == '{safe_email}' && "
f"'{_GROUP_LABEL_DISCUSSION}' in labels"
)
groups: List[str] = []
page_token = None
page_token: str | None = None
try:
while True:
resp = (
service.groups()
.memberships()
.searchTransitiveGroups(
parent="groups/-",
query=query,
pageToken=page_token,
)
.execute()
)
for m in resp.get("memberships", []):
gkey = m.get("groupKey", {}).get("id")
if gkey:
groups.append(gkey)
resp = service.groups().list(
userKey=email,
maxResults=200,
pageToken=page_token,
).execute()
for g in resp.get("groups", []):
gid = g.get("email")
if gid:
groups.append(gid)
page_token = resp.get("nextPageToken")
if not page_token:
break

View file

@ -91,13 +91,30 @@ async def google_callback(request: Request):
# Find or create user, sync Workspace group memberships into
# user_group_members.
from src.db import get_system_db
from src.db import (
get_system_db,
SYSTEM_ADMIN_GROUP,
SYSTEM_EVERYONE_GROUP,
)
from src.repositories.users import UserRepository
from src.repositories.user_groups import UserGroupsRepository
from src.repositories.user_group_members import UserGroupMembersRepository
from app.auth.group_sync import fetch_user_groups
import uuid
# Optional Workspace-group prefix filter + system-group mapping. Read
# per-request so test fixtures and operators can flip via env without
# restarting the process. Empty prefix = legacy behavior (mirror all).
prefix = os.environ.get(
"AGNES_GOOGLE_GROUP_PREFIX", ""
).strip().lower()
admin_email = os.environ.get(
"AGNES_GROUP_ADMIN_EMAIL", ""
).strip().lower()
everyone_email = os.environ.get(
"AGNES_GROUP_EVERYONE_EMAIL", ""
).strip().lower()
conn = get_system_db()
try:
repo = UserRepository(conn)
@ -112,38 +129,81 @@ async def google_callback(request: Request):
# Sync Workspace groups → user_group_members (source='google_sync').
# Fail-soft: any error leaves the previous membership snapshot in
# place; admin-added rows survive regardless.
members_repo = UserGroupMembersRepository(conn)
try:
group_names = fetch_user_groups(email)
# `fetch_user_groups` is fail-soft and returns [] for both
# "user genuinely has no groups" and "transient API failure".
# We can't distinguish, so empty is treated as "no change":
# don't call replace_google_sync_groups (which would
# DELETE...source='google_sync' then INSERT zero, wiping
# all of the user's Workspace-synced memberships on a
# transient hiccup). Trade-off: a user whose Workspace
# groups were genuinely cleared keeps stale memberships
# until the next non-empty sync. Admin-added rows
# (source='admin') are unaffected either way.
if group_names:
ug_repo = UserGroupsRepository(conn)
members_repo = UserGroupMembersRepository(conn)
group_ids: list[str] = []
for group_name in group_names:
g = ug_repo.ensure(group_name)
group_ids.append(g["id"])
members_repo.replace_google_sync_groups(
user["id"], group_ids, added_by="system:google-sync",
)
logger.info(
"Google group sync for %s: %d group(s) [%s]",
email, len(group_ids), ", ".join(group_names),
)
else:
# Empty result is treated as "no change": preserve the
# previous snapshot rather than wiping it on a transient
# hiccup. Admin-added rows survive regardless.
if not group_names:
logger.info(
"Google group sync for %s: empty result, "
"preserving existing memberships",
email,
)
else:
# Lower-cased Workspace email of each group; comparisons
# against admin_email/everyone_email/prefix are all
# case-insensitive.
fetched = [g.lower() for g in group_names]
if prefix:
relevant = [g for g in fetched if g.startswith(prefix)]
else:
relevant = list(fetched)
# Login gate: prefix is set AND fetch returned a
# non-empty list AND none of those groups match the
# prefix → user is signed into Google but is not a
# member of any group permitted to use this Agnes
# instance. Pass-through-on-empty-fetch is preserved
# above (transient API failures must not lock users
# out), so this branch fires only when we got a real
# answer that excluded them.
if prefix and not relevant:
logger.info(
"Google login denied for %s: no group with "
"prefix %r in %s",
email, prefix, fetched,
)
return RedirectResponse(
url="/login?error=not_in_allowed_group"
)
ug_repo = UserGroupsRepository(conn)
group_ids: list[str] = []
for email_addr in relevant:
if admin_email and email_addr == admin_email:
sys_admin = ug_repo.get_by_name(
SYSTEM_ADMIN_GROUP
)
if sys_admin:
group_ids.append(sys_admin["id"])
continue
if everyone_email and email_addr == everyone_email:
sys_everyone = ug_repo.get_by_name(
SYSTEM_EVERYONE_GROUP
)
if sys_everyone:
group_ids.append(sys_everyone["id"])
continue
# Regular synced group: name = full email. ensure()
# is get-or-create-by-name and stamps
# created_by='system:google-sync' on first create.
g = ug_repo.ensure(email_addr)
group_ids.append(g["id"])
members_repo.replace_google_sync_groups(
user["id"], group_ids, added_by="system:google-sync",
)
logger.info(
"Google group sync for %s: %d group(s) "
"(filtered from %d fetched, prefix=%r) [%s]",
email, len(group_ids), len(fetched), prefix,
", ".join(relevant),
)
except Exception as sync_err: # noqa: BLE001 - fail-soft by design
logger.warning(
"Google group sync failed for %s: %s", email, sync_err

View file

@ -150,6 +150,21 @@ def _build_context(request: Request, user: Optional[dict] = None, **extra) -> di
DEBUG_AUTH_ENABLED = os.environ.get("AGNES_DEBUG_AUTH", "").strip().lower() in (
"1", "true", "yes",
)
# Google Workspace prefix-mapping config — surfaced into templates
# so client-side JS can derive a friendly display name from the
# full Workspace email stored as the group's `name` (admin UI
# strips the prefix and `@domain` for the big line, keeps the
# full email as subtitle). Read at template render time so an
# operator can flip these via env without an image rebuild.
AGNES_GOOGLE_GROUP_PREFIX = os.environ.get(
"AGNES_GOOGLE_GROUP_PREFIX", ""
)
AGNES_GROUP_ADMIN_EMAIL = os.environ.get(
"AGNES_GROUP_ADMIN_EMAIL", ""
)
AGNES_GROUP_EVERYONE_EMAIL = os.environ.get(
"AGNES_GROUP_EVERYONE_EMAIL", ""
)
@staticmethod
def theme_overrides():
@ -728,10 +743,17 @@ async def admin_group_detail_page(
"""Single-group detail page — header + members table. Resource grants
live on /admin/grants (deep-linked from here)."""
from src.repositories.user_groups import UserGroupsRepository
from app.api.access import _is_google_managed
g = UserGroupsRepository(conn).get(group_id)
if not g:
raise HTTPException(status_code=404, detail="Group not found")
ctx = _build_context(request, user=user, target_group=g)
# Project a `is_google_managed` flag onto the dict the template reads,
# using the same rule the API enforces (created_by='system:google-sync'
# OR system + env mapping). Doing it server-side keeps the template
# free of env-var lookups and Python-side logic duplication.
g_view = dict(g)
g_view["is_google_managed"] = _is_google_managed(g)
ctx = _build_context(request, user=user, target_group=g_view)
return templates.TemplateResponse(request, "admin_group_detail.html", ctx)

View file

@ -18,7 +18,23 @@
.gd-back:hover { color: var(--text-primary, #111827); }
.gd-title-block { flex: 1; }
.gd-title { font-size: 22px; font-weight: 600; margin: 0; }
.gd-title-email {
display: block;
font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;
font-size: 12px; font-weight: 400; color: var(--text-secondary, #6b7280);
margin-top: 2px;
}
.gd-subtitle { font-size: 13px; color: var(--text-secondary, #6b7280); margin-top: 2px; }
.gd-managed-banner {
margin-bottom: 16px;
padding: 12px 16px;
background: #ecfdf5;
border: 1px solid #86efac;
color: #166534;
border-radius: 10px;
font-size: 13px;
line-height: 1.5;
}
.origin-chip {
display: inline-block; padding: 3px 10px; border-radius: 999px;
font-size: 10px; font-weight: 600;
@ -133,19 +149,31 @@
.toast.error { background: #b91c1c; }
</style>
<div class="gd-page" data-group-id="{{ target_group.id }}" data-group-name="{{ target_group.name }}" data-is-system="{{ 'true' if target_group.is_system else 'false' }}">
<div class="gd-page"
data-group-id="{{ target_group.id }}"
data-group-name="{{ target_group.name }}"
data-is-system="{{ 'true' if target_group.is_system else 'false' }}"
data-is-google-managed="{{ 'true' if target_group.is_google_managed else 'false' }}"
data-google-prefix="{{ config.AGNES_GOOGLE_GROUP_PREFIX }}">
<div class="gd-header">
<a href="/admin/groups" class="gd-back">← Back to groups</a>
<div class="gd-title-block">
<h1 class="gd-title">
{{ target_group.name }}
<h1 class="gd-title" id="header-title">
{% if target_group.is_google_managed %}
<span id="header-display-name">{{ target_group.name }}</span>
{% else %}
{{ target_group.name }}
{% endif %}
<span id="origin-chip" class="origin-chip" style="display:none;"></span>
</h1>
{% if target_group.is_google_managed %}
<span class="gd-title-email">{{ target_group.name }}</span>
{% endif %}
<div class="gd-subtitle" id="header-sub">
{{ target_group.description or "—" }}
</div>
</div>
{% if not target_group.is_system %}
{% if not target_group.is_system and not target_group.is_google_managed %}
<div style="display:flex; gap:6px;">
<button class="icon-btn" id="edit-group-btn">Edit</button>
<button class="icon-btn danger" id="delete-group-btn">Delete</button>
@ -153,6 +181,14 @@
{% endif %}
</div>
{% if target_group.is_google_managed %}
<div class="gd-managed-banner">
This group is managed by Google Workspace — read-only here.
Add or remove members via <a href="https://admin.google.com" target="_blank" rel="noopener">admin.google.com</a>,
or sign in again to refresh.
</div>
{% endif %}
<!-- Members section -->
<section class="gd-section">
<div class="gd-section-head">
@ -168,10 +204,12 @@
<tbody id="members-tbody"></tbody>
</table>
<div class="gd-empty" id="members-empty" style="display:none;">No members yet.</div>
{% if not target_group.is_google_managed %}
<div class="add-row">
<input id="add-email" type="email" autocomplete="off" placeholder="Add user by email…">
<button id="add-btn" disabled>Add member</button>
</div>
{% endif %}
</section>
<!-- Resource grants summary -->
@ -212,9 +250,27 @@
const root = document.querySelector(".gd-page");
const GROUP_ID = root.dataset.groupId;
const IS_SYSTEM = root.dataset.isSystem === "true";
const IS_GOOGLE_MANAGED = root.dataset.isGoogleManaged === "true";
const GOOGLE_GROUP_PREFIX = root.dataset.googlePrefix || "";
const GROUP_API = `/api/admin/groups/${encodeURIComponent(GROUP_ID)}`;
const MEMBERS_API = `${GROUP_API}/members`;
function deriveDisplayName(fullEmail) {
if (!fullEmail) return "";
const local = String(fullEmail).split("@")[0] || String(fullEmail);
const px = (GOOGLE_GROUP_PREFIX || "").toLowerCase();
let s = local;
if (px && s.toLowerCase().startsWith(px)) s = s.slice(px.length);
s = s.replace(/^[_\-\s]+/, "");
if (!s) return local;
return s.charAt(0).toUpperCase() + s.slice(1);
}
if (IS_GOOGLE_MANAGED) {
const dn = document.getElementById("header-display-name");
if (dn) dn.textContent = deriveDisplayName(root.dataset.groupName);
}
function esc(s) { const d = document.createElement("div"); d.textContent = s == null ? "" : String(s); return d.innerHTML; }
function fmtDate(s) { return s ? String(s).slice(0, 16).replace("T", " ") : "-"; }
function toast(msg, kind = "") {
@ -316,13 +372,19 @@ async function addMember() {
loadMembers();
}
document.getElementById("add-btn").addEventListener("click", addMember);
document.getElementById("add-email").addEventListener("input", e => {
document.getElementById("add-btn").disabled = !e.target.value.trim();
});
document.getElementById("add-email").addEventListener("keydown", e => {
if (e.key === "Enter") { e.preventDefault(); if (e.target.value.trim()) addMember(); }
});
// Add-member affordance is hidden server-side on google-managed rows; bind
// the listeners only when the elements actually exist.
const addBtnEl = document.getElementById("add-btn");
const addEmailEl = document.getElementById("add-email");
if (addBtnEl && addEmailEl) {
addBtnEl.addEventListener("click", addMember);
addEmailEl.addEventListener("input", e => {
addBtnEl.disabled = !e.target.value.trim();
});
addEmailEl.addEventListener("keydown", e => {
if (e.key === "Enter") { e.preventDefault(); if (e.target.value.trim()) addMember(); }
});
}
async function removeMember(userId, label) {
if (!confirm(`Remove ${label} from this group?`)) return;

View file

@ -47,6 +47,12 @@
text-decoration: none;
}
.gp-name:hover { color: var(--primary, #4338ca); text-decoration: underline; }
.gp-name-sub {
display: block;
font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;
font-size: 11px; font-weight: 400; color: var(--text-secondary, #6b7280);
margin-top: 2px;
}
.gp-desc {
color: var(--text-secondary, #6b7280); font-size: 12px;
max-width: 380px;
@ -245,10 +251,28 @@
<script>
const API = "/api/admin/groups";
// Server-injected env: empty string = no prefix configured. Used to derive a
// friendly display name from the full Workspace email stored as the group's
// `name` (e.g. "grp_acme_finance@example.com" → "Finance").
const GOOGLE_GROUP_PREFIX = {{ config.AGNES_GOOGLE_GROUP_PREFIX | tojson }};
function esc(s) { const d = document.createElement("div"); d.textContent = s == null ? "" : String(s); return d.innerHTML; }
function fmtDate(s) { return s ? String(s).slice(0, 16).replace("T", " ") : "-"; }
function deriveDisplayName(fullEmail) {
// Strip @domain, then strip the configured prefix (case-insensitive),
// then capitalize the first letter. Fallback to the raw local-part if
// anything looks off — better to show the email than render an empty cell.
if (!fullEmail) return "";
const local = String(fullEmail).split("@")[0] || String(fullEmail);
const px = (GOOGLE_GROUP_PREFIX || "").toLowerCase();
let s = local;
if (px && s.toLowerCase().startsWith(px)) s = s.slice(px.length);
s = s.replace(/^[_\-\s]+/, "");
if (!s) return local;
return s.charAt(0).toUpperCase() + s.slice(1);
}
function toast(msg, kind = "") {
const el = document.createElement("div");
el.className = "toast " + kind;
@ -308,12 +332,26 @@ function render() {
tr.dataset.id = g.id;
tr.style.cursor = "pointer";
const origin = g.origin || "admin";
const actions = g.is_system
// Read-only when the row is owned by Google sync OR a non-mapped system
// group (Admin/Everyone canonical name without the env mapping — those
// still cannot be renamed/deleted, but they accept admin-managed members).
// The `is_google_managed` flag from the API is the union we care about for
// hiding Edit/Delete in the list.
const isGoogleManaged = !!g.is_google_managed;
const isReadOnly = g.is_system || isGoogleManaged;
const actions = isReadOnly
? `<span style="color:#9ca3af;font-size:11px">read-only</span>`
: `<button class="icon-btn" data-action="edit">Edit</button>
<button class="icon-btn danger" data-action="delete">Delete</button>`;
// For Google-managed rows, the canonical `name` is the full Workspace
// email — render a derived "Finance" big label with the email as a
// monospace subtitle. For everything else, name stays one line.
const nameCell = isGoogleManaged
? `<a class="gp-name" href="/admin/groups/${encodeURIComponent(g.id)}">${esc(deriveDisplayName(g.name))}</a>
<span class="gp-name-sub">${esc(g.name)}</span>`
: `<a class="gp-name" href="/admin/groups/${encodeURIComponent(g.id)}">${esc(g.name)}</a>`;
tr.innerHTML = `
<td><a class="gp-name" href="/admin/groups/${encodeURIComponent(g.id)}">${esc(g.name)}</a></td>
<td>${nameCell}</td>
<td><span class="gp-desc">${esc(g.description || "")}</span></td>
<td><span class="origin-chip origin-${esc(origin)}">${esc(origin.replace("_"," "))}</span></td>
<td class="count-cell">${g.member_count || 0}</td>

View file

@ -95,6 +95,22 @@
Access your AI data analysis workspace
</p>
{% set _err = request.query_params.get('error') %}
{% set _err_messages = {
'not_in_allowed_group': "Your Google account isn't a member of any group permitted to use this Agnes instance. Ask your Agnes administrator to grant you access.",
'group_check_unavailable': "We couldn't verify your group membership with Google right now. Please try signing in again in a moment.",
'deactivated': "This account has been deactivated. Contact your Agnes administrator if you believe this is in error.",
'oauth_failed': "Google sign-in failed. Please try again.",
'no_email': "Google didn't return an email for this account.",
'domain_not_allowed': "This email's domain is not permitted to sign in to this Agnes instance.",
'google_not_configured': "Google sign-in is not configured on this server.",
} %}
{% if _err and _err_messages.get(_err) %}
<div class="login-error" role="alert" style="background:#fef2f2;border:1px solid #fecaca;color:#b91c1c;padding:12px 14px;border-radius:8px;font-size:13px;line-height:1.5;margin:0 auto 16px;max-width:280px;text-align:left;">
{{ _err_messages.get(_err) }}
</div>
{% endif %}
{% for btn in login_buttons %}
<a href="{{ btn.url }}" class="btn {{ btn.css_class|default('btn-secondary') }}" style="width: 100%; max-width: 280px;">
{% if btn.icon_html %}{{ btn.icon_html|safe }}{% endif %}

View file

@ -1,72 +1,252 @@
# Google Workspace Groups in Agnes
# Google Workspace Group Sync
How Agnes pulls a user's group memberships at Google sign-in and where they end up.
How Agnes pulls a user's Workspace group memberships at Google sign-in and
where they end up in the database.
## Google Cloud setup (per OAuth client / project)
## Flow at a glance
In the GCP project hosting the OAuth client (e.g. `acme-internal-prod`):
The OAuth callback in `app/auth/providers/google.py` calls
`app.auth.group_sync.fetch_user_groups(email)` and feeds the result into
`UserGroupMembersRepository.replace_google_sync_groups`, which DELETE+INSERTs
the user's `source='google_sync'` rows in `user_group_members`. Admin-added
rows (`source='admin'`) and seeded system rows (`source='system_seed'`) are
untouched.
1. **Enable Cloud Identity API**`APIs & Services → Library → "Cloud Identity API" → Enable`.
2. **OAuth consent screen → Data Access → Add or Remove Scopes** — manually add:
```
https://www.googleapis.com/auth/cloud-identity.groups.readonly
```
3. **OAuth client → Authorized redirect URIs** — must include `https://<host>/auth/google/callback` for the deployment that uses this client.
4. **OAuth consent screen → Audience** — keep `Internal` (own Workspace tenant only). `External` triggers verification review for the sensitive Cloud Identity scope.
That's it. No service account, no domain-wide delegation, no admin role per user.
## The `security` label trap
Cloud Identity exposes membership listing through `groups/-/memberships:searchTransitiveGroups`. Its `query` (CEL) **must include a label predicate**. Two label types matter:
- `cloudidentity.googleapis.com/groups.discussion_forum` — every Workspace group has it. **Returns 403 "Insufficient permissions"** for non-admin users.
- `cloudidentity.googleapis.com/groups.security` — only security-flagged groups have it as a top-level capability, but in practice **every Keboola Workspace group also carries this label**. **Returns 200** with the full membership list.
Agnes therefore queries with `security` (in `app/auth/providers/google.py`):
```python
"member_key_id == '<email>' && 'cloudidentity.googleapis.com/groups.security' in labels"
```
Browser → /auth/google/callback
→ exchange code for ID token (email)
→ fetch_user_groups(email) ← keyless DWD + Admin SDK groups.list
→ optional prefix filter + system-group mapping
→ ensure each group in user_groups
→ replace_google_sync_groups(...) ← per-user DELETE+INSERT, source-scoped
→ set session cookie, redirect to /dashboard
```
Switching to `discussion_forum` will silently break for everyone but Workspace admins.
The fetch is **fail-soft**: any error (missing config, API 4xx/5xx, network
outage) returns `[]`, the membership snapshot from the previous login stays
intact, and the user is signed in regardless. A transient outage does not
empty a user's groups.
## Storage + use
## How `fetch_user_groups` authenticates to Google
`app/auth/providers/google.py:google_callback` runs on every Google sign-in:
The function in `app/auth/group_sync.py` uses **keyless Domain-Wide
Delegation**: the VM service account signs the impersonation JWT through
the IAM `signJwt` API (no private key on disk anywhere), then exchanges
that JWT for a short-lived OAuth token scoped to
`admin.directory.group.readonly`. The Admin SDK
`groups.list?userKey=<email>` endpoint returns both static and dynamic
group memberships in one paginated call.
1. Fetch via `fetch_user_groups(access_token, email)` (in `app/auth/group_sync.py`) → list of `{"id": "<email>", "name": "<displayName>"}`.
2. Write to `user_group_members` table with `source='google_sync'` (DuckDB-backed, persistent across sessions).
3. The previous Google-sync set is wholesale replaced (DELETE + INSERT for `source='google_sync'` rows) so a removed Workspace membership disappears immediately.
4. Admin-added memberships (`source='admin'`) are preserved — Google sync only touches its own rows.
5. **Fail-soft**: If the Cloud Identity API returns an error (403, 401, network), the callback preserves existing memberships instead of wiping them. This prevents a transient API outage from silently dropping all Workspace-synced group memberships.
Two identities are involved:
The `user_group_members` table is the single source of truth for group memberships, used by:
- RBAC authorization (`app/auth/access.py`) — `require_resource_access` checks group grants
- **The VM service account** (auto-detected from the GCE metadata server)
is the issuer of the JWT. Its IAM unique ID must be allowlisted via DWD.
- **The impersonated subject** (`GOOGLE_ADMIN_SDK_SUBJECT` env var) is a
real Workspace user with directory read privileges. The Admin SDK call
is authorized as if that admin made it.
## Filtering and storage
What the OAuth callback does with the list returned by `fetch_user_groups`:
1. **Prefix filter.** If `AGNES_GOOGLE_GROUP_PREFIX` is set (e.g.
`grp_acme_`), only emails whose local part starts with the prefix
survive into Agnes; the rest are discarded. If unset, every fetched
group is mirrored (legacy behavior).
2. **System-group mapping.** Two optional env vars route specific
Workspace emails into the seeded system rows instead of creating fresh
`user_groups` entries:
- `AGNES_GROUP_ADMIN_EMAIL` — when set, membership in the matching
Workspace group adds the user to the seeded `Admin` row.
- `AGNES_GROUP_EVERYONE_EMAIL` — same, for `Everyone`.
This lets operators have a Workspace group like
`grp_acme_admin@example.com` show up in Agnes as the canonical
`Admin` system group (with the same `is_system=TRUE` semantics, the
same membership-table id) — no parallel "near-Admin" row.
3. **Login gate.** If `AGNES_GOOGLE_GROUP_PREFIX` is set AND the fetch
returned a non-empty list AND none of those groups match the prefix →
the callback redirects to `/login?error=not_in_allowed_group`. The
prefix gate fires only on a real, prefix-mismatched answer; if the
Admin SDK returned an empty list (transient failure or genuine
no-membership), the previous cached snapshot is preserved (fail-soft)
and the login proceeds — locking returning users out on a flaky API
call would be worse than the alternative.
4. **Storage.** Surviving groups land in `user_group_members` with
`source='google_sync'`. The underlying `user_groups` row's `name` is
the **full Workspace email** (no separate `external_id` column — the
email IS the canonical identifier), `created_by='system:google-sync'`.
Admin UI strips the prefix and `@domain` for display
("grp_acme_finance@example.com" → "Finance" big + email subtitle
small).
5. **Refresh semantics.** The previous Google-sync set is wholesale
replaced (DELETE + INSERT for `source='google_sync'` rows) so a removed
Workspace membership disappears immediately. Admin-added memberships
(`source='admin'`) are preserved — Google sync only touches its own
rows. Memberships are refreshed on every Google sign-in; a user's
stale memberships persist until their next login.
**Read-only admin UI on Google-managed rows.** The admin UI hides the
Edit / Delete affordances on rows owned by Google sync
(`created_by='system:google-sync'`) and on the seeded `Admin` / `Everyone`
rows when their email-mapping env var is set. The REST API enforces the
same rule: PATCH / DELETE / add-member / remove-member return
`409 google_managed_readonly` for these rows. To add or remove members,
an operator changes Workspace membership at admin.google.com and the user
signs in again to Agnes.
**No more implicit Everyone.** The auto-`system_seed` insert into
`Everyone` for every new user was removed when prefix-mapping landed.
Every membership now traces to a real source row (`admin`, `google_sync`,
or an explicit `system_seed`). If you want plugins visible to "everyone
in the company", grant them on a Workspace group every employee belongs
to, mapped to `Everyone` via `AGNES_GROUP_EVERYONE_EMAIL`.
The `user_group_members` table is the single source of truth for group
memberships, used by:
- RBAC authorization (`app/auth/access.py`) — `require_resource_access`
checks group grants
- Admin UI (`/admin/access`) — member lists, grant counts
- CLI (`da admin group members`) — group membership queries
- Marketplace filtering (`src/marketplace_filter.py`) — plugin access based on group grants
- Marketplace filtering (`src/marketplace_filter.py`) — plugin access
based on group grants
**Refresh.** Memberships are refreshed on every Google sign-in. A user's stale memberships persist until their next login.
## GCP setup (one-off, per deployment)
## Local-dev mock (no Google round-trip)
1. **Enable Admin SDK API** on the project:
```
APIs & Services → Library → "Admin SDK API" → Enable
```
2. **IAM binding on the VM SA** — grant the SA `roles/iam.serviceAccountTokenCreator`
on itself, so it can call `IAMCredentials.signJwt`:
```bash
gcloud iam service-accounts add-iam-policy-binding <sa-email> \
--member="serviceAccount:<sa-email>" \
--role="roles/iam.serviceAccountTokenCreator" \
--project=<project-id>
```
3. **Domain-Wide Delegation** in `admin.google.com`:
```
Security → API controls → Domain-wide Delegation → Add new
Client ID: <SA's numeric Unique ID, e.g. 103511645014740068359>
OAuth scope: https://www.googleapis.com/auth/admin.directory.group.readonly
```
The Unique ID is the field `uniqueId` returned by
`gcloud iam service-accounts describe <sa-email>`.
When developing on `localhost` with `LOCAL_DEV_MODE=1`, Google OAuth never runs, so group memberships would normally stay empty. Set `LOCAL_DEV_GROUPS` to inject a mocked membership list:
This setup is per Workspace tenant. A Workspace super admin must grant
the DWD entry; project-level GCP IAM cannot do it.
## Required env on the VM
```env
GOOGLE_ADMIN_SDK_SUBJECT=admin@your-domain.com
```
The Workspace admin email the SA impersonates. **Without this, the function
fails soft and returns `[]`** — group sync is silently disabled. The admin
must already have directory read privileges in `admin.google.com`; a regular
user with no admin role will produce a `403 Not Authorized` from the Admin
SDK even with DWD in place.
## Optional env
```env
AGNES_GOOGLE_GROUP_PREFIX=grp_acme_
AGNES_GROUP_ADMIN_EMAIL=grp_acme_admin@example.com
AGNES_GROUP_EVERYONE_EMAIL=grp_acme_everyone@example.com
GOOGLE_ADMIN_SDK_SA_EMAIL=explicit-sa@project.iam.gserviceaccount.com
```
- `AGNES_GOOGLE_GROUP_PREFIX` / `AGNES_GROUP_ADMIN_EMAIL` /
`AGNES_GROUP_EVERYONE_EMAIL` — see [Filtering and storage](#filtering-and-storage).
Empty / unset = legacy "mirror all groups, no gate, no system mapping".
- `GOOGLE_ADMIN_SDK_SA_EMAIL` — when unset, the SA email is auto-detected
from the GCE metadata server. Set this only when running off-VM (CI /
local dev with explicit ADC) or when impersonating a different SA than
the one the VM is attached to.
## Local dev / CI mock
```env
GOOGLE_ADMIN_SDK_MOCK_GROUPS=engineers@example.com,admins@example.com
```
When set, all Google calls in `fetch_user_groups` are bypassed and the
function returns the parsed list verbatim. Empty value (`""`) returns
`[]`. Unset → real keyless-DWD path. The mock is honoured regardless of
`LOCAL_DEV_MODE` so integration tests can exercise the full callback path
with deterministic group lists.
A separate mechanism, `LOCAL_DEV_GROUPS`, is used when `LOCAL_DEV_MODE=1`
bypasses the OAuth flow entirely (so `fetch_user_groups` is never called).
`get_current_user` in `app/auth/dependencies.py` reads that JSON array and
writes it directly into `user_group_members`:
```bash
export LOCAL_DEV_GROUPS='[{"id":"engineers@example.com","name":"Engineering"},{"id":"admins@example.com","name":"Admins"}]'
```
The value is a JSON array of objects matching the production shape (`{"id", "name"}`). `get_current_user` in `app/auth/dependencies.py` writes the parsed list into `user_group_members` on every dev-bypass request.
`docker-compose.local-dev.yml` carries a commented example at the right
escape level for Compose YAML. **Never set this in production** — the
variable is only honoured when `LOCAL_DEV_MODE=1`.
`docker-compose.local-dev.yml` carries a commented example at the right escape level for Compose YAML. **Never set this in production** — the variable is only honored when `LOCAL_DEV_MODE=1`.
## Verifying the setup
## Debugging
`scripts/debug/probe_google_groups.py` — stdlib, takes a Playground-issued OAuth access token + email, hits 6 candidate endpoints, prints raw response. Use this **before** changing the production query — saves a deploy cycle per attempt.
After Terraform apply + subject seeded into `.env`, on the VM:
```bash
python3 scripts/debug/probe_google_groups.py "ya29.…" user@keboola.com
sudo docker exec agnes-app-1 python -c "
from app.auth.group_sync import fetch_user_groups
print(fetch_user_groups('user@your-domain.com'))
"
```
Token via [OAuth 2.0 Playground](https://developers.google.com/oauthplayground/) → gear icon → own credentials → request the three scopes (`cloud-identity.groups.readonly`, `cloud-identity.groups`, `admin.directory.group.readonly`) → exchange code → copy access token.
Expected: a Python list of group emails. `[]` means either the user has no
groups or the function fail-softed — check `docker logs agnes-app-1 | grep
"group sync\|group fetch\|Admin SDK"` for the actual reason.
Common failure modes:
- `... GOOGLE_ADMIN_SDK_SUBJECT not set; skipping group fetch` — env var
missing.
- `... Admin SDK init failed: ...` — DWD entry missing or wrong client ID,
Admin SDK API disabled, or `tokenCreator` IAM binding missing.
- `... Group fetch failed for X: HttpError 403 Not Authorized to access
this resource/api` — the impersonated subject does not have directory
read privileges in Workspace.
## Custom (admin-managed) groups
Admins can still create / rename / delete groups manually via
`/admin/groups`. Two caveats vs. the prefix-mapped flow:
- A renamed group's primary key (`id`) stays put, but DuckDB's UNIQUE
constraint on `name` combined with the FK from
`user_group_members.group_id` makes renaming a populated group awkward
— the operator must clear members + grants first, rename, then re-add.
Documented limitation; the same constraint blocks the prefix-mapping
design from using `external_id` so the email is the name.
- System groups (`Admin`, `Everyone`) refuse renames at the repository
level regardless of `created_by` — those names are referenced from
code (`app.auth.access`, marketplace filter, the email-mapping check)
and must not move.
## Why not the simpler approaches
Earlier iterations tried two simpler paths that did not work in every
deployment:
- **User OAuth token + Cloud Identity API + `groups.security` label**.
Worked at one tenant where every group carried the `security` label, but
returned `403 Error 4013` at another where group label coverage differs.
Tenant-dependent, so dropped from the codebase.
- **VM SA + Cloud Identity `searchTransitiveGroups` with admin role**.
Requires assigning a Workspace admin role to the SA, which several
Workspace tenants block for cross-tenant service accounts (`prj-*` SAs
living under a different Cloud Organization than the Workspace customer
ID). DWD is the documented way around that.
Keyless DWD is the path that works regardless of tenant configuration and
keeps zero key material on the host.

View file

@ -3,10 +3,10 @@
The marketplace endpoint aggregates plugins from every registered marketplace
and returns only those the caller is allowed to see. Access is resolved
uniformly through ``resource_grants`` (resource_type='marketplace_plugin'):
the caller sees the distinct plugins granted to any of their groups
(Everyone is implicit; Admin is just one of those groups here there is no
god-mode shortcut for the marketplace feed, so admins curate their own
view by granting plugins to the Admin group).
the caller sees the distinct plugins granted to any of their groups. There
is no implicit Everyone membership and no god-mode shortcut for the
marketplace feed admins curate their own view by granting plugins to a
group they belong to (Admin or otherwise).
Plugins from different marketplaces that happen to share a name are NOT the
same plugin the caller needs both. We therefore prefix every plugin name
@ -75,10 +75,9 @@ def resolve_allowed_plugins(
root = get_marketplaces_dir()
# Distinct (marketplace_id, plugin_name) across all of the user's
# groups (Everyone is implicit via _user_group_ids). If two groups
# grant the same plugin, it still appears once. Admin is treated as
# a regular group — admins get only the plugins their groups have
# been granted.
# groups. If two groups grant the same plugin, it still appears
# once. Admin is treated as a regular group — admins get only the
# plugins their groups have been granted.
group_ids = _user_group_ids(user_id, conn) if user_id else set()
if not group_ids:
return []
@ -126,9 +125,8 @@ def resolve_user_groups(
granted me visibility into this plugin set?" without opening the admin UI.
Membership semantics mirror ``app.auth.access._user_group_ids``:
Everyone is implicit and is returned even if the explicit
``user_group_members`` row is missing (fresh-install / mis-seeded
fixture safety).
only real ``user_group_members`` rows are surfaced; there is no
implicit Everyone membership.
"""
user_id = user.get("id")
if not user_id:

View file

@ -8,9 +8,11 @@ table:
memberships on every login (DELETE+INSERT scoped to
this source).
- ``admin`` admin UI/CLI manual additions; survives Google sync.
- ``system_seed`` deploy-time seeds (Admin grant for SEED_ADMIN_EMAIL,
Everyone for every new user); survives Google sync
and refuses removal via the admin path.
- ``system_seed`` deploy-time seeds (Admin grant for SEED_ADMIN_EMAIL);
survives Google sync and refuses removal via the
admin path. The auto-Everyone seed for every new
user was removed when Google-prefix mapping landed
explicit grants only.
The ``replace_google_sync_groups`` method is the bulk operation called from
the OAuth callback; ``add_member`` / ``remove_member`` cover admin actions.
@ -157,3 +159,23 @@ class UserGroupMembersRepository:
[group_id],
).fetchone()
return int(row[0]) if row else 0
def has_any_google_sync_membership(self, user_id: str) -> bool:
"""Whether the user has any prior `source='google_sync'` row.
Used by the OAuth callback to distinguish a brand-new login (where
an empty fetch from Cloud Identity might mean the user genuinely
has no Workspace groups) from a returning user with a previously
cached membership snapshot. Returning users get a pass-through on
empty fetch (transient API failures must not lock them out); a
fresh-login no-cache empty fetch is treated identically by the
current callback (pass-through), so this helper is presently
diagnostic kept here so a future tightening of the gate can
flip the branch without a new query path.
"""
row = self.conn.execute(
"SELECT 1 FROM user_group_members "
"WHERE user_id = ? AND source = 'google_sync' LIMIT 1",
[user_id],
).fetchone()
return row is not None

View file

@ -39,17 +39,18 @@ class UserRepository:
role: str = "analyst",
password_hash: Optional[str] = None,
) -> None:
"""Create a user and add them to the Everyone system group.
"""Create a user. Group memberships are populated separately.
``role`` is accepted for legacy API compatibility (some callers still
pass it) but the value is written to the deprecated ``users.role``
column only authorization no longer reads it. New users are
automatically members of Everyone via ``user_group_members``;
explicit Admin grants are issued separately by SEED_ADMIN_EMAIL or
admin UI.
column only authorization no longer reads it. New users are NOT
auto-added to Everyone the implicit membership was removed when
Google-prefix mapping landed because access deployments need every
membership to be traceable to a real source (admin grant, Google
sync, or explicit system seed). If you need the previous "every
new user is in Everyone" behavior, add a `system_seed` row in the
caller after `create`.
"""
from src.db import SYSTEM_EVERYONE_GROUP
now = datetime.now(timezone.utc)
self.conn.execute(
"""INSERT INTO users (id, email, name, role, password_hash, created_at, updated_at)
@ -57,23 +58,6 @@ class UserRepository:
[id, email, name, role, password_hash, now, now],
)
# Auto-add to Everyone. Skip silently if Everyone is missing — that's
# only possible during fresh-install bootstrap before the seed runs;
# _seed_system_groups makes the row idempotently on next connect.
everyone = self.conn.execute(
"SELECT id FROM user_groups WHERE name = ?", [SYSTEM_EVERYONE_GROUP],
).fetchone()
if everyone:
try:
self.conn.execute(
"""INSERT INTO user_group_members
(user_id, group_id, source, added_by)
VALUES (?, ?, 'system_seed', 'user_repo.create')""",
[id, everyone[0]],
)
except duckdb.ConstraintException:
pass # already a member (re-create after delete?)
def update(self, id: str, **kwargs) -> None:
# `groups` was a v12-era column dropped in v13; fresh installs run
# `_SYSTEM_SCHEMA` only and never have it, so listing it here would

View file

@ -0,0 +1,432 @@
"""Tests for the Google-prefix mapping + system-group routing.
Covers:
- prefix filter (only `grp_acme_*` rows survive into user_group_members)
- login gate (302 when prefix is set and no Workspace group matches)
- system-group mapping (admin/everyone Workspace email seeded
Admin/Everyone row instead of a fresh user_groups insert)
- idempotency (second login produces the same memberships)
- API guard `_is_google_managed` + 409 google_managed_readonly
"""
from __future__ import annotations
from types import SimpleNamespace
from unittest.mock import AsyncMock
import pytest
from fastapi.testclient import TestClient
@pytest.fixture
def google_callback_env(tmp_path, monkeypatch):
"""TestClient for the Google callback wired against monkeypatched OAuth.
Patches `is_available`, `oauth.google.authorize_access_token`, and
`app.auth.group_sync.fetch_user_groups` so no real network traffic is
required. The callback's domain check accepts `tester@example.com`
because no `allowed_domains` is configured by default in tests.
Per-test setup: monkeypatch the prefix/admin/everyone env vars and the
`fetch_user_groups` return value before issuing the callback request.
"""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-32chars-minimum!!!!!")
from app.main import create_app
import app.auth.providers.google as g_mod
monkeypatch.setattr(g_mod, "is_available", lambda: True)
fake_oauth_google = SimpleNamespace(
authorize_access_token=AsyncMock(
return_value={
"userinfo": {
"email": "tester@example.com",
"name": "Tester",
}
}
)
)
monkeypatch.setattr(g_mod.oauth, "google", fake_oauth_google, raising=False)
app = create_app()
return {
"client": TestClient(app, follow_redirects=False),
"monkeypatch": monkeypatch,
"g_mod": g_mod,
}
def _set_fetch(monkeypatch, groups):
import app.auth.group_sync as gs_mod
monkeypatch.setattr(gs_mod, "fetch_user_groups", lambda email: list(groups))
def _system_db():
from src.db import get_system_db
return get_system_db()
class TestPrefixFilter:
def test_prefix_filter_keeps_only_matching_groups(self, google_callback_env):
env = google_callback_env
env["monkeypatch"].setenv("AGNES_GOOGLE_GROUP_PREFIX", "grp_acme_")
_set_fetch(env["monkeypatch"], [
"grp_acme_finance@example.com",
"grp_acme_eng@example.com",
"grp_other@example.com",
"acme-everyone@example.com",
"drinks@example.com",
])
resp = env["client"].get("/auth/google/callback?code=x&state=y")
assert resp.status_code == 302
assert resp.headers["location"] == "/dashboard"
conn = _system_db()
try:
from src.repositories.users import UserRepository
from src.repositories.user_groups import UserGroupsRepository
from src.repositories.user_group_members import (
UserGroupMembersRepository,
)
user = UserRepository(conn).get_by_email("tester@example.com")
assert user is not None
group_ids = UserGroupMembersRepository(conn).list_groups_for_user(
user["id"]
)
ug = UserGroupsRepository(conn)
names = sorted(ug.get(gid)["name"] for gid in group_ids)
assert names == [
"grp_acme_eng@example.com",
"grp_acme_finance@example.com",
]
for n in names:
assert ug.get_by_name(n)["created_by"] == "system:google-sync"
finally:
conn.close()
def test_prefix_set_no_match_redirects_to_login_error(
self, google_callback_env
):
env = google_callback_env
env["monkeypatch"].setenv("AGNES_GOOGLE_GROUP_PREFIX", "grp_acme_")
_set_fetch(env["monkeypatch"], [
"drinks@example.com",
"acme-everyone@example.com",
])
resp = env["client"].get("/auth/google/callback?code=x&state=y")
# Bare RedirectResponse defaults to 307 (matches the other error
# redirects in google.py — domain_not_allowed, oauth_failed, etc.).
assert resp.status_code in (302, 307)
assert resp.headers["location"] == "/login?error=not_in_allowed_group"
# No group memberships were written for the user (the gate fired
# before replace_google_sync_groups). The user row may exist
# because user creation happens before the gate — that's the
# documented behavior; admins can mark the row inactive if they
# want a hard block.
conn = _system_db()
try:
from src.repositories.users import UserRepository
from src.repositories.user_group_members import (
UserGroupMembersRepository,
)
user = UserRepository(conn).get_by_email("tester@example.com")
if user:
groups = UserGroupMembersRepository(conn).list_groups_for_user(
user["id"]
)
assert groups == []
finally:
conn.close()
def test_no_prefix_means_legacy_behavior(self, google_callback_env):
"""Without AGNES_GOOGLE_GROUP_PREFIX, every fetched group is mirrored."""
env = google_callback_env
env["monkeypatch"].delenv("AGNES_GOOGLE_GROUP_PREFIX", raising=False)
_set_fetch(env["monkeypatch"], [
"grp_a@example.com",
"grp_b@example.com",
])
resp = env["client"].get("/auth/google/callback?code=x&state=y")
assert resp.status_code == 302
assert resp.headers["location"] == "/dashboard"
conn = _system_db()
try:
from src.repositories.users import UserRepository
from src.repositories.user_groups import UserGroupsRepository
from src.repositories.user_group_members import (
UserGroupMembersRepository,
)
user = UserRepository(conn).get_by_email("tester@example.com")
group_ids = UserGroupMembersRepository(conn).list_groups_for_user(
user["id"]
)
names = sorted(
UserGroupsRepository(conn).get(gid)["name"] for gid in group_ids
)
assert names == ["grp_a@example.com", "grp_b@example.com"]
finally:
conn.close()
class TestSystemMapping:
def test_admin_email_routes_to_seeded_admin_row(self, google_callback_env):
env = google_callback_env
env["monkeypatch"].setenv("AGNES_GOOGLE_GROUP_PREFIX", "grp_acme_")
env["monkeypatch"].setenv(
"AGNES_GROUP_ADMIN_EMAIL", "grp_acme_admin@example.com"
)
_set_fetch(env["monkeypatch"], [
"grp_acme_admin@example.com",
"grp_acme_finance@example.com",
])
env["client"].get("/auth/google/callback?code=x&state=y")
conn = _system_db()
try:
from src.repositories.users import UserRepository
from src.repositories.user_groups import UserGroupsRepository
from src.repositories.user_group_members import (
UserGroupMembersRepository,
)
ug = UserGroupsRepository(conn)
# Crucially: no separate user_groups row was created with the
# full admin email as `name`. Membership lands in the seeded
# Admin row instead.
assert ug.get_by_name("grp_acme_admin@example.com") is None
admin_row = ug.get_by_name("Admin")
assert admin_row is not None and admin_row["is_system"] is True
user = UserRepository(conn).get_by_email("tester@example.com")
group_ids = UserGroupMembersRepository(conn).list_groups_for_user(
user["id"]
)
assert admin_row["id"] in group_ids
# Finance group still goes through ensure() and creates a fresh row.
finance = ug.get_by_name("grp_acme_finance@example.com")
assert finance is not None
assert finance["is_system"] is False
assert finance["created_by"] == "system:google-sync"
assert finance["id"] in group_ids
finally:
conn.close()
def test_everyone_email_routes_to_seeded_everyone_row(
self, google_callback_env
):
env = google_callback_env
env["monkeypatch"].setenv("AGNES_GOOGLE_GROUP_PREFIX", "grp_acme_")
env["monkeypatch"].setenv(
"AGNES_GROUP_EVERYONE_EMAIL", "grp_acme_everyone@example.com"
)
_set_fetch(env["monkeypatch"], [
"grp_acme_everyone@example.com",
])
env["client"].get("/auth/google/callback?code=x&state=y")
conn = _system_db()
try:
from src.repositories.users import UserRepository
from src.repositories.user_groups import UserGroupsRepository
from src.repositories.user_group_members import (
UserGroupMembersRepository,
)
ug = UserGroupsRepository(conn)
assert ug.get_by_name("grp_acme_everyone@example.com") is None
everyone_row = ug.get_by_name("Everyone")
assert everyone_row is not None
assert everyone_row["is_system"] is True
user = UserRepository(conn).get_by_email("tester@example.com")
group_ids = UserGroupMembersRepository(conn).list_groups_for_user(
user["id"]
)
assert everyone_row["id"] in group_ids
finally:
conn.close()
class TestIdempotency:
def test_second_login_does_not_duplicate_groups(self, google_callback_env):
env = google_callback_env
env["monkeypatch"].setenv("AGNES_GOOGLE_GROUP_PREFIX", "grp_acme_")
_set_fetch(env["monkeypatch"], [
"grp_acme_finance@example.com",
])
env["client"].get("/auth/google/callback?code=x&state=y")
env["client"].get("/auth/google/callback?code=x&state=y")
conn = _system_db()
try:
from src.repositories.users import UserRepository
from src.repositories.user_group_members import (
UserGroupMembersRepository,
)
user = UserRepository(conn).get_by_email("tester@example.com")
group_ids = UserGroupMembersRepository(conn).list_groups_for_user(
user["id"]
)
# Exactly one membership: same group, deduplicated by the
# (user_id, group_id) PK in user_group_members.
assert len(group_ids) == 1
# Exactly one user_groups row for that name (ensure() is
# get-or-create, the second login picks up the existing row).
count = conn.execute(
"SELECT COUNT(*) FROM user_groups WHERE name = ?",
["grp_acme_finance@example.com"],
).fetchone()[0]
assert count == 1
finally:
conn.close()
class TestIsGoogleManagedFlag:
"""Exercises the `_is_google_managed` rule used by GroupResponse +
the API guard."""
def test_google_sync_row_is_managed(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
from app.api.access import _is_google_managed
g = {
"name": "grp_acme_x@example.com",
"is_system": False,
"created_by": "system:google-sync",
}
assert _is_google_managed(g) is True
def test_system_admin_with_env_mapping_is_managed(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv(
"AGNES_GROUP_ADMIN_EMAIL", "grp_acme_admin@example.com"
)
from app.api.access import _is_google_managed
g = {"name": "Admin", "is_system": True, "created_by": "system:seed"}
assert _is_google_managed(g) is True
def test_system_admin_without_env_mapping_is_not_managed(
self, tmp_path, monkeypatch
):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.delenv("AGNES_GROUP_ADMIN_EMAIL", raising=False)
monkeypatch.delenv("AGNES_GROUP_EVERYONE_EMAIL", raising=False)
from app.api.access import _is_google_managed
g = {"name": "Admin", "is_system": True, "created_by": "system:seed"}
assert _is_google_managed(g) is False
def test_manual_custom_group_is_not_managed(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
from app.api.access import _is_google_managed
g = {
"name": "data-team",
"is_system": False,
"created_by": "alice@example.com",
}
assert _is_google_managed(g) is False
class TestApiGuard:
"""API endpoints reject mutations on Google-managed groups with 409."""
@pytest.fixture
def admin_client(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-32chars-minimum!!!!!")
monkeypatch.setenv(
"AGNES_GROUP_ADMIN_EMAIL", "grp_acme_admin@example.com"
)
from app.main import create_app
from src.db import get_system_db
from src.repositories.users import UserRepository
from src.repositories.user_groups import UserGroupsRepository
from src.repositories.user_group_members import (
UserGroupMembersRepository,
)
from app.auth.jwt import create_access_token
conn = get_system_db()
try:
ur = UserRepository(conn)
ur.create(id="admin1", email="admin@x", name="Admin1", role="admin")
ur.create(id="u1", email="u1@x", name="U1", role="analyst")
ug = UserGroupsRepository(conn)
admin_id = ug.get_by_name("Admin")["id"]
UserGroupMembersRepository(conn).add_member(
"admin1", admin_id, source="system_seed",
)
# A google-sync group to act on.
ug.ensure("grp_acme_finance@example.com")
finally:
conn.close()
app = create_app()
client = TestClient(app, follow_redirects=False)
token = create_access_token("admin1", "admin@x", "")
client.cookies.set("access_token", token)
return client
def _gid(self, name):
from src.db import get_system_db
from src.repositories.user_groups import UserGroupsRepository
conn = get_system_db()
try:
return UserGroupsRepository(conn).get_by_name(name)["id"]
finally:
conn.close()
def test_patch_google_managed_returns_409(self, admin_client):
gid = self._gid("grp_acme_finance@example.com")
r = admin_client.patch(
f"/api/admin/groups/{gid}",
json={"name": "renamed"},
)
assert r.status_code == 409
body = r.json()
# FastAPI wraps the dict detail under "detail"; assert the code is
# surfaced for the UI's machine-readable branch.
assert body["detail"]["code"] == "google_managed_readonly"
def test_delete_google_managed_returns_409(self, admin_client):
gid = self._gid("grp_acme_finance@example.com")
r = admin_client.delete(f"/api/admin/groups/{gid}")
assert r.status_code == 409
assert r.json()["detail"]["code"] == "google_managed_readonly"
def test_add_member_to_google_managed_returns_409(self, admin_client):
gid = self._gid("grp_acme_finance@example.com")
r = admin_client.post(
f"/api/admin/groups/{gid}/members",
json={"email": "u1@x"},
)
assert r.status_code == 409
assert r.json()["detail"]["code"] == "google_managed_readonly"
def test_patch_admin_with_env_mapping_returns_409(self, admin_client):
# AGNES_GROUP_ADMIN_EMAIL is set in the fixture → seeded Admin row
# is treated as Google-managed and rejects renames here too.
gid = self._gid("Admin")
r = admin_client.patch(
f"/api/admin/groups/{gid}",
json={"description": "updated"},
)
assert r.status_code == 409
assert r.json()["detail"]["code"] == "google_managed_readonly"

View file

@ -2,8 +2,6 @@
from __future__ import annotations
import sys
from types import SimpleNamespace
from unittest import mock
import pytest
@ -45,163 +43,197 @@ class TestMockFlag:
# ---------------------------------------------------------------------------
# Real path (monkeypatched Google client)
# Real path (monkeypatched Google client) — keyless-DWD + Admin SDK shape
# ---------------------------------------------------------------------------
def _make_service_mock(pages: list[dict]) -> mock.Mock:
"""Build a mock for `service.groups().memberships().searchTransitiveGroups(...).execute()`
that returns the given pages in order."""
def _make_admin_service_mock(pages: list[dict]):
"""Mock for ``service.groups().list(...).execute()`` that yields ``pages``
in order. Returns ``(service, list_call)`` so tests can assert on the
call kwargs."""
page_iter = iter(pages)
def execute_side_effect(*_a, **_kw):
return next(page_iter)
search = mock.Mock()
search.return_value.execute.side_effect = execute_side_effect
memberships = mock.Mock()
memberships.return_value.searchTransitiveGroups = search
list_call = mock.Mock()
list_call.return_value.execute.side_effect = execute_side_effect
groups = mock.Mock()
groups.return_value.memberships = memberships
groups.return_value.list = list_call
service = mock.Mock()
service.groups = groups
return service, search
return service, list_call
@pytest.fixture
def real_path_env(monkeypatch):
"""Common setup: ensure mock-env is unset, subject + SA explicit (no
metadata-server call), and stub `google.auth.default` + `iam.Signer` +
`service_account.Credentials` so the SDK init never reaches Google."""
monkeypatch.delenv("GOOGLE_ADMIN_SDK_MOCK_GROUPS", raising=False)
monkeypatch.setenv("GOOGLE_ADMIN_SDK_SUBJECT", "admin@example.com")
monkeypatch.setenv("GOOGLE_ADMIN_SDK_SA_EMAIL", "sa@example.iam.gserviceaccount.com")
monkeypatch.setattr(
"google.auth.default", lambda *a, **kw: (mock.Mock(), "test-project")
)
monkeypatch.setattr(
"google.auth.iam.Signer", lambda *a, **kw: mock.Mock()
)
monkeypatch.setattr(
"google.oauth2.service_account.Credentials",
lambda **kw: mock.Mock(),
)
class TestRealPath:
def test_success_single_page(self, monkeypatch):
monkeypatch.delenv("GOOGLE_ADMIN_SDK_MOCK_GROUPS", raising=False)
service, search = _make_service_mock(
def test_success_single_page(self, monkeypatch, real_path_env):
service, list_call = _make_admin_service_mock(
[
{
"memberships": [
{"groupKey": {"id": "grp_a@groupon.com"}},
{"groupKey": {"id": "grp_b@groupon.com"}},
"groups": [
{"email": "grp_a@groupon.com", "name": "A"},
{"email": "grp_b@groupon.com", "name": "B"},
]
# no nextPageToken
}
]
)
monkeypatch.setattr(
"google.auth.default",
lambda scopes=None: (mock.Mock(), "test-project"),
)
monkeypatch.setattr(
"googleapiclient.discovery.build",
lambda *a, **kw: service,
"googleapiclient.discovery.build", lambda *a, **kw: service
)
from app.auth.group_sync import fetch_user_groups
result = fetch_user_groups("user@groupon.com")
assert result == ["grp_a@groupon.com", "grp_b@groupon.com"]
# CEL query contains email + discussion_forum label filter
call_kwargs = search.call_args.kwargs
assert call_kwargs["parent"] == "groups/-"
assert "member_key_id == 'user@groupon.com'" in call_kwargs["query"]
assert "discussion_forum" in call_kwargs["query"]
call_kwargs = list_call.call_args.kwargs
assert call_kwargs["userKey"] == "user@groupon.com"
assert call_kwargs["pageToken"] is None
def test_success_paginated(self, monkeypatch):
monkeypatch.delenv("GOOGLE_ADMIN_SDK_MOCK_GROUPS", raising=False)
service, search = _make_service_mock(
def test_success_paginated(self, monkeypatch, real_path_env):
service, list_call = _make_admin_service_mock(
[
{
"memberships": [{"groupKey": {"id": "page1@x"}}],
"groups": [{"email": "page1@x"}],
"nextPageToken": "tok1",
},
{
"memberships": [{"groupKey": {"id": "page2@x"}}],
"groups": [{"email": "page2@x"}],
# terminal
},
]
)
monkeypatch.setattr(
"google.auth.default",
lambda scopes=None: (mock.Mock(), "test-project"),
)
monkeypatch.setattr(
"googleapiclient.discovery.build",
lambda *a, **kw: service,
"googleapiclient.discovery.build", lambda *a, **kw: service
)
from app.auth.group_sync import fetch_user_groups
result = fetch_user_groups("u@x")
assert result == ["page1@x", "page2@x"]
assert list_call.call_args_list[1].kwargs["pageToken"] == "tok1"
# Second call should have pageToken=tok1
assert search.call_args_list[1].kwargs["pageToken"] == "tok1"
def test_api_exception_returns_empty(self, monkeypatch):
monkeypatch.delenv("GOOGLE_ADMIN_SDK_MOCK_GROUPS", raising=False)
def raise_boom(*a, **kw):
raise RuntimeError("boom")
def test_api_exception_returns_empty(self, monkeypatch, real_path_env):
service = mock.Mock()
service.groups.return_value.memberships.return_value.searchTransitiveGroups.return_value.execute.side_effect = raise_boom
monkeypatch.setattr(
"google.auth.default",
lambda scopes=None: (mock.Mock(), "test-project"),
service.groups.return_value.list.return_value.execute.side_effect = (
RuntimeError("boom")
)
monkeypatch.setattr(
"googleapiclient.discovery.build",
lambda *a, **kw: service,
"googleapiclient.discovery.build", lambda *a, **kw: service
)
from app.auth.group_sync import fetch_user_groups
assert fetch_user_groups("user@x") == []
def test_client_init_exception_returns_empty(self, monkeypatch):
"""Errors before the API call (ADC, discovery.build) also fail-soft."""
monkeypatch.delenv("GOOGLE_ADMIN_SDK_MOCK_GROUPS", raising=False)
def test_client_init_exception_returns_empty(self, monkeypatch, real_path_env):
"""Errors before the API call (ADC, signer, build) also fail-soft."""
def boom(*a, **kw):
raise RuntimeError("no metadata server")
raise RuntimeError("adc unavailable")
monkeypatch.setattr("google.auth.default", boom)
from app.auth.group_sync import fetch_user_groups
assert fetch_user_groups("user@x") == []
def test_memberships_without_groupkey_are_skipped(self, monkeypatch):
"""Defensive: a malformed membership missing groupKey.id must not crash."""
monkeypatch.delenv("GOOGLE_ADMIN_SDK_MOCK_GROUPS", raising=False)
service, _ = _make_service_mock(
def test_groups_without_email_are_skipped(self, monkeypatch, real_path_env):
"""Defensive: a malformed group entry missing 'email' must not crash."""
service, _ = _make_admin_service_mock(
[
{
"memberships": [
{"groupKey": {"id": "good@x"}},
{"groupKey": {}}, # missing id
{}, # missing groupKey
"groups": [
{"email": "good@x", "name": "Good"},
{"name": "no email"},
{},
]
}
]
)
monkeypatch.setattr(
"google.auth.default",
lambda scopes=None: (mock.Mock(), "test-project"),
)
monkeypatch.setattr(
"googleapiclient.discovery.build",
lambda *a, **kw: service,
"googleapiclient.discovery.build", lambda *a, **kw: service
)
from app.auth.group_sync import fetch_user_groups
assert fetch_user_groups("u@x") == ["good@x"]
def test_email_with_quote_is_escaped(self, monkeypatch):
"""A single quote in the email must not break the CEL query."""
# ---------------------------------------------------------------------------
# Pre-flight env / metadata checks (fail-soft when config is missing)
# ---------------------------------------------------------------------------
class TestPreflightFailSoft:
def test_missing_subject_returns_empty(self, monkeypatch):
"""Without GOOGLE_ADMIN_SDK_SUBJECT we cannot impersonate — bail."""
monkeypatch.delenv("GOOGLE_ADMIN_SDK_MOCK_GROUPS", raising=False)
service, search = _make_service_mock([{"memberships": []}])
monkeypatch.setattr(
"google.auth.default",
lambda scopes=None: (mock.Mock(), "test-project"),
monkeypatch.delenv("GOOGLE_ADMIN_SDK_SUBJECT", raising=False)
monkeypatch.setenv("GOOGLE_ADMIN_SDK_SA_EMAIL", "sa@x.iam")
from app.auth.group_sync import fetch_user_groups
assert fetch_user_groups("u@x") == []
def test_missing_sa_and_no_metadata_returns_empty(self, monkeypatch):
"""No explicit SA + metadata server unreachable → bail."""
monkeypatch.delenv("GOOGLE_ADMIN_SDK_MOCK_GROUPS", raising=False)
monkeypatch.delenv("GOOGLE_ADMIN_SDK_SA_EMAIL", raising=False)
monkeypatch.setenv("GOOGLE_ADMIN_SDK_SUBJECT", "admin@example.com")
# Force the metadata fetch to fail.
def boom(*a, **kw):
raise OSError("no route to metadata")
monkeypatch.setattr("urllib.request.urlopen", boom)
from app.auth.group_sync import fetch_user_groups
assert fetch_user_groups("u@x") == []
def test_explicit_sa_email_used(self, monkeypatch):
"""When GOOGLE_ADMIN_SDK_SA_EMAIL is set, metadata server is bypassed."""
monkeypatch.delenv("GOOGLE_ADMIN_SDK_MOCK_GROUPS", raising=False)
monkeypatch.setenv("GOOGLE_ADMIN_SDK_SUBJECT", "admin@example.com")
monkeypatch.setenv(
"GOOGLE_ADMIN_SDK_SA_EMAIL", "explicit@x.iam.gserviceaccount.com"
)
# Capture what email Signer is called with.
captured: dict[str, str] = {}
def fake_signer(_request, _source, sa_email):
captured["sa"] = sa_email
return mock.Mock()
monkeypatch.setattr(
"googleapiclient.discovery.build",
lambda *a, **kw: service,
"google.auth.default", lambda *a, **kw: (mock.Mock(), "p")
)
monkeypatch.setattr("google.auth.iam.Signer", fake_signer)
monkeypatch.setattr(
"google.oauth2.service_account.Credentials",
lambda **kw: mock.Mock(),
)
service, _ = _make_admin_service_mock([{"groups": []}])
monkeypatch.setattr(
"googleapiclient.discovery.build", lambda *a, **kw: service
)
from app.auth.group_sync import fetch_user_groups
fetch_user_groups("o'reilly@x")
assert "\\'" in search.call_args.kwargs["query"]
fetch_user_groups("u@x")
assert captured["sa"] == "explicit@x.iam.gserviceaccount.com"

View file

@ -90,15 +90,25 @@ class TestResolveAllowedPlugins:
prefixed = {p["prefixed_name"] for p in result}
assert prefixed == {"mkt-a-p1", "mkt-b-p3"}
def test_everyone_grants_visible_to_all(self, db_conn):
def test_everyone_grants_require_explicit_membership(self, db_conn):
# Auto-Everyone removal: a freshly-created user is no longer
# implicitly a member of Everyone, so a grant on Everyone is
# invisible until the user is added as an explicit member.
from src.marketplace_filter import resolve_allowed_plugins
t = datetime.now(timezone.utc)
_register_marketplace(db_conn, id="mkt", registered_at=t,
plugins=[{"name": "public", "version": "1.0"}])
everyone_gid = db_conn.execute("SELECT id FROM user_groups WHERE name='Everyone'").fetchone()[0]
everyone_gid = db_conn.execute(
"SELECT id FROM user_groups WHERE name='Everyone'"
).fetchone()[0]
_grant(db_conn, group_id=everyone_gid, marketplace="mkt", plugin="public")
_make_user(db_conn, user_id="u1", email="u1@x")
# No membership written → no plugin visible.
assert resolve_allowed_plugins(db_conn, {"id": "u1"}) == []
# After explicit membership the grant resolves.
_add_member(db_conn, user_id="u1", group_id=everyone_gid)
result = resolve_allowed_plugins(db_conn, {"id": "u1"})
assert [p["prefixed_name"] for p in result] == ["mkt-public"]
@ -154,13 +164,14 @@ class TestResolveAllowedPlugins:
order = [p["marketplace_id"] for p in result]
assert order == ["earlier-mkt", "later-mkt"]
def test_user_with_unknown_group_sees_nothing(self, db_conn):
def test_user_with_no_groups_sees_nothing(self, db_conn):
from src.marketplace_filter import resolve_allowed_plugins
t = datetime.now(timezone.utc)
_register_marketplace(db_conn, id="mkt", registered_at=t,
plugins=[{"name": "p", "version": "1"}])
_make_user(db_conn, user_id="u-nogroup", email="ng@x")
# Has only Everyone (auto-membership) but no grants on Everyone.
# Auto-Everyone removal: a brand-new user has zero memberships and
# therefore sees nothing regardless of what's granted on Everyone.
result = resolve_allowed_plugins(db_conn, {"id": "u-nogroup"})
assert result == []

View file

@ -156,13 +156,14 @@ class TestMarketplaceInfo:
assert names == {"mkt-b-plug-y"}
assert "TestGroup" in info["groups"]
def test_user_with_no_groups_falls_back_to_everyone(self, marketplace_env):
"""Everyone has no grants here, so the list is empty but call succeeds."""
def test_user_with_no_groups_sees_empty_payload(self, marketplace_env):
"""Auto-Everyone removal: a user with zero memberships now sees an
empty groups list and zero plugins (no implicit Everyone fallback)."""
c = marketplace_env["client"]
resp = c.get("/marketplace/info", headers=_auth(marketplace_env["nogroups_token"]))
assert resp.status_code == 200
info = resp.json()
assert "Everyone" in info["groups"]
assert info["groups"] == []
assert info["plugins"] == []
def test_missing_auth_returns_401(self, marketplace_env):

View file

@ -482,18 +482,14 @@ class TestUserGroupsRepository:
assert returned["is_system"] is True
class TestUserRepositoryEveryoneAutoMember:
"""v12: UserRepository.create adds new users to the Everyone group."""
class TestUserRepositoryNoAutoMembership:
"""Auto-Everyone was removed when Google-prefix mapping landed —
UserRepository.create no longer writes any user_group_members row."""
def test_create_adds_everyone_membership(self, db_conn):
def test_create_adds_no_memberships(self, db_conn):
from src.repositories.users import UserRepository
from src.repositories.user_group_members import UserGroupMembersRepository
repo = UserRepository(db_conn)
repo.create(id="u1", email="u1@test", name="U1")
groups = UserGroupMembersRepository(db_conn).list_groups_for_user("u1")
assert len(groups) >= 1
# Find the Everyone group ID
everyone = db_conn.execute(
"SELECT id FROM user_groups WHERE name='Everyone'"
).fetchone()
assert everyone is not None and everyone[0] in groups
assert groups == []