Comprehensive deploy safety audit implementing 19 improvements across CI/CD pipeline, test coverage, and source code. ### CI/CD Pipeline - ruff + mypy added to both release.yml and keboola-deploy.yml (continue-on-error) - Smoke test added to keboola-deploy.yml (was missing) - Automatic rollback on smoke test failure in release.yml - Expanded smoke-test.sh with catalog, admin/tables, marketplace.zip, metrics - Required status checks via .github/settings.yml - Dependabot + CODEOWNERS + pre-commit hooks + ruff config ### Source Code - DB schema version check in /api/health (db_schema: ok/mismatch/unhealthy) - Config versioning (config_version: 1 in instance.yaml, non-blocking validation) - BigQuery extractor ATTACH error handling (try/except around INSTALL+ATTACH) - Post-deploy smoke test script for prod VM validation ### Test Coverage (~50 new tests) - v13->v14 migration, Email magic link TTL, PAT, Marketplace ZIP/Git, Jira webhooks, Hybrid Query BQ, Keboola/BQ extractor failure modes, Orchestrator failure modes Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
4.5 KiB
Google Workspace Groups in Agnes
How Agnes pulls a user's group memberships at Google sign-in and where they end up.
Google Cloud setup (per OAuth client / project)
In the GCP project hosting the OAuth client (e.g. acme-internal-prod):
- Enable Cloud Identity API —
APIs & Services → Library → "Cloud Identity API" → Enable. - OAuth consent screen → Data Access → Add or Remove Scopes — manually add:
https://www.googleapis.com/auth/cloud-identity.groups.readonly - OAuth client → Authorized redirect URIs — must include
https://<host>/auth/google/callbackfor the deployment that uses this client. - OAuth consent screen → Audience — keep
Internal(own Workspace tenant only).Externaltriggers verification review for the sensitive Cloud Identity scope.
That's it. No service account, no domain-wide delegation, no admin role per user.
The security label trap
Cloud Identity exposes membership listing through groups/-/memberships:searchTransitiveGroups. Its query (CEL) must include a label predicate. Two label types matter:
cloudidentity.googleapis.com/groups.discussion_forum— every Workspace group has it. Returns 403 "Insufficient permissions" for non-admin users.cloudidentity.googleapis.com/groups.security— only security-flagged groups have it as a top-level capability, but in practice every Keboola Workspace group also carries this label. Returns 200 with the full membership list.
Agnes therefore queries with security (in app/auth/providers/google.py):
"member_key_id == '<email>' && 'cloudidentity.googleapis.com/groups.security' in labels"
Switching to discussion_forum will silently break for everyone but Workspace admins.
Storage + use
app/auth/providers/google.py:google_callback runs on every Google sign-in:
- Fetch via
fetch_user_groups(access_token, email)(inapp/auth/group_sync.py) → list of{"id": "<email>", "name": "<displayName>"}. - Write to
user_group_memberstable withsource='google_sync'(DuckDB-backed, persistent across sessions). - The previous Google-sync set is wholesale replaced (DELETE + INSERT for
source='google_sync'rows) so a removed Workspace membership disappears immediately. - Admin-added memberships (
source='admin') are preserved — Google sync only touches its own rows. - Fail-soft: If the Cloud Identity API returns an error (403, 401, network), the callback preserves existing memberships instead of wiping them. This prevents a transient API outage from silently dropping all Workspace-synced group memberships.
The user_group_members table is the single source of truth for group memberships, used by:
- RBAC authorization (
app/auth/access.py) —require_resource_accesschecks group grants - Admin UI (
/admin/access) — member lists, grant counts - CLI (
da admin group members) — group membership queries - Marketplace filtering (
src/marketplace_filter.py) — plugin access based on group grants
Refresh. Memberships are refreshed on every Google sign-in. A user's stale memberships persist until their next login.
Local-dev mock (no Google round-trip)
When developing on localhost with LOCAL_DEV_MODE=1, Google OAuth never runs, so group memberships would normally stay empty. Set LOCAL_DEV_GROUPS to inject a mocked membership list:
export LOCAL_DEV_GROUPS='[{"id":"engineers@example.com","name":"Engineering"},{"id":"admins@example.com","name":"Admins"}]'
The value is a JSON array of objects matching the production shape ({"id", "name"}). get_current_user in app/auth/dependencies.py writes the parsed list into user_group_members on every dev-bypass request.
docker-compose.local-dev.yml carries a commented example at the right escape level for Compose YAML. Never set this in production — the variable is only honored when LOCAL_DEV_MODE=1.
Debugging
scripts/debug/probe_google_groups.py — stdlib, takes a Playground-issued OAuth access token + email, hits 6 candidate endpoints, prints raw response. Use this before changing the production query — saves a deploy cycle per attempt.
python3 scripts/debug/probe_google_groups.py "ya29.…" user@keboola.com
Token via OAuth 2.0 Playground → gear icon → own credentials → request the three scopes (cloud-identity.groups.readonly, cloud-identity.groups, admin.directory.group.readonly) → exchange code → copy access token.