agnes-the-ai-analyst/docs/auth-groups.md
ZdenekSrotyr 61f6b8d2d5
feat(ci+tests): deploy safety audit — linting, rollback, smoke tests, 50+ new tests (#120)
Comprehensive deploy safety audit implementing 19 improvements across CI/CD pipeline, test coverage, and source code.

### CI/CD Pipeline
- ruff + mypy added to both release.yml and keboola-deploy.yml (continue-on-error)
- Smoke test added to keboola-deploy.yml (was missing)
- Automatic rollback on smoke test failure in release.yml
- Expanded smoke-test.sh with catalog, admin/tables, marketplace.zip, metrics
- Required status checks via .github/settings.yml
- Dependabot + CODEOWNERS + pre-commit hooks + ruff config

### Source Code
- DB schema version check in /api/health (db_schema: ok/mismatch/unhealthy)
- Config versioning (config_version: 1 in instance.yaml, non-blocking validation)
- BigQuery extractor ATTACH error handling (try/except around INSTALL+ATTACH)
- Post-deploy smoke test script for prod VM validation

### Test Coverage (~50 new tests)
- v13->v14 migration, Email magic link TTL, PAT, Marketplace ZIP/Git,
  Jira webhooks, Hybrid Query BQ, Keboola/BQ extractor failure modes,
  Orchestrator failure modes

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-04-29 09:18:55 +02:00

72 lines
4.5 KiB
Markdown

# Google Workspace Groups in Agnes
How Agnes pulls a user's group memberships at Google sign-in and where they end up.
## Google Cloud setup (per OAuth client / project)
In the GCP project hosting the OAuth client (e.g. `acme-internal-prod`):
1. **Enable Cloud Identity API**`APIs & Services → Library → "Cloud Identity API" → Enable`.
2. **OAuth consent screen → Data Access → Add or Remove Scopes** — manually add:
```
https://www.googleapis.com/auth/cloud-identity.groups.readonly
```
3. **OAuth client → Authorized redirect URIs** — must include `https://<host>/auth/google/callback` for the deployment that uses this client.
4. **OAuth consent screen → Audience** — keep `Internal` (own Workspace tenant only). `External` triggers verification review for the sensitive Cloud Identity scope.
That's it. No service account, no domain-wide delegation, no admin role per user.
## The `security` label trap
Cloud Identity exposes membership listing through `groups/-/memberships:searchTransitiveGroups`. Its `query` (CEL) **must include a label predicate**. Two label types matter:
- `cloudidentity.googleapis.com/groups.discussion_forum` — every Workspace group has it. **Returns 403 "Insufficient permissions"** for non-admin users.
- `cloudidentity.googleapis.com/groups.security` — only security-flagged groups have it as a top-level capability, but in practice **every Keboola Workspace group also carries this label**. **Returns 200** with the full membership list.
Agnes therefore queries with `security` (in `app/auth/providers/google.py`):
```python
"member_key_id == '<email>' && 'cloudidentity.googleapis.com/groups.security' in labels"
```
Switching to `discussion_forum` will silently break for everyone but Workspace admins.
## Storage + use
`app/auth/providers/google.py:google_callback` runs on every Google sign-in:
1. Fetch via `fetch_user_groups(access_token, email)` (in `app/auth/group_sync.py`) → list of `{"id": "<email>", "name": "<displayName>"}`.
2. Write to `user_group_members` table with `source='google_sync'` (DuckDB-backed, persistent across sessions).
3. The previous Google-sync set is wholesale replaced (DELETE + INSERT for `source='google_sync'` rows) so a removed Workspace membership disappears immediately.
4. Admin-added memberships (`source='admin'`) are preserved — Google sync only touches its own rows.
5. **Fail-soft**: If the Cloud Identity API returns an error (403, 401, network), the callback preserves existing memberships instead of wiping them. This prevents a transient API outage from silently dropping all Workspace-synced group memberships.
The `user_group_members` table is the single source of truth for group memberships, used by:
- RBAC authorization (`app/auth/access.py`) — `require_resource_access` checks group grants
- Admin UI (`/admin/access`) — member lists, grant counts
- CLI (`da admin group members`) — group membership queries
- Marketplace filtering (`src/marketplace_filter.py`) — plugin access based on group grants
**Refresh.** Memberships are refreshed on every Google sign-in. A user's stale memberships persist until their next login.
## Local-dev mock (no Google round-trip)
When developing on `localhost` with `LOCAL_DEV_MODE=1`, Google OAuth never runs, so group memberships would normally stay empty. Set `LOCAL_DEV_GROUPS` to inject a mocked membership list:
```bash
export LOCAL_DEV_GROUPS='[{"id":"engineers@example.com","name":"Engineering"},{"id":"admins@example.com","name":"Admins"}]'
```
The value is a JSON array of objects matching the production shape (`{"id", "name"}`). `get_current_user` in `app/auth/dependencies.py` writes the parsed list into `user_group_members` on every dev-bypass request.
`docker-compose.local-dev.yml` carries a commented example at the right escape level for Compose YAML. **Never set this in production** — the variable is only honored when `LOCAL_DEV_MODE=1`.
## Debugging
`scripts/debug/probe_google_groups.py` — stdlib, takes a Playground-issued OAuth access token + email, hits 6 candidate endpoints, prints raw response. Use this **before** changing the production query — saves a deploy cycle per attempt.
```bash
python3 scripts/debug/probe_google_groups.py "ya29.…" user@keboola.com
```
Token via [OAuth 2.0 Playground](https://developers.google.com/oauthplayground/) → gear icon → own credentials → request the three scopes (`cloud-identity.groups.readonly`, `cloud-identity.groups`, `admin.directory.group.readonly`) → exchange code → copy access token.