Comprehensive deploy safety audit implementing 19 improvements across CI/CD pipeline, test coverage, and source code. ### CI/CD Pipeline - ruff + mypy added to both release.yml and keboola-deploy.yml (continue-on-error) - Smoke test added to keboola-deploy.yml (was missing) - Automatic rollback on smoke test failure in release.yml - Expanded smoke-test.sh with catalog, admin/tables, marketplace.zip, metrics - Required status checks via .github/settings.yml - Dependabot + CODEOWNERS + pre-commit hooks + ruff config ### Source Code - DB schema version check in /api/health (db_schema: ok/mismatch/unhealthy) - Config versioning (config_version: 1 in instance.yaml, non-blocking validation) - BigQuery extractor ATTACH error handling (try/except around INSTALL+ATTACH) - Post-deploy smoke test script for prod VM validation ### Test Coverage (~50 new tests) - v13->v14 migration, Email magic link TTL, PAT, Marketplace ZIP/Git, Jira webhooks, Hybrid Query BQ, Keboola/BQ extractor failure modes, Orchestrator failure modes Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
192 lines
9 KiB
Markdown
192 lines
9 KiB
Markdown
# Access control (v14)
|
||
|
||
Two-layer authorization model:
|
||
|
||
- **App-level access** = membership in the seeded `Admin` user-group. Admins can do everything; everyone else is gated through resource grants.
|
||
- **Resource-level access** = generic `(group, resource_type, resource_id)` grants. A user has access to a specific resource if any of their groups holds a matching grant.
|
||
|
||
There is no role hierarchy, no session cache, no implies expansion, no module-author registration step. Every protected endpoint resolves authorization with one or two DuckDB queries.
|
||
|
||
---
|
||
|
||
## Tables
|
||
|
||
| Table | Purpose |
|
||
|---|---|
|
||
| `user_groups` | Named groups. Two rows seeded as `is_system=TRUE`: **Admin** (god mode) and **Everyone** (auto-membership for all users). |
|
||
| `user_group_members` | `(user_id, group_id, source)`. `source ∈ {admin, google_sync, system_seed}` so each writer only manipulates its own rows — Google sync's nightly DELETE+INSERT does not clobber admin-added members. **v14**: FK constraint on `group_id` referencing `user_groups.id` (cascade delete). |
|
||
| `resource_grants` | `(group_id, resource_type, resource_id)`. The grant table the resolver hits when Admin short-circuit doesn't apply. **v14**: FK constraint on `group_id` referencing `user_groups.id` (cascade delete). |
|
||
|
||
`resource_type` is a string from the `app.resource_types.ResourceType` `StrEnum`. `resource_id` is a path string whose format is owned by the registering module — for `marketplace_plugin` it's `<marketplace_slug>/<plugin_name>`.
|
||
|
||
---
|
||
|
||
## Authorization API
|
||
|
||
```python
|
||
from app.auth.access import require_admin, require_resource_access
|
||
from app.resource_types import ResourceType
|
||
|
||
# App-level — admin actions, settings, user management.
|
||
@router.post("/admin/users")
|
||
async def create_user(user = Depends(require_admin)): ...
|
||
|
||
# Resource-level — entity-scoped reads/writes.
|
||
@router.get("/marketplace/{slug}/plugins/{name}")
|
||
async def get_plugin(
|
||
slug: str, name: str,
|
||
user = Depends(require_resource_access(
|
||
ResourceType.MARKETPLACE_PLUGIN, "{slug}/{name}",
|
||
)),
|
||
): ...
|
||
```
|
||
|
||
The `path_template` argument is a Python format string resolved against the request's `path_params` at gate time — `"{slug}/{name}"` becomes the `resource_id` for the grant lookup.
|
||
|
||
Admin short-circuits both helpers — admins never need explicit grants.
|
||
|
||
---
|
||
|
||
## Adding a new resource type
|
||
|
||
Everything lives in `app/resource_types.py`. Three edits, one file:
|
||
|
||
1. Add an enum member to `ResourceType`:
|
||
|
||
```python
|
||
class ResourceType(StrEnum):
|
||
MARKETPLACE_PLUGIN = "marketplace_plugin"
|
||
DATASET = "dataset" # new
|
||
```
|
||
|
||
2. Write a `list_blocks` delegate that projects the domain tables into the `(block → items)` shape the admin /access page consumes. Each item must include `resource_id` matching the path string written into `resource_grants`:
|
||
|
||
```python
|
||
def _dataset_blocks(conn) -> list[Block]:
|
||
rows = conn.execute(
|
||
"SELECT bucket, name, description FROM table_registry ORDER BY bucket, name"
|
||
).fetchall()
|
||
blocks: dict[str, Block] = {}
|
||
for bucket, name, desc in rows:
|
||
block = blocks.setdefault(bucket, {"id": bucket, "name": bucket, "items": []})
|
||
block["items"].append({
|
||
"resource_id": f"{bucket}.{name}",
|
||
"name": name,
|
||
"description": desc,
|
||
})
|
||
return list(blocks.values())
|
||
```
|
||
|
||
3. Register a `ResourceTypeSpec` in `RESOURCE_TYPES`. The dataclass requires `list_blocks` so the type checker will catch a missing delegate:
|
||
|
||
```python
|
||
RESOURCE_TYPES[ResourceType.DATASET] = ResourceTypeSpec(
|
||
key=ResourceType.DATASET,
|
||
display_name="Datasets",
|
||
description="A table available in the analytics catalog.",
|
||
id_format="<bucket>.<table_name>",
|
||
list_blocks=_dataset_blocks,
|
||
)
|
||
```
|
||
|
||
Then wire your endpoints with `require_resource_access(ResourceType.DATASET, "{bucket}.{table}")`.
|
||
|
||
No DB migration, no startup hook, no second wiring step in `access-overview` — the registry drives both `/api/admin/resource-types` (UI dropdown) and `/api/admin/access-overview` (resource tree).
|
||
|
||
---
|
||
|
||
## Group membership sources
|
||
|
||
Members are added to groups by three sources, distinguished by the `source` column:
|
||
|
||
- **`google_sync`** — written by the OAuth callback on every login. The previous Google-sync set is wholesale replaced (DELETE + INSERT) so a removed Workspace membership disappears immediately.
|
||
- **`admin`** — written by admin actions in the UI (`/admin/access`), CLI (`da admin group add-member …`), or REST (`POST /api/admin/groups/{id}/members`). Survives Google sync. Admin can only delete admin-source rows.
|
||
- **`system_seed`** — written at deploy time. Used for the `SEED_ADMIN_EMAIL` → Admin-group binding and the auto-Everyone membership of every new user. Never modified at runtime.
|
||
|
||
Removing a user from a group via the admin path (UI/CLI/REST) only deletes admin-source rows. To revoke a Google-synced membership, the operator must change the upstream Workspace group instead — Agnes will pick up the change on the user's next login.
|
||
|
||
---
|
||
|
||
## Admin workflows
|
||
|
||
### UI
|
||
|
||
`/admin/access` is the single admin page. Two tabs:
|
||
|
||
- **Groups** — list user-groups with member/grant counts. Click a group to manage members. System groups are read-only.
|
||
- **Resource grants** — list grants across all groups (filterable by group / resource_type), create new grants via dropdowns wired against `/api/admin/resource-types`.
|
||
|
||
`/admin/users/{id}` (the existing user detail page) toggles the Admin-group membership when an operator switches a user's "role" between admin and non-admin — there's no four-level hierarchy left, just admin / non-admin.
|
||
|
||
### CLI
|
||
|
||
```bash
|
||
da admin group list
|
||
da admin group create Engineering --description "Eng team"
|
||
da admin group delete Engineering
|
||
da admin group members Engineering
|
||
da admin group add-member Engineering alice@example.com
|
||
da admin group remove-member Engineering alice@example.com
|
||
|
||
da admin grant resource-types
|
||
da admin grant create Engineering marketplace_plugin foundry-ai/metrics-plugin
|
||
da admin grant list --type marketplace_plugin
|
||
da admin grant list --group Engineering
|
||
da admin grant delete <grant-id>
|
||
```
|
||
|
||
All subcommands authenticate via PAT and exit non-zero on API errors.
|
||
|
||
### REST
|
||
|
||
| Endpoint | Method | Purpose |
|
||
|---|---|---|
|
||
| `/api/admin/groups` | GET / POST | list / create groups |
|
||
| `/api/admin/groups/{id}` | PATCH / DELETE | rename / delete (system groups read-only) |
|
||
| `/api/admin/groups/{id}/members` | GET / POST | list / add member |
|
||
| `/api/admin/groups/{id}/members/{user_id}` | DELETE | remove (admin-source rows only) |
|
||
| `/api/admin/grants` | GET / POST | list (with `?resource_type=` / `?group_id=`) / create |
|
||
| `/api/admin/grants/{id}` | DELETE | delete |
|
||
| `/api/admin/resource-types` | GET | enumerate the StrEnum |
|
||
|
||
Every mutation writes an audit log entry (`user_group.created`, `resource_grant.deleted`, …).
|
||
|
||
---
|
||
|
||
## Bootstrapping the first admin
|
||
|
||
`SEED_ADMIN_EMAIL` (env var, set by the infra Terraform module) points at the operator's email. The app startup hook in `app/main.py`:
|
||
|
||
1. Creates a `users` row for that email if missing (with `password_hash` from `SEED_ADMIN_PASSWORD` if provided).
|
||
2. Adds an Admin-group membership with `source='system_seed'`.
|
||
|
||
The hook is idempotent — re-running deploy does not duplicate or revoke. To add additional initial admins post-deploy, log in as the seed admin and use `/admin/access` or `da admin group add-member Admin <email>`.
|
||
|
||
---
|
||
|
||
## Migration from v9–v12 (schema v13 cutover)
|
||
|
||
The v12→v13 migration is a single-step hard cutover. The Python helper `_v12_to_v13_finalize` runs after the new tables are created and:
|
||
|
||
1. Seeds Admin/Everyone in `user_groups` (idempotent).
|
||
2. Backfills `user_group_members` from `users.groups` JSON with `source='google_sync'`.
|
||
3. Promotes every `core.admin` user-role grant to Admin-group membership with `source='system_seed'`.
|
||
4. Adds Everyone-group membership for every existing user.
|
||
5. Translates `plugin_access` rows to `resource_grants` of type `marketplace_plugin`, resource_id `<marketplace>/<plugin>`.
|
||
6. Drops `plugin_access`, `user_role_grants`, `group_mappings`, `internal_roles` (FK-correct order).
|
||
7. Drops the `users.groups` JSON column. The legacy `users.role` column is kept NULL'd as an artifact (DuckDB historical FK constraints sometimes block DROP COLUMN; the field carries no semantic meaning post-v13).
|
||
|
||
No dual-write window. Either the schema is on v12 (old code) or v13 (new code).
|
||
|
||
---
|
||
|
||
## Schema v14 — FK constraints
|
||
|
||
The v13→v14 migration adds DuckDB foreign-key constraints to `user_group_members` and `resource_grants`:
|
||
|
||
- `user_group_members.group_id` → `user_groups.id` (ON DELETE CASCADE)
|
||
- `resource_grants.group_id` → `user_groups.id` (ON DELETE CASCADE)
|
||
|
||
This prevents orphaned member/grant rows pointing at a deleted group. The migration uses RENAME → CREATE-with-FK → INSERT → DROP, wrapped in `BEGIN TRANSACTION` so a partial failure rolls back without leaving the DB at a half-applied schema.
|
||
|
||
No semantic changes — v14 is backward compatible with v13 application code.
|