agnes-the-ai-analyst/docs/TODO-rbac-data-enforcement.md
ZdenekSrotyr e9d7af3cce feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening
This squashes 13 commits from ma/staging plus a small docstring translation
into a single coherent unit. Three workstreams.

== RBAC v13 redesign ==
- Drops core.viewer/analyst/km_admin/admin hierarchy and the
  internal_roles / group_mappings / user_role_grants / plugin_access tables.
- Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill
  wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry.
- Two authorization primitives in app.auth.access:
    require_admin                        — Admin-group god-mode
    require_resource_access(rt, "{path}") — entity-scoped grants
  Single DB lookup per request; no session cache; no implies BFS.
- /admin/access UI (single page) replaces /admin/role-mapping +
  /admin/plugin-access. CLI `da admin group/grant *` replaces
  `da admin role/mapping/grant-role/revoke-role/effective-roles`.
- ResourceType.TABLE listing-only — admins can record table grants,
  runtime enforcement still flows through legacy dataset_permissions
  (migration plan in docs/TODO-rbac-data-enforcement.md).

== Claude Code marketplace ==
- Aggregated /marketplace.zip + /marketplace.git/* (PAT-gated,
  RBAC-filtered, content-addressed cache via dulwich).
- Admin god-mode dropped on the marketplace surface — admins curate
  their own view via grants like everyone else.
- Bare-repo cache materializes per RBAC-filtered ETag; stale entries
  not pruned in this iteration (disclaimed in git_backend.py docstring).

== #81 #83 #44 security/ops hardening ==
- #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias).
- #81 Group B — Keboola extractor 3-state exit codes:
    0 success / 1 total fail / 2 PARTIAL fail
  Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary
  alerting must teach it the new partial signal.
- #81 Group C — schema v10 view_ownership; rejects silent overwrite
  of a prior connector's view name on collision.
- #81 Group D — extractor-side identifier validation.
- #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset
  + path-traversal fix.
- #44 — entire /api/scripts/* surface is admin-only (planted-script +
  sandbox-bypass risk closed).

== Web UI polish + deploy fix ==
- /admin/access: live grant-count badges (no stale snapshot revert),
  shared-header CSS link added to /catalog and /admin/{tables,permissions},
  per-resource-type colored stripes.
- docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't
  silently shadow sub-mounts and write state to the wrong disk.

== OSS vendor-neutralization (waves 1+2) ==
- scripts/grpn/ → scripts/ops/. Customer-specific identifiers
  (project IDs, internal hostnames, dev/prod VM IPs, brand names)
  replaced with placeholders across code, docs, Terraform, Caddyfile,
  OAuth probe, and planning docs. Downstream infra repos that copied
  scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must
  update the path.

== Translation ==
- src/repositories/user_groups.py::ensure_system docstring translated
  from Czech to English for codebase consistency.

Co-authored-by: Mina Rustamyan <mina@keboola.com>
2026-04-28 14:25:04 +02:00

154 lines
7.7 KiB
Markdown

# TODO — RBAC data enforcement (drop `dataset_permissions`)
## Goal
Switch table-access authorization from the legacy per-user
`dataset_permissions` table to the unified per-group `resource_grants`
model (via `ResourceType.TABLE`), then drop `dataset_permissions`
entirely.
The first step — surfacing tables on the `/admin/access` page as a new
resource type — has already shipped. Admins can grant/revoke per-group,
but the runtime check still walks the legacy code path. This document
describes the follow-up work that closes the loop.
## Background
Two parallel access models exist today:
- **v13 unified model** — `user_groups``user_group_members`
`resource_grants(group_id, resource_type, resource_id)`. Used by
`MARKETPLACE_PLUGIN` and (for listing only) `TABLE`.
- **Legacy per-user model** — `dataset_permissions(user_id, dataset)`
with `bucket.*` wildcards and an `is_public` bypass on
`table_registry`. Used at runtime by `src/rbac.py:can_access_table`,
which is the call site referenced by `app/api/catalog.py`,
`app/api/sync.py`, and any other endpoint that filters tables by
caller.
The two layers are mutually exclusive — `dataset_permissions` is
per-user, `resource_grants` is per-group. v13 already pivoted marketplace
plugins to groups; tables are the last holdout.
## Steps
1. **Refactor `src/rbac.py:can_access_table`** (line 63). Preserve the
external signature — callers stay untouched. New check order:
1. Admin bypass (already in place).
2. `is_public=True` on `table_registry` row → allow.
3. Delegate to
`app.auth.access.can_access(user_id, ResourceType.TABLE.value, table_id, conn)`.
Remove the `DatasetPermissionRepository.has_access(user_id, table_id)`
branch and the `bucket.*` wildcard branch. The bucket-level "grant
all" UX is now expressed via the per-table bulk action in
`/admin/access`, not via a wildcard token.
2. **Audit callers of `can_access_table`.** Run
`grep -rn 'can_access_table\|get_accessible_tables' --include='*.py'`
and confirm each site still works with the new semantics. Known
sites: `app/api/catalog.py:24`, `app/api/catalog.py:90`,
`app/api/sync.py`. Either keep delegating through `can_access_table`
(cheapest), or migrate the endpoint to
`Depends(require_resource_access(ResourceType.TABLE, "{table_name}"))`
for endpoints that have a clean `{table_name}` path param.
3. **Schema v14 migration in `src/db.py`.**
- Decide on backfill strategy for existing `dataset_permissions`
rows. Two options:
- **(a) Backfill** — for each `(user_id, dataset)` row, ensure a
personal group `user-{email}` exists, add the user as a member
(source `system_seed`), and insert
`resource_grants(group_id, "table", dataset)`. Wildcard rows
(`bucket.*`) need expansion to all matching `table_registry.id`
values at migration time.
- **(b) Drop without backfill** — log a CHANGELOG warning,
require admins to re-grant via `/admin/access` post-deploy.
- **Recommended: (b)**, contingent on an ops check that
`dataset_permissions` is not heavily populated in any active
deployment. (b) is much simpler and avoids creating "groups of one"
that clutter the access UI.
- Drop the `dataset_permissions` table.
- Decide whether `access_requests` (`src/db.py:183`) goes too —
it implements the "I'd like access to X" flow that pairs with
`dataset_permissions`. If retained, it should write to
`resource_grants` on approval; if not used in practice, drop it
in the same migration.
4. **Remove `DatasetPermissionRepository`** from
`src/repositories/sync_settings.py:47`. Grep for any remaining
importers and clean them up.
5. **Update `tests/test_access_control.py`.** The current suite seeds
`dataset_permissions` directly. Rewrite the relevant cases to seed
`resource_grants` via the access API (or directly via
`ResourceGrantsRepository`). Most assertions should carry over
verbatim because the public surface (HTTP status codes, manifest
content) does not change.
6. **CHANGELOG entry under `## [Unreleased]`:**
```
### Changed
- **BREAKING** Table access moved from per-user `dataset_permissions`
to per-group `resource_grants` (via `ResourceType.TABLE`). Existing
`dataset_permissions` entries are dropped on upgrade — admins must
re-grant via /admin/access.
```
7. **Update `docs/RBAC.md`.** It already calls out v13's removal of
`plugin_access`; add the analogous note for `dataset_permissions`
and document the `is_public` bypass alongside the admin bypass.
8. **Delete the dummy seed script.** Once enforcement is wired up and
verified end-to-end against a real data source, remove
`scripts/seed_dummy_tables.py` and the corresponding `### Internal`
entry from `CHANGELOG.md`. The script exists only to make the
`/admin/access` Tables section testable while no real connector is
configured; once production-style tables flow through `table_registry`
via Keboola/BigQuery discovery, the dummy rows just clutter the UI.
Also drop any leftover dummy rows from deployments that ran the
script (`DELETE FROM table_registry WHERE source_type = 'dummy'`).
## Open questions
- **Production usage of `dataset_permissions`.** Determines whether
step 3 needs a backfill. Check with ops or query a representative
deployment. If essentially empty, prefer (b) and skip the backfill.
- **Fate of `access_requests`.** Does anyone use the request-access
workflow today? If not, drop it together with `dataset_permissions`.
If yes, it needs an updated approval path that writes to
`resource_grants`.
- **`is_public` flag.** Recommended to keep — semantics "default
visible to everyone, no grant needed" is independent of the
group-grant model and useful for shared reference tables. Confirm
before locking in.
- **Bucket-level grants.** v1 deliberately models the bucket as a
UI-only grouping (mirroring how `marketplace` is not a grantable
resource — only individual plugins are). If, after living with the
per-table bulk-action UX, admins still want first-class bucket grants
that auto-cover newly registered tables, add `ResourceType.BUCKET`
in a later iteration.
- **Legacy role-based remnants.** Several places in the code still
carry role-based artifacts (API contracts, CLI gestures, helper
functions, DB column, JWT claim). Audit the codebase end-to-end and
replace each with the v13 group-based mechanism. **Gated on Google
Workspace group sync (`app/auth/group_sync.py`) being verified
reliable end-to-end in production** — until then, `users.role`
serves as a defensive fallback for admin elevation when Cloud
Identity calls silently return `[]`. The asymmetry flagged in PR
#106 (Devin review on `_set_admin_membership`) falls under this
cleanup and will be resolved together with the rest.
- **v9 admins via group_mapping not carried to v13.** The v12→v13
finalize backfill at `src/db.py:851` matches `core.admin` grants
in `user_role_grants` only — direct grants. Any v9 admin who held
`core.admin` exclusively via a `group_mappings` row (Google group →
internal role mapping) is silently dropped on upgrade and gets 403
on admin-gated endpoints post-deploy. In this deploy the set is
empty (Google Workspace group sync was never functional →
`group_mappings` table was never populated → no admin promotions
came through that path), so the loss is theoretical here. For
forked installs that wired up Workspace groups under v9, those
admins need to be re-promoted via `/admin/access` post-upgrade.
Falls under the broader role-based cleanup pass once Workspace
sync is verified end-to-end. Flagged in PR #106 (cvrysanek review
on `src/db.py:851`).