agnes-the-ai-analyst/docs/TODO-rbac-data-enforcement.md
ZdenekSrotyr e9d7af3cce feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening
This squashes 13 commits from ma/staging plus a small docstring translation
into a single coherent unit. Three workstreams.

== RBAC v13 redesign ==
- Drops core.viewer/analyst/km_admin/admin hierarchy and the
  internal_roles / group_mappings / user_role_grants / plugin_access tables.
- Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill
  wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry.
- Two authorization primitives in app.auth.access:
    require_admin                        — Admin-group god-mode
    require_resource_access(rt, "{path}") — entity-scoped grants
  Single DB lookup per request; no session cache; no implies BFS.
- /admin/access UI (single page) replaces /admin/role-mapping +
  /admin/plugin-access. CLI `da admin group/grant *` replaces
  `da admin role/mapping/grant-role/revoke-role/effective-roles`.
- ResourceType.TABLE listing-only — admins can record table grants,
  runtime enforcement still flows through legacy dataset_permissions
  (migration plan in docs/TODO-rbac-data-enforcement.md).

== Claude Code marketplace ==
- Aggregated /marketplace.zip + /marketplace.git/* (PAT-gated,
  RBAC-filtered, content-addressed cache via dulwich).
- Admin god-mode dropped on the marketplace surface — admins curate
  their own view via grants like everyone else.
- Bare-repo cache materializes per RBAC-filtered ETag; stale entries
  not pruned in this iteration (disclaimed in git_backend.py docstring).

== #81 #83 #44 security/ops hardening ==
- #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias).
- #81 Group B — Keboola extractor 3-state exit codes:
    0 success / 1 total fail / 2 PARTIAL fail
  Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary
  alerting must teach it the new partial signal.
- #81 Group C — schema v10 view_ownership; rejects silent overwrite
  of a prior connector's view name on collision.
- #81 Group D — extractor-side identifier validation.
- #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset
  + path-traversal fix.
- #44 — entire /api/scripts/* surface is admin-only (planted-script +
  sandbox-bypass risk closed).

== Web UI polish + deploy fix ==
- /admin/access: live grant-count badges (no stale snapshot revert),
  shared-header CSS link added to /catalog and /admin/{tables,permissions},
  per-resource-type colored stripes.
- docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't
  silently shadow sub-mounts and write state to the wrong disk.

== OSS vendor-neutralization (waves 1+2) ==
- scripts/grpn/ → scripts/ops/. Customer-specific identifiers
  (project IDs, internal hostnames, dev/prod VM IPs, brand names)
  replaced with placeholders across code, docs, Terraform, Caddyfile,
  OAuth probe, and planning docs. Downstream infra repos that copied
  scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must
  update the path.

== Translation ==
- src/repositories/user_groups.py::ensure_system docstring translated
  from Czech to English for codebase consistency.

Co-authored-by: Mina Rustamyan <mina@keboola.com>
2026-04-28 14:25:04 +02:00

7.7 KiB

TODO — RBAC data enforcement (drop dataset_permissions)

Goal

Switch table-access authorization from the legacy per-user dataset_permissions table to the unified per-group resource_grants model (via ResourceType.TABLE), then drop dataset_permissions entirely.

The first step — surfacing tables on the /admin/access page as a new resource type — has already shipped. Admins can grant/revoke per-group, but the runtime check still walks the legacy code path. This document describes the follow-up work that closes the loop.

Background

Two parallel access models exist today:

  • v13 unified modeluser_groupsuser_group_membersresource_grants(group_id, resource_type, resource_id). Used by MARKETPLACE_PLUGIN and (for listing only) TABLE.
  • Legacy per-user modeldataset_permissions(user_id, dataset) with bucket.* wildcards and an is_public bypass on table_registry. Used at runtime by src/rbac.py:can_access_table, which is the call site referenced by app/api/catalog.py, app/api/sync.py, and any other endpoint that filters tables by caller.

The two layers are mutually exclusive — dataset_permissions is per-user, resource_grants is per-group. v13 already pivoted marketplace plugins to groups; tables are the last holdout.

Steps

  1. Refactor src/rbac.py:can_access_table (line 63). Preserve the external signature — callers stay untouched. New check order:

    1. Admin bypass (already in place).
    2. is_public=True on table_registry row → allow.
    3. Delegate to app.auth.access.can_access(user_id, ResourceType.TABLE.value, table_id, conn).

    Remove the DatasetPermissionRepository.has_access(user_id, table_id) branch and the bucket.* wildcard branch. The bucket-level "grant all" UX is now expressed via the per-table bulk action in /admin/access, not via a wildcard token.

  2. Audit callers of can_access_table. Run grep -rn 'can_access_table\|get_accessible_tables' --include='*.py' and confirm each site still works with the new semantics. Known sites: app/api/catalog.py:24, app/api/catalog.py:90, app/api/sync.py. Either keep delegating through can_access_table (cheapest), or migrate the endpoint to Depends(require_resource_access(ResourceType.TABLE, "{table_name}")) for endpoints that have a clean {table_name} path param.

  3. Schema v14 migration in src/db.py.

    • Decide on backfill strategy for existing dataset_permissions rows. Two options:
      • (a) Backfill — for each (user_id, dataset) row, ensure a personal group user-{email} exists, add the user as a member (source system_seed), and insert resource_grants(group_id, "table", dataset). Wildcard rows (bucket.*) need expansion to all matching table_registry.id values at migration time.
      • (b) Drop without backfill — log a CHANGELOG warning, require admins to re-grant via /admin/access post-deploy.
    • Recommended: (b), contingent on an ops check that dataset_permissions is not heavily populated in any active deployment. (b) is much simpler and avoids creating "groups of one" that clutter the access UI.
    • Drop the dataset_permissions table.
    • Decide whether access_requests (src/db.py:183) goes too — it implements the "I'd like access to X" flow that pairs with dataset_permissions. If retained, it should write to resource_grants on approval; if not used in practice, drop it in the same migration.
  4. Remove DatasetPermissionRepository from src/repositories/sync_settings.py:47. Grep for any remaining importers and clean them up.

  5. Update tests/test_access_control.py. The current suite seeds dataset_permissions directly. Rewrite the relevant cases to seed resource_grants via the access API (or directly via ResourceGrantsRepository). Most assertions should carry over verbatim because the public surface (HTTP status codes, manifest content) does not change.

  6. CHANGELOG entry under ## [Unreleased]:

    ### Changed
    - **BREAKING** Table access moved from per-user `dataset_permissions`
      to per-group `resource_grants` (via `ResourceType.TABLE`). Existing
      `dataset_permissions` entries are dropped on upgrade — admins must
      re-grant via /admin/access.
    
  7. Update docs/RBAC.md. It already calls out v13's removal of plugin_access; add the analogous note for dataset_permissions and document the is_public bypass alongside the admin bypass.

  8. Delete the dummy seed script. Once enforcement is wired up and verified end-to-end against a real data source, remove scripts/seed_dummy_tables.py and the corresponding ### Internal entry from CHANGELOG.md. The script exists only to make the /admin/access Tables section testable while no real connector is configured; once production-style tables flow through table_registry via Keboola/BigQuery discovery, the dummy rows just clutter the UI. Also drop any leftover dummy rows from deployments that ran the script (DELETE FROM table_registry WHERE source_type = 'dummy').

Open questions

  • Production usage of dataset_permissions. Determines whether step 3 needs a backfill. Check with ops or query a representative deployment. If essentially empty, prefer (b) and skip the backfill.
  • Fate of access_requests. Does anyone use the request-access workflow today? If not, drop it together with dataset_permissions. If yes, it needs an updated approval path that writes to resource_grants.
  • is_public flag. Recommended to keep — semantics "default visible to everyone, no grant needed" is independent of the group-grant model and useful for shared reference tables. Confirm before locking in.
  • Bucket-level grants. v1 deliberately models the bucket as a UI-only grouping (mirroring how marketplace is not a grantable resource — only individual plugins are). If, after living with the per-table bulk-action UX, admins still want first-class bucket grants that auto-cover newly registered tables, add ResourceType.BUCKET in a later iteration.
  • Legacy role-based remnants. Several places in the code still carry role-based artifacts (API contracts, CLI gestures, helper functions, DB column, JWT claim). Audit the codebase end-to-end and replace each with the v13 group-based mechanism. Gated on Google Workspace group sync (app/auth/group_sync.py) being verified reliable end-to-end in production — until then, users.role serves as a defensive fallback for admin elevation when Cloud Identity calls silently return []. The asymmetry flagged in PR #106 (Devin review on _set_admin_membership) falls under this cleanup and will be resolved together with the rest.
  • v9 admins via group_mapping not carried to v13. The v12→v13 finalize backfill at src/db.py:851 matches core.admin grants in user_role_grants only — direct grants. Any v9 admin who held core.admin exclusively via a group_mappings row (Google group → internal role mapping) is silently dropped on upgrade and gets 403 on admin-gated endpoints post-deploy. In this deploy the set is empty (Google Workspace group sync was never functional → group_mappings table was never populated → no admin promotions came through that path), so the loss is theoretical here. For forked installs that wired up Workspace groups under v9, those admins need to be re-promoted via /admin/access post-upgrade. Falls under the broader role-based cleanup pass once Workspace sync is verified end-to-end. Flagged in PR #106 (cvrysanek review on src/db.py:851).