agnes-the-ai-analyst/docs/RBAC.md
ZdenekSrotyr e9d7af3cce feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening
This squashes 13 commits from ma/staging plus a small docstring translation
into a single coherent unit. Three workstreams.

== RBAC v13 redesign ==
- Drops core.viewer/analyst/km_admin/admin hierarchy and the
  internal_roles / group_mappings / user_role_grants / plugin_access tables.
- Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill
  wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry.
- Two authorization primitives in app.auth.access:
    require_admin                        — Admin-group god-mode
    require_resource_access(rt, "{path}") — entity-scoped grants
  Single DB lookup per request; no session cache; no implies BFS.
- /admin/access UI (single page) replaces /admin/role-mapping +
  /admin/plugin-access. CLI `da admin group/grant *` replaces
  `da admin role/mapping/grant-role/revoke-role/effective-roles`.
- ResourceType.TABLE listing-only — admins can record table grants,
  runtime enforcement still flows through legacy dataset_permissions
  (migration plan in docs/TODO-rbac-data-enforcement.md).

== Claude Code marketplace ==
- Aggregated /marketplace.zip + /marketplace.git/* (PAT-gated,
  RBAC-filtered, content-addressed cache via dulwich).
- Admin god-mode dropped on the marketplace surface — admins curate
  their own view via grants like everyone else.
- Bare-repo cache materializes per RBAC-filtered ETag; stale entries
  not pruned in this iteration (disclaimed in git_backend.py docstring).

== #81 #83 #44 security/ops hardening ==
- #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias).
- #81 Group B — Keboola extractor 3-state exit codes:
    0 success / 1 total fail / 2 PARTIAL fail
  Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary
  alerting must teach it the new partial signal.
- #81 Group C — schema v10 view_ownership; rejects silent overwrite
  of a prior connector's view name on collision.
- #81 Group D — extractor-side identifier validation.
- #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset
  + path-traversal fix.
- #44 — entire /api/scripts/* surface is admin-only (planted-script +
  sandbox-bypass risk closed).

== Web UI polish + deploy fix ==
- /admin/access: live grant-count badges (no stale snapshot revert),
  shared-header CSS link added to /catalog and /admin/{tables,permissions},
  per-resource-type colored stripes.
- docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't
  silently shadow sub-mounts and write state to the wrong disk.

== OSS vendor-neutralization (waves 1+2) ==
- scripts/grpn/ → scripts/ops/. Customer-specific identifiers
  (project IDs, internal hostnames, dev/prod VM IPs, brand names)
  replaced with placeholders across code, docs, Terraform, Caddyfile,
  OAuth probe, and planning docs. Downstream infra repos that copied
  scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must
  update the path.

== Translation ==
- src/repositories/user_groups.py::ensure_system docstring translated
  from Czech to English for codebase consistency.

Co-authored-by: Mina Rustamyan <mina@keboola.com>
2026-04-28 14:25:04 +02:00

8.2 KiB
Raw Blame History

Access control (v13)

Two-layer authorization model:

  • App-level access = membership in the seeded Admin user-group. Admins can do everything; everyone else is gated through resource grants.
  • Resource-level access = generic (group, resource_type, resource_id) grants. A user has access to a specific resource if any of their groups holds a matching grant.

There is no role hierarchy, no session cache, no implies expansion, no module-author registration step. Every protected endpoint resolves authorization with one or two DuckDB queries.


Tables

Table Purpose
user_groups Named groups. Two rows seeded as is_system=TRUE: Admin (god mode) and Everyone (auto-membership for all users).
user_group_members (user_id, group_id, source). source ∈ {admin, google_sync, system_seed} so each writer only manipulates its own rows — Google sync's nightly DELETE+INSERT does not clobber admin-added members.
resource_grants (group_id, resource_type, resource_id). The grant table the resolver hits when Admin short-circuit doesn't apply.

resource_type is a string from the app.resource_types.ResourceType StrEnum. resource_id is a path string whose format is owned by the registering module — for marketplace_plugin it's <marketplace_slug>/<plugin_name>.


Authorization API

from app.auth.access import require_admin, require_resource_access
from app.resource_types import ResourceType

# App-level — admin actions, settings, user management.
@router.post("/admin/users")
async def create_user(user = Depends(require_admin)): ...

# Resource-level — entity-scoped reads/writes.
@router.get("/marketplace/{slug}/plugins/{name}")
async def get_plugin(
    slug: str, name: str,
    user = Depends(require_resource_access(
        ResourceType.MARKETPLACE_PLUGIN, "{slug}/{name}",
    )),
): ...

The path_template argument is a Python format string resolved against the request's path_params at gate time — "{slug}/{name}" becomes the resource_id for the grant lookup.

Admin short-circuits both helpers — admins never need explicit grants.


Adding a new resource type

Everything lives in app/resource_types.py. Three edits, one file:

  1. Add an enum member to ResourceType:

    class ResourceType(StrEnum):
        MARKETPLACE_PLUGIN = "marketplace_plugin"
        DATASET = "dataset"  # new
    
  2. Write a list_blocks delegate that projects the domain tables into the (block → items) shape the admin /access page consumes. Each item must include resource_id matching the path string written into resource_grants:

    def _dataset_blocks(conn) -> list[Block]:
        rows = conn.execute(
            "SELECT bucket, name, description FROM table_registry ORDER BY bucket, name"
        ).fetchall()
        blocks: dict[str, Block] = {}
        for bucket, name, desc in rows:
            block = blocks.setdefault(bucket, {"id": bucket, "name": bucket, "items": []})
            block["items"].append({
                "resource_id": f"{bucket}.{name}",
                "name": name,
                "description": desc,
            })
        return list(blocks.values())
    
  3. Register a ResourceTypeSpec in RESOURCE_TYPES. The dataclass requires list_blocks so the type checker will catch a missing delegate:

    RESOURCE_TYPES[ResourceType.DATASET] = ResourceTypeSpec(
        key=ResourceType.DATASET,
        display_name="Datasets",
        description="A table available in the analytics catalog.",
        id_format="<bucket>.<table_name>",
        list_blocks=_dataset_blocks,
    )
    

Then wire your endpoints with require_resource_access(ResourceType.DATASET, "{bucket}.{table}").

No DB migration, no startup hook, no second wiring step in access-overview — the registry drives both /api/admin/resource-types (UI dropdown) and /api/admin/access-overview (resource tree).


Group membership sources

Members are added to groups by three sources, distinguished by the source column:

  • google_sync — written by the OAuth callback on every login. The previous Google-sync set is wholesale replaced (DELETE + INSERT) so a removed Workspace membership disappears immediately.
  • admin — written by admin actions in the UI (/admin/access), CLI (da admin group add-member …), or REST (POST /api/admin/groups/{id}/members). Survives Google sync. Admin can only delete admin-source rows.
  • system_seed — written at deploy time. Used for the SEED_ADMIN_EMAIL → Admin-group binding and the auto-Everyone membership of every new user. Never modified at runtime.

Removing a user from a group via the admin path (UI/CLI/REST) only deletes admin-source rows. To revoke a Google-synced membership, the operator must change the upstream Workspace group instead — Agnes will pick up the change on the user's next login.


Admin workflows

UI

/admin/access is the single admin page. Two tabs:

  • Groups — list user-groups with member/grant counts. Click a group to manage members. System groups are read-only.
  • Resource grants — list grants across all groups (filterable by group / resource_type), create new grants via dropdowns wired against /api/admin/resource-types.

/admin/users/{id} (the existing user detail page) toggles the Admin-group membership when an operator switches a user's "role" between admin and non-admin — there's no four-level hierarchy left, just admin / non-admin.

CLI

da admin group list
da admin group create Engineering --description "Eng team"
da admin group delete Engineering
da admin group members Engineering
da admin group add-member Engineering alice@example.com
da admin group remove-member Engineering alice@example.com

da admin grant resource-types
da admin grant create Engineering marketplace_plugin foundry-ai/metrics-plugin
da admin grant list --type marketplace_plugin
da admin grant list --group Engineering
da admin grant delete <grant-id>

All subcommands authenticate via PAT and exit non-zero on API errors.

REST

Endpoint Method Purpose
/api/admin/groups GET / POST list / create groups
/api/admin/groups/{id} PATCH / DELETE rename / delete (system groups read-only)
/api/admin/groups/{id}/members GET / POST list / add member
/api/admin/groups/{id}/members/{user_id} DELETE remove (admin-source rows only)
/api/admin/grants GET / POST list (with ?resource_type= / ?group_id=) / create
/api/admin/grants/{id} DELETE delete
/api/admin/resource-types GET enumerate the StrEnum

Every mutation writes an audit log entry (user_group.created, resource_grant.deleted, …).


Bootstrapping the first admin

SEED_ADMIN_EMAIL (env var, set by the infra Terraform module) points at the operator's email. The app startup hook in app/main.py:

  1. Creates a users row for that email if missing (with password_hash from SEED_ADMIN_PASSWORD if provided).
  2. Adds an Admin-group membership with source='system_seed'.

The hook is idempotent — re-running deploy does not duplicate or revoke. To add additional initial admins post-deploy, log in as the seed admin and use /admin/access or da admin group add-member Admin <email>.


Migration from v9v12 (schema v13 cutover)

The v12→v13 migration is a single-step hard cutover. The Python helper _v12_to_v13_finalize runs after the new tables are created and:

  1. Seeds Admin/Everyone in user_groups (idempotent).
  2. Backfills user_group_members from users.groups JSON with source='google_sync'.
  3. Promotes every core.admin user-role grant to Admin-group membership with source='system_seed'.
  4. Adds Everyone-group membership for every existing user.
  5. Translates plugin_access rows to resource_grants of type marketplace_plugin, resource_id <marketplace>/<plugin>.
  6. Drops plugin_access, user_role_grants, group_mappings, internal_roles (FK-correct order).
  7. Drops the users.groups JSON column. The legacy users.role column is kept NULL'd as an artifact (DuckDB historical FK constraints sometimes block DROP COLUMN; the field carries no semantic meaning post-v13).

No dual-write window. Either the schema is on v12 (old code) or v13 (new code).