Commit graph

4 commits

Author SHA1 Message Date
ZdenekSrotyr
b6543c9c55 fix: Devin Review on #194 — 2 BUG-class findings
1. .env_overlay write paths now match read path under STATE_DIR.
   app/main.py:343 reads via _state_dir() (post-PR #194), but two
   write sites still hardcoded ${DATA_DIR}/state/.env_overlay:
     - app/api/admin.py:2687 — configure endpoint secrets persistence
     - app/api/marketplaces.py:152 — marketplace PAT persistence
   Under flat-mount layout (STATE_DIR=/data-state) the admin UI wrote
   secrets to /data/state/.env_overlay while the app read from
   /data-state/.env_overlay, silently dropping the value on next
   restart. Both write sites now go through _state_dir().

2. host-mount.yml: caddy inherits data:/srv:ro from base, but with
   no service populating the data: named volume (other services
   switched to direct /data binds), the inherited mount points at an
   empty Docker volume — try_files finds nothing, every parquet
   download falls through to uvicorn, defeating the v0.36.0
   file_server bypass under the host-mount layout. Added a caddy
   override that restates all mounts including a direct /data:/srv:ro
   bind. Mirrors the comment + treatment already in flat-mount.yml.
2026-05-05 19:47:12 +02:00
Vojtech Rysanek
655822b953 host-mount: replace named-volume driver_opts with direct service binds
The previous version of docker-compose.host-mount.yml modified the
'data' named volume's driver_opts to point at /data with 'o:
bind,rbind'. Docker named volumes have an immutability footgun:
once a volume is created, its driver options are fixed for the life
of the volume. Editing this file and re-running 'docker compose up
-d' does NOT propagate the new options to existing volumes — they
keep whatever options were in effect at create time.

This bit a deployer (Groupon FoundryAI) on 2026-05-05: the volume
was created before this overlay had bind,rbind, kept the old bind
(non-recursive) propagation, and containers wrote to a shadowed
subdirectory of the parent disk instead of the nested child mount.
DuckDB went FATAL on a root-owned WAL during a routine container
recreate; sign-in broke. Recovery required docker volume rm +
manual data migration on every affected VM.

Direct service-level bind mounts ('/host/path:/container/path')
don't go through Docker's volume layer at all. They re-evaluate
mount options every container start, and modern Docker Engine
(20.10+) defaults to recursive bind for these. No options to
forget, no immutable state to migrate, no shadow-mount class.

Validated via 'docker compose config' merge — overlay correctly
replaces 'data:/data' with bind type:none on app, extract,
scheduler, telegram-bot, ws-gateway.

Compose-spec version note: !override merge tag is part of the
Compose Specification supported by Docker Compose v2.20+. Tested
against Compose v5.1.3 used by Groupon's deployment.
2026-05-05 19:27:14 +02:00
ZdenekSrotyr
e9d7af3cce feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening
This squashes 13 commits from ma/staging plus a small docstring translation
into a single coherent unit. Three workstreams.

== RBAC v13 redesign ==
- Drops core.viewer/analyst/km_admin/admin hierarchy and the
  internal_roles / group_mappings / user_role_grants / plugin_access tables.
- Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill
  wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry.
- Two authorization primitives in app.auth.access:
    require_admin                        — Admin-group god-mode
    require_resource_access(rt, "{path}") — entity-scoped grants
  Single DB lookup per request; no session cache; no implies BFS.
- /admin/access UI (single page) replaces /admin/role-mapping +
  /admin/plugin-access. CLI `da admin group/grant *` replaces
  `da admin role/mapping/grant-role/revoke-role/effective-roles`.
- ResourceType.TABLE listing-only — admins can record table grants,
  runtime enforcement still flows through legacy dataset_permissions
  (migration plan in docs/TODO-rbac-data-enforcement.md).

== Claude Code marketplace ==
- Aggregated /marketplace.zip + /marketplace.git/* (PAT-gated,
  RBAC-filtered, content-addressed cache via dulwich).
- Admin god-mode dropped on the marketplace surface — admins curate
  their own view via grants like everyone else.
- Bare-repo cache materializes per RBAC-filtered ETag; stale entries
  not pruned in this iteration (disclaimed in git_backend.py docstring).

== #81 #83 #44 security/ops hardening ==
- #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias).
- #81 Group B — Keboola extractor 3-state exit codes:
    0 success / 1 total fail / 2 PARTIAL fail
  Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary
  alerting must teach it the new partial signal.
- #81 Group C — schema v10 view_ownership; rejects silent overwrite
  of a prior connector's view name on collision.
- #81 Group D — extractor-side identifier validation.
- #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset
  + path-traversal fix.
- #44 — entire /api/scripts/* surface is admin-only (planted-script +
  sandbox-bypass risk closed).

== Web UI polish + deploy fix ==
- /admin/access: live grant-count badges (no stale snapshot revert),
  shared-header CSS link added to /catalog and /admin/{tables,permissions},
  per-resource-type colored stripes.
- docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't
  silently shadow sub-mounts and write state to the wrong disk.

== OSS vendor-neutralization (waves 1+2) ==
- scripts/grpn/ → scripts/ops/. Customer-specific identifiers
  (project IDs, internal hostnames, dev/prod VM IPs, brand names)
  replaced with placeholders across code, docs, Terraform, Caddyfile,
  OAuth probe, and planning docs. Downstream infra repos that copied
  scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must
  update the path.

== Translation ==
- src/repositories/user_groups.py::ensure_system docstring translated
  from Czech to English for codebase consistency.

Co-authored-by: Mina Rustamyan <mina@keboola.com>
2026-04-28 14:25:04 +02:00
ZdenekSrotyr
1acc89c486 fix(ci): move bind-mount of /data to separate overlay, fix CI smoke test
The CI smoke test failed because docker-compose.prod.yml forced a bind mount
to /data on the host — which doesn't exist on GitHub runners.

Split the bind mount into docker-compose.host-mount.yml, which is only
composed by the VM startup script (/data exists there, mounted from the
persistent disk). CI continues to use the default named volume.

Module startup script + auto-upgrade cron now compose all three:
  -f docker-compose.yml -f docker-compose.prod.yml -f docker-compose.host-mount.yml
2026-04-21 16:54:18 +02:00