Commit graph

135 commits

Author SHA1 Message Date
ZdenekSrotyr
39146288e1 feat: admin-editable setup_banner on /setup page (schema v22)
Adds an optional Jinja2/HTML banner displayed above the bootstrap
commands on /setup. Empty by default; admin authors it at
/admin/setup-banner. autoescape=True — safe for HTML context.
Render failures return "" so a broken banner never breaks /setup.

Schema v22: setup_banner singleton table, auto-migration v21→v22.
2026-05-03 16:12:13 +02:00
ZdenekSrotyr
40d221f20a feat(admin-welcome): CodeMirror editor + live preview pane 2026-05-03 16:12:13 +02:00
ZdenekSrotyr
4bcdc4e7d7 feat(dashboard): link Claude Code setup CTA to /setup page 2026-05-03 16:12:13 +02:00
ZdenekSrotyr
85967e14ca feat(web): rename /install → /setup; nav label 'Setup local agent'
- Add GET /setup serving install.html (CLI + Claude Code setup page)
- Add GET /install → 301 redirect to /setup for backwards compat
- Move first-time setup wizard from /setup to /first-time-setup
- Update nav link: href=/setup, label 'Setup local agent', active on both /setup and /install paths
- Update page <title> to 'Setup local agent — …'
- Update /dashboard and /setup comment in _claude_setup_instructions.jinja
- Update tests and OpenAPI snapshot accordingly
2026-05-03 16:12:13 +02:00
ZdenekSrotyr
92fd78cfb4 fix(admin-welcome): redesign with peer chrome, toast, btn-copy 2026-05-03 16:12:13 +02:00
ZdenekSrotyr
ecaa113c68 fix(admin-welcome): credentials: include, real-content preview, refresh after mutate 2026-05-03 16:10:48 +02:00
ZdenekSrotyr
2b3048f77f feat(web): /admin/welcome editor page 2026-05-03 16:10:48 +02:00
ZdenekSrotyr
93b713900b fix(api): validate template render on PUT; broaden render-time catch 2026-05-03 16:10:48 +02:00
ZdenekSrotyr
0d1ecd235d feat(api): /api/welcome + /api/admin/welcome-template endpoints 2026-05-03 16:10:48 +02:00
ZdenekSrotyr
d055417377 feat(config): default welcome template in jinja2 + sync_interval 2026-05-03 16:10:48 +02:00
ZdenekSrotyr
91caefaca9
security(auth): per-IP rate limit + last-admin guard (#165)
* security(auth): per-IP rate limit on auth endpoints + generalize last-admin guard

Closes #45 and #151.

#45 — every auth endpoint was unthrottled (login, magic-link, token,
bootstrap), leaving us open to password brute-force and SMTP
email-bombing. Wires slowapi (new dep) into the middleware chain with
per-route limits: 10/min on login + token, 5/min on send-link, 3/min on
bootstrap. Returns 429 with Retry-After: 60 once exceeded. Per-IP key
respects the leftmost X-Forwarded-For hop (Caddy in front of the app
strips client-supplied XFF). Operator escape hatch:
AGNES_AUTH_RATELIMIT_ENABLED=0. Test suite disables the limiter via
autouse conftest fixture so existing auth tests that hammer endpoints
in tight loops are unaffected.

#151 — DELETE /api/admin/users/{id}/memberships/{group_id} and the
mirror DELETE /api/admin/groups/{group_id}/members/{user_id} only
guarded against self-removal as last admin. Generalizes to refuse
removing anyone from the seeded Admin group when they are the only
remaining active admin (mirrors the existing
count_admins(active_only=True) <= 1 check on delete_user / update_user).
Recovery from zero admins requires direct DB access, so this closes
a path where a scheduler/bootstrap actor that bypasses normal admin
checks could otherwise empty the group.

* security(auth): throttle remaining email-bombing + token-confirm endpoints

Address code-review gap on PR #165 — the first commit covered /send-link
but missed two endpoints with the IDENTICAL email-bombing surface:

- POST /auth/password/reset       — sends reset mail, anti-enum response
- POST /auth/password/setup/request — sends setup mail, anti-enum response

Both now share the 5/min limit with /send-link.

Also add 10/min to the token-confirm surfaces — high-entropy tokens but
partial leaks via logs / referer have surfaced before, and unbounded
guess rate would let an attacker exhaust the keyspace adjacent to a
leaked prefix:

- POST /auth/email/verify
- GET  /auth/email/verify         — closes the click-through bypass
- POST /auth/password/reset/confirm
- POST /auth/password/setup/confirm

Doc fix: rate_limit.py module docstring + CHANGELOG entry no longer
claim "disable without a redeploy" (misleading). The Limiter constructor
freezes `enabled` from env at import time, matching every other Agnes
env knob — operators set the flag and bounce the container.

Tests: 4 new cases in test_auth_rate_limit.py covering
/reset, /setup/request, /reset/confirm, GET /verify. Full suite:
2583 passed, 32 skipped, 0 failed.

* security(auth): throttle JSON /auth/password/setup — closes form-throttle bypass

Second code-review pass on PR #165 caught a fifth gap: POST /auth/password/setup
(JSON variant, kept for backward compat) consumes the same setup_token as
the web form /setup/confirm but was unthrottled — an attacker brute-forcing
the token just switches from the form path to the JSON path and resumes
at unbounded RPS. Apply the same 10/min limit and signature shape used
on /setup/confirm.

Also extend CHANGELOG note about the JSON-variant bypass for future
operators reading the security entry.

Test: 1 new case (test_password_setup_json_rate_limited_after_10_requests),
9 rate-limit tests + 28 password-flow tests + 41 auth-provider tests pass,
no regressions.

* chore(release): cut 0.30.1 — auth security hardening (rate limit + last-admin guard)
2026-05-02 21:08:33 +02:00
ZdenekSrotyr
dc03837a7b feat(query-api): better error message when --remote query references a materialized-but-not-rebuilt id
E2E sub-agent finding: `da query --remote "SELECT * FROM <id>"` against a
materialized table that hasn't yet been rebuilt in the server's
analytics.duckdb returns a confusing DuckDB "Table does not exist"
message even though the table is in the registry. Materialized rows
produce parquets at `${DATA_DIR}/extracts/<source>/data/<id>.parquet`,
but the orchestrator's master-view creation is `_meta`-driven — fresh
instances or pre-tick states have the registry row without a
corresponding view, so analysts hit the bare "does not exist" with no
path forward.

Improve the error rendering in `app/api/query.py:execute_query`. When
DuckDB raises a "table does not exist" error, scan the registry for any
`query_mode='materialized'` row whose id or name appears in the failed
SQL. On a hit, return a 400 whose detail names the table, explains the
materialize state, and offers two concrete next steps:

1. Run `da sync` (or wait for the scheduler tick / hit
   POST /api/sync/trigger) to materialize the parquet, OR
2. Query the source directly via the catalog alias when the registry row
   carries bucket+source_table (e.g. `bq."dataset"."table"` for BigQuery,
   `kbc."bucket"."table"` for Keboola).

Detection is bounded — the registry round-trip only fires when DuckDB's
error mentions a missing table, so happy-path queries pay no cost.
Non-materialized unknowns fall through to DuckDB's raw error.

2 new tests: materialized id surfaces the hint with the bucket+source_table
payload; unknown table falls back to the generic error path with no false
positive on the new hint.
2026-05-01 23:09:52 +02:00
ZdenekSrotyr
8030a867ec fix(admin-api): keep source_type validator permissive when primary is 'local' (bootstrap)
The strict source_type-availability validator from the prior commit
broke ~12 existing tests that register tables on the default test
instance (where `data_source.type` resolves to 'local' since no
instance.yaml is loaded).

The intent of the validator is to catch *explicit* misconfig:
`type=bigquery` instance + `source_type=keboola` payload with no
`data_source.keboola.*` block. The bootstrap workflow — admin sets up
a fresh instance and registers a few tables before pointing at a real
source — should not be gated here.

Loosen the check: when `get_data_source_type()` returns 'local' (the
fallback when no `data_source.type` is set), skip the rejection. The
explicit mismatch case still 422s because that path resolves
`configured_primary` to a real source type.

Also adds an autouse keboola_instance fixture to test_journey_sync_query.py
which exercises Keboola registrations through the full sync→query
flow — the fixture documents the test's data-source assumption rather
than relying on the bootstrap escape hatch.
2026-05-01 23:09:15 +02:00
ZdenekSrotyr
bc3ba0d43d feat(admin-api): reject register-table for source_type not configured on instance
E2E sub-agent finding: instance configured with `data_source.type='bigquery'`
and no `data_source.keboola.*` block. Admin POSTs `{source_type: 'keboola'}`
to /api/admin/register-table → returns 201, row lands in the registry, but
never syncs because the scheduler has no Keboola URL/token to ATTACH
against. Operator only notices the gap when `da catalog` keeps showing
nothing.

The new `_validate_source_type_configured` helper runs immediately after
the id/view-name collision checks in `register_table`. A source_type is
considered configured when:

- it matches `get_data_source_type()` (the instance's primary), OR
- a non-empty `data_source.<source_type>` block exists in the effective
  `instance.yaml` (multi-source instance), OR
- it's in `_SOURCE_TYPES_INDEPENDENT_OF_DATA_SOURCE` (Jira / local — both
  get data through paths that don't involve `data_source.*`).

Returns 422 with a message that names the configured primary source and
points at `/admin/server-config` for enabling a secondary one. None /
empty source_type is still tolerated for backward compat with legacy CLI
scripts that don't set the field — the route resolves it later.

5 new tests cover: keboola-on-bq rejected, bq-on-keboola rejected,
matching source_type still works, jira allowed regardless, omitted
source_type passes through.

Existing tests that registered Keboola rows on the unconfigured default
test instance now opt into a `keboola_instance` fixture to satisfy the
new validator (tests/test_admin_bq_register.py + .keboola_materialized
+ .unregister_cleanup; the multi-source PUT test in test_admin_bq_register
adds a `keboola` block to its synthetic config).

Pre-existing test_missing_project_returns_error failure in
TestRebuildFromRegistry is unrelated (config-cache leakage from a
previous test in the same class) — confirmed pre-existing on the prior
commit via `git stash` reproduction.
2026-05-01 23:04:51 +02:00
ZdenekSrotyr
dd46461c6c fix(admin+orchestrator): DELETE registry drops parquet + sync_state; rebuild skips orphan parquets
E2E sub-agent finding: register a materialized BQ row → sync to materialize
the parquet at `/data/extracts/bigquery/data/<id>.parquet` → DELETE the
registry row. The DB row goes away but:

- the parquet file stays on disk forever, AND
- the sync_state row stays, so `/api/sync/manifest` keeps advertising the
  dropped table to `da sync`, AND
- the orchestrator's next rebuild can resurrect a master view by picking
  up the leftover parquet.

Two-part fix in `unregister_table`:

1. For materialized rows on bigquery/keboola, remove
   `${DATA_DIR}/extracts/<source_type>/data/<name>.parquet` (and any stale
   `<name>.parquet.tmp` from a crashed prior materialize). Filename is
   keyed on `table_registry.name` to match sync_state bookkeeping.
   File-removal errors are logged but don't fail the DELETE — the registry
   row is already gone, and an orphan parquet won't get a master view at
   next rebuild because the orchestrator's _meta-driven scan never picks
   up bare parquet files.

2. Always clear `sync_state` + `sync_history` rows for the dropped table_id
   so the manifest stops advertising the table — applies to all source
   types and modes, not just materialized, since any synced row had a
   sync_state entry.

Orchestrator-side defensive guard (Finding 2b) is a no-op in the current
implementation: `_attach_and_create_views` only creates master views from
`_meta` rows in each connector's `extract.duckdb`, so a parquet without a
matching `_meta` entry is already invisible to the rebuild. The new
test `test_orchestrator_skips_orphan_parquet_in_extracts` is kept as a
regression guard for that contract.

5 tests cover: BQ + Keboola materialized DELETE removes parquet, remote
DELETE doesn't error trying to remove a non-existent file, sync_state
cleared on DELETE, orchestrator orphan-skip invariant.
2026-05-01 22:54:11 +02:00
ZdenekSrotyr
f0979f997a fix(admin-api): reject backtick BQ-native source_query at register; surface materialize errors per-row
E2E testing showed admin POSTs of materialized BQ rows whose source_query
uses BigQuery-native backtick identifiers (`prj.ds.t`) silently no-op'd at
the next sync tick — the materialize path runs the SQL through the DuckDB
BQ extension's COPY which uses DuckDB's parser; backticks aren't recognized
and the query either parse-errors or matches zero rows. No parquet lands at
the canonical path and no error reaches an operator-visible surface.

Two-part fix:

1. RegisterTableRequest's _check_mode_query_coherence model_validator now
   rejects any source_query containing a backtick with a 422 + actionable
   message pointing at the DuckDB equivalent (bq."dataset"."table"). Same
   check is applied in update_table on the merged record so PATCHes that
   flip a stored source_query to backtick form are also caught. Covers BQ
   AND Keboola materialized rows since both connectors funnel source_query
   through DuckDB's COPY.

2. _run_materialized_pass now persists per-row failures via the new
   SyncStateRepository.set_error / clear_error methods (existing
   sync_state.error / status columns — no schema migration). GET
   /api/admin/registry enriches each row with `last_sync_error` from a
   single batched SELECT against sync_state, so the admin UI / da admin
   status can show "this table failed last sync because: X" instead of
   operators having to trawl scheduler logs. Recovered rows have the
   error cleared automatically — update_sync's success path resets
   status='ok' / error=NULL on the upsert.

The materialized-path test fixture's _materialized_payload helper is
updated to use DuckDB-flavor SQL (the prior backtick example pre-dated the
fix). 6 new tests cover register/update rejection on BQ + Keboola, the
sync_state error persistence, and the registry response surface.
2026-05-01 22:51:02 +02:00
ZdenekSrotyr
a4339ce679 fix(admin+diagnose): address 2 additional Devin Review findings on PR #152
Devin's second review pass on commit 16938ae7 surfaced 2 more issues:

BUG_pr-review-job-58ae3148_0001 — non-BQ materialized via PUT bypasses source_query check
  app/api/admin.py update_table only enforces 'query_mode=materialized
  requires source_query' for source_type='bigquery' rows (via the
  synthetic RegisterTableRequest at line 2129+). Non-BQ source types
  (Keboola) skip the check — admin could PUT {query_mode: materialized}
  on a Keboola local row without source_query, persist successfully,
  then crash at the next sync tick when kb_materialize_query received
  sql=None and DuckDB rejected COPY (None) TO '...'.
  Fix: generic coherence guard before the BQ-specific block — for ALL
  source types, query_mode='materialized' requires non-empty source_query
  in the merged record. Returns 422 with a hint about reverting via
  query_mode='local'/'remote'.

ANALYSIS_pr-review-job-642ff90f_0007 — diagnose returns 'ok' on BQ resolution failure
  app/api/health.py:_check_bq_billing_project caught get_bq_access()
  exceptions and returned status='ok' with a 'could not resolve' detail.
  Automated alerting keyed on status != 'ok' would silently miss missing
  google-cloud-bigquery, auth failures, or malformed config. Fix: return
  status='unknown' on resolution failure — surfaces it on operator
  dashboards without promoting the overall health to 'degraded' (which
  'warning' does, intentionally for the billing==project case).

Tests:
  - test_update_keboola_to_materialized_without_source_query_rejected:
    PUT {query_mode: materialized} on a Keboola local row returns 422
    with 'source_query' in the detail
  - test_diagnose_returns_unknown_status_when_bq_resolution_fails:
    when get_bq_access raises, the bq_config service entry surfaces
    status='unknown' (not 'ok')

Full sweep: 2507 passed, 25 skipped, 0 failed (+2 from previous sweep
because of the 2 new regression tests; 8 pre-existing internal_roles
schema-migration failures still ignored per task brief).
2026-05-01 21:21:23 +02:00
ZdenekSrotyr
16938ae7cb fix(materialized): address 4 Devin Review findings on PR #152
Devin Review on commit 7052a235 flagged 4 real bugs in the Keboola
materialized path. All four are fixed; 3 new regression tests pin the
behavior so future refactors can't quietly regress.

BUG_pr-review-job-3fbd31c9_0001 — _run_materialized_pass gated behind 'if bq_project:'
  app/api/sync.py:444-466 wrapped the entire materialized pass (which
  dispatches BOTH BigQuery AND Keboola rows by source_type) in a check
  for data_source.bigquery.project being non-empty. On Keboola-only
  instances this short-circuited and Keboola materialized rows sat in
  table_registry forever without their SQL being evaluated — the feature
  CHANGELOG advertised was dead code on the most common deployment shape.
  Fix: always run the materialized pass; the BQ branch's per-row try/except
  catches the typed BqAccessError(not_configured) the sentinel raises
  when no BQ project is set, so non-BQ instances incur a per-row error
  for any (hypothetical) BQ-tagged row but the Keboola path runs cleanly.
  Log line renamed 'Materialized BQ' → 'Materialized SQL' to match.

BUG_pr-review-job-3fbd31c9_0004 — wrong config key 'url' instead of 'stack_url'
  app/api/sync.py:149 read get_value('data_source', 'keboola', 'url'),
  but the canonical config key documented in instance.yaml.example:111
  and used by app/api/admin.py:1503 + 2359 is 'stack_url'. Production
  Keboola instances would always see an empty URL and fail with the
  'not configured' error. The pre-existing test patched the wrong key
  too, so it passed without catching the mismatch. Fix: use stack_url
  in both sync.py and the test fixture.

BUG_pr-review-job-3fbd31c9_0003 — no atomic write in Keboola materialize_query
  connectors/keboola/extractor.py wrote COPY directly to the final
  '<id>.parquet' path. A mid-COPY failure (network, disk full, extension
  crash) left a partial parquet that the orchestrator rebuild would
  later pick up and serve to analysts. BQ's materialize_query already
  uses a '<id>.parquet.tmp' staging path + os.replace() atomic swap
  (connectors/bigquery/extractor.py:370-445); Keboola now mirrors that
  pattern with the same try/except cleanup on COPY failure.

BUG_pr-review-job-3fbd31c9_0002 — full file read into memory for MD5
  Same file:60-62 used parquet_path.read_bytes() for the MD5 hash.
  Multi-GB Keboola materialized results would OOM on memory-constrained
  containers. BQ's version uses streaming 8 KiB-chunk hashing
  (connectors/bigquery/extractor.py:438-442); Keboola now mirrors it.

Tests:
  - test_run_sync_runs_materialized_pass_on_keboola_only_instance —
    pins BUG_0001's fix; setting bigquery.project='' must NOT skip
    Keboola materialized dispatch
  - test_keboola_materialize_atomic_write_on_failure — pins BUG_0003;
    a mid-COPY RuntimeError leaves no .parquet AND no .parquet.tmp at
    the canonical path
  - test_keboola_materialize_uses_tmp_path_during_copy — documents the
    atomic-write contract: COPY targets .parquet.tmp, final swap to
    .parquet (no .tmp suffix on the result['path'])
  - existing test_run_materialized_pass_dispatches_keboola_to_keboola_extractor
    fixture updated: stack_url instead of url

Full sweep: 2505 passed, 25 skipped, 0 failed (modulo 8 pre-existing
internal_roles schema-migration failures called out in the task brief).
2026-05-01 20:58:17 +02:00
ZdenekSrotyr
b627de8344 feat(diagnose) + docs: warn on USER_PROJECT_DENIED footgun + document all newly-exposed knobs
Diagnostic + operator-facing documentation that closes the loop on the work in this PR.

`da diagnose` (via /api/health/detailed):
  - New _check_bq_billing_project() helper. When data_source.type='bigquery' and BqProjects.billing == .data, surface a yellow warning: 'BigQuery billing project equals data project'. Hint includes the YAML field path + the /admin/server-config UI shortcut. Diagnose's overall status promotes warning → degraded so the CLI echoes it.
  - Non-BQ instances (Keboola-only, etc.) skip the check.
  - Implementation hooks into the existing /api/health/detailed surface — no new endpoint, no CLI changes.

config/instance.yaml.example documentation:
  - data_source.bigquery.billing_project: USER_PROJECT_DENIED hint, /admin/server-config UI reference
  - data_source.bigquery.legacy_wrap_views: analyst-side discipline note (use `da fetch` / `da query --remote`), issue #101 history, view-heavy deployment guidance
  - data_source.bigquery.max_bytes_per_materialize: cost guardrail block (NEW — wasn't documented in .example before)
  - ai.base_url: provider list + UI hint
  - openmetadata + desktop: 'configurable via /admin/server-config UI' headers
  - corporate_memory: leading note that the schema is editable via UI

Other docs:
  - CHANGELOG.md: comprehensive Unreleased section
  - CLAUDE.md: schema chain → v20 + Materialized SQL connector mode + per-connector tab UI mention
  - README.md: mode-first source table summary
  - docs/architecture.md: per-connector tab UI mention
  - cli/skills/connectors.md: bootstrap rails (parallel to #154)
  - docs/superpowers/plans/2026-05-01-admin-tables-form-cleanup.md: implementation plan archive (2515 lines)
  - scripts/seed_dummy_tables.py: drop is_public after #150 RBAC migration (column gone)

Tests:
  - test_diagnose_billing.py — 3 cases (BQ with billing==data warns, BQ with billing!=data clean, non-BQ skips)
2026-05-01 20:27:24 +02:00
ZdenekSrotyr
df7f5b1d9a feat(admin-ui): /admin/server-config known-fields registry + structured nested editor
Today /admin/server-config renders fields by iterating Object.keys(payload) on the YAML value — if a key isn't in instance.yaml, the operator can't see it. They have to know to type it via the JSON-patch textarea (which only renders for empty sections) or SSH and edit YAML.

Adds a known-fields registry (`_KNOWN_FIELDS` in app/api/admin.py) the UI consumes alongside the YAML payload. Renderer shows BOTH:
  - existing fields (from YAML) with current value
  - known-but-unset fields with dashed-border placeholder + hint, ready to fill in

Renderer (`renderField`, `renderSection`, `collectSection`):
  - kind="string"|"secret"|"bool"|"int"|"select"|"object"|"array"|"map" — picks input type
  - kind="object" with `fields` — recursive structured form, arbitrary depth (corporate_memory needs 3-4 levels)
  - kind="array" with `item_kind` — vertical stack of typed inputs + add/remove buttons
  - kind="map" with `key_kind` + `value_kind` — key:value rows + add/remove (used for confidence.base, domain_owners, entity_resolution.entities)
  - data-path encoded as JSON segment array so map keys with embedded dots (e.g. 'user_verification.correction') survive collect → patch round-trip
  - .cfg-field.is-unset CSS — dashed border, muted label, italic hint

Sections newly exposed (added to _EDITABLE_SECTIONS):
  - openmetadata: url, token (secret), cache_ttl_seconds, verify_ssl
  - desktop: jwt_issuer, jwt_secret (secret), url_scheme

Known fields populated for existing sections:
  - data_source.bigquery: billing_project (the cause of the 403 USER_PROJECT_DENIED footgun when SA can read but not bill the data project), legacy_wrap_views (bigquery_query() wrap for VIEWs — issue #101 default off, ON for view-heavy deployments), max_bytes_per_materialize (cost guardrail)
  - data_source.keboola: stack_url, project_id (hints; values already populated)
  - ai: base_url (required for openai_compat), structured_output (select)
  - corporate_memory: full schema from instance.yaml.example — distribution_mode, approval_mode, review_period_months, notify_on_new_items, sources.{claude_local_md,session_transcripts}, extraction.{model,sensitivity_check,contradiction_check}, confidence.{base,modifiers,decay.{mode,half_life_months,decay_rate_monthly,floor}}, contradiction_detection.{enabled,max_candidates}, entity_resolution.{enabled,entities}, domain_owners, domains
  - Known partial: confidence.modifiers is map<string, map<string, float>> — falls through to JSON-textarea with TODO; structured editor for that one shape needs more renderer work

Tests:
  - test_admin_server_config_known_fields — registry envelope shape, smoke fixture
  - test_admin_server_config_renderer_depth — 4-level nested objects, arrays of strings, maps of floats, dotted-key safety
  - test_admin_server_config_corp_memory — full corporate_memory schema, 12 fields incl. nested
  - test_admin_server_config — existing tests adjusted for new shape
2026-05-01 20:27:01 +02:00
ZdenekSrotyr
c63f54d643 feat(admin-ui): /admin/tables per-connector tabs + Keboola materialized parity + form cleanup + Manage access deep link
Replaces the single mixed Jinja-branched form at /admin/tables with a per-connector tab interface and brings Keboola to capability parity with BigQuery.

Tab structure:
  - BigQuery tab: Register modal with two-question radio model (Q1 Live | Synced × Q2 Whole | Custom SQL), Discover datasets / List tables / Use-table-as-base autocomplete buttons, table-vs-view auto-detection hint, per-tab listing filter
  - Keboola tab: same two-question radio (Q2 only — no Live mode for Keboola), Custom SQL textarea against kbc."bucket"."table" for materialized rows
  - Jira tab: read-only listing (Jira is webhook-driven; no Register form)
  - Active tab persists in window.location.hash so refresh keeps the operator in place

Form cleanup (within tabs):
  - Drops the misleading 'Sync Strategy' dropdown — runtime never read it (only profiler.is_partitioned() consumes the value for parquet-layout detection); kept in DB for back-compat (Pydantic deprecated)
  - Adds Sync Schedule input to Keboola Register/Edit (was missing — scheduler honored per-table cron via is_table_due() for every source but the Keboola UI had no surface)
  - Hides Primary Key under <details>Advanced with clarifying hint that it's catalog-metadata only (Agnes does not perform upsert/dedup; every sync is a full overwrite)
  - Drops the Strategy column from the registry listing (every Keboola row defaulted to full_refresh after Strategy was hidden — column was noise)
  - Removes the legacy out-of-tab #registerModal + the legacy global Discovery panel; each tab now owns its own header + Register button + listing div

Edit modal:
  - BigQuery Edit modal physically relocated into <section id="tab-content-bigquery"> (mirrors Phase E Register placement)
  - Keboola Edit modal mirrors Register (same Q2 radio, Discover/List buttons via parameterized helpers)
  - openEditModal(table) dispatches by source_type to the right modal — fixes a quiet bug where Phase F's openEditKeboolaModal was never wired up and Keboola edits silently used the legacy modal

Per-row Manage access deep link:
  - Each row in the per-tab listing has a lock-icon button between Edit and Delete that navigates to /admin/access#table:<table_id>
  - admin_access.html bootstrap reads window.location.hash and pre-fills the resource filter, mirroring the existing ?group=<id> deep-link pattern

Tests:
  - test_admin_tables_tab_ui.py — tab nav, hash persistence, register-button-per-tab, listing partition by source_type, Manage access deep link
  - test_admin_tables_ui_materialized.py — two-question radio (BQ + Keboola), Discover/List/Use-as-base buttons, Edit modal parity, Jira read-only
2026-05-01 20:26:29 +02:00
ZdenekSrotyr
85d3810535 feat(materialized): query_mode='materialized' for BigQuery + Keboola — admin SELECT → parquet → analyst
Closes the 'admin pre-stages a curated table/view for analysts' use case end-to-end across both supported source connectors.

Backend (BigQuery + Keboola, schema v20):
  - schema v20 adds source_query TEXT to table_registry (renumbered from v19 after main's #150 RBAC migration also bumped to v19)
  - connectors/bigquery/extractor.py adds materialize_query(table_id, sql, *, bq, output_dir, max_bytes=...) — BqAccess session, dry-run cost guardrail (default 10 GiB, configurable via data_source.bigquery.max_bytes_per_materialize), idempotent ATTACH, rows/bytes/md5 metadata for sync_state
  - connectors/keboola/access.py — new KeboolaAccess facade (parallel of BqAccess) wrapping ATTACH 'keboola://...' AS kbc
  - connectors/keboola/extractor.py adds materialize_query — same shape, no dry-run analog (Keboola Storage API has different cost model); legacy bucket-download path skips query_mode='materialized' rows
  - app/api/sync.py:_run_materialized_pass dispatches by source_type to the right materialize_query
  - app/api/admin.py: RegisterTableRequest accepts source_query; model_validator coheres mode↔source_query↔bucket; PUT preserves omitted fields; deprecation marks (Field(deprecated=True)) on sync_strategy + profile_after_sync (no extractor reads them; profile_after_sync becomes inert — bug from earlier work where /api/sync/trigger never honored the flag); _BQ_OPTIONAL_FIELD_DEFAULTS injects defaults into GET /server-config payload

Operator + CLI surface:
  - da admin register-table --query / --query-mode materialized
  - scripts/smoke-test-materialized-bq.sh — end-to-end smoke for operators

Tests (incl. spike + integration + regression):
  - test_db_migration_v20, test_table_registry_source_query
  - test_bq_materialize, test_bq_cost_guardrail, test_bq_init_extract_skips
  - test_keboola_access, test_keboola_extension_query_passthrough (lock-in for the DuckDB extension capability), test_keboola_materialize, test_keboola_init_extract_skips, test_keboola_materialized_e2e (skipped without KBC_TEST_* creds)
  - test_sync_trigger_materialized, test_sync_trigger_keboola_materialized
  - test_api_admin_materialized, test_cli_admin_materialized
  - test_admin_bq_register, test_admin_discover_bigquery, test_admin_keboola_materialized, test_admin_phase_c_deprecation, test_admin_put_preservation, test_materialized_e2e

Cost: BQ uses bigquery_query() (jobs API, view-aware) — works on tables, views, materialized views uniformly. Keboola uses ATTACH+COPY parquet through the DuckDB extension.
2026-05-01 20:25:56 +02:00
minasarustamyan
d4ac84dd46
feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150)
* feat(rbac): drop dataset_permissions + access_requests + users.role + is_public; v19 migration

BREAKING. Sjednocení datové RBAC vrstvy do per-group resource_grants modelu.
Před PR byla legacy data RBAC vrstva (dataset_permissions + is_public bypass)
de-facto neaktivní — is_public neměl API/UI/CLI surface, default true znamenal
že can_access_table vždycky bypassl. Dnes každý non-admin přístup vyžaduje
explicitní resource_grants(group, "table", id) řádek.

Schema v18 → v19 (src/db.py:_v18_to_v19_finalize):
- DROP TABLE dataset_permissions, access_requests
- DROP COLUMN users.role (NULL artifact since v13)
- DROP COLUMN table_registry.is_public
- Drops přes table-rebuild idiom (rename → create new → INSERT … SELECT
  → drop old) kvůli DuckDB ALTER DROP COLUMN limitacím na tabulkách
  s historic FK constraints. INSERT picks intersection sloupců, takže
  test fixtures s minimal pre-v19 schemou migrate cleanly.

Runtime:
- src/rbac.py:can_access_table → deleguje na app.auth.access.can_access
- DatasetPermissionRepository, AccessRequestRepository smazány
- AGNES_ENABLE_TABLE_GRANTS env-gate v app/resource_types.py odstraněn
  (TABLE je unconditionally enabled)

API drop:
- app/api/permissions.py, app/api/access_requests.py celé soubory
- /admin/permissions web route + admin_permissions.html
- "Request Access" modal v catalog.html + locked-row UI
- ~10 if user.get("role") != "admin" checků nahrazeno (admin shortcut
  je uvnitř can_access_table)
- /api/settings: drop permissions field z GET; PUT /api/settings/dataset
  gate přepnut na can_access(user_id, "table", dataset, conn)

Auth:
- app/auth/jwt.py:create_access_token: drop role parametr (claim zmizí
  z nově vydávaných JWT; staré tokeny zůstávají valid, claim ignored)
- app/api/users.py: drop role z CreateUserRequest / UpdateUserRequest
  (admin promotion = explicit add to Admin group via memberships API)
- src/repositories/users.py: drop role z create() / update()

CLI:
- da admin set-role smazán → hard-fail s replacement command
- da admin add-user --role flag pryč
- da auth import-token --role flag pryč
- da auth whoami: drop "Role:" výpis
- cli/config.py:save_token: role parametr now optional, no longer written
  (back-compat se starými token.json soubory zachována — pole se ignoruje)

Tests:
- DELETE: test_permissions.py, test_permissions_api.py, test_access_requests_api.py
- REWRITE: test_access_control.py (resource_grants flow), test_rbac.py
  (can_access_table over resource_grants), test_journey_rbac.py
  (drop access-request flow), test_resource_types.py (drop env-gate
  tests, drop is_public from helpers), test_v2_*.py (drop role-based
  user dicts in favor of id-based + Admin group membership),
  test_settings_api.py (no permissions field, can_access gate)
- TRIVIAL: ~30 souborů — drop role="admin" arg z UserRepository.create
  a 3rd positional role z create_access_token
- NEW: test_v18_to_v19 migration test (test_db.py),
  test_can_access_table_no_implicit_public (test_rbac.py),
  test_admin_set_role_returns_hardfail (test_cli_admin.py)
- OpenAPI snapshot regenerated

Docs:
- CHANGELOG: BREAKING entry pod [Unreleased]
- CLAUDE.md: schema v18 → v19
- docs/architecture.md: schema table + RBAC sekce přepsána
- docs/auth-google-oauth.md: admin promotion přes da admin break-glass
- cli/skills/security.md: kompletně přepsáno na group-based model
- docs/TODO-rbac-data-enforcement.md: smazáno (TODO splněn)

Test results: 2363 passed, 19 failed. Zbývající failures jsou pre-existing
Windows-specific issues (fcntl, charset) nesouvisející s tímto PR —
ověřeno git stash pop.

Plan: ~/.claude/plans/floofy-coalescing-parnas.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(release): cut 0.27.0

---------

Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
2026-04-30 22:02:16 +02:00
minasarustamyan
fb1573766a
feat(admin): users/groups UI polish + SSO lock + v18 migration (#142)
Cuts release 0.24.0.

## Highlights
- SSO-managed accounts read-only for password / delete operations (UI + API). New `is_sso_user` flag derived from group memberships.
- Admin/Everyone system rows show `google_sync` chip + Workspace email subtitle when env-mapped.
- Origin pill vocabulary unified across `/admin/groups`, `/admin/access`, `/admin/users`, `/admin/users/{id}`, `/profile` (Admin yellow, Everyone gray, google_sync green, custom purple).
- Effective-access readout no longer short-circuits for admin users — always renders per-resource breakdown.
- Schema migration v18 drops stranded non-google memberships in env-mapped Admin/Everyone groups (cleans up v13's blanket Everyone backfill).

## Devin findings addressed
- _is_sso_user requires source='google_sync' on system-group branches (so v13 system_seed memberships in env-mapped Everyone don't lock out the admin).
- POST add-to-group returns correct origin via _derive_origin (matching GET).
- 8 customer-specific token instances (groupon.com / foundryai) replaced with vendor-neutral placeholders across templates, tests, and CHANGELOG.
- deriveDisplayName name-skip for canonical "Admin"/"Everyone" so an overlapping AGNES_GOOGLE_GROUP_PREFIX doesn't mangle the chip text.

See CHANGELOG [0.24.0] for full notes.
2026-04-30 15:16:04 +02:00
ZdenekSrotyr
70672204fe
feat(memory): admin Edit + MEMORY_DOMAIN RBAC + ai-section UI (#141)
Cuts release 0.23.0.

## Highlights
- Single-item Edit button on every memory item card (modal hits PATCH /api/memory/admin/{id}).
- MEMORY_DOMAIN RBAC resource type — admins grant user_groups access to specific domains via /admin/access. Composes with existing audience filter (OR semantics, no-op when no grants).
- ai: section editable in /admin/server-config — admins can set ANTHROPIC_API_KEY / model / provider / base_url for the corporate-memory extractor without editing instance.yaml directly. api_key auto-masked.

## Devin findings addressed
- Modal NULL→empty fix (audience visibility wouldn't break).
- Stats endpoint granted_domains parity with list endpoint.
- Documented intentional MEMORY_DOMAIN→audience bypass.
- Documented conscious ai.base_url SSRF exclusion (legit internal LiteLLM/vLLM proxies).

See CHANGELOG [0.23.0] for full notes.
2026-04-30 11:04:41 +02:00
ZdenekSrotyr
83adf01bde
fix(v2): #134 BigQuery cross-project errors return structured 502/400 + BqAccess facade (#138)
* docs(spec): #134 unify BigQuery access behind BqAccess facade

Brainstorm output for issue #134. Captures:
- root cause (incl. correction of the issue's hypothesis about commit 33a9964)
- BqAccess facade API + project resolution rules
- error contract — typed BqAccessError mapped to HTTP 502 for upstream
  BQ failures, 500 for deployment/config bugs
- migration plan for v2_scan, v2_sample, RemoteQueryEngine
- test rewrite eliminating _bq_client_factory injection point
- E2E verification protocol on agnes-development as success criterion

* docs(spec): #134 revise after first review

Incorporates code-reviewer findings:

Must-fix:
- Add v2_schema (2 copies of INSTALL/LOAD/SECRET dance) to migration scope.
- Reframe v2_scan headline: missing try/except around BQ calls is the
  actual cause of bare 500s, not project resolution (which 33a9964 fixed).
- List two more deferred call sites (extractor.py, register_bq_table)
  with explicit rationale.

Important:
- Drop billing != data clause from cross_project_forbidden heuristic;
  rely only on 'serviceusage' substring. billing != data is normal
  for cross-project setup, was over-classifying.
- Split bq_bad_request into _user (400) and _server (502) variants;
  add sql_origin parameter to translate_bq_error so call sites declare
  whether SQL contains user input.
- Add @functools.cache to BqAccess.from_config; document tests bypass
  via dependency_overrides.
- Replace monkey-patched-classmethod test pattern with
  BqAccess(client_factory=...) injection at construction time. Cleaner
  than today's _bq_client_factory and 1:1 migration shape.
- Keep BqProjects.data (reviewer assumed registry has source_project;
  it doesn't). Multi-project explicitly listed as non-goal with note.

Nice-to-have:
- Add 'Implementation strategy' section: 2 staged commits (bug fix
  alone is revertable; refactor follows).
- Extend E2E protocol to cover all three endpoints, not just /sample.
- Note removal of stale docstring at src/remote_query.py:204.

* docs(spec): #134 revision 3 — incorporates second-round review

Must-fix from second review:
- v2_schema split into two migration cases: _fetch_bq_schema translates
  errors via translate_bq_error; _fetch_bq_table_options preserves its
  swallow-all 'except Exception → return {}' so /schema doesn't 502 on
  partition-info failures.
- RemoteQueryEngine.__init__ now resolves BqAccess lazily (in
  _get_bq_client, not in __init__). Without this, ~7 DuckDB-only tests
  in test_remote_query.py would suddenly fail with not_configured.
- translate_bq_error pass-through for BqAccessError is now load-bearing
  (clause 1, before any Google-API branch). bq.client() raises BqAccessError
  for bq_lib_missing/auth_failed; without explicit pass-through those
  fall to 'unknown' and re-raise as bare 500.
- Commit 1 now emits the SAME structured response shape as commit 2 to
  avoid contract churn between commits.
- BIGQUERY_PROJECT env-var precedence is BREAKING for env-only deployments
  — flagged in CHANGELOG ### Changed.

Editorial:
- sql_origin renamed to bad_request_status with values 'client_error' /
  'upstream_error' (clearer about what the parameter actually decides).
  bq_bad_request_user/_server kinds collapsed to bq_bad_request (400)
  and bq_upstream_error (502).
- CLI (cli/commands/query.py) noted as external RemoteQueryEngine caller;
  unaffected because new bq_access kwarg has default None.
- Added unit/integration tests for the new contracts:
  test_translate_passes_through_BqAccessError,
  test_v2_scan_returns_500_on_bq_lib_missing,
  test_v2_schema_returns_200_with_empty_partition_on_bq_failure,
  test_resolve_succeeds_after_config_set.
- E2E protocol now covers /schema as the fourth endpoint.
- Documented functools.cache-doesn't-cache-exceptions semantics and
  fixture nullcontext-doesn't-close caveat for nested sessions.

* docs(spec): #134 revision 4 — incorporates third-round review

Third reviewer verdict: 'implementation-ready with two trivial edits';
explicitly noted prior rounds did the heavy lifting.

Edits:
1. get_bq_access() module-level function instead of @classmethod
   @functools.cache from_config. Removes the classmethod-cache stacking
   footgun (different Python versions wrap differently) and gives FastAPI's
   dependency introspection a clean function signature. Drops the
   'Do not subclass BqAccess' caveat that no longer applies.

2. Commit 1 strategy explicitly: wrap _fetch_bq_sample (v2_sample),
   _bq_dry_run_bytes + _run_bq_scan (v2_scan), and _fetch_bq_schema
   (v2_schema strict block). Do NOT touch _fetch_bq_table_options swallow-all
   in commit 1 — preserved as-is, then migrated (still preserved) in commit 2.
   All three endpoints emit the same structured body shape so client parsers
   see one consistent contract throughout the staged rollout. No more
   half-rolled-out window where /sample is bare 500 while /scan is
   structured 502.

* docs(plan): #134 implementation plan — Phase 1 (atomic bug fix) + Phase 2 (BqAccess refactor) + Phase 3 (verification)

Bite-sized TDD tasks. 3 phases, 16 tasks total:

Phase 1 (Commit 1) — atomic bug fix across all four v2 endpoints:
  Tasks 1.1-1.5 wrap _fetch_bq_sample, _bq_dry_run_bytes, _run_bq_scan,
  _fetch_bq_schema with structured 502/400 try/except. _fetch_bq_table_options
  preserved untouched. CHANGELOG Fixed entries.

Phase 2 (Commit 2) — BqAccess facade extraction + migration:
  Tasks 2.1-2.5 build connectors/bigquery/access.py bottom-up
  (BqProjects, BqAccessError, translate_bq_error, default factories,
  BqAccess class, get_bq_access module-level cached). Task 2.6 adds
  conftest.py fixture. Tasks 2.7-2.9 migrate v2_scan, v2_sample, v2_schema
  to BqAccess. Tasks 2.10-2.11 migrate RemoteQueryEngine + tests
  (lazy bq_access, drop _bq_client_factory). Task 2.12 CHANGELOG
  Changed BREAKING + Internal.

Phase 3 — Verification:
  3.1 full pytest. 3.2 squash into two PR-shape commits. 3.3 manual
  E2E on agnes-development per spec protocol → close #134.

Self-review table maps spec sections to implementing tasks; no gaps.

* fix(v2): #134 structured 502/400 on BQ errors across /scan, /scan/estimate, /sample, /schema

Wraps the BigQuery call sites in v2_scan, v2_sample, and v2_schema (strict
block only) with try/except for google.api_core exceptions, translating to
HTTPException with a structured body shape: {error, message, details}.

Fixes Pavel's report (#134) where these endpoints returned bare HTTP 500
with no body when the SA on agnes-development hit cross-project Forbidden
on serviceusage.services.use.

Also fixes /sample's missing billing_project fallback (the bug 33a9964
fixed for /scan never landed here).

Status code split:
  - /scan, /scan/estimate: BadRequest -> 400 (bq_bad_request) since SQL is
    user-derived from req.select/where/order_by.
  - /sample, /schema: BadRequest -> 502 (bq_upstream_error) since SQL is
    server-constructed from validated identifiers.
  - All Forbidden -> 502 with cross_project_forbidden if 'serviceusage' in
    error message (with hint pointing at data_source.bigquery.billing_project),
    else bq_forbidden.

Body shape matches what the upcoming BqAccess refactor (next commit) will
produce, so client-side parsers see one consistent contract throughout
the staged rollout.

_fetch_bq_table_options preserved exactly as-is — its swallow-all-and-return-empty
contract is intentional and survives into the refactor; /schema continues to
return 200 with empty partition info when partition queries fail.

Outer wraps in scan_endpoint, scan_estimate_endpoint, sample, and schema
endpoints exist only to make the test pattern (monkeypatching whole
_fetch_* functions) work, and are tagged TODO(#134 Phase 2) for removal
once BqAccess centralizes translation.

* refactor(bq): #134 BqAccess facade — unify v2_scan, v2_sample, v2_schema, RemoteQueryEngine

Extracts the duplicated BigQuery-access pattern (project resolution +
client construction + DuckDB-extension session + Google-API error
translation) into connectors/bigquery/access.py. Migrates four
call sites to use it:

- app/api/v2_scan.py — _bq_dry_run_bytes, _run_bq_scan
- app/api/v2_sample.py — _fetch_bq_sample
- app/api/v2_schema.py — _fetch_bq_schema (strict translation),
  _fetch_bq_table_options (preserves swallow-all best-effort contract)
- src/remote_query.py — RemoteQueryEngine, lazy bq_access kwarg

The new module exposes:
- BqProjects (frozen dataclass: billing + data project IDs)
- BqAccessError (typed exception with HTTP_STATUS class mapping)
- BqAccess (facade with injectable client_factory/duckdb_session_factory
  for tests; defaults call the real google-cloud-bigquery + DuckDB extension)
- get_bq_access (module-level @functools.cache; FastAPI Depends target)
- translate_bq_error (Google API exception → BqAccessError mapper, with
  BqAccessError pass-through, 'serviceusage'-substring heuristic for
  cross_project_forbidden, and bad_request_status param distinguishing
  user-derived (400) from server-constructed (502) SQL)
- _default_client_factory, _default_duckdb_session_factory

RemoteQueryEngine.__init__ no longer accepts _bq_client_factory; tests
migrate to bq_access=BqAccess(projects, client_factory=...). DuckDB-only
RemoteQueryEngine tests need no changes — bq_access defaults to None and
get_bq_access() is only invoked on first BQ call (lazy resolution).
BqAccessError raised internally is translated to RemoteQueryError(
error_type="bq_error") in _get_bq_client to preserve the engine's
existing public contract — CLI and /api/query/hybrid callers see no change.

Endpoint tests (test_v2_scan, test_v2_scan_estimate, test_v2_sample,
test_v2_schema) migrate from monkey-patching whole _fetch_* functions
to using the new bq_access fixture in tests/conftest.py — which
exercises the REAL translation path through BqAccess + translate_bq_error,
closing the test gap flagged in Task 1.1's review.

Side-effect behavior change: v2_sample's FROM clause now uses the data
project (instance.yaml data_source.bigquery.project), not the conflated
billing_project from Phase 1. Documented in CHANGELOG ### Internal.

BREAKING for deployments combining BIGQUERY_PROJECT env var with
data_source.bigquery.project in instance.yaml — env var now overrides
data project too. See CHANGELOG ### Changed.

Two known-duplicate BQ-access sites (connectors/bigquery/extractor.py,
scripts/duckdb_manager.register_bq_table) explicitly out of scope;
tracked as follow-up.

Removed stale docstring at the previous src/remote_query.py:204
that referenced scripts.duckdb_manager._create_bq_client as the default
BQ client factory (RemoteQueryEngine never actually used that function).

Test counts: tests/test_bq_access.py +27 (new), tests/test_v2_*.py +
tests/test_remote_query.py migrated to bq_access fixture (counts unchanged
or +1-2 per file). Full suite: 2086 passed, 8 pre-existing failures
(DB migration tests with unrelated internal_roles DependencyException —
not introduced by this PR).

* fix(bq_access): translate DefaultCredentialsError to BqAccessError(auth_failed)

CI on PR #138 caught: bigquery.Client(...) resolves Application Default
Credentials at construction time; without ADC (CI without SA key, dev
laptop without 'gcloud auth application-default login') it raises
google.auth.exceptions.DefaultCredentialsError synchronously.

Pre-fix _default_client_factory only caught ImportError, so DefaultCredentialsError
propagated as raw exception — and from production endpoints would surface
as bare 500 (the exact failure mode #134 sets out to fix).

Now translates to BqAccessError(kind='auth_failed', details.hint='Run
gcloud auth application-default login...'). Endpoint catch chain returns
HTTP 502 with structured body. Adds unit test
test_raises_auth_failed_on_default_credentials_error.

Third-round spec review flagged this case in passing; the fix didn't land.
CI's auth-less environment surfaced it.

* fix(bq_access): get_bq_access() returns sentinel instead of raising when not configured

Devin BUG_0001 on PR #138 review: 'get_bq_access() as FastAPI Depends
breaks all v2 endpoints for non-BigQuery instances'.

Pre-fix: get_bq_access() raised BqAccessError(not_configured) when
neither BIGQUERY_PROJECT env nor data_source.bigquery.project was set.
Because FastAPI resolves Depends() BEFORE the endpoint body runs, this
exception fires during dep-injection — the endpoint's try/except
BqAccessError clause never gets a chance to catch it. Result: every
v2 request on Keboola-only or CSV-only instances returned bare HTTP
500, even for local-source tables that never touch BigQuery.

Fix: get_bq_access() now returns a sentinel BqAccess with empty
BqProjects and factories that raise BqAccessError(not_configured)
on actual use. Construction succeeds, FastAPI's dep-injection cleanly
yields the sentinel, the endpoint runs. The local-source code path
in build_sample / build_schema / etc. never calls bq.client() or
bq.duckdb_session() (it reads parquet directly), so non-BQ tables
return 200 as before. Only when an endpoint actually tries to query
BQ (source_type == 'bigquery') does the sentinel raise — and the
endpoint's existing except BqAccessError catches it normally,
returning structured 502 with hint.

Test get_bq_access::test_raises_not_configured_when_neither_set
renamed and rewritten to test_returns_sentinel_when_neither_set:
asserts BqAccess is returned, then asserts client() and
duckdb_session() each raise BqAccessError(not_configured) on call.

Test test_does_not_cache_exceptions removed (no longer applicable)
and replaced with test_sentinel_is_cached_per_process documenting
the operator-restart-on-config-change contract.

* docs(spec+plan): #134 genericize customer-specific tokens (CLAUDE.md OSS rule)

Devin BUG_0001/0002 round 3 on PR #138: spec and plan docs contained
customer-specific deployment hostnames, deployment names, and a GCP
project ID that violated CLAUDE.md's vendor-agnostic OSS rule
('Nothing customer-specific belongs in code, configuration defaults,
comments, docs, commit messages, PR titles, or PR bodies').

Replacements:
  agnes-development.groupondev.com -> <your-agnes-host>
  agnes-development                -> <your-dev-instance>
  prj-grp-dataview-prod-1ff9       -> <your-data-project>
  s1_session_landings              -> <bq_table_id>

E2E verification semantics unchanged — operators still run the same
four curls + config flip + retry, just substituting their own host /
deployment name / project / table.

* fix(bq_access): hook get_bq_access.cache_clear into instance_config.reset_cache

Devin ANALYSIS_0004 on PR #138: get_bq_access is @functools.cache'd at
process level, so it captures BigQuery project IDs at first call and
ignores subsequent instance.yaml changes. Pre-Phase-2 the v2 endpoints
re-read get_value() on every request, so admin /api/admin/server-config
saves (which call instance_config.reset_cache()) hot-reloaded the BQ
project. Without this fix, my refactor silently regresses that contract
— operators editing instance.yaml via the admin UI would see no effect
on v2 endpoints until container restart.

instance_config.reset_cache() now also calls
connectors.bigquery.access.get_bq_access.cache_clear() (lazy import,
swallowed if connectors module isn't loaded — keeps instance_config
usable in isolated unit tests).

Adds test_instance_config_reset_cache_invalidates_get_bq_access as
regression guard. Updates CHANGELOG Internal entry to mention the
hot-reload contract + the not-configured sentinel behavior (round-3
fix from Devin BUG_0001 was previously only in commit message).

* fix(bq_access): surface not_configured before identifier validation + plan path genericize

Devin BUG_0001 + BUG_0002 round 5 on PR #138.

BUG_0001 (plan doc): personal filesystem path violated CLAUDE.md
vendor-agnostic rule. Replaced with '<worktree-root>' placeholder.

BUG_0002 (sentinel error path): when get_bq_access() returns the sentinel
BqAccess (BQ not configured), the empty bq.projects.data was reaching
validate_quoted_identifier first and raising ValueError -> endpoint
mapped to HTTP 400 'unsafe_identifier' instead of structured 500
'not_configured' with hint.

Each fetch helper now checks 'if not bq.projects.data: bq.client()' as
the first step, which triggers the sentinel's BqAccessError(not_configured).
Endpoint catches the typed error and returns HTTP 500 with hint pointing
at data_source.bigquery.project. Best-effort _fetch_bq_table_options
returns {} silently in this case (preserves the swallow-all contract).

* fix(bq_access): classify DuckDB-native exceptions from bigquery_query() via string match

Devin ANALYSIS on PR #138 review (latest round). The DuckDB bigquery
extension is a C++ plugin making its own HTTP calls — when BQ returns
403, it throws duckdb.IOException with the BQ error embedded as text,
not gax.Forbidden. translate_bq_error's isinstance checks would miss
these, falling to case 7 → bare 500 in production for v2_scan, v2_sample,
and v2_schema (the bigquery_query() paths).

Fix: last-resort string-match heuristic before the re-raise. 'Forbidden'
/ '403' / 'Bad Request' / '400' in the lowercased message classifies via
the same kind hierarchy. The 'serviceusage' substring still distinguishes
cross_project_forbidden from bq_forbidden. Specific enough that random
exceptions without HTTP-error keywords still re-raise.

Adds 4 unit tests covering the new heuristic + the 'don't swallow random
exceptions' invariant.

* chore(release): cut 0.22.0

PR #138 contains issue #134 user-visible behavior changes:
- BREAKING: BIGQUERY_PROJECT env var now overrides instance.yaml
  data_source.bigquery.project for v2 endpoints (previously
  RemoteQueryEngine billing only).
- Fixed: structured 502/400 on /api/v2/sample, /scan, /scan/estimate,
  /schema when BigQuery raises Forbidden/BadRequest (was bare 500).
- Internal: BqAccess facade refactor unifying four duplicate BQ-access
  call sites; instance_config.reset_cache() now invalidates BqAccess
  cache too so admin server-config saves hot-reload BQ project IDs.

Bumps to 0.22.0 because PR #137 merged first and took 0.21.0.
2026-04-30 10:11:20 +02:00
minasarustamyan
4ec5ff44dd
feat(setup): cross-platform TLS bootstrap + marketplace plugin install (#137)
Bootstraps the Agnes Claude Code marketplace + RBAC-allowed plugins from
the dashboard CTA, and inlines the server's TLS cert when the chain isn't
publicly trusted (self-signed / private CA). Cross-platform setup prompt
covers Windows Git Bash, macOS, Linux. Includes Bun-compiled `claude` fix
(macOS goes via git-clone fallback, same as Windows), PAT stripping after
clone, explicit error handling, and four rounds of Devin Review fixes
(phantom step references, $PLATFORM re-detection, heredoc/awk line-count
sync). Cuts 0.21.0.

See CHANGELOG.md [0.21.0] section for details.
2026-04-30 08:56:45 +02:00
Vojtech
38f6b639d2
feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136)
Cuts release 0.20.0.

## Highlights
- X-Request-ID header on every response + sanitized to [A-Za-z0-9_-] (CRLF log-forging mitigation)
- Error pages (HTML + JSON 500) surface request_id for support tickets
- Dev debug toolbar gated by DEBUG=1 — fastapi-debug-toolbar with custom DuckDBPanel
- Centralized app.logging_config.setup_logging() replaces 23 scattered basicConfig calls
- Telegram bot drops bot.log file — stdout only (BREAKING)

## Devin findings addressed
- BUG_0001: .env.template no longer claims FastAPI debug=True
- BUG_0002: subprocess extractor logs INFO to stderr again
- ANALYSIS_0003: _wants_html no longer matches Accept: */* (curl gets JSON as before)
- BUG on b1c6ee9: HTML 500 page no longer leaks str(exc) in production
- BUG on b13d2fe: 2 CLAUDE.md compliance flags (transform.py + ws_gateway) accepted as scope-limited logging refactor — follow-up to update CLAUDE.md if needed

See CHANGELOG [0.20.0] for full notes.
2026-04-29 22:54:21 +02:00
ZdenekSrotyr
b7a1795834
feat(scheduler): re-wire sync_schedule + script.schedule; tune via env; OpenMetadata TLS (#135)
Bundles 4 issues:
- #79 — table_registry.sync_schedule honored at runtime (API-side filter + Pydantic validators)
- #78 — script_registry.schedule honored via new POST /api/scripts/run-due (atomic claim, BackgroundTask exec, deploy-time safety validation)
- #77 — sidecar JOBS env-driven (SCHEDULER_DATA_REFRESH_INTERVAL/HEALTH_CHECK_INTERVAL/SCRIPT_RUN_INTERVAL/TICK_SECONDS)
- #89 — OpenMetadataClient verify=True default (BREAKING for self-signed)

Cuts release 0.19.0. See CHANGELOG for full notes incl. Known Limitations.
2026-04-29 22:06:30 +02:00
minasarustamyan
953bd9d250
fix(marketplace): use plugin.json name in synth marketplace.json (#133)
Closes the /plugin UI 'Plugin <X> not found in marketplace' bug. Synth marketplace.json catalog 'name' now reads from <plugin_dir>/.claude-plugin/plugin.json (with fallback to upstream marketplace.json name). On-disk plugins/<slug>-<plugin>/ layout preserved so cross-marketplace files don't collide. /marketplace/info exposes both name and prefixed_name (BREAKING — downstream tooling parsing 'name' for the slug-prefixed form must switch to prefixed_name).
2026-04-29 19:25:57 +02:00
minasarustamyan
c940593a90
feat(auth): Google Workspace group prefix filter + system mapping (#131)
Three new env vars wire the Google OAuth callback to a configurable Workspace prefix and route admin/everyone Workspace groups onto the seeded system rows: AGNES_GOOGLE_GROUP_PREFIX, AGNES_GROUP_ADMIN_EMAIL, AGNES_GROUP_EVERYONE_EMAIL. Login gate redirects users with no prefix-matching group to /login?error=not_in_allowed_group. BREAKING: auto-Everyone membership for new users removed. Admin UI/API are read-only on Google-managed groups. See docs/auth-groups.md.
2026-04-29 14:08:04 +02:00
ZdenekSrotyr
82c5d71d63
feat(memory): #62 — duplicate hints + tree-view + bulk-edit (#126)
Issue #62. Tree view with cross-axis filtering, duplicate-candidate hints (Jaccard score on entity overlap), bulk-edit endpoints (PATCH /api/memory/admin/{id} + POST /api/memory/admin/bulk-update), schema v17 (knowledge_item_relations), full CLI parity (da admin memory tree/edit/bulk-edit/duplicates list/resolve).
2026-04-29 13:55:15 +02:00
ZdenekSrotyr
1824b9dd9c
feat(admin): #108 M1 — BigQuery table registration in UI + CLI (#119)
Issue #108 Milestone 1. Adds BigQuery table registration via /admin/tables UI and `da admin register-table` CLI without hand-editing table_registry. POST /api/admin/register-table/precheck for round-trip validation. --dry-run flag on CLI. Audit-log entries on register/update/unregister. PUT /api/admin/registry/{id} now preserves registered_at (closes #130).
2026-04-29 13:18:31 +02:00
ZdenekSrotyr
995e4cd366
fix(scheduler): HTTP marketplaces job + SCHEDULER_API_TOKEN shared secret (#127)
* fix(scheduler): HTTP marketplaces job + SCHEDULER_API_TOKEN shared secret

Two scheduler-reliability bugs surfaced after the v0.12.1 USER-agnes flip:

1. The marketplaces job called src.marketplace.sync_marketplaces() in-process
   from the scheduler container, racing the app's long-lived system.duckdb
   handle. DuckDB rejects cross-process writers — every cron tick 500-ed on
   "Could not set lock on file ... PID 0".

2. The data-refresh + new marketplaces jobs both 401-ed on the API because
   SCHEDULER_API_TOKEN was never propagated by the Terraform startup script.
   The scheduler had no credential to authenticate with.

Fix:
- New POST /api/marketplaces/sync-all (admin-only) drives the nightly refresh
  through the app process so it inherits the existing DB connection.
- Scheduler swaps fn->http for marketplaces; all jobs are now plain HTTP and
  the scheduler is reduced to a cron clock.
- New app/auth/scheduler_token.py adds a shared-secret auth path. The
  startup script generates a 256-bit secret on first boot, persists it
  across reboots, and writes it to /opt/agnes/.env. Both containers source
  the same .env. The app validates incoming Bearer tokens against the env
  var (constant-time, length-floored) and resolves matches to a synthetic
  scheduler@system.local user that's a member of the Admin system group.
  Audit-log entries from the scheduler are attributed to this user.
- app/main.py seeds the synthetic user at startup so the first cron tick
  has a valid actor; lazy seed in get_scheduler_user covers token rotation
  before the next app restart.

Tests: 5 new in tests/test_auth_scheduler_token.py covering empty/short
secret rejection, exact-match comparison, idempotent user seeding, and
lazy provisioning. 142 marketplace + scheduler tests + 96 auth tests
remain green.

Existing VMs with .env from before this change need a one-time
re-provisioning (re-run startup-script or rotate via openssl rand);
documented in CHANGELOG.

* fix(audit): use '_all' sentinel for bulk marketplace sync — Devin review #127

Avoids the literal string 'marketplace:None' in the audit_log resource
column when the bulk sync endpoint writes its summary row.

* fix(scheduler): unblock event loop + per-job timeouts — Devin review #127

Two findings from Devin re-review on commit 5fbad15:

1. BUG: trigger_sync_all was async def, so FastAPI ran it on the asyncio
   event loop. sync_marketplaces() does blocking I/O (subprocess git
   clones up to GIT_TIMEOUT_SEC=300 each, threading.Lock, DuckDB writes)
   and would freeze every concurrent request for the duration of a bulk
   sync. Switched to plain def so FastAPI auto-routes to the thread pool.

2. ANALYSIS: scheduler used a fixed 120s httpx timeout for every POST.
   Bulk marketplace sync iterates the registry under a single lock with
   up to 300s per repo — easily exceeds 120s on 2-3 slow repos. The
   scheduler then sees a timeout, doesn't update last_run, and re-fires
   on the next 30s tick, queueing redundant work. Per-job timeout
   override added to the JOBS tuple; marketplaces gets 900s (15 min),
   data-refresh keeps 120s, health-check 30s.

* fix(auth): require_session_token rejects scheduler shared secret — Devin review #127

require_session_token gates /auth/tokens (PAT minting). Pre-fix it only
rejected JWTs with typ=pat — but the scheduler shared secret is an opaque
string, so verify_token() returns None, payload becomes {}, and the
PAT-claim check silently passed. A caller bearing SCHEDULER_API_TOKEN
could mint persistent PATs that survive a secret rotation.

Added explicit is_scheduler_token() check before the PAT-claim check;
new regression test in tests/test_auth_scheduler_token.py.

Devin's other note (pre-existing async def trigger_sync at marketplaces.py:392
also calls blocking sync_one) — Devin flagged it as out-of-scope for this PR
and I agree; tracking separately.

* release(0.17.0): cut + clean up CHANGELOG duplicates

Cuts 0.17.0 (minor: scheduler shared-secret auth + sync-all endpoint
plus the deploy-shape fixes that landed since the last release tag).

Bumps pyproject from 0.15.0 — also corrects the missed bump from PR #120
(v0.16.0 was tagged on GitHub and shipped as :stable, but pyproject
stayed at 0.15.0, so /api/version, /cli/latest, and `da --version` had
been under-reporting the running release).

Removes the long-form duplicate entries for 0.13.0 / 0.14.0 / 0.15.0
above [0.16.0] — the canonical short summaries (with GitHub-release
links) already exist below 0.16.0, the long forms were leftover state
from before those versions were cut and have been silently shadowed
ever since.
2026-04-29 11:44:00 +02:00
ZdenekSrotyr
61f6b8d2d5
feat(ci+tests): deploy safety audit — linting, rollback, smoke tests, 50+ new tests (#120)
Comprehensive deploy safety audit implementing 19 improvements across CI/CD pipeline, test coverage, and source code.

### CI/CD Pipeline
- ruff + mypy added to both release.yml and keboola-deploy.yml (continue-on-error)
- Smoke test added to keboola-deploy.yml (was missing)
- Automatic rollback on smoke test failure in release.yml
- Expanded smoke-test.sh with catalog, admin/tables, marketplace.zip, metrics
- Required status checks via .github/settings.yml
- Dependabot + CODEOWNERS + pre-commit hooks + ruff config

### Source Code
- DB schema version check in /api/health (db_schema: ok/mismatch/unhealthy)
- Config versioning (config_version: 1 in instance.yaml, non-blocking validation)
- BigQuery extractor ATTACH error handling (try/except around INSTALL+ATTACH)
- Post-deploy smoke test script for prod VM validation

### Test Coverage (~50 new tests)
- v13->v14 migration, Email magic link TTL, PAT, Marketplace ZIP/Git,
  Jira webhooks, Hybrid Query BQ, Keboola/BQ extractor failure modes,
  Orchestrator failure modes

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-04-29 09:18:55 +02:00
ZdenekSrotyr
6752c4a53e
fix(web): restore admin nav menu items (#122)
v13 RBAC migration nulled users.role and moved admin authority onto user_group_members. Header still gated on session.user.role == 'admin', so admin menu was hidden for everyone. Inject user['is_admin'] via is_user_admin in get_current_user; header reads session.user.is_admin.
2026-04-29 09:09:23 +02:00
ZdenekSrotyr
1baadd172e
fix(ui): render shared header full-width on corporate memory pages (#117)
Move {% include '_app_header.html' %} out of .container-memory (max-width: 1000px)
in corporate_memory.html and corporate_memory_admin.html so the header spans the
viewport, matching dashboard.html. Page content stays constrained by the container.
2026-04-29 07:45:56 +02:00
PavelDo
e1108b6112
feat(memory): corporate memory v1+v1.5 + 0.15.0 (#72)
Adds corporate memory v1 (verification flywheel + contradiction detection + confidence scoring) and v1.5 (audience-based distribution + per-item privacy + admin curation). Server: GET /api/memory/bundle returns mandatory + ranked-approved items within a token budget; POST /api/memory/admin/mandate accepts an audience field gated against user_group_members; /api/memory/stats uses SQL aggregation. CLI: da sync writes received items to .claude/rules/km_*.md. Verification detector extracts knowledge candidates from session JSONL files. Auto-tagging via Haiku when ai: is configured. Adapted from the v9-era branch onto v13/v14 RBAC: _is_privileged_viewer + _effective_groups now query user_group_members JOIN user_groups; require_role(Role.KM_ADMIN) replaced with require_admin (km_admin collapsed into admin). Schema v15: knowledge_items context-engineering columns + knowledge_contradictions + session_extraction_state. Schema v16: verification_evidence. Cuts release v0.15.0 (also bundles #116 /me/debug page).
2026-04-29 07:16:22 +02:00
minasarustamyan
7a06f1a585
feat(auth): /me/debug self-only auth diagnostic page (#116)
Adds /me/debug HTML page rendering the logged-in user's own session state — decoded JWT claims (no raw token, sha256[:12] fingerprint for log correlation), group memberships with sources and bound external_id when present, resource grants effective via those memberships, and a Refetch from Google (dry-run) button that diffs a fresh fetch_user_groups call against the cached user_group_members snapshot. Gated by AGNES_DEBUG_AUTH env var (default off → 404, route existence undetectable in production). Self-only by construction: user_id is read from the validated session, never echoes raw JWT / password hash / full PAT. Tolerates v13 + v14 schemas via information_schema check on users.external_id.
2026-04-29 06:36:28 +02:00
ZdenekSrotyr
2e1dfb7553
feat(v2): claude-driven fetch primitives + 0.14.0 (#102)
Replaces the BigQuery wrap-view pattern with a discovery + scoped-fetch toolkit driven by the analyst's Claude session. Adds /api/v2/{catalog,schema,sample,scan,scan/estimate}, da catalog/schema/describe/fetch/snapshot/disk-info CLI commands, sqlglot-backed WHERE validator, process-local quota tracker, agent rails skill (cli/skills/agnes-data-querying.md). BREAKING: BQ wrap views off by default — set data_source.bigquery.legacy_wrap_views=true for one cycle. Backward-compat field_validator on primary_key. Catalog cache now matches documented 300s TTL with RBAC fresh per request. Cuts release v0.14.0.
2026-04-29 01:07:19 +02:00
ZdenekSrotyr
a222f92e70
feat(admin): server configuration editor + 0.13.0 (#107)
Adds /admin/server-config UI for editing instance.yaml from the web. Hardening: SSRF gate on data_source URLs, narrow-overlay write strategy, atomic writes, audit log with secret masking on shape changes, threading lock on read-modify-write, corrupt-overlay refusal on write side + louder log on read side, modal Promise resolution on backdrop dismiss, sentinel scrub on save (defense-in-depth client+server). Bundles Windows PowerShell wrapper from #80. Cuts release v0.13.0.
2026-04-29 00:47:23 +02:00
ZdenekSrotyr
5f6bb7a4b2
fix(security+ops) + release(0.12.1): #82 #85 #87 hardening + cut 0.12.1 (#104)
* fix(security+ops): #82 #85 #87 — auth hardening, API validation, deploy posture

Security and operational hardening across three issue groups:

- M23: docker-compose.override.yml → docker-compose.dev.yml (BREAKING, prod foot-gun)
- C13: Container runs as non-root user 'agnes' (USER directive in Dockerfile)
- M21: Docker resource limits (mem_limit, cpus) on app + scheduler
- M22: Caddyfile security headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, -Server)
- M17: /api/health split into minimal (unauth) + /api/health/detailed (auth) (BREAKING)
- M26: release.yml restricts build-and-push to main + workflow_dispatch; paths-ignore for docs

- C2: table_id traversal validation on /api/data/{table_id}/download
- M4: Upload streaming (chunk-read + temp file) instead of full-buffer; /local-md hashed filename

- C5: reset_token removed from POST /api/users/{id}/reset-password response
- C8: Startup WARNING when no user has password_hash (bootstrap window visible)
- M9: Audit log on failed web form login (mirrors /auth/token endpoint)
- M10: Atomic magic-link consume via compare-and-swap (CONSUMED: marker + DuckDB conflict catch)

Also: SSRF protection on /api/admin/configure (#46), memory stats SQL aggregation (#90)

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(review): SSRF 169.254.x.x + IPv6 multicast; M10 marker cleanup safety

Review fixes:
- Add 169.254.0.0/16 (link-local, cloud metadata) to SSRF regex — was
  missing, allowing requests to AWS/GCP/Azure metadata endpoints
- Add ff[0-9a-f]{2}: (IPv6 multicast) to SSRF regex
- M10: wrap Step 3 (CONSUMED marker cleanup) in try-except with
  warning log — prevents unhandled exception if DB write fails after
  successful token consumption
- Add test for 169.254.169.254 SSRF rejection

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(review): SSRF IPv6 bypass, CLI health endpoint, upload FD leak

Address Devin Review findings on PR #104:

1. SSRF IPv6 bypass: Replace hostname regex with DNS resolution +
   ipaddress module checks. The old regex patterns like `fe80:` only
   matched up to the first colon, missing real IPv6 addresses like
   `fe80::1`, `fc00::1`, `ff02::1`. The new approach resolves the
   hostname via getaddrinfo and checks each resulting IP against
   ipaddress.is_private/is_loopback/is_link_local/is_reserved/is_multicast.

2. CLI commands broken: `da setup test-connection`, `da setup verify`,
   `da diagnose`, `da status` all called /api/health expecting the old
   format (status=="healthy", services dict). Now they call
   /api/health/detailed for service-level checks (with graceful fallback
   to the minimal endpoint when auth is not configured).

3. Temp file handle leak: _stream_to_temp returns an open
   NamedTemporaryFile; callers now close it before shutil.move() to
   prevent FD leaks until GC.

Also adds IPv6 SSRF test cases (loopback, link-local, unique-local,
multicast) with mocked DNS resolution for test environment independence.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(review): download regex blocks hyphenated IDs; document health split

Address Devin Review round-3 findings on PR #104:

1. _SAFE_IDENTIFIER regex blocked hyphenated table IDs: The download
   endpoint used the strict SQL-identifier regex which does not allow
   dots or hyphens, but Keboola table IDs like in.c-crm.orders
   contain both. Switched to _SAFE_QUOTED_IDENTIFIER which allows dots
   and hyphens while still blocking path-traversal chars (/, .., \)
   and quote/control characters. Added test for hyphenated/dotted IDs.

2. Documented health endpoint split in DEPLOYMENT.md: Added Health
   checks & external monitoring section explaining both endpoints
   (minimal unauth /api/health vs authenticated /api/health/detailed)
   and how to wire external monitoring tools to the detailed endpoint
   with a PAT.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* release(0.12.1): cut hotfix for snapshot integrity + #82/#85/#87 hardening

* fix(security): apply CAS pattern to password reset confirm (#82/M10 follow-up)

Devin review on the rebased PR flagged the asymmetry: magic-link verify
got the atomic compare-and-swap pattern in the original M10 fix, but
password reset confirm at /auth/password/reset/confirm was still using
read-validate-clear. Two concurrent POSTs with the same valid reset
token could both succeed in setting different new passwords (last-write-
wins). Lower severity than the magic-link race because the attacker
would need the reset token AND to race the legitimate user, but the
asymmetry was a polish gap.

Mirrors app/auth/providers/email.py::_consume_token CAS exactly: write
unique CONSUMED:<random> marker via UPDATE...WHERE token=old_token, then
SELECT to verify our marker won, then proceed. Only the winner clears
the marker and applies the password change.

New regression test_concurrent_reset_only_one_wins in
tests/test_password_flows.py::TestResetConfirm pins the contract: two
ThreadPoolExecutor workers + Barrier hit /reset/confirm with the same
token; exactly one gets 302 (password applied), the other gets 200 with
'Invalid or expired'. Sanity-checked against the pre-CAS code — both
POSTs got 302 (race confirmed).

---------

Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-04-28 19:57:30 +02:00
ZdenekSrotyr
0c963b55ef fix(rbac+auth): system-group PATCH accepts description; Google sync preserves memberships on empty
Two Devin-flagged regressions on the squashed PR #106 head:

1) PATCH /api/admin/groups/{id} blanket-rejected on system groups.

   The repository guard at src/repositories/user_groups.py was already
   narrowed to "rename only" by 7147bac (PR #110 follow-up), but the
   endpoint at app/api/access.py:331-343 still short-circuited with
   409 "System groups are immutable" for any mutation. A description-only
   payload like {"description": "..."} returned 409 instead of 200 even
   though the repo would have accepted it. CHANGELOG entry promised the
   fix but the code didn't match.

   Endpoint now mirrors the repo contract: 409 only when payload.name
   is set AND differs from existing name. Same-name no-op renames are
   dropped before the repo call. Description-only updates flow through.

2) Google OAuth callback wiped google_sync memberships on transient
   API failure.

   fetch_user_groups is fail-soft and returns [] for both "user has no
   groups" and "Cloud Identity API error". The callback fed that empty
   list into replace_google_sync_groups, which DELETEs all rows with
   source='google_sync' for the user then INSERTs zero — silently
   wiping every Workspace-synced membership on a hiccup.

   Callback now skips replace_google_sync_groups when group_names is
   empty and logs "preserving existing memberships". Trade-off: a user
   whose Workspace groups were genuinely cleared keeps stale memberships
   until the next non-empty sync. Admin-added rows (source='admin') were
   already protected by source-scope and are unaffected. The previous
   guard against this exact regression was test_callback_empty_groups_
   does_not_overwrite_existing in tests/test_auth_providers.py — that
   test class has been skipped since v12 (asserts users.groups JSON,
   needs rewrite for user_group_members).
2026-04-28 14:40:27 +02:00
Vojtech Rysanek
7147bac079 feat(rbac+marketplace): schema v14 FK + AGNES_ENABLE_TABLE_GRANTS + break-glass CLI
Follow-up to the RBAC v13 + marketplace work in the parent commit. Addresses
deferred Devin findings, gemini-flagged blockers, and adds three guard rails.

== Schema v14 — FK constraints on user_group_members + resource_grants ==
Adds DuckDB foreign-key constraints so cascade deletes can no longer leave
orphaned member / grant rows pointing at a deleted group_id (which were
relying on application-level cascades up to v13). Migration is RENAME →
CREATE-with-FK → INSERT → DROP, wrapped in BEGIN TRANSACTION so a partial
failure rolls back without leaving the DB at a half-applied schema.

== AGNES_ENABLE_TABLE_GRANTS feature flag (default off) ==
ResourceType.TABLE was shipped in the parent commit as listing-only — admins
can record grants but runtime enforcement still flows through legacy
dataset_permissions. To avoid the misleading-UX surface area, the chip is
hidden from /admin/access and POST /api/admin/grants returns 422 with the
env-var name in detail until the operator opts in. Existing TABLE rows in
resource_grants stay listable + deletable so cleanup is never blocked.

Helpers: is_resource_type_enabled(rt), enabled_resource_types().

== Break-glass admin CLI ==
`da admin break-glass <user>` adds the user to the Admin user_group with
source='system_seed' regardless of RBAC state. Bypasses authentication —
relies on filesystem access to ${DATA_DIR}/state/system.duckdb implying
host-level trust. Recovery path when the operator has locked themselves
out of /admin/access.

== Devin round-2 fixes (deferred on b4ec4c4) ==
- src/repositories/user_groups.py — narrow update() guard from blocking any
  mutation on system groups to blocking name change only. Description edits
  now pass through. Endpoint pre-check stays as defense-in-depth. Prior
  behavior surfaced as a misleading 409 'Cannot rename a system group' on
  description-only PATCH.
- app/api/access.py:delete_group — wrap cascade DELETEs + repo.delete in
  BEGIN TRANSACTION / COMMIT / ROLLBACK. Prevents orphan rows if any
  DELETE fails after the user_groups row is gone.
- app/marketplace_server/{packager,router}.py — split compute_etag_for_user()
  from build_zip(); router resolves etag first and 304-shorts before any
  file read or ZIP_DEFLATED. In-process cachetools.TTLCache (default 120s,
  env-tunable via AGNES_MARKETPLACE_ETAG_TTL, set 0 to disable).
  invalidate_etag_cache() called by sync to force re-hash on content drift.

== Tests ==
- TestTableGrantsFeatureFlag (4 cases) — endpoint exclude/include, grant
  rejection/acceptance under the flag.
- test_v12_to_v13_finalize_rollback_on_failure — destructive: monkeypatches
  _seed_system_groups to raise mid-transaction, asserts schema_version stays
  at 12, legacy tables intact, new tables empty (rollback fired). Then
  restores the real function and asserts the retry succeeds.
- test_update_system_group_description_allowed,
  test_update_system_group_same_name_no_op — repo-level coverage of the
  narrowed guard.
2026-04-28 14:25:13 +02:00
ZdenekSrotyr
e9d7af3cce feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening
This squashes 13 commits from ma/staging plus a small docstring translation
into a single coherent unit. Three workstreams.

== RBAC v13 redesign ==
- Drops core.viewer/analyst/km_admin/admin hierarchy and the
  internal_roles / group_mappings / user_role_grants / plugin_access tables.
- Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill
  wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry.
- Two authorization primitives in app.auth.access:
    require_admin                        — Admin-group god-mode
    require_resource_access(rt, "{path}") — entity-scoped grants
  Single DB lookup per request; no session cache; no implies BFS.
- /admin/access UI (single page) replaces /admin/role-mapping +
  /admin/plugin-access. CLI `da admin group/grant *` replaces
  `da admin role/mapping/grant-role/revoke-role/effective-roles`.
- ResourceType.TABLE listing-only — admins can record table grants,
  runtime enforcement still flows through legacy dataset_permissions
  (migration plan in docs/TODO-rbac-data-enforcement.md).

== Claude Code marketplace ==
- Aggregated /marketplace.zip + /marketplace.git/* (PAT-gated,
  RBAC-filtered, content-addressed cache via dulwich).
- Admin god-mode dropped on the marketplace surface — admins curate
  their own view via grants like everyone else.
- Bare-repo cache materializes per RBAC-filtered ETag; stale entries
  not pruned in this iteration (disclaimed in git_backend.py docstring).

== #81 #83 #44 security/ops hardening ==
- #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias).
- #81 Group B — Keboola extractor 3-state exit codes:
    0 success / 1 total fail / 2 PARTIAL fail
  Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary
  alerting must teach it the new partial signal.
- #81 Group C — schema v10 view_ownership; rejects silent overwrite
  of a prior connector's view name on collision.
- #81 Group D — extractor-side identifier validation.
- #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset
  + path-traversal fix.
- #44 — entire /api/scripts/* surface is admin-only (planted-script +
  sandbox-bypass risk closed).

== Web UI polish + deploy fix ==
- /admin/access: live grant-count badges (no stale snapshot revert),
  shared-header CSS link added to /catalog and /admin/{tables,permissions},
  per-resource-type colored stripes.
- docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't
  silently shadow sub-mounts and write state to the wrong disk.

== OSS vendor-neutralization (waves 1+2) ==
- scripts/grpn/ → scripts/ops/. Customer-specific identifiers
  (project IDs, internal hostnames, dev/prod VM IPs, brand names)
  replaced with placeholders across code, docs, Terraform, Caddyfile,
  OAuth probe, and planning docs. Downstream infra repos that copied
  scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must
  update the path.

== Translation ==
- src/repositories/user_groups.py::ensure_system docstring translated
  from Czech to English for codebase consistency.

Co-authored-by: Mina Rustamyan <mina@keboola.com>
2026-04-28 14:25:04 +02:00
ZdenekSrotyr
ef74ec010c
fix(ops): #81 Group B — Keboola partial-failure exit code 2 (squashed) (#99)
Closes M14 from issue #81. Keboola extractor exits 0/1/2
(success/full-fail/partial). sync.py interprets exit 2 as
PARTIAL FAILURE (data-quality alert, distinct from exit 1).

Tests: tests/test_keboola_extractor_exit_codes.py — 14 cases including
runtime mock subprocess (rc=0/1/2/124).

Refs #81 Group B.
2026-04-27 21:52:46 +02:00
ZdenekSrotyr
23be8ad46f
fix(security): #81 Group A — orchestrator attach hardening (squashed) (#95)
Closes the C1 findings from issue #81 plus the round-3/4 follow-ups
on the read-only query path.

Both _attach_remote_extensions (rebuild path) and
_reattach_remote_extensions (query path) now apply the same hard
allowlists for extensions and token-env names, single-quote-escape
the URL, and split built-in vs community install. The CHANGELOG bullet
documents the full scope including the table_schema → table_catalog
fix that made the rebuild path a silent no-op for every connector.

New module src/orchestrator_security.py centralises the policy. Tests
in tests/test_orchestrator_remote_attach_security.py — 28/28 pass.

Refs #81.
2026-04-27 21:34:04 +02:00
ZdenekSrotyr
24e81fb671
fix(security): gate Script-API /run on admin role (#44) (#92)
* fix(security): gate Script-API /run on admin role (#44)

The AST + string-blocklist sandbox in `_execute_script` is defense-in-depth,
not a primary trust boundary. It does not block `vars()`, `type()`, or
`__class__.__bases__` introspection chains, and the string blocklist is
trivially evadable via concatenation/dunder encoding. Treat the role gate
as the actual barrier: only admin can run scripts.

- `POST /api/scripts/run` and `POST /api/scripts/{id}/run` now require admin.
- `POST /api/scripts/deploy` stays analyst-accessible (storing != executing).
- Existing /run tests retargeted to admin_token; added regression tests
  asserting analyst → 403 on both endpoints.
- CHANGELOG: BREAKING (security) bullet under Unreleased/Changed.

Closes #44.

* fix(security): admin-gate /deploy + harden sandbox blocklist (review #92)

Reviewer of PR #92 flagged three MUST-FIXes that #44 wasn't fully closed:

1. /api/scripts/deploy still accepted analyst → planted-script attack
   path (analyst plants malicious source, waits for admin to /run).
   Now: /deploy also requires admin; the entire Script API is admin-only.

2. The "Minimum (same-day)" blocklist mitigations from issue #44 weren't
   applied. Added the introspection-chain dunders that the issue PoC
   pivots through: __subclasses__, __globals__, __class__, __base__,
   __bases__, __mro__, __dict__, __code__, __builtins__. Plus `vars`
   in BLOCKED_FUNCTIONS. Deliberately NOT adding __init__ /
   __getattribute__ (substring match would flag every legit `def __init__`)
   nor `type`/`dir` (frequent in legitimate admin scripts). Documented
   the trade-off inline.

3. Tests didn't cover the actual PoC payload nor non-analyst non-admin
   roles. Added test_run_pwn_payload_blocked parametrized over the issue's
   own PoC + two equivalent variants (lambda+__globals__, __mro__
   traversal); these stay green only as long as the dunder list does.
   test_*_requires_admin tests now parametrize over (analyst, viewer,
   km_admin) so all three non-admin core roles are pinned at 403.

Conftest extension: seeded_app now exposes viewer_token and
km_admin_token as siblings to admin_token / analyst_token.

CHANGELOG bullet updated to reflect /deploy gate change and new
internal regression tests. 35/35 scripts tests pass locally.

Refs review of #92.

* fix(tests): test_security TestScriptSandbox needs admin token after #44 hardening

CI failure on PR #92 caught a missed test file. tests/test_security.py
seeded only an analyst user and used the analyst token to drive sandbox
tests. After the #44 admin-gate (deploy + run both admin-only), every
sandbox test got 403 from the role gate before the AST/string check
could run, so 'blocks os.system' / 'blocks eval' / etc. all failed.

Fix: extend the fixture to also seed an admin user and return the admin
token. Sandbox tests now reach the sandbox layer; access-control tests
further down in the module continue to use the analyst that was kept
around. 41/41 test_security.py tests pass locally.

* fix(security): #92 round-3 — gate GET /api/scripts on admin role

Devin Review caught: GET /api/scripts (app/api/scripts.py:44-51) was
left on Depends(get_current_user) when the rest of the API moved to
admin-only. ScriptRepository.list_all() does SELECT * FROM script_registry
which returns ALL columns including 'source' (the full script body).
So any authenticated user (viewer / analyst / km_admin) could read
admin-deployed scripts — leak of code that may contain credentials,
business logic, or admin-only operational details.

CHANGELOG already says 'The entire Script API is now admin-only',
which was true for /deploy, /run, /{id}/run, DELETE — just not for
GET. Now consistent: every Script endpoint requires admin.

Tests:
- New parametrized test_list_scripts_requires_admin over (analyst,
  viewer, km_admin) tokens — all assert 403.
- Updated test_list_scripts_empty in both test_scripts_api.py and
  test_api_scripts.py to use admin_token.

79 tests pass.

Refs Devin Review of #92.

* fix: cleanup unused imports, stale docstrings, and incomplete CHANGELOG

- Remove unused imports: Path, List, get_current_user (ruff F401)
- Trim docstrings to describe current behavior, not change history
- CHANGELOG now lists GET /api/scripts among admin-gated endpoints
- Remove diff-commenting inline comments from tests

Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com>

* fix: merge duplicate Changed sections into one per CLAUDE.md convention

Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com>

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-04-27 21:13:56 +02:00
ZdenekSrotyr
2f783c5c0a
fix(security): close Jira webhook fail-open + path traversal (#83) (#93)
* fix(security): close Jira webhook fail-open + path traversal (#83)

Two related vulnerabilities:

1. Fail-open signature check: when JIRA_WEBHOOK_SECRET was unset,
   _verify_signature returned True and any unauthenticated POST to
   /webhooks/jira would run the full ingest pipeline. Now fail-closed —
   the handler short-circuits with 503 (operator-misconfiguration signal,
   distinct from 401 wrong-signature) when the secret is missing.

2. Path traversal via attacker-controlled issue_key: webhook payloads
   carry issue.key, which flowed unsanitized into save_issue (issues_dir /
   "{issue_key}.json"), download_attachment (attachments_dir / issue_key),
   and incremental_transform (raw_dir / "issues" / "{issue_key}.json"). A
   crafted webhook with issue.key="../../etc/passwd" could write outside
   the Jira data dir.

Defense-in-depth: new connectors/jira/validation.py exposes
is_valid_issue_key (whitelist regex ^[A-Z][A-Z0-9_]{0,31}-\d{1,12}$) and
safe_join_under (Path.resolve() containment check). Both are enforced at
the webhook entry point AND at every filesystem boundary in the connector.

Tests:
- New tests/test_jira_validation.py — unit tests for both helpers
  (parametrized invalid keys, traversal/symlink/absolute-path cases).
- Webhook tests: test_unconfigured_secret_returns_503,
  test_path_traversal_in_issue_key_rejected (parametrized over 10 bad keys),
  test_valid_issue_key_accepted.

CHANGELOG: two CRITICAL Fixed bullets under Unreleased.

Closes #83.

* fix(security): close remaining #83 review findings — webhookEvent traversal, _handle_deletion guard, regex tightening

Reviewer of PR #93 flagged four MUST-FIXes:

1. _log_webhook_event used the attacker-controlled `webhookEvent` field
   as a filename component without sanitization. Payload with
   `webhookEvent: "../../tmp/pwn"` could escape WEBHOOK_LOG_DIR. Now:
   - non-`[A-Za-z0-9_-]` runs are replaced with `_` (dot excluded so
     `..` cannot survive sanitization as a directory component)
   - length capped at 64 chars
   - final path routed through safe_join_under
   New regression test `test_webhook_event_path_traversal_sanitized`.

2. _handle_deletion (connectors/jira/service.py:530) and
   process_webhook_event (line 487) still used raw issue_key in path
   builds. Even though the webhook handler validates upstream, the
   "defense-in-depth at every filesystem boundary" claim required these
   too. Both now run is_valid_issue_key and safe_join_under guards.

3. Regex `^[A-Z][A-Z0-9_]{0,31}-\d{1,12}$` permitted underscores in
   project keys. Atlassian's project-key validator does not — `A_B-1`
   is rejected by Jira itself. Tightened to `[A-Z0-9]` and updated
   tests: `ABC_DEF-1` is now invalid, added Cyrillic А-1 (lookalike),
   CRLF, and oversize cases to the bad-key parametrization.

4. Existing test test_deletion_of_nonexistent_issue_returns_true used
   `PROJ-NOEXIST` which is not a real Jira key shape. Updated to
   `PROJ-99999`. The test still exercises the same intent (deletion of
   issue with no local file is idempotent).

73/73 jira tests pass locally (test_jira_webhooks + test_jira_validation
+ test_jira_service + test_jira_service_full + test_jira_incremental).

CHANGELOG updated to document the regex tightening and the new
webhookEvent sanitization.

Refs review of #93.

* fix(tests): test_journey_jira tests assumed fail-open before #83 fix

CI failure on PR #93 caught two journey tests that pinned the OLD
fail-open contract:

- test_webhook_with_no_secret_configured_accepted asserted 200 when
  JIRA_WEBHOOK_SECRET was unset. After the #83 fix that's a 503
  (operator misconfig). Renamed to _refused and flipped the assertion.

- test_webhook_empty_payload_rejected didn't set the secret, so the
  503 short-circuit fired before the empty-payload 400 could. Set
  JIRA_WEBHOOK_SECRET in the patched Config so the test exercises the
  intended path.

56/56 jira journey + webhook + validation tests now pass.

* fix(security): #93 round-3 — webhook fallback format + save_issue early validation

Devin Review caught two real findings:

1. Webhook handler regression: the round-2 fix extracted issue_key only
   from event_data['issue']['key'], but process_webhook_event has long
   supported a fallback 'issue_key' top-level field for certain Jira
   event formats (e.g. delete events historically). The handler now
   blocks those events with 400 before they reach the service layer.
   Fix: mirror process_webhook_event's fallback in the handler — try
   issue.key first, fall through to event_data.get('issue_key') when
   empty. is_valid_issue_key still validates whichever source provided
   the key.

2. save_issue defense-in-depth was incomplete: is_valid_issue_key ran
   AFTER fetch_remote_links and fetch_sla_fields had already used the
   unvalidated issue_key in HTTP URL construction
   ({base_url}/issue/{issue_key}/remotelink etc.). A future internal
   caller invoking save_issue directly with attacker-controlled input
   could trigger outbound requests with a malicious path component
   (limited SSRF / URL-path manipulation against the Jira API server).
   Fix: move the is_valid_issue_key check to immediately after the
   null guard, before any HTTP request or filesystem op. Webhook layer
   still validates upstream, this is the second layer.

66 jira tests pass.

Refs Devin Review of #93.

* fix(changelog): #93 round-4 — add BREAKING marker to fail-closed bullet

Devin Review caught: the JIRA_WEBHOOK_SECRET fail-closed change is a
behavior change for operators (response code 503 vs old 200) that
existing alerting may treat differently. Per CLAUDE.md changelog
discipline rule, operators grep for **BREAKING** before bumping the
pin. Added the marker + a short note on what action operators need
to take (set the env var if they haven't).

Refs Devin Review of #93.

* fix: #93 round-5 — null-issue crash + comment drift

Devin Review caught two findings on the round-4 commit:

1. Pre-existing crash on null issue field: a webhook payload with
   {"issue": null} (rather than omitting the key) caused
   event_data.get("issue", {}) to return None, then issue.get("key")
   raised AttributeError → unhandled 500. Pre-existing but reachable.
   Fix: 'event_data.get("issue") or {}' normalises None to {}, then
   the existing fallback / validation path returns 400 cleanly.
   New regression test test_null_issue_field_does_not_crash.

2. Inline comment drift: the comment at line 77 documented the allowed
   character class as [A-Za-z0-9._-] (with dot) but the regex at line 27
   excludes dot deliberately (so '..' cannot survive sanitization).
   Fixed the comment to match.

52 jira tests pass.

Refs Devin Review of #93 round 5.

* fix: #93 round-6 — process_webhook_event also normalises null issue field

Devin Review caught: the webhook handler at app/api/jira_webhooks.py
correctly handles {"issue": null} via 'event_data.get("issue") or {}',
but process_webhook_event at connectors/jira/service.py:509 still
used the bare 'event_data.get("issue", {})' which returns None on
explicit null. Internal callers (anything that invokes
process_webhook_event without going through the HTTP handler) would
hit the same AttributeError the round-5 fix closed at the handler
layer. Same one-line fix.

32 jira tests pass.

Refs Devin Review of #93 round 5.

* fix: #93 round-7 — issue-key regex uses [0-9] not \d

Devin Review caught: Python 3's \d matches any Unicode decimal digit
(Arabic-Indic ٣, Bengali ৩, Devanagari ३, …). A key like TEST-٣ would
pass the regex even though it's not a valid Jira input. Tightened to
[0-9] (ASCII only).

Added three Unicode-digit cases to the bad-key parametrization in
test_jira_validation.py to lock in the contract.

Refs Devin Review of #93 round 6.

* fix: #93 round-8 — use \\Z anchor not $ in issue-key regex

Devin Review caught: Python's $ anchor matches before a trailing \\n,
so re.match('…$', 'TEST-1\\n') returns a match. is_valid_issue_key
returned True for CRLF-injected keys. \\Z is hard end-of-string and
closes that bypass.

Manual verification:
  is_valid_issue_key('TEST-1\\n') → False (was True before fix)
  is_valid_issue_key('TEST-1\\r\\n') → False
  is_valid_issue_key('TEST-1') → True

Refs Devin Review of #93 round 7.

* docs: #93 round-9 — CHANGELOG regex matches implementation
2026-04-27 19:53:55 +02:00
Petr Simecek
2dfb246996
release(0.11.5): post-merge follow-up — Devin review fixes + authlib warning silenced (#74)
Cuts 0.11.5 with all the [Unreleased] bullets that landed on top of PR #73
between commit a899877 (the original "v0.11.4" tag in the chain) and the
final merge commit on main. No new public-API surface; the user-visible
payoff is that v8→v9-migrated installations work end-to-end (login flows,
GET /api/users, admin nav, the new role-management REST API and its
last-admin protection) and `make local-dev` startup is finally quiet.

Bullets covered (full text in CHANGELOG.md [0.11.5]):
- _hydrate_legacy_role re-resolves from grants on every request — fixes
  privilege-retention after grant revoke via the role-management API.
- Dev-bypass + OAuth callback now pass user_id to resolve_internal_roles
  so direct grants land in the session cache (not the DB-fallback path).
- GET /api/users hydrates user dicts before Pydantic validation
  (HTTP 500 on every migrated install) + same fix for update/delete
  paths so last-admin protection triggers on migrated admins.
- Scheduler stopped spamming POST /auth/token 401 — the auto-fetch
  fallback was always broken; SCHEDULER_API_TOKEN is now the only path.
- POST /auth/token / Google OAuth / password / email-magic-link all
  hydrate user["role"] before issuing the JWT (Pydantic 500 + wrong
  token payload). New TestAuthLoginFlowsPostMigration regression class.
- docs/RBAC.md no longer documents the non-existent implies= keyword
  on register_internal_role.
- _seed_core_roles now actually runs on every connect (the docstring
  was lying — only ran during fresh install + v8→v9). New
  TestSeedCoreRolesSafetyNet regression class.

This commit also adds:
- AuthlibDeprecationWarning suppression at app/main.py top — upstream-
  internal forward-compat note from authlib._joserfc_helpers, not
  actionable on our side. Filter is targeted by class (with a
  message-based fallback) so other DeprecationWarnings remain visible.
- pyproject.toml version: 0.11.4 → 0.11.5.
- CHANGELOG.md: [Unreleased] → [0.11.5] — 2026-04-27, new empty
  [Unreleased] skeleton appended for the next PR to land on.

Tag v0.11.5 follows; keboola-deploy-v0.11.5 tag triggers the
keboola-deploy.yml workflow for agnes-dev.keboola.com.
2026-04-27 02:32:18 +02:00