agnes-the-ai-analyst

Author	SHA1	Message	Date
ZdenekSrotyr	a222f92e70	feat(admin): server configuration editor + 0.13.0 (#107 ) Adds /admin/server-config UI for editing instance.yaml from the web. Hardening: SSRF gate on data_source URLs, narrow-overlay write strategy, atomic writes, audit log with secret masking on shape changes, threading lock on read-modify-write, corrupt-overlay refusal on write side + louder log on read side, modal Promise resolution on backdrop dismiss, sentinel scrub on save (defense-in-depth client+server). Bundles Windows PowerShell wrapper from #80. Cuts release v0.13.0.	2026-04-29 00:47:23 +02:00
ZdenekSrotyr	5f6bb7a4b2	fix(security+ops) + release(0.12.1): #82 #85 #87 hardening + cut 0.12.1 (#104 ) * fix(security+ops): #82 #85 #87 — auth hardening, API validation, deploy posture Security and operational hardening across three issue groups: - M23: docker-compose.override.yml → docker-compose.dev.yml (BREAKING, prod foot-gun) - C13: Container runs as non-root user 'agnes' (USER directive in Dockerfile) - M21: Docker resource limits (mem_limit, cpus) on app + scheduler - M22: Caddyfile security headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, -Server) - M17: /api/health split into minimal (unauth) + /api/health/detailed (auth) (BREAKING) - M26: release.yml restricts build-and-push to main + workflow_dispatch; paths-ignore for docs - C2: table_id traversal validation on /api/data/{table_id}/download - M4: Upload streaming (chunk-read + temp file) instead of full-buffer; /local-md hashed filename - C5: reset_token removed from POST /api/users/{id}/reset-password response - C8: Startup WARNING when no user has password_hash (bootstrap window visible) - M9: Audit log on failed web form login (mirrors /auth/token endpoint) - M10: Atomic magic-link consume via compare-and-swap (CONSUMED: marker + DuckDB conflict catch) Also: SSRF protection on /api/admin/configure (#46), memory stats SQL aggregation (#90) Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF 169.254.x.x + IPv6 multicast; M10 marker cleanup safety Review fixes: - Add 169.254.0.0/16 (link-local, cloud metadata) to SSRF regex — was missing, allowing requests to AWS/GCP/Azure metadata endpoints - Add ff[0-9a-f]{2}: (IPv6 multicast) to SSRF regex - M10: wrap Step 3 (CONSUMED marker cleanup) in try-except with warning log — prevents unhandled exception if DB write fails after successful token consumption - Add test for 169.254.169.254 SSRF rejection Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF IPv6 bypass, CLI health endpoint, upload FD leak Address Devin Review findings on PR #104: 1. SSRF IPv6 bypass: Replace hostname regex with DNS resolution + ipaddress module checks. The old regex patterns like `fe80:` only matched up to the first colon, missing real IPv6 addresses like `fe80::1`, `fc00::1`, `ff02::1`. The new approach resolves the hostname via getaddrinfo and checks each resulting IP against ipaddress.is_private/is_loopback/is_link_local/is_reserved/is_multicast. 2. CLI commands broken: `da setup test-connection`, `da setup verify`, `da diagnose`, `da status` all called /api/health expecting the old format (status=="healthy", services dict). Now they call /api/health/detailed for service-level checks (with graceful fallback to the minimal endpoint when auth is not configured). 3. Temp file handle leak: _stream_to_temp returns an open NamedTemporaryFile; callers now close it before shutil.move() to prevent FD leaks until GC. Also adds IPv6 SSRF test cases (loopback, link-local, unique-local, multicast) with mocked DNS resolution for test environment independence. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): download regex blocks hyphenated IDs; document health split Address Devin Review round-3 findings on PR #104: 1. _SAFE_IDENTIFIER regex blocked hyphenated table IDs: The download endpoint used the strict SQL-identifier regex which does not allow dots or hyphens, but Keboola table IDs like in.c-crm.orders contain both. Switched to _SAFE_QUOTED_IDENTIFIER which allows dots and hyphens while still blocking path-traversal chars (/, .., \) and quote/control characters. Added test for hyphenated/dotted IDs. 2. Documented health endpoint split in DEPLOYMENT.md: Added Health checks & external monitoring section explaining both endpoints (minimal unauth /api/health vs authenticated /api/health/detailed) and how to wire external monitoring tools to the detailed endpoint with a PAT. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * release(0.12.1): cut hotfix for snapshot integrity + #82/#85/#87 hardening * fix(security): apply CAS pattern to password reset confirm (#82/M10 follow-up) Devin review on the rebased PR flagged the asymmetry: magic-link verify got the atomic compare-and-swap pattern in the original M10 fix, but password reset confirm at /auth/password/reset/confirm was still using read-validate-clear. Two concurrent POSTs with the same valid reset token could both succeed in setting different new passwords (last-write- wins). Lower severity than the magic-link race because the attacker would need the reset token AND to race the legitimate user, but the asymmetry was a polish gap. Mirrors app/auth/providers/email.py::_consume_token CAS exactly: write unique CONSUMED:<random> marker via UPDATE...WHERE token=old_token, then SELECT to verify our marker won, then proceed. Only the winner clears the marker and applies the password change. New regression test_concurrent_reset_only_one_wins in tests/test_password_flows.py::TestResetConfirm pins the contract: two ThreadPoolExecutor workers + Barrier hit /reset/confirm with the same token; exactly one gets 302 (password applied), the other gets 200 with 'Invalid or expired'. Sanity-checked against the pre-CAS code — both POSTs got 302 (race confirmed). --------- Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-04-28 19:57:30 +02:00
ZdenekSrotyr	5c54320f75	chore(release): cut 0.12.0 Pre-1.0 SemVer convention: BREAKING changes land in MINOR. This release is heavy on those — RBAC v13 schema rewrite, schema v14 FK constraints, marketplace admin god-mode drop, internal_roles/group_mappings/user_role_grants tables removed, exit code 2 from Keboola extractor (partial fail), Script API admin-only, scripts/grpn/ → scripts/ops/. CHANGELOG.md: rename Unreleased → [0.12.0] — 2026-04-28; append a fresh empty Unreleased above so the next PR has somewhere to land. pyproject.toml: 0.11.5 → 0.12.0. Tag the merge commit as v0.12.0 + push the tag after merge to main.	2026-04-28 14:25:13 +02:00
Vojtech Rysanek	7147bac079	feat(rbac+marketplace): schema v14 FK + AGNES_ENABLE_TABLE_GRANTS + break-glass CLI Follow-up to the RBAC v13 + marketplace work in the parent commit. Addresses deferred Devin findings, gemini-flagged blockers, and adds three guard rails. == Schema v14 — FK constraints on user_group_members + resource_grants == Adds DuckDB foreign-key constraints so cascade deletes can no longer leave orphaned member / grant rows pointing at a deleted group_id (which were relying on application-level cascades up to v13). Migration is RENAME → CREATE-with-FK → INSERT → DROP, wrapped in BEGIN TRANSACTION so a partial failure rolls back without leaving the DB at a half-applied schema. == AGNES_ENABLE_TABLE_GRANTS feature flag (default off) == ResourceType.TABLE was shipped in the parent commit as listing-only — admins can record grants but runtime enforcement still flows through legacy dataset_permissions. To avoid the misleading-UX surface area, the chip is hidden from /admin/access and POST /api/admin/grants returns 422 with the env-var name in detail until the operator opts in. Existing TABLE rows in resource_grants stay listable + deletable so cleanup is never blocked. Helpers: is_resource_type_enabled(rt), enabled_resource_types(). == Break-glass admin CLI == `da admin break-glass <user>` adds the user to the Admin user_group with source='system_seed' regardless of RBAC state. Bypasses authentication — relies on filesystem access to ${DATA_DIR}/state/system.duckdb implying host-level trust. Recovery path when the operator has locked themselves out of /admin/access. == Devin round-2 fixes (deferred on b4ec4c4) == - src/repositories/user_groups.py — narrow update() guard from blocking any mutation on system groups to blocking name change only. Description edits now pass through. Endpoint pre-check stays as defense-in-depth. Prior behavior surfaced as a misleading 409 'Cannot rename a system group' on description-only PATCH. - app/api/access.py:delete_group — wrap cascade DELETEs + repo.delete in BEGIN TRANSACTION / COMMIT / ROLLBACK. Prevents orphan rows if any DELETE fails after the user_groups row is gone. - app/marketplace_server/{packager,router}.py — split compute_etag_for_user() from build_zip(); router resolves etag first and 304-shorts before any file read or ZIP_DEFLATED. In-process cachetools.TTLCache (default 120s, env-tunable via AGNES_MARKETPLACE_ETAG_TTL, set 0 to disable). invalidate_etag_cache() called by sync to force re-hash on content drift. == Tests == - TestTableGrantsFeatureFlag (4 cases) — endpoint exclude/include, grant rejection/acceptance under the flag. - test_v12_to_v13_finalize_rollback_on_failure — destructive: monkeypatches _seed_system_groups to raise mid-transaction, asserts schema_version stays at 12, legacy tables intact, new tables empty (rollback fired). Then restores the real function and asserts the retry succeeds. - test_update_system_group_description_allowed, test_update_system_group_same_name_no_op — repo-level coverage of the narrowed guard.	2026-04-28 14:25:13 +02:00
ZdenekSrotyr	e9d7af3cce	feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening This squashes 13 commits from ma/staging plus a small docstring translation into a single coherent unit. Three workstreams. == RBAC v13 redesign == - Drops core.viewer/analyst/km_admin/admin hierarchy and the internal_roles / group_mappings / user_role_grants / plugin_access tables. - Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry. - Two authorization primitives in app.auth.access: require_admin — Admin-group god-mode require_resource_access(rt, "{path}") — entity-scoped grants Single DB lookup per request; no session cache; no implies BFS. - /admin/access UI (single page) replaces /admin/role-mapping + /admin/plugin-access. CLI `da admin group/grant ` replaces `da admin role/mapping/grant-role/revoke-role/effective-roles`. - ResourceType.TABLE listing-only — admins can record table grants, runtime enforcement still flows through legacy dataset_permissions (migration plan in docs/TODO-rbac-data-enforcement.md). == Claude Code marketplace == - Aggregated /marketplace.zip + /marketplace.git/ (PAT-gated, RBAC-filtered, content-addressed cache via dulwich). - Admin god-mode dropped on the marketplace surface — admins curate their own view via grants like everyone else. - Bare-repo cache materializes per RBAC-filtered ETag; stale entries not pruned in this iteration (disclaimed in git_backend.py docstring). == #81 #83 #44 security/ops hardening == - #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias). - #81 Group B — Keboola extractor 3-state exit codes: 0 success / 1 total fail / 2 PARTIAL fail Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary alerting must teach it the new partial signal. - #81 Group C — schema v10 view_ownership; rejects silent overwrite of a prior connector's view name on collision. - #81 Group D — extractor-side identifier validation. - #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset + path-traversal fix. - #44 — entire /api/scripts/* surface is admin-only (planted-script + sandbox-bypass risk closed). == Web UI polish + deploy fix == - /admin/access: live grant-count badges (no stale snapshot revert), shared-header CSS link added to /catalog and /admin/{tables,permissions}, per-resource-type colored stripes. - docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't silently shadow sub-mounts and write state to the wrong disk. == OSS vendor-neutralization (waves 1+2) == - scripts/grpn/ → scripts/ops/. Customer-specific identifiers (project IDs, internal hostnames, dev/prod VM IPs, brand names) replaced with placeholders across code, docs, Terraform, Caddyfile, OAuth probe, and planning docs. Downstream infra repos that copied scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must update the path. == Translation == - src/repositories/user_groups.py::ensure_system docstring translated from Czech to English for codebase consistency. Co-authored-by: Mina Rustamyan <mina@keboola.com>	2026-04-28 14:25:04 +02:00
Petr Simecek	2dfb246996	release(0.11.5): post-merge follow-up — Devin review fixes + authlib warning silenced (#74 ) Cuts 0.11.5 with all the [Unreleased] bullets that landed on top of PR #73 between commit a899877 (the original "v0.11.4" tag in the chain) and the final merge commit on main. No new public-API surface; the user-visible payoff is that v8→v9-migrated installations work end-to-end (login flows, GET /api/users, admin nav, the new role-management REST API and its last-admin protection) and `make local-dev` startup is finally quiet. Bullets covered (full text in CHANGELOG.md [0.11.5]): - _hydrate_legacy_role re-resolves from grants on every request — fixes privilege-retention after grant revoke via the role-management API. - Dev-bypass + OAuth callback now pass user_id to resolve_internal_roles so direct grants land in the session cache (not the DB-fallback path). - GET /api/users hydrates user dicts before Pydantic validation (HTTP 500 on every migrated install) + same fix for update/delete paths so last-admin protection triggers on migrated admins. - Scheduler stopped spamming POST /auth/token 401 — the auto-fetch fallback was always broken; SCHEDULER_API_TOKEN is now the only path. - POST /auth/token / Google OAuth / password / email-magic-link all hydrate user["role"] before issuing the JWT (Pydantic 500 + wrong token payload). New TestAuthLoginFlowsPostMigration regression class. - docs/RBAC.md no longer documents the non-existent implies= keyword on register_internal_role. - _seed_core_roles now actually runs on every connect (the docstring was lying — only ran during fresh install + v8→v9). New TestSeedCoreRolesSafetyNet regression class. This commit also adds: - AuthlibDeprecationWarning suppression at app/main.py top — upstream- internal forward-compat note from authlib._joserfc_helpers, not actionable on our side. Filter is targeted by class (with a message-based fallback) so other DeprecationWarnings remain visible. - pyproject.toml version: 0.11.4 → 0.11.5. - CHANGELOG.md: [Unreleased] → [0.11.5] — 2026-04-27, new empty [Unreleased] skeleton appended for the next PR to land on. Tag v0.11.5 follows; keboola-deploy-v0.11.5 tag triggers the keboola-deploy.yml workflow for agnes-dev.keboola.com.	2026-04-27 02:32:18 +02:00
Petr Simecek	83ced81966	feat(auth): unified role management — UI + REST API + CLI + schema v9 (v0.11.4) (#73 ) * feat(auth): v9 schema — unified role management foundation (WIP) Tasks 1-5, 10 of the role-management-complete plan. Foundation only, follow-up commits add REST API, CLI, UI, and tests. Schema v9: - user_role_grants table: direct user → internal_role mapping (complementary to group_mappings). Drives PAT/headless auth and persists across sessions. Source field tracks 'direct' vs auto-seed. - internal_roles.implies (JSON): transitive role hierarchy. core.admin implies core.km_admin → core.analyst → core.viewer. Resolver does BFS expand at lookup time. - internal_roles.is_core (BOOL): distinguishes seeded core.* hierarchy from module-registered roles. UI renders them differently. - v8→v9 migration: ADD COLUMN, CREATE TABLE, _seed_core_roles + _backfill_users_role_to_grants, then NULL legacy users.role values. DuckDB FK constraint blocks DROP COLUMN — sloupec zůstává jako deprecated artifact (UserRepository ignoruje), fyzický drop deferred. Resolver: - Regex extended to allow dotted namespace (core.admin, context_engineering.admin), max 64 chars total. - expand_implies(role_keys, conn): BFS over implies JSON column. - resolve_internal_roles signature gains optional user_id parameter; unions group-mapping resolution with user_role_grants direct grants before implies expansion. require_internal_role: - Two-path resolution: session cache (OAuth) → DB grants (PAT/headless fallback). PAT clients now legitimately satisfy gates without the OAuth round-trip, fixing the v8 limitation where every PAT-callable admin endpoint needed require_role(Role.ADMIN) instead of require_internal_role(...). Backward-compat: - require_role(Role.X) and require_admin become thin wrappers over require_internal_role(f"core.{role}"). Implies hierarchy preserves the legacy "at least this level" semantics automatically — no per-level comparison code needed. - src/rbac.py helpers (is_admin, has_role, get_user_role, set_user_role, can_access_table, get_accessible_tables) all read from the resolver via _get_internal_role_keys. - UserRepository.create() and update() now mirror role changes into user_role_grants via _grant_core_role helper. Preserves API while making the new table the source of truth. - UserRepository.delete() pre-deletes user_role_grants rows (FK cascade — DuckDB doesn't auto-cascade). - count_admins() reads user_role_grants ⨝ internal_roles instead of the now-NULL users.role column. First consumer: - app/api/admin.py module-level docstring documents the v9 pattern for future module authors. Existing require_role(Role.ADMIN) callsites flow through the wrapper; no behavior change for OAuth callers, and PAT callers gain access via direct grants. Tests: full suite green (1396 passed, 6 skipped). Existing tests exercise the new pathway transparently because UserRepository.create auto-grants. New test_pat_caller_with_direct_grant_passes pins the PAT-aware contract. Schema: v9 (was v8). pyproject.toml + CHANGELOG bump deferred to the final PR-prep commit. * feat(auth): role management complete — REST API + CLI + UI + docs (v0.11.4) Sjednocuje legacy users.role enum s v8 internal-roles foundation pod jeden model s implies hierarchií, dodává admin UI + REST API + CLI pro správu group mappings i přímých user grants, a dělá require_internal_role PAT-aware tak, aby admin endpointy fungovaly uniformly napříč OAuth i headless callery. REST API (app/api/role_management.py, +496 LOC): - 8 endpointů pod /api/admin: internal-roles list, group-mappings CRUD, users/{id}/role-grants CRUD, users/{id}/effective-roles debug. - Všechny gated require_internal_role("core.admin"). Audit-log na každé mutaci (role_mapping.created/deleted, role_grant.created/deleted). - Last-admin protection: refuse to delete the final core.admin grant (mirrors users.py:count_admins protection). - Nový UserRoleGrantsRepository v src/repositories/user_role_grants.py. CLI (cli/commands/admin.py extension, +258 LOC): - da admin role list / show <key> - da admin mapping list / create <group-id> <role-key> / delete <id> - da admin grant-role <email> <role-key> - da admin revoke-role <email> <role-key> - da admin effective-roles <email> - Všechno přes typer + PAT auth, --json flag, response-shape tolerantní. UI (admin_role_mapping.html + admin_user_detail.html + nav + user list): - Nová stránka /admin/role-mapping: internal_roles read-only table + group_mappings table with create/delete forms. - Nová stránka /admin/users/{id}: core role single-select + capabilities multi-checkbox + effective-roles debug (direct + group + expanded). - Existing user list dostává "Detail" link na novou stránku. - Nav link na /admin/role-mapping. Tests: +85 nových testů přes 4 nové soubory: - test_schema_v9_migration.py (8) — fresh install + v8→v9 backfill + legacy column NULL semantics + unknown-role fallback + invariants. - test_api_role_management.py (33) — všech 8 endpointů, happy + error paths, audit-log assertions, last-admin protection. - test_cli_admin_role.py (25 + 1 conditional) — typer subcommands, text + json output, PAT integration smoke. - test_admin_role_mapping_ui.py (9) + test_admin_user_capabilities_ui.py (10) — page rendering, auth gating, form contracts, JS hooks. Full suite: 1482 passed, 6 skipped (was 1396 → +86, žádné regrese). Docs: - docs/internal-roles.md kompletní rewrite — odstranil "no UI yet", přidal hierarchy diagram, dual-path resolution, dotted-namespace convention, admin workflow přes UI/CLI/REST, refresh semantics for group mappings vs direct grants, migration notes. - CLAUDE.md schema v8 → v9. - CHANGELOG.md [0.11.4] s BREAKING marker pro users.role NULL semantics + complete Added/Changed/Removed/Internal sekce. - pyproject.toml: 0.11.3 → 0.11.4. Sequencing: po mergi tohoto PR Pabu rebasuje pabu/local-dev (PR #72) na main, jeho schema migrations se posouvají z v9/v10/v11 na v10/v11/v12. Implementation breakdown: - Sequential (já): foundation tasks — schema v9, resolver, PAT-aware require_internal_role, backward-compat wrappers, rbac refactor, UserRepository auto-grant. - Parallel sub-agents (3 worktrees, ~10 min): REST API, CLI, UI. - Sequential (já): integrace, docs/CHANGELOG/version, schema tests, fullsuite verification. * fix(auth): address Devin review on PR #73 — three regressions Three concrete bugs caught in Devin's PR review, all fixed in this commit. 1. users.role hydration on read (the big one): v8→v9 migration NULLs users.role for every existing user, but a long tail of read sites still inspect user["role"] directly: - app/web/templates/_app_header.html:15 — admin nav gate - app/web/templates/_app_header.html:36-37 — role badge in dropdown - app/web/router.py:319-321 — UserInfo.is_admin/is_analyst/is_privileged - app/web/router.py:489 — corporate memory is_km_admin - app/api/catalog.py:54 — admin "see all tables" bypass - app/api/sync.py:215 — admin "see all sync states" bypass Without a fix, every existing admin loses the entire admin nav (and API admin bypasses) immediately after upgrade — a serious regression. Fix: new helper _hydrate_legacy_role() in app/auth/dependencies.py maps the highest-level core.* grant back into user["role"] as the legacy enum string. Called from get_current_user() on both auth paths (LOCAL_DEV_MODE + JWT/PAT). Idempotent — skips when role is already populated. Net effect: every pre-v9 callsite keeps working transparently for both OAuth and PAT callers, with one extra DB round-trip per authenticated request (same cost as the existing PAT-aware require_internal_role fallback). 3 regression tests in tests/test_schema_v9_migration.py: - test_hydration_recovers_role_from_user_role_grants - test_hydration_returns_highest_grant (multi-grant → highest wins) - test_hydration_falls_back_to_viewer_when_no_grants (safe fallback) 2. CLI effective-roles TypeError: API returns direct/group as List[Dict] (RoleGrantResponse-shaped), but the CLI did ', '.join(direct) which raises TypeError on dicts. Tests masked it because mocks used bare string lists. Replaced raw .join() with a _names() helper that extracts role_key from each item, falling back to str() for legacy mock shapes. 3. UI template field-name mismatch: admin_user_detail.html JS reads data.groups but the API serializes the field as group (singular, per EffectiveRolesResponse pydantic). Currently benign because the API always returns group:[], but the field would silently disappear once the group-derived view is wired up. Added data.group as the primary lookup, kept the legacy aliases for shape-drift tolerance. Full suite: 1485 passed (was 1482, +3 hydration tests), 6 skipped, no regressions. * fix(auth): Devin review #2 + UX self-service + RBAC docs rename Three threads landed in one commit because they share the same auth/role surface and CHANGELOG entry. Devin review #73 second round (2 actionable findings): - _hydrate_legacy_role no longer short-circuits on truthy users.role. The role-management endpoints (POST/DELETE /api/admin/users/{id}/ role-grants + the changeCoreRole UI flow) only mutate user_role_grants — they don't update the legacy column. The early return trusted that stale value, so a user downgraded via the new REST/UI kept role="admin" in their dict on subsequent requests, which fooled _is_admin_user_dict (src/rbac.py) and the catalog/sync admin-bypass short-circuits into retaining elevated table access even though require_internal_role correctly denied the API gates. Always re-resolves now, making user_role_grants the single source of truth on every authenticated request. Cost: one DB round-trip per request — same as the existing PAT-aware fallback. Pinned by test_hydration_ignores_stale_legacy_role_after_grant_revoke. - Dev-bypass (app/auth/dependencies.py) and OAuth callback (app/auth/providers/google.py) now pass user_id to resolve_internal_roles so direct grants land in session["internal_roles"] alongside group-mapped roles. Pre-fix, every admin-gated request fell through to the per-request DB fallback inside require_internal_role and the dev-bypass log line read "resolved 0 internal role(s)" for an obviously-admin user. test_session_internal_roles_populated updated to assert union. User-visible UX (also addresses local-test feedback): - HTTP 500 on /admin/users post-v8→v9 migration — UserResponse.role is required str, but legacy users.role was NULL-ed by the migration. _to_response in app/api/users.py now routes every dict through _hydrate_legacy_role; same fix lifts the silent no-op of last-admin protection in update_user/delete_user (the role-equality short-circuits would skip the count_admins guard for migrated admins). Three regression tests under TestAPIUsersPostMigration. - /profile is now a real self-service detail page for every signed-in user (not just admins). Three new server-side sections: Effective roles (resolver output as chip cloud), Direct grants (rows in user_role_grants with source label), Roles via groups (which Cloud Identity / dev group grants which role for the current user). Non-admins finally see why a feature is or isn't accessible. Admins additionally see a deep-link to /admin/users/{id} for editing their own grants. - /admin/role-mapping group-id picker. New "Known groups" panel above the create form: clickable chips for the calling admin's own session.google_groups (tagged "your group") merged with external_group_ids already used in existing mappings (tagged "already mapped"). Click a chip → fills the form. Empty-state copy points operators at LOCAL_DEV_GROUPS / Google sign-in instead of leaving them to guess Cloud Identity opaque IDs from memory. Operational fixes: - Scheduler log-noise: every cron tick produced a POST /auth/token 401 because the auto-fetch fallback called the endpoint with just an email (no password) and silently fell through. Removed the broken path entirely. Operators set SCHEDULER_API_TOKEN (long-lived PAT) in production; in LOCAL_DEV_MODE the dev-bypass auto-authenticates the un-tokenized request, so jobs continue to work. Docs: - docs/internal-roles.md → docs/RBAC.md (git mv preserves history). Standard industry term, more discoverable for engineers grepping for RBAC in a new repo. Restructured: Quickstart-by-role (operator / end-user / module author), step-by-step Module-author workflow with code examples (register key, gate endpoint, declare implies, write contract test), naming pitfalls, refresh semantics. CLAUDE.md gets a new "Extensibility → RBAC" section pointing contributors at the doc before they add gated endpoints. Cross-refs in app/api/admin.py + tests/test_role_resolver.py updated. Tests: 293 in the auth/role/scheduler/UI test set passed, 0 regressions. * fix(auth): Devin review #3 — login flows + RBAC docs Two new findings on commit 7d1c048, both real and addressed. Finding 1 (BUG, HTTP 500): every auth login flow loaded users via UserRepository.get_by_email and passed user["role"] straight to create_access_token, Pydantic response models, and _set_login_cookie without going through _hydrate_legacy_role. Post-v9 the legacy column is NULL for migrated users, and TokenResponse.role is a required str — so POST /auth/token raised ValidationError → HTTP 500 for any v8-admin trying to log in via password. Same root cause produced non-crashing but semantically wrong JWTs (role: null) from Google OAuth, password web flows, and email magic-link verification. Fix: hydrate inline in every login flow before reading user["role"]: - app/auth/router.py — POST /auth/token (the crash site) - app/auth/providers/google.py — OAuth callback (was just stale JWT) - app/auth/providers/password.py — 5 flows: JSON login, web login, JSON setup, web reset confirm, web setup confirm - app/auth/providers/email.py — centralized in _consume_token, covers both /verify endpoints New regression class TestAuthLoginFlowsPostMigration pins both the no-crash and the correct-role contracts for all four legacy levels (viewer/analyst/km_admin/admin) on POST /auth/token. Finding 2 (DOCS): docs/RBAC.md showed register_internal_role() being called with implies=[...], but the function signature is (key, , display_name, description, owner_module). A module author copying the example would TypeError at import time. The implies field on internal_roles IS honored at runtime by expand_implies, but the registry-side write path (register_internal_role + InternalRoleSpec + sync_registered_roles_to_db) doesn't exist yet — implies is currently seeded only for the core. hierarchy via _seed_core_roles in src/db.py. Rewrote the Implies hierarchy and Module-author workflow sections to document what's actually supported in 0.11.4 and what a future change would need to add. The "for cross-module hierarchies, register each level + grant both" pattern works today. Tests: 322 in the auth/role/scheduler/UI/password test set passed, 0 regressions. * fix(db): _seed_core_roles actually runs on every connect (Devin review #4) Devin flagged that the docstring on `_seed_core_roles` promised per-connect execution as a safety net for accidental DELETEs and in-code seed changes, but the only call sites lived inside `if current < SCHEMA_VERSION:` — so once a DB was on v9 the function never ran again, and the docstring lied. Picked option (b) from the review (actually call it on every startup) over option (a) (fix the docstring) because the safety net is genuinely useful: - recovery from accidental admin DELETE on internal_roles, - in-code _CORE_ROLES_SEED tweaks (display_name/description/implies) ship without a manual SQL deploy, - fresh installs and migrations stop needing their own seed call sites. Tail call gated by `get_schema_version(conn) <= SCHEMA_VERSION` so the future-version-is-noop rollback contract still holds — a v9 binary won't touch a DB that's been upgraded past v9. Test coverage: new TestSeedCoreRolesSafetyNet class (3 tests) pins the three contracts — deleted row re-seeds, mutated display_name re-syncs from in-code seed, applied_at on schema_version doesn't churn on already-current DBs. Existing TestMigrationSafety::test_future_version_is_noop still passes (verified against the gating logic).	2026-04-27 02:23:01 +02:00
Petr Simecek	6c36b26979	release(0.11.3): internal roles + external→internal group mapping (foundation) (#71 ) * feat(auth): internal roles + external→internal group mapping (foundation) Two-layer authorization model: external Cloud Identity groups (org-managed) get mapped onto internal Agnes-defined capabilities (app-managed) via an admin-curated many-to-many table. Per-request permission checks read off the session — no DB hit. Refresh requires re-login. Schema v8 — new tables: - internal_roles (id, key UNIQUE, display_name, description, owner_module, …) — app-defined capabilities like 'context_admin'. Modules self-register at import; the startup hook syncs the registry into this table (idempotent). - group_mappings (id, external_group_id, internal_role_id FK, …) — admin-managed bindings, UNIQUE(external_group_id, internal_role_id). app/auth/role_resolver.py — new module: - register_internal_role(key, display_name, description, owner_module) Module-author entry point. lower_snake_case key, immutable, validated. Same key + same fields = no-op (re-import safe); same key + different fields = ValueError so two modules can't silently overwrite each other. - sync_registered_roles_to_db(conn) — startup reconciliation. Inserts new keys, updates drifted metadata, never deletes (preserves mappings). - resolve_internal_roles(external_groups, conn) — joins group_mappings. Sorted, deduplicated role-key list. Plugged into google_callback + dev-bypass branch in get_current_user. - require_internal_role('key') — FastAPI dependency factory; reads session.internal_roles; 403 with explicit message when missing. Resolution runs at sign-in only (Google callback + LOCAL_DEV_GROUPS change in dev-bypass) — same semantics as session.google_groups. No admin UI yet; mappings created via repository directly until follow-up PR ships UI. 21 new tests in tests/test_role_resolver.py: register/list, idempotency, collision detection, key-format validation; sync insert/update/no-delete; resolve empty/single/many-to-many/malformed-input; e2e via LOCAL_DEV_GROUPS — gated endpoint allowed/denied + direct session-cookie inspection. Full sweep: 178/178 passed across auth + db + repo tests. (Two pre-existing test_catalog_export.py failures verified unrelated.) * fix(auth): polish review feedback — first-request dev populate + PAT doc Two follow-ups from a code-reviewer pass on the foundation commit before opening the PR: - Dev-bypass populates session["internal_roles"] on the first request after sign-in, not just when external groups change. The previous guard only resolved when groups_changed=True, which left a hole for the LOCAL_DEV_GROUPS=`""` (explicit empty) flow: target=[], current=None, neither write branch fires, internal_roles stays unset, and require_internal_role then 403s with no roles to check against. The OAuth callback writes session["internal_roles"] unconditionally on sign-in (even []); dev-bypass now matches that semantics. Adds a single-pass populate gated on the key being absent from the session, so subsequent same-state requests still no-op (cheap session lookup, no resolver call). - Document that internal roles are session-scoped and PAT/headless clients will get 403 from any require_internal_role(...) endpoint. Same constraint already applies to session.google_groups (PAT JWTs deliberately don't snapshot group memberships — they could change after issuance with no way to re-sign), but the doc didn't surface this — an operator pointing a CLI at a role-gated endpoint would see 403 with no clue why. New "PAT and headless requests" section spells out the constraint, the rationale, and the three escape valves (use users.role for the gate; route through OAuth; wait for the planned `da admin grant-role` CLI helper). 54 auth tests still pass locally (21 role-resolver + 33 existing auth-provider). * release(0.11.3): cut release for the internal-roles foundation Bumps pyproject.toml 0.11.2 → 0.11.3 and renames CHANGELOG's [Unreleased] section to [0.11.3] — 2026-04-26 (with a fresh empty [Unreleased] skeleton appended). Adds the matching [0.11.3] link reference at the bottom of CHANGELOG so the section heading renders as a hyperlink to the GitHub release page once the tag lands. The bullet itself is unchanged content; the rephrasing of "dev-bypass when external groups change" → "dev-bypass — populates on first request and whenever external groups change, mirroring the OAuth callback's always-write semantics" reflects the polish committed in d590579, plus the appended PAT/headless caveat pointing at the doc section that landed in the same polish pass. * fix(auth): address review feedback from Pavel — PAT-specific 403, audit logs, hardening Round-2 polish over the internal-roles foundation, addressing Pavel's review on PR #71. No behavior change for the happy path; tightens the safety rails and makes the failure modes self-explanatory. User-visible: - require_internal_role now distinguishes "no session" (Bearer/PAT caller) from "signed in but missing role" and surfaces a PAT-specific 403 detail in the first case ("This endpoint needs an interactive (OAuth) session — Bearer/PAT tokens do not carry session-resolved roles by design"). - docs/internal-roles.md documents deactivate+reactivate as the supported "force re-resolve now" lever for users that can't be made to log out. Internal hardening: - INFO-level audit log on every successful resolve (OAuth callback + dev-bypass) so a wrong-role complaint is debuggable from the log alone. - Startup warning when SESSION_SECRET is shorter than 32 chars, matching the existing JWT_SECRET_KEY gate — both HMAC surfaces sign trust-laden state (session.internal_roles, session.google_groups, JWTs). - _clear_registry_for_tests() now refuses to run unless TESTING=1 so a stray import path in production can't drop the registered capabilities. Tests: - 4 new tests in tests/test_role_resolver.py covering: stale-session contract after a mid-session mapping revoke (pin the documented limitation), PAT 403 detail wording, OAuth pipeline data flow from external groups to internal_roles, and the dev-bypass empty-list fallback when the resolver raises. CHANGELOG.md updated under [0.11.3] (### Changed + ### Internal). CLAUDE.md schema doc bumped from v7 to v8. --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-26 23:49:10 +02:00
Petr Simecek	1c18cdf15f	release(0.11.2): LOCAL_DEV_GROUPS dev mock + Makefile defaults + docs/local-development.md (#70 ) * feat(auth): mock session.google_groups in LOCAL_DEV_MODE via LOCAL_DEV_GROUPS LOCAL_DEV_MODE auto-logged-in the dev user but left session.google_groups empty, so group-aware UI/code paths can't be exercised on localhost without a real Google OAuth round-trip. New LOCAL_DEV_GROUPS env var (JSON array matching the production {id, name} shape) populates the session on every dev-bypass request — same structure the OAuth callback writes, so mock and prod stay in lockstep. Compare-then-write avoids spurious Set-Cookie noise on PAT/CLI requests; malformed input falls back to [] with a WARNING so the dev mock never breaks the dev flow. * refactor(auth): fail-fast LOCAL_DEV_GROUPS at startup + cache + no-mutate Three small follow-ups on the same dev-mock vector before merge: - Validate LOCAL_DEV_GROUPS at app startup and report the parsed group IDs in the LOCAL_DEV_MODE banner. A malformed value now warns loudly at boot instead of silently logging on the first authenticated request, where it's easy to miss. - Cache the parsed result single-slot, keyed by the raw env-string. Avoids re-parsing JSON on every authenticated request without test-isolation surprises — when the env value changes, the key changes and the cache transparently rebuilds. - Stop mutating the parsed-input dicts (item.setdefault → spread-merge) so the cached list stays a fresh value on every rebuild. - Replace the try/except guard around request.session with hasattr — SessionMiddleware is always registered, the silent except was paranoid. Tests grow by a direct session-cookie inspection (decoupled from the profile template) and three startup-banner log assertions. * fix(auth): drop fragile session-decoder test + actually skip empty-target write Two follow-ups on the LOCAL_DEV_GROUPS feature before merge: - Drop test_session_holds_mocked_groups_directly. It manually decoded the signed session cookie via TimestampSigner + base64, hardcoding both the Starlette session-cookie format and the 14-day max_age. Starlette has changed its session encoding before (URLSafeTimedSerializer pre-0.20) and would do so again silently — the test would fail with a cryptic BadSignature, not a clear "mock is broken" signal. The remaining test_dev_user_sees_mocked_groups_on_profile already covers the same observable signal (mocked groups in /profile body) without coupling to Starlette internals. - Actually skip the session write when target_groups is empty. The previous comment claimed compare-then-write avoided spurious Set-Cookie noise on PAT/CLI requests, but on those requests session.get("google_groups") is None and target is [], so None != [] always evaluates True and the write fired anyway, marking the session dirty and re-issuing Set-Cookie on every request. Adding `target_groups and ...` to the guard makes the comment honest: empty mock now genuinely no-ops, stable browser sessions still skip via value-equality, and the only remaining write is the one that actually changes state. 33 auth tests still pass locally. * fix(auth): match production's always-write semantics for stale dev groups Devin code-review finding on PR #70: my earlier `target_groups and ...` short-circuit silently diverged from the production OAuth callback. In app/auth/providers/google.py:189-194 the callback always writes session.google_groups on each login — including [] on failure or empty token — so the session always reflects authoritative current state. The mock should match. Failure mode the previous guard left open: a developer sets LOCAL_DEV_GROUPS=[{...}] for a session, the groups land in the signed cookie, then the developer unsets the env var and reloads. target → [], session.get → [{...}], `if target_groups and ...` is False, no write, stale groups stay in the browser session indefinitely. Mock now lies about state until logout. Fix splits the guard: - target_groups truthy + value-changed → write the new mock (existing path) - target_groups falsy + non-empty stored → write [] to clear stale state - otherwise no-op (target [] + stored None/[]: no transition to record) PAT/CLI requests with no prior session still take the no-op path (target=[], session.get → None which is falsy), so the original goal of suppressing spurious Set-Cookie noise on token traffic is preserved. Tests already cover the populated and unset paths; the new clear-stale branch is correct by construction (production has the same shape) and the rare manual reset workflow. * release(0.11.2): default mocked groups in make local-dev + docs/local-development.md Cuts 0.11.2 around the LOCAL_DEV_GROUPS work plus a small dev-experience follow-up: every `make local-dev` now boots with two sensible default mocked groups (Local Dev Engineers + Local Dev Admins on example.com), so /profile and group-aware code paths render something realistic without the operator having to discover and set LOCAL_DEV_GROUPS. Layered so the default lives in the workflow, not the contract: - scripts/run-local-dev.sh seeds LOCAL_DEV_GROUPS via shell ":=" syntax — only sets the var when the operator hasn't already. Override: LOCAL_DEV_GROUPS='[...]' make local-dev. Disable: LOCAL_DEV_GROUPS= make local-dev. - docker-compose.local-dev.yml swaps the commented JSON example for a bare `- LOCAL_DEV_GROUPS` passthrough — the value comes from the shell, the compose file just propagates it. Operators running `docker compose up` directly without the wrapper script get an empty mock (correct: they didn't opt into the make-driven defaults). - Makefile help line mentions the mocked groups so the behavior is visible without grepping. New docs/local-development.md consolidates dev-onboarding instructions that were previously scattered across docker-compose.local-dev.yml inline comments, docs/auth-groups.md "Local-dev mock" section, the Makefile help text, and CLAUDE.md "First-Time Setup". Single page now covers TL;DR, what LOCAL_DEV_MODE actually bypasses, group mocking controls + verification, what is not mocked (Cloud Identity, real OAuth, admin Workspace permissions), and the safety rails that keep the dev shortcuts off production. Version bump 0.11.1 → 0.11.2 in pyproject.toml, CHANGELOG cuts [Unreleased] → [0.11.2] — 2026-04-26 with a fresh empty [Unreleased] skeleton. * fix(local-dev): default LOCAL_DEV_GROUPS truncated by shell parameter expansion Reported by an operator running `make local-dev` against the freshly released 0.11.2 — the LOCAL_DEV_MODE banner showed: LOCAL_DEV_GROUPS is not valid JSON, ignoring: Expecting ',' delimiter: line 1 column 70 (char 69) LOCAL_DEV_GROUPS is set but produced no valid groups — check the WARNING above for the parse error. Cause: the default value lived inside `${LOCAL_DEV_GROUPS:=…}` parameter expansion. Bash matches `}` to close the expansion at the first `}` encountered in the body, regardless of context — even one inside a nested JSON object literal. The two-element JSON array was therefore truncated to the first group's closing brace, leaving an unparseable fragment: [{"id":"local-dev-engineers@example.com","name":"Local Dev Engineers" There is no escaping syntax for `}` inside parameter expansion (the backslash escapes I had only escaped the quotes — `}` reaches bash literally). Fix: hold the default in a single-quoted variable and reference it through `${LOCAL_DEV_GROUPS:-$DEFAULT_LOCAL_DEV_GROUPS}`. The variable's value is opaque to the expansion — no `}` matching inside it — so the JSON survives intact. Verified with `python -m json`: parsed OK: 2 groups: ['local-dev-engineers@example.com', 'local-dev-admins@example.com'] Operators on a running 0.11.2 stack: `make local-dev-down && make local-dev` to pick up the corrected default. * fix(local-dev): respect LOCAL_DEV_GROUPS= disable path + add 0.11.2 changelog link Two follow-ups from a Devin code-review pass on PR #70: - run-local-dev.sh: switch ${LOCAL_DEV_GROUPS:-$DEFAULT} to ${LOCAL_DEV_GROUPS-$DEFAULT} (no leading colon). The :- form substitutes the default when the variable is unset OR set-but-empty, silently overwriting the documented disable knob. Three places promise this works — docs/local-development.md, the CHANGELOG entry, and the script's own comment — so the bug was an operator-facing lie, not just an implementation detail. The bare - form only substitutes on unset, so `LOCAL_DEV_GROUPS= make local-dev` now reaches the Python parser as "" and short-circuits to []. Verified with both empty and unset shells. - CHANGELOG.md: add the [0.11.2] link reference at the bottom. Keep-a-Changelog convention is to mirror every version heading with a release-tag link in the footer; the 0.11.2 heading was missing its counterpart, breaking the Markdown link rendering on GitHub. --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-26 16:48:55 +02:00
Petr Simecek	2099bb816e	release(0.11.1): hotfix the missed CADDY_TLS passthrough + changelog discipline (#67 ) Patch release containing the two follow-up changes from 0.11.0: - Caddy CADDY_TLS env passthrough in docker-compose.yml (#55) — should have shipped with #52 but the first PR got accidentally closed before merge. Without this fix Caddy ignores .env CADDY_TLS and crash-loops on any LE / internal-CA deployment. - CLAUDE.md changelog discipline (#59) — every PR touching user-visible behavior must update CHANGELOG.md under [Unreleased] in the same PR. The discipline rule itself caused this release to exist: writing the [Unreleased] entry made the missed fix obvious, which is exactly the feedback loop the rule is supposed to create.	2026-04-26 01:52:08 +02:00
Petr Simecek	598f186eb1	release(0.11.0): reset to pre-1.0 semver + first changelog (#58 ) The version = "2.x" strings in earlier pyproject.toml snapshots were arbitrary placeholders from the initial scaffold (cookiecutter default), not a reflection of API maturity. Resetting to 0.11.0 to signal pre-1.0 status: public surface (CLI flags, REST endpoints, instance.yaml schema, extract.duckdb contract) may still shift between minor versions. CalVer image tags (stable-YYYY.MM.N, dev-YYYY.MM.N) continue from CI; semver tags (v0.X.Y) are cut at release boundaries and reference the same commit as a stable-* tag from the same day. CHANGELOG.md replaces the old CalVer draft format with Keep a Changelog + semver. The 0.11.0 entry curates everything currently in main: - Auth: Workspace groups, password reset, PAT, magic-link, seed admin pwd - Deploy: keboola-deploy workflow, Caddy/LE/cert-file TLS, dev_instances TLS, optional Google OAuth from SM, LOCAL_DEV_MODE, /setup wizard - CLI: wheel distribution, auto-update, --version, --dry-run, gzip - Data: remote query (BQ+DuckDB), business metrics, OpenAPI snapshot test - Security: padak-security.md audit batch + urllib3 + argon2-cffi - Two BREAKING items called out (Caddy profile rename, Caddyfile default cert mode flipped to cert-file)	2026-04-26 01:05:55 +02:00
Petr Simecek	1bbbe58ea0	release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43 ) * fix(cli): versioned wheel URL in setup instructions; drop broken /cli/agnes.whl alias (#36) * fix(cli): inline PEP 427 wheel filename in setup instructions `uv tool install <server>/cli/agnes.whl` fails with error: The wheel filename "agnes.whl" is invalid: Must have a version because uv validates the filename in the URL path before fetching — so the server-side Content-Disposition header (which has the real versioned filename) is never consulted, and an HTTP redirect does not help either: uv resolves the filename from the initial URL. Fix the root cause by inlining the real PEP 427 filename into the setup snippet the dashboard copies to the clipboard. The wheel filename is resolved server-side via `_find_wheel()` and substituted into the lines returned from `setup_instructions.resolve_lines()`, so both the read-only HTML preview and the JS clipboard renderer get byte-identical output. Also added `/cli/wheel/{filename}` to serve wheels at their PEP 427 path, and kept `/cli/agnes.whl` as a 302 redirect for manual/legacy callers — though that redirect alone is NOT sufficient for `uv tool install` (uv validates before following redirects) and is there only as defense-in-depth. Verified locally: - `uv tool install <server>/cli/wheel/agnes_the_ai_analyst-2.0.0-py3-none-any.whl` succeeds - `/install` HTML now renders the versioned URL; `/cli/agnes.whl` no longer appears in the rendered snippet * fix(cli): remove /cli/agnes.whl alias entirely — it only confused users The bareword alias was never actually usable: - `uv tool install <server>/cli/agnes.whl` fails at filename validation before any HTTP fetch, so neither the Content-Disposition header nor a 302 redirect rescued it. - The 302-to-versioned-path fallback left a visibly "working" URL in browser / curl -L contexts, which is exactly how the original bug got reported in the first place ("the URL loads, why doesn't install work?"). Remove the endpoint and scrub all remaining references. The only CLI wheel URL is now `/cli/wheel/{filename}` with the real PEP 427 filename, which the setup-instructions template already generates server-side. Existing tests that referenced /cli/agnes.whl become negative tests ("must not appear") so we don't regress. * feat(cli): --version flag; sync --dry-run + progress indicator (#38) * feat(cli): add --version / -V flag Prints `da <version>` from package metadata (importlib.metadata). Falls back to "unknown" when the package is not installed (e.g. running from a source checkout without `uv pip install -e .`), instead of crashing. Eager typer callback, so `da --version` exits before subcommand resolution and does not require any auth/config. * feat(cli): da sync --dry-run + X/N progress indicator --dry-run reports what would be downloaded/uploaded without hitting the API or writing local state. Supports the full flag set (--table, --json, --upload-only); JSON shape is {"dry_run": true, "would_download": [...], "summary": {...}}. Progress bar now shows "[X/N] Downloading <table>..." with a Rich BarColumn + TaskProgressColumn + TimeElapsedColumn instead of a bare spinner — makes long syncs visible. * feat(cli): durable sync + server gzip + auto-update check (#41) * fix(sync): atomic writes + manifest hash verification + retry on transient errors Three durability hooks around stream_download and the sync command: 1. Atomic writes. stream_download now streams into `<target>.tmp` and calls os.replace() on success, so the real target file never exists in a half-written state. On failure the tmp is unlinked — no cleanup leftovers, no guard needed at read time. 2. Retry with backoff. Transient errors (ConnectError, ReadError, WriteError, RemoteProtocolError, TimeoutException, 5xx) are retried up to 3× with 0.3s / 1s / 3s backoff. 4xx (auth, 404) surfaces immediately — retrying those is pointless. 3. Manifest-hash verification. After download, sync.py computes MD5 of the target (same 8KiB chunking as app/api/sync.py:_file_hash) and compares against `server_tables[tid]["hash"]`. Mismatch ⇒ unlink, record error, skip state commit. The PAR1 structural check survives as a fallback for legacy manifests without a hash. Also makes _rebuild_duckdb_views tolerant: single broken parquet is skipped with a stderr warning instead of killing the whole rebuild. Supersedes #40 — this commit is a strict super-set (hash check + PAR1 fallback + atomic write + retry). #40 can be closed without merging. * perf(server): enable GZipMiddleware for JSON / HTML responses GZipMiddleware at minimum_size=1024 shaves bandwidth on manifest-style JSON endpoints (/api/sync/manifest, /api/version, …) and the /install HTML preview. Parquet file downloads are already columnar-compressed so the middleware sees limited benefit there — but it doesn't hurt, httpx on the client side decompresses transparently. Placed after session middleware so gzip wraps the session-Set-Cookie response too, and before CORSMiddleware so compression is applied to both cross-origin and same-origin responses. * feat(cli): auto-check for newer CLI version on startup Server side - GET /cli/latest returns {version, wheel_filename, download_url_path} for whatever wheel is currently in AGNES_CLI_DIST_DIR. Public, cacheable, no secrets — consumed by the CLI auto-update probe. Client side - New cli/update_check.py: reads /cli/latest with a 3s timeout, caches the result in $DA_CONFIG_DIR/update_check.json for 24h. Cache is invalidated when the installed version changes (e.g. after a fresh `uv tool install`) so stale "you're behind" warnings don't linger. - Root typer callback fires the probe before subcommand dispatch; any failure is swallowed so a bad network never blocks a working command. - Outdated → one-line stderr warning: [update] da 2.0.0 is out of date — latest on this server is 2.1.0. Upgrade: uv tool install --force <server>/cli/wheel/<…>.whl - Disable with DA_NO_UPDATE_CHECK=1. * fix(pr-review): None-guard the upgrade line + skip gzip on parquet paths Two follow-ups from Devin review on #41. 1. format_outdated_notice(UpdateInfo(download_url=None)) emitted literal "uv tool install --force None" — copy-pasting that fails. Drop the upgrade snippet when the URL is absent and keep only the version line. 2. GZipMiddleware compressed everything over 1024 bytes, including the parquet FileResponses served by /api/data/{tid}/download, /cli/wheel/{name}, and /cli/download. Parquet is already columnar- compressed — gzip there is pure CPU + latency with no size win, and /api/data bodies can reach hundreds of MB. Wrap GZipMiddleware in a small _SelectiveGZipMiddleware that skips those path prefixes and delegates the rest to the stock middleware. JSON / HTML endpoints (manifest, /install, /api/version, …) still get compressed. * release: bump to 2.1.0 — unify AGNES_VERSION with pyproject.toml version (#42) Before: two independent version systems. pyproject.toml carried semver (2.0.0 → wheel filename → `da --version`) while release.yml injected CalVer into AGNES_VERSION (e.g. 2026.04.155 → /api/version). Users saw different strings in the CLI vs. the /install page, and the CLI auto- update check couldn't tell "new deploy, same package version" apart from "new package version". Make pyproject.toml [project].version the single product-version source of truth. release.yml extracts it and feeds AGNES_VERSION, so every surface (/api/version, /api/health, /cli/latest, `da --version`) agrees on one number. The CalVer tag keeps doing what CalVer is for: release identity on the git tag and Docker image tag (versioned_tag). Also wires AGNES_TAG through the build: release.yml → Dockerfile ARG → env, so /api/version.image_tag finally reports the actual image tag instead of the "unknown" fallback. Bump to 2.1.0 to reflect the PRs shipped on ps/wheel-name-fix: durable sync (atomic writes + manifest MD5 + retry), server GZip, CLI auto- update probe, setup snippet PEP 427 URL. * fix(pr-review): directional version compare in is_outdated() UpdateInfo.is_outdated() used `self.latest != self.installed`, which fires in both directions. If the server is rolled back or the user connects to an older deployment, the CLI would warn "out of date" and — worse — the formatted notice would prompt uv tool install --force <older-version>.whl i.e. an unintended downgrade. Compare with packaging.version.Version (PEP 440 aware, handles pre- release tags). Fall back to dotted-int tuple compare if packaging is somehow missing, and return False on unparseable strings — better to miss an upgrade hint than to silently suggest a downgrade. Adds 4 test cases: installed older (True), installed newer (False), 10.0.0 vs 2.1.0 lexical-compare trap (correct), unparseable strings (False). Addresses Devin review on #43. * fix(pr-review): read FastAPI app version from package metadata app/main.py:80 hardcoded `version="2.0.0"` in the FastAPI constructor. After #42 bumped pyproject.toml to 2.1.0, /api/version, /cli/latest, and `da --version` all reported 2.1.0 while /openapi.json and the /docs UI still advertised 2.0.0. Read `agnes-the-ai-analyst` version via importlib.metadata (same pattern cli/main.py:_cli_version already uses), with a `"dev"` fallback when the package is not installed (source checkout). This way pyproject.toml stays the single source of truth across every version surface — /openapi.json now tracks the bump automatically. Adds a dedicated test file to pin this behavior so a future regression to a hardcoded literal fails at CI. Addresses second Devin finding on #43. * fix(pr-review): _fmt_bytes PiB label + negative cache in update_check Two more follow-ups from Devin review on #43. 1. _fmt_bytes off-by-unit. The old loop exited at TiB but the fallback labelled PiB, so 1 PiB rendered as "1024.0 PiB". Restructure: put every unit inside the loop (KiB through EiB) so the division count always matches the label. Covers up to 1 ZiB cleanly; anything beyond renders as "<big>.0 EiB" rather than crashing. 2. Negative cache for failed /cli/latest probes. On a corporate firewall / VPN that silently drops packets, the 3s HTTP timeout fired on every `da` invocation. Writing a `latest=None` cache entry with a 5-minute TTL caps that at one probe per 5min. Successful probes still use the 24h TTL. Reading logic branches on whether the cached `latest` is None. Adds TestFmtBytes (2 cases: small/medium sizes and the PiB/EiB fallback regression), plus two TestSync update-check cases covering negative- cache reuse and TTL expiry.	2026-04-22 21:18:18 +02:00
dependabot[bot]	6e93461918	chore(deps): bump python-multipart from 0.0.24 to 0.0.26 Bumps [python-multipart](https://github.com/Kludex/python-multipart) from 0.0.24 to 0.0.26. - [Release notes](https://github.com/Kludex/python-multipart/releases) - [Changelog](https://github.com/Kludex/python-multipart/blob/master/CHANGELOG.md) - [Commits](https://github.com/Kludex/python-multipart/compare/0.0.24...0.0.26) --- updated-dependencies: - dependency-name: python-multipart dependency-version: 0.0.26 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-04-21 13:26:19 +00:00
dependabot[bot]	043ae4b378	chore(deps): bump authlib from 1.6.9 to 1.6.11 Bumps [authlib](https://github.com/authlib/authlib) from 1.6.9 to 1.6.11. - [Release notes](https://github.com/authlib/authlib/releases) - [Changelog](https://github.com/authlib/authlib/blob/v1.6.11/docs/changelog.rst) - [Commits](https://github.com/authlib/authlib/compare/v1.6.9...v1.6.11) --- updated-dependencies: - dependency-name: authlib dependency-version: 1.6.11 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-04-17 00:41:27 +00:00
ZdenekSrotyr	510608813c	test: add shared test infrastructure (fixtures, factories, assertions, mocks) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 11:05:35 +02:00
ZdenekSrotyr	86fe4b411d	fix: upgrade urllib3 1.26→2.6.3 — resolves all 4 Dependabot security alerts Removed kbcstorage from all dependency groups (optional + dev) so urllib3 is no longer pinned to <2.0. Legacy Keboola client is available via manual install: pip install kbcstorage	2026-04-09 14:53:30 +02:00
ZdenekSrotyr	809448e02b	fix: move kbcstorage to optional dep — unblocks urllib3 security updates kbcstorage pins urllib3<2.0.0 which blocks Dependabot security patches. Moved to [project.optional-dependencies] keboola-legacy since the primary extraction path uses the DuckDB Keboola extension, not kbcstorage. Legacy fallback uses lazy import — app works without it installed.	2026-04-09 14:46:50 +02:00
ZdenekSrotyr	0279cc06fa	refactor: consolidate deps into pyproject.toml, remove requirements.txt - All dependencies now in pyproject.toml [project.dependencies] - Dev/test deps in [project.optional-dependencies] dev and [tool.uv] - Dockerfile uses uv pip install . from pyproject.toml - CI uses uv pip install ".[dev]" - Deleted requirements.txt and requirements-dev.txt - Updated README, CLAUDE.md install instructions - Enhanced .dockerignore (exclude tests, docs, infra from image)	2026-04-09 13:17:59 +02:00
ZdenekSrotyr	224635b88d	security: fix auth (argon2, cookie, JWT), CORS, session middleware, pyproject.toml	2026-04-08 12:08:52 +02:00
ZdenekSrotyr	5ee12d78e7	refactor: final cleanup — delete legacy auth, clean deps, fix hash, migrate to uv - Delete root auth/ directory (legacy Flask providers, orphaned) - Clean requirements.txt: remove Flask, gunicorn, authlib, sendgrid, anthropic, openai, argon2-cffi (9 unused deps) - Fix hash computation in orchestrator: MD5 of parquet mtime+size (CLI sync now skips unchanged tables correctly) - Migrate pip → uv in CLAUDE.md, scripts/init.sh, pyproject.toml - Sync pyproject.toml dependencies with requirements.txt 578 tests passing.	2026-03-31 19:18:30 +02:00
ZdenekSrotyr	3701130a11	feat: add Docker, CLI tool, scheduler, and agent skills - Dockerfile (uv-based) + docker-compose.yml (3 services) - CLI tool 'da' with commands: auth, sync, query, status, admin, diagnose, skills - Scheduler sidecar service (replaces systemd timers) - pyproject.toml for uv distribution - Built-in skills (setup, troubleshoot) for AI agents - 17 CLI tests, 75 total tests passing	2026-03-27 15:30:03 +01:00

1 2 3

121 commits