agnes-the-ai-analyst

Author	SHA1	Message	Date
Vojtech	2e2e1a1eca	feat(home): state-aware /home + /setup-advanced + schema v26 (#228 ) * feat(home+news): state-aware /home + /news + admin-edited news section Squash of the vr/home-page feature work for clean rebase onto main. Original 18-commit history preserved in branch backup/vr-home-page-pre-rebase. What's in this PR: State-aware /home page - New `/home` route with hero + auto-mode + connectors (Asana / GWS / Atlassian) + lookarounds. Onboarded vs not-onboarded state-machine branches a single template (`home_not_onboarded.html`); the install steps, "Setup a new Claude Code" CTA (90-day PAT mint), and per- connector setup prompts hide once `users.onboarded=TRUE`. A completion badge replaces them. - "Mark me as offboarded" button reverses the flag without an SQL UPDATE. - `users.onboarded BOOLEAN` column added; default FALSE; flipped by the CLI's `agnes init` post-success POST and the `/admin/users` API. - Connector setup prompts pre-check whether the tool is already installed/connected before re-running setup. - GWS scope set widened to include Google Chat (`chat.spaces`, `chat.messages`). Single template + design tokens - `dashboard.html` now extends `base.html` via the new `{% block layout %}` opt-out (full-width pages skip the 800px `.container`). Net: every page shares one shell. - `style-custom.css` `:root` extended with `--space-{7,9,10,12}`, `--radius-2xl`, `--shadow-{card,elevated}`, `--text-{muted,disabled}`, `--focus-ring`, `--transition-`, `--width-{narrow,app,wide}` so inline page styles can migrate incrementally. Auth redirects honor AGNES_HOME_ROUTE* - `safe_next_path` resolves the configured home route when no `default=` is passed; OAuth callbacks, magic-link clicks, password form, and LOCAL_DEV_MODE shortcuts now land on `/home` (or whatever the operator picked) instead of always /dashboard. News section + /news permalink + /admin/news editor - Schema-bumped `news_template` table (single versioned entity, draft + publish gate). `published BOOLEAN` distinguishes draft from public; monotonically-increasing `version` per save; rows >30d pruned on save except the currently-displayed published version. - `/home` bottom-of-page renders the latest published intro with a "Read more →" link to `/news` (which renders the full body). - `/admin/news` editor with sandboxed live preview, versions table, per-row Unpublish, Format-help cheatsheet. - `agnes admin news show / draft / edit / publish / unpublish / versions / export` (CLI). Talks to the live server via the `/api/admin/news/` endpoints (PAT-authed) — no direct DB access so it coexists with a running uvicorn. - Optimistic-lock guard: `agnes admin news publish --version N` and PUT/PATCH endpoints accept `expected_version` and 409 with structured `{error: "version_conflict", expected, actual, actual_by}` when a concurrent admin replaced the draft. Edit refuses to overwrite a draft authored by someone else without `--force` or `--expect-version`. - nh3 (Rust-backed ammonia) HTML sanitizer; iframe pre-pass strips any iframe whose src is not on the YouTube/Vimeo/Loom allowlist; javascript:/data: schemes blocked everywhere. - Author CSS vocabulary: `.news-hero` (blue gradient hero block), `.callout`/`.callout-{info,warn,success,danger}`, `.video-embed`, `.news-section`, `.news-grid-{2,3}`, `.news-cta` — all consolidated in `style-custom.css` under "News content vocabulary (shared)" so /home perex, /news body, and /admin/news preview share one source of styling. - Code-inside-`<pre>` contrast fix (was unreadable amber-on-silver). - `.news-content` table styling (border, header band, row-hover). `scripts/dev/run-local.sh`* — local uvicorn launcher. Pulls Google OAuth client id/secret from GCP Secret Manager (`AGNES_OAUTH_GCP_PROJECT`-driven, no vendor defaults), points `AGNES_CLI_DIST_DIR` at `./dist` so the wheel endpoint resolves, and `--dev` flips `LOCAL_DEV_MODE=1` + `AGNES_HOME_ROUTE=/home` for one- command iteration. `LOCAL_DEV_MODE=1` also enables the FastAPI debug toolbar. CLAUDE.md "Run tests before every push" section codifies `pytest tests/ -n auto -q` as non-negotiable before each push. Tests: 51 + 14 + 8 = 73 new tests across news-template repo, sanitizer, API, web, CLI; plus updated home/auth/template tests for the new shared-shell architecture. Origin docs (gitignored, customer-fork content): docs/brainstorms/home-page-requirements.md, docs/plans/2026-05-07-001-feat-home-page-plan.md. * feat(cli): agnes onboarded {on,off,status} — self-scoped flag toggle User-facing equivalent of the in-page "Mark me as (off)boarded" button on /home. POSTs /api/me/onboarded with {onboarded, source}; --source overrides the audit-log marker so flips made from the CLI vs the web button vs agnes init automation stay distinguishable. `status` reads via /api/me/profile (when present); falls back to a quick body-marker scan of /home so the read path doesn't write an audit_log row. PAT-authed via cli.client.api_post — same convention as agnes admin news / agnes admin add-user etc. Tests: 5 covering on/off/status round-trip, idempotency, and audit-log source recording. Full suite holds at 12 pre-existing failures (same set as before). * ui(nav+home): primary nav reorg + green What's new band + /marketplace link fix Primary nav (post-rebase audit + per-user feedback): - Items: Home → Marketplace → Data Packages → Memory. Admin dropdown for admins only. The "Dashboard" label was renamed Home — point still resolves through `home_route` so customer instances on /dashboard still land there. - Activity Center moved into the Admin dropdown. Per-team adoption analytics is admin-consumed in practice; the route still allows any authed user for direct deep-links so existing /home tile + bookmarks keep working. - Memory link added (→ /corporate-memory) — was previously buried in the /home "Look around" tiles. - Setup local agent + My Stack dropped from main nav. Setup is the /home install flow's home now; My Stack lives as a tab inside /marketplace. /home tweaks: - Plugin marketplace tile now points at /marketplace (was /store — legacy from before the marketplace rebrand landed in #230). - "What's new" section header gets a green band (success-flavored D1FAE5 background, A7F3D0 border, darker green title) so the bottom-of-page news block visibly distinguishes from the blue install-hero at the top. Header strip only — body stays white. Test fix: test_home_route_resolution renamed `dashboard_link_uses_home_route` → `home_link_uses_home_route` and asserts `href="/home">Home` instead of `href="/home">Dashboard` after the label change. * fix(home): decouple Step 3 + Connect-tools collapse from server onboarded flag The server-side `users.onboarded` flip happens through two paths: 1. Explicit user click on "Mark me as onboarded" or `agnes onboarded on`. 2. Implicit `agnes init` POST → /api/me/onboarded on success. Path 2 produced a UX surprise: an analyst running `agnes init` mid-flow reloaded /home and saw Step 3 (auto-mode) + Connect-your-tools auto- collapse to summary bars. They were actively working through those sections — the install POST never signalled "I'm done with the rest of setup", just "Agnes itself is installed". Decouple the section-collapse decision from the server flag: - Step 1 + Step 2 install blocks: still hidden on `onboarded=TRUE` (their completion is a hard server signal — Agnes IS installed). - Step 3 + Connect-your-tools: render flat by default in BOTH states. Wrapped in `<details class="setup-collapsible" open>` so the browser's native disclosure handles per-section toggle without JS, but the `<summary>` is CSS-hidden until the page-level `data-setup-minimized="1"` attribute is set on `.home-mock`. - New "Minimize setup view" toggle inside the blue install-hero, rendered only when onboarded. Click flips the data-attr on `.home-mock` AND removes the `open` attribute from each `<details>`. State persists in `localStorage["agnes_home_setup_minimized"]` so the choice survives reloads but is per-device. - "Show full setup view" (the same button when minimized) re-opens both `<details>` and clears localStorage. When minimized, each `<details>` still has its own native expand/ collapse — click the gray summary bar to peek at one section without toggling the page-level minimize off. Tests: - test_step3_and_connectors_render_flat_when_onboarded_by_default — asserts `<details class="setup-collapsible" ... open>` for both sections post-onboarding and the absence of any server-rendered `data-setup-minimized` attribute on the `.home-mock` root. - test_minimize_toggle_visible_only_when_onboarded — toggle button rendered only when onboarded. Full pytest holds at 12 pre-existing failures (same set).	2026-05-08 18:28:47 +02:00
Vojtech	107195730d	feat(observability): optional PostHog integration (#231 ) * feat(observability): optional PostHog integration (errors, LLM traces, replay, flags) Off by default. Activates when POSTHOG_API_KEY is set in env. Defaults to PostHog Cloud EU; override host for US Cloud or self-hosted. Coverage: - FastAPI 500 handler captures unhandled exceptions - src/orchestrator.py rebuild + rebuild_source failures - services/scheduler/ HTTP-job failures - cli/main.py uncaught CLI errors (Typer.Exit/SystemExit/KeyboardInterrupt skipped; flushes before re-raise so short-lived CLI invocations don't drop events) - connectors/llm/anthropic_provider.py + openai_compat.py emit $ai_generation events with provider, model, latency, token counts (prompt/completion bodies stay off unless POSTHOG_LLM_PAYLOADS=1 because LLM prompts here routinely include customer SQL/data) - Browser snippet injected into every text/html response by PosthogInjectionMiddleware — registered inside the GZip layer so it sees uncompressed HTML before compression. Many templates are standalone (their own DOCTYPE) and never extend base.html, so a per-template include would miss them. - Frontend: $pageview, $pageleave, JS error capture via window.error and unhandledrejection handlers, masked session replay (maskAllInputs: true plus CSS-selector mask for known data surfaces), feature flags (browser posthog.isFeatureEnabled + server-side feature_enabled with fallback for older SDKs). Identification mode operator-configurable: none / id / email / full. Default email ships user.id + email but never name. CLI entry point moves from cli.main:app to cli.main:main (Typer wrapper). Files: - src/observability/posthog_client.py — lazy singleton, no network when disabled, single-process flush on shutdown - src/observability/llm_tracing.py — trace_generation context manager - app/middleware/posthog_inject.py — HTML rewrite middleware - app/web/templates/_posthog.html — browser snippet template - docs/observability.md — operator guide - config/.env.template — documented POSTHOG_* knobs - tests/test_posthog_disabled.py + tests/test_posthog_client.py + tests/test_llm_tracing.py — 18 tests covering disabled state, identify-mode payloads, $ai_generation shape, error variant. CHANGELOG entry under [Unreleased] Added. * feat(observability): tag every PostHog event with environment + release Splits PostHog dashboards cleanly between localhost / dev / staging / production without manual tagging on every capture call. - POSTHOG_ENVIRONMENT explicit override; auto-resolves to "local" when LOCAL_DEV_MODE=1, else RELEASE_CHANNEL, else AGNES_DEPLOYMENT_ENV, else "unknown". - AGNES_VERSION → RELEASE_CHANNEL fallback feeds the `release` property for "is this error new in this release?" cohorting. - Backend gets both via the PostHog SDK's super_properties constructor arg (every captured event picks them up automatically). - Browser snippet calls posthog.register({environment, release}) inside the loaded callback so $pageview, $exception, autocapture, etc. all carry the same labels. - request.state.user now populated by auth dependencies so the snippet can actually call posthog.identify(user_id, {email}) for logged-in users (previously the user block always resolved to None because nothing wrote to request.state.user). 4 new tests cover env resolution: explicit > LOCAL_DEV_MODE > channel > unknown, plus super-properties forwarding into the SDK constructor. * feat(observability): inline user attrs on every PostHog event + debug throw route PostHog's UI shows person properties on the Person profile page, not inline on each event — so a reviewer triaging an exception couldn't tell which user hit the bug without clicking through. Fix it on both sides. - Backend capture_exception merges user_id / user_email / user_name into the event properties (gated by POSTHOG_IDENTIFY_PII: none/id/email/full). Backed by a new _user_props_for_event helper on PosthogClient. - Browser snippet registers user_id + user_email + user_name as super- properties via posthog.register({...}) so every $exception, $pageview, and custom event coming from posthog.captureException() carries them inline. Mirrors the backend so cross-referencing client/server events doesn't require a person-profile lookup. - /api/debug/throw — debug-only endpoint gated by DEBUG=1 (404 in prod). Runs Depends(get_current_user) first so request.state.user is set when the unhandled-exception handler captures the event. Lets operators exercise the full observability path end-to-end without hand-rolling a TestClient script. Configurable via ?kind=ValueError&msg=... 7 new tests cover: backend user-attr merge across identify modes, anonymous request fall-through, browser snippet super-prop emission for logged-in / anonymous / id-only / full-name cases. * fix(observability): address minasarustamyan PR #231 review Two bugs caught in review. 1. PosthogInjectionMiddleware dropped Response.background on every return path. BaseHTTPMiddleware materialises the body and asks subclasses to return a fresh Response — three paths in dispatch() omitted background=, silently cancelling any BackgroundTask / BackgroundTasks the route attached (audit logging, async webhooks, email sends) with no log line. Fix: route every return through a _passthrough() helper that forwards background. Also adds a _MAX_BUFFER_BYTES (4 MB) cap so a streamed-HTML response can't balloon RSS during buffering. Bigger bodies short-circuit through with a warning rather than being injected. Regression tests in tests/test_posthog_inject_middleware.py exercise four return paths (snippet present, render-fail, double-injection guard, non-HTML passthrough) plus the streaming-guard short-circuit. 2. $ai_input / $ai_output_choices were emitted without truncation, so POSTHOG_LLM_PAYLOADS=1 silently dropped events past PostHog's ~32 KB per-event ingest limit — exactly the calls (large prompts with schemas / sample rows / SQL) an operator would want to inspect. Fix: clip both at POSTHOG_LLM_PAYLOAD_MAX_CHARS (default 30000) with an explicit "…[truncated N chars]" marker so readers don't mistake truncated captures for complete ones. Metadata (provider, model, tokens, latency, error) flows regardless. Three new tests cover default-cap clipping, env-override, and pass-through under the cap. 37 PostHog tests pass.	2026-05-08 17:57:10 +04:00
ZdenekSrotyr	cc1886c97c	release: 0.47.4 — Docker collector skip + FIFO session-pipeline check (#229 ) ## Summary Two minimum-viable fixes after today's 0.44.0 → 0.47.3 release train and the production 30-user launch. Devil's advocate review of a 3-PR / 7-item plan cut scope to these 2 — the rest is deferred to a separate "operate-first, instrument-second" backlog item. ### B2 — Docker session_collector log skip `services/session_collector` was logging `Collection complete: 0 users, 0 files copied` + `WARNING: Group 'data-ops' not found, using default group` every 10 minutes in the Docker layout (where `/home//user/sessions/` doesn't exist). New env var `AGNES_SKIP_LEGACY_COLLECTOR=1` set by default in `docker-compose.yml` short-circuits the collector pass. The bare-VM deployment path (where /home/ IS populated by Claude Code) leaves the env var unset and continues to scan normally — including the data-ops warning, which is load-bearing for catching missing-group mis-deploys. ### O2 — FIFO check in `_check_session_pipeline` The existing check compares `MAX(processed_at)` to newest jsonl mtime — catches "detector hasn't run lately" but blind to "old file was skipped while newer ones were processed". New code finds the oldest FS jsonl that's NOT in `session_extraction_state.session_file` and flags if its mtime is older than `SESSION_PIPELINE_STUCK_FILE_GRACE_SECONDS` (default 4× the existing grace = 2h). Severity intentionally starts at `info` so we can collect prod data on false-positive rate before tightening to `warning`. The aggregator already treats `info` as non-promoting (see the severity vocabulary docstring at the top of `app/api/health.py`), so the headline `status` stays at `healthy` even when this fires — the operator sees the entry in the per-check breakdown but no spurious `degraded` overall. ## Test plan - [x] `pytest tests/test_session_collector.py` — 17 tests pass (existing 9 + new 8 covering env-set/unset, truthy variants, falsy non-skip). - [x] `pytest tests/test_health_session_pipeline.py` — 8 tests pass (existing 4 + new 4 FIFO tests covering stuck-file, under-threshold, all-processed, env-override). <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/229" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-08 09:38:21 +02:00
ZdenekSrotyr	6fe9135cb5	release: 0.47.3 — self-upgrade ignores 24h cache, always re-probes /cli/latest (#227 ) ## Summary `agnes self-upgrade` without `--force` previously short-circuited on the local 24h `update_check.json` cache. After a server-side version bump within that window, the explicit command exited silently as a no-op — empirically observed today when prod 0.47.1 → 0.47.2 didn't propagate. Fix: always invalidate the cache in `_resolve_info`. The cache still gates the implicit warning loop in the root callback (correctly — that runs on every `agnes <anything>` and can't hammer `/cli/latest`). ## Test plan - [x] New `test_self_upgrade_bypasses_24h_cache_without_force` — stale cache claims current; mocked server reports newer; assert UpdateInfo carries the newer version, not the cached one. - [x] Existing self-upgrade tests pass (including `--force` semantics — force is now downstream-only, behavior preserved).	2026-05-07 22:08:21 +02:00
ZdenekSrotyr	917f9aaef0	release: 0.47.2 — restore #218 + #219 fixes silently reverted by #217 (#225 ) ## Summary Smoke-testing the just-shipped 0.47.1 against production exposed two regressions: 1. `agnes query --remote "SELECT FROM unit_economics WHERE bad_col=1"` returned `Table "unit_economics" must be qualified` (the OLD error) instead of `Unrecognized name: bad_col` (the #218 fix's intended behavior). 2. `agnes query "DESCRIBE unit_economics"` showed only DuckDB's misleading `Did you mean order_economics?` with no Agnes hint paragraph (the #219 fix is missing). Root cause: PR #217's squash merge (`506a378c`) carried stale snapshots of `app/api/query.py` and `cli/commands/query.py` from before #218 and #219 merged. The rebase-and-merge auto-merged those files cleanly (no conflict markers) but the result silently reverted both fixes. Restore the two changes verbatim. Tests for both fixes already on main and continue to pass against the restored code. ## Test plan - [x] `pytest tests/test_api_query_guardrail.py tests/test_cli_query.py` — clean - [x] Manual repro against prod after deploy: both flows now surface the intended diagnostic. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/225" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 19:57:18 +02:00
ZdenekSrotyr	506a378c3a	release: 0.47.1 — Keboola connector v27 (incremental, partitioned, where_filters, typed parquet) (#217 ) ## Summary Brings the Keboola connector to feature parity with the legacy internal data-analyst's per-table sync strategies. Closes the four documented gaps from the spec branch (`zs/keboola-connector-specs`): - Typed parquet in the legacy SDK extraction path — column types from Keboola Storage metadata (provider cascade `user > ai-metadata-enrichment > keboola.snowflake-transformation`) survive the CSV → parquet roundtrip; invalid date strings (`'0000-00-00'`) and invalid numeric strings (`'Non-Manager'`) become NULL while keeping the column's typed schema. Pre-fix everything was VARCHAR. - Incremental sync via Storage API `changedSince` — opt-in per table; pulls only delta rows, merges into the existing parquet by `primary_key` (drop_duplicates with keep='last'). Cuts daily extraction from O(full table) to O(delta). - Partitioned sync — flat per-partition layout `data/<table>/<key>.parquet` (e.g. `2026_05.parquet`), per-affected-partition merge for daily updates, chunked initial load with 1-day overlap and 2-empty-chunk stop heuristic. - `where_filters` — server-side row filter with date placeholders (`{{today}}`, `{{last_3_months}}`, `{{start_of_3_months_ago}}`, etc.) resolved at sync time. Force the SDK path; reject `incremental + where_filters` combination at API layer (changedSince already filters temporally). ## Architecture - Schema migration v25 → v26: 7 new columns on `table_registry`. Existing `sync_strategy` column reused (pre-v26 it was inert catalog metadata; post-v26 the extractor dispatches off it). - Per-table dispatcher in `extractor.run()` routes to one of `_extract_via_extension` (full_refresh + extension), `_extract_via_legacy` (full_refresh + filters or extension fallback), `extract_incremental`, or `extract_partitioned`. - API conflict policy: `incremental + where_filters` → 422; `partitioned + query_mode='remote'` → 422; `partitioned ⇒ partition_by required`. - Admin UI: third "Direct extract (Storage API)" radio in the Keboola Register / Edit modals, alongside existing "Whole table (extension)" and "Custom SQL". When selected, exposes a v26 sync-strategy panel with conditional fields per strategy. ## Test plan - [x] Unit + module — 134 v26 tests covering migration, repo, parquet_io, where_filters, incremental (compute_changed_since + merge_parquet + extract_incremental E2E), partitioned (key derivation + merge_partition + chunked windows + extract_partitioned E2E), extractor dispatcher, admin API validators, PUT field clearing, registry-shape → dispatcher bridge - [x] HTML form structure — all v26 inputs + visibility classes + JS payload fields verified in rendered template - [x] Real Keboola roundtrip — registered a small test table as `sync_strategy='incremental'` against a test Storage project, triggered two syncs: - Sync 1: `changedSince=None` → full pull → 9 rows typed parquet - Sync 2: `changedSince=last_sync - 1d window` → 9 delta rows merged with 9 existing → 9 after dedup on primary_key (PK merge confirmed) - [x] Browser UX — agent-browser session against a local uvicorn: login → admin/tables → register modal → switch radios → verify field visibility per strategy → submit → edit existing row → switch to Direct/Incremental → save → confirm DB persistence - [x] Regression — no regressions in the broader 3252-test suite (3 pre-v26 tests updated for the deprecation-marker removal + schema-version bump; 2 pre-existing environment-sensitive test failures unrelated to this change) ## Bugs caught + fixed during E2E The browser + real-Keboola roundtrip exposed four bugs the unit tests missed: 1. JS visibility race — two competing `forEach` loops set `display=''` then `display='none'` on form elements sharing `kb-strategy-incremental kb-strategy-partitioned` classes (window_days + max_history_days are reused across strategies). Fix: single-pass selector with class-based visibility resolver. 2. PUT cannot clear field — pre-v26 `updates = {k: v ... if v is not None}` collapsed "omitted from body" and "sent as null" into the same case, so admin couldn't switch a partitioned row back to full_refresh and have stale `partition_by` clear. Fix: `model_dump(exclude_unset=True)`. 3. Subprocess DB lock conflict — `_read_last_sync` reopened `system.duckdb` while the parent server held the write lock (subprocess contract at `app/api/sync.py:_run_sync` line 260). Fix: parent injects `__last_sync__` into table_config before subprocess spawn. 4. Wrong KBC table_id — `extract_incremental` / `extract_partitioned` built the Storage API table_id from the registry row's slugified `id` (`circle_inc`) instead of `bucket.source_table` (`in.c-finance.circle`), producing 404s. Fix: prefer `bucket+source_table`; fall back to `id` only when bucket empty. ## Operator notes - Existing tables stay on `full_refresh` after migration; admins opt individual tables in via `agnes admin register-table --sync-strategy ...`, the Keboola Edit modal, or `POST/PUT /api/admin/registry`. - `merge_parquet` and `merge_partition` use `pd.concat + drop_duplicates`, loading both existing and delta into pandas RAM. For tables in the multi-million-row range this may OOM — switch to `partitioned` strategy for those (per-partition merge keeps memory bounded). Documented in `### Internal` of the changelog entry. - Date placeholders are resolved at sync time, not register time — a typo'd `{{lasst_week}}` is accepted at register and surfaces only when the next sync runs. By design (rolling windows need late-binding). ## Spec source The four corresponding plans on the `zs/keboola-connector-specs` branch under `docs/superpowers/plans/2026-05-07-0[1-4]-*.md` capture the design rationale and link back to internal repo references for each subsystem. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/217" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 19:01:27 +02:00
ZdenekSrotyr	aa5921da67	release: 0.47.0 — source-agnostic catalog metadata + cache discipline (#223 ) ## Summary - Catalog enrichment for `query_mode='remote'` rows: `rows`, `size_bytes`, `partition_by`, `clustered_by` per table (BQ + Keboola providers). - `/api/v2/schema/{id}` cache miss: 2 BQ jobs → 1 (-50%) via shared `fetch_bq_columns_full`. - All four catalog/schema/sample/metadata caches flush on registry change; single-row re-warm scheduled. - Automatic cache warmup at server startup (bounded concurrency, opt-out via `AGNES_SKIP_CACHE_WARMUP=1`). - SSE-driven freshness toolbar on `/admin/tables` with progress bar, log, and per-row badge. - New admin doc `docs/admin/query-modes.md` — single source of truth on `local` / `remote` / `materialized` choice. Closes #155. Closes #156. ## Test plan - [x] 65+ targeted tests pass across 11 new test modules + 3 modified ones. - [x] No DB migration; no wire-break; `MIN_COMPAT_CLI_VERSION` unchanged. - [ ] Reviewer: register a remote BQ table via `/admin/tables`, observe the toolbar populates within ~2 s and the per-row badge transitions warming → fresh. - [ ] Reviewer: trigger `Re-warm all`, verify SSE log scrolls and `cacheWarmupBar` progresses. - [ ] Reviewer: edit a registered row's bucket, verify `agnes schema <id>` returns updated columns immediately (no 1-hour staleness). - [ ] Reviewer: confirm `agnes admin register-table --query-mode remote` prints the new IAM-smoke-check hint. ## Notable design decisions - BigQuery `INFORMATION_SCHEMA.TABLE_STORAGE` is the only valid scope for size+rows (verified live 2026-05-07; dataset-scoped doesn't exist). Region resolved from `instance.yaml.data_source.bigquery.location` → `bq.client().get_dataset(...)` → fall back to legacy `__TABLES__`. - VIEW handling: TABLE_STORAGE returns no rows for views, fall through to `__TABLES__` (also empty) → `TableMetadata(rows=None, size_bytes=None, partition_by=..., clustered_by=...)`. Null size signals analyst Claude to apply existing CLAUDE.md guidance. - `size_bytes` is `active_logical_bytes + long_term_logical_bytes` — full BQ scan reads both; reporting only active undercounts aged partitioned tables. - Source-agnostic provider seam: per-source `connectors/<source>/metadata.py:fetch(MetadataRequest)`; dispatcher in `app/api/v2_catalog.py:_metadata_provider_for` lazily imports per source_type so a Keboola-only deployment doesn't pay the BQ-extension import cost. - Warmup non-blocking: FastAPI `lifespan` schedules `asyncio.create_task(_warm_catalog_caches_bg)` before `yield`. Per-row failures isolated. ## Out of scope - Profile / column histograms / dimension cardinality for remote tables (separate issue). - Onboarding nudge ("you have 0 remote tables, consider registering some BQ ones") — separate UX call. - Provider plug-in registration via entry-points (the dispatch table is a hardcoded if-tree today; one line per future source). ## Release Bumps `pyproject.toml` 0.46.1 → 0.47.0 (main shipped 0.46.0 + 0.46.1 during this PR — see commit `d98976ec`). New CHANGELOG section under `## [0.47.0] — 2026-05-07`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/223" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 18:33:55 +02:00
ZdenekSrotyr	751cc25327	release: 0.46.5 — agnes describe -n parses, server sanitizes NaN (#224 ) ## Summary Two bugs in `agnes describe` surfaced from a real analyst session following the CLAUDE.md agent-rails discovery workflow. Together they break `agnes describe` end-to-end for any analyst (or analyst-AI) who follows the documented form. ### A) CLI parsing `agnes describe TABLE -n 5` failed with `Missing argument 'TABLE_ID'`. Root cause: the command was registered as a `Typer.Typer` subcommand group via `app.add_typer(describe_app, name="describe")` + `@describe_app.callback(invoke_without_command=True)`, and that pattern mis-parses positional + short-int option in some orderings. Same pattern in `cli/commands/schema.py` works only because schema has no INTEGER short option. Fix: switch to flat `@app.command("describe")`. ### B) Server NaN `/api/v2/sample/<id>` (called by `agnes describe`) returned HTTP 500 with `ValueError: Out of range float values are not JSON compliant: nan` whenever a row contained NaN. Fix: sanitize NaN/±inf to None before JSON serialization. ## Test plan - [x] `pytest tests/test_cli_describe.py` — added regression tests pinning `-n` parsing on either side of the positional. - [x] `pytest tests/test_api_v2_sample.py` — added regression test for NaN row → JSON `null` (not 500). <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/224" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 18:16:21 +02:00
ZdenekSrotyr	8d0bb43b06	release: 0.46.4 — detach SessionEnd push so it survives claude -p SIGTERM (#222 ) ## Summary `claude -p` (headless mode) gives SessionEnd hook subprocesses ~1 second before SIGTERM, regardless of work in progress. `agnes push` for a typical workspace takes 5-30s. The current synchronous SessionEnd hook (`agnes push --quiet 2>/dev/null \|\| true`) was therefore being killed mid-first-upload — `\|\| true` masks the SIGTERM as exit 0, so this regression was invisible until I traced it via a wrapper script and Claude's `~/.claude/debug/<sid>.txt` log. Fix: wrap SessionEnd push in `bash -c "( nohup agnes push --quiet </dev/null >/dev/null 2>&1 & ) ; true"`. The subshell exits immediately, orphaning the upload child to init so it survives the hook subprocess kill. Same `bash -c` pattern as the existing `refresh-marketplace` SessionStart entry (for Windows compatibility). End-to-end verified against production: claude exited in 5s, detached child completed the upload, file `491e3a23-...jsonl` landed on the server within 30s with mtime 14:30 UTC. ## Test plan - [x] `pytest tests/test_lib_hooks.py` — added `test_session_end_push_is_detached` regression test asserting `nohup`, `&`, `</dev/null` are all present. - [x] `pytest tests/test_setup_hooks_template.py` — assertions loosened from `==` to `in` where necessary. - [x] Verified end-to-end against production with the detached wrapper before opening this PR (manual probe). <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/222" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 17:59:27 +02:00
ZdenekSrotyr	7fc5365891	release: 0.46.3 — self-heal session pipeline + clearer diagnose (#220 ) ## Summary Verified against production: `claude -p` headless mode doesn't fire SessionEnd hooks (proven via `--output-format stream-json --include-hook-events`: zero `SessionEnd` events), so any session JSONLs from `-p` invocations stay orphaned locally and never reach the server. Fix: add `agnes push --quiet` as a third SessionStart entry — symmetric self-heal alongside the existing `agnes pull` entry. Existing workspaces pick this up on their next `agnes init` via the marker-based migration already in `cli/lib/hooks.py`. Separately: a colleague's fresh install showed `agnes diagnose` warning "uploads are not being processed", which led them to suspect their `agnes push` was broken. The warning is actually about the LLM-based `verification-detector` backlog (uploads themselves were arriving fine — confirmed by 23+3 JSONLs landed on the server while the warning was firing). Reword the warning to "verification-detector backlog" + add `last_processed` to the diagnose dict so operators don't have to grep logs to confirm. ## Test plan - [x] `pytest tests/test_lib_hooks.py` — updated count + added `agnes push in SessionStart` assertion. - [x] `pytest tests/test_setup_hooks_template.py` — updated. - [x] `pytest tests/test_clean_install_integration.py` — updated. - [x] `pytest tests/test_health_session_pipeline.py` — updated warning text + asserted `last_processed` field. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/220" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 17:41:22 +02:00
ZdenekSrotyr	50d10443d1	release: 0.46.2 — friendlier hint on missing-table errors for remote tables (#219 ) ## Summary `agnes query "DESCRIBE unit_economics"` (where `unit_economics` is `query_mode='remote'`) previously returned DuckDB's nearest-name suggestion (`Did you mean "order_economics"`?), sending users down the wrong path. Now appends a friendly hint about remote tables. Reproduced from a real analyst session — colleague spent ~30s diagnosing what was actually "this is a remote table, not materialized locally". ## Test plan - [x] New test: `_query_local("DESCRIBE unit_economics", ...)` against an empty local DuckDB triggers the new hint, original DuckDB error still echoed. - [x] Negative test: a syntax-error query does NOT trigger the hint (regex only matches "Table with name X does not exist"). - [x] `pytest tests/test_cli_query*.py` clean. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/219" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 17:24:10 +02:00
ZdenekSrotyr	378ee40459	release: 0.46.1 — surface real BQ error from remote_estimate_failed retry (#218 ) ## Summary When `agnes query --remote` references a column that doesn't exist on the FROM table, users were seeing `Table "<id>" must be qualified with a dataset` instead of the actually-useful `Unrecognized name: <column>` from BigQuery. Surface the first-attempt diagnostic now; keep the second-attempt context as `underlying_original`. Reproduced against production: ``` $ agnes query --remote "SELECT COUNT(*) FROM unit_economics WHERE authorize_date = DATE '2025-05-06'" Error: remote_estimate_failed (HTTP 400) message: Could not estimate scan size for this query. underlying: 400 ... Table "unit_economics" must be qualified with a dataset. ``` (`unit_economics` has `authorize_timestamp`, not `authorize_date`.) ## Test plan - [x] New `test_remote_estimate_failed_surfaces_first_error_when_attempts_differ` asserts the first-attempt message wins, second-attempt is preserved as `underlying_original`, hint points to `agnes schema`. - [x] Existing `test_guardrail_returns_400_remote_estimate_failed_on_double_parse_error` still passes (both attempts mocked to identical error). - [x] `pytest tests/test_api_query_guardrail.py` clean. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/218" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 16:54:45 +02:00
ZdenekSrotyr	f1561a67d8	release: 0.46.0 — Keboola cutover bundle (#216 ) Cuts the [Unreleased] section into [0.46.0] in CHANGELOG.md and bumps pyproject.toml. The user-visible content was already on main via PR #190 (commit `28430ced`); this is the release-cut commit that should have been the last commit on that PR — splitting it out so the operator-facing release artifact (tag + GitHub Release) lines up with what's already deployed at :stable.	2026-05-07 12:39:36 +02:00
ZdenekSrotyr	28430ced09	Keboola cutover: native parquet path + sync correctness + auto-discover protection (#190 ) * fix: cutover regressions + parallel Keboola legacy fallback Bundled fixes from a fresh-deploy run on a Keboola Storage backend with the block-shared-snowflake-access feature flag — DuckDB Keboola extension's per-table scan can't access bucket schemas, so the legacy kbcstorage Storage-API client is the only working path. CUTOVER REGRESSIONS - agnes pull hash mismatch on every Keboola local-mode table — src/orchestrator.py:_update_sync_state stored md5(mtime+size)[:12] while the CLI compares against full 32-char content MD5. Now stores the same content MD5 the materialized SQL path already used. - Trailing-slash sanitization in connectors/keboola/access.py and extractor.py — DuckDB Keboola extension's ATTACH fails when the URL ends in / (canonical form). - src/profiler.py:TableInfo.description becomes optional — two call sites instantiated without it, crashing the profiler pass. - scripts/ops/agnes-auto-upgrade.sh: chown on UID change — older images ran as root, current runs as agnes (uid 999). Reads target uid:gid from /etc/passwd inside the new image and chowns ${STATE_DIR}, /data/extracts, /data/analytics when the digest moves. - POST /api/sync/trigger is now singleton per process — two near-simultaneous trigger calls each forked an extractor subprocess, fought for extract.duckdb's file lock, starved uvicorn, flipped the container to unhealthy. Trigger now returns 409 (sync_already_in_progress) when held; _run_sync acquires non-blocking. PARALLEL LEGACY FALLBACK - Process pool fan-out for the _extract_via_legacy queue (default 8 workers, override via AGNES_KEBOOLA_PARALLELISM). Process pool, not thread pool, because connectors/keboola/client.py:export_table does os.chdir(temp_dir) — process-global, so threads raced and slice files landed in the wrong directory ("[Errno 2] No such file or directory: '<job_id>.csv_X_Y_Z.csv'"). - Extractor subprocess timeout 1800s -> 3600s (configurable via AGNES_EXTRACTOR_TIMEOUT_SEC). 28+ tables × multi-minute Keboola export jobs need the headroom on telemetry-class projects. - Process group cleanup on timeout — Popen(start_new_session=True) puts the extractor in its own group. On timeout the parent SIGTERMs the group (10s grace) then SIGKILLs stragglers. Without this, the pool workers were reparented to PID 1 and continued holding open Keboola Storage export jobs. Inline extractor script also installs a SIGTERM -> sys.exit(143) handler so the with ProcessPoolExecutor(...) block __exit__ runs cleanly. Tests: existing tests that patched subprocess.run updated to patch subprocess.Popen with a _FakePopen stand-in (same exit-code-injection contract). Two tests that exercised the parallel path forced AGNES_KEBOOLA_PARALLELISM=1 to keep mocks alive (mocks don't ride into ProcessPoolExecutor subprocesses). Squashed onto current main (was 7 commits + multi-commit CHANGELOG + agnes-auto-upgrade.sh conflicts; squash avoids per-commit conflict resolution against main's flat-mount STATE_DIR refactor and 0.38.0 release cut). * feat(keboola): Storage API direct extract path; drop extension data path The DuckDB Keboola extension's COPY routes through Keboola QueryService, which is unreliable on linked-bucket projects (extension v0.1.6 fixes that case but isn't yet in the community CDN, and pre-fix any project with the block-shared-snowflake-access feature flag couldn't see bucket schemas at all). Move the extract path off the extension entirely and talk to the Storage API directly via signed-URL download — works on any project, regardless of extension state. connectors/keboola/storage_api.py (NEW) Lightweight client built on requests.Session. Three endpoints: - POST /v2/storage/tables/{id}/export-async (kicks off job) - GET /v2/storage/jobs/{id} (poll until done) - GET /v2/storage/files/{id}?federationToken=1 (signed URL detail) - GET <signed_url> (download bytes) Supports sliced exports (manifest + per-slice signed URLs) and gzipped payloads. ExportFilter dataclass mirrors the Keboola filter spec (whereFilters / columns / changedSince / limit) and handles JSON round-trip with the registry's source_query column. Token redaction in error messages. Bounded exponential backoff on job polling. No cloud-SDK dependency on the data path; thread-safe. connectors/keboola/extractor.py - materialize_query() rewritten: takes bucket/source_table/source_query (JSON filter spec), exports via KeboolaStorageClient, converts CSV to parquet via DuckDB, atomic os.replace. Same return shape so sync.py downstream code stays uniform with the BQ branch. - _extract_via_legacy() also moved to Storage API direct (kept the name for caller compatibility with _legacy_worker / the parallel batch extractor). Per-call temp directories — no os.chdir, threads don't race. app/api/sync.py _run_materialized_pass for source_type='keboola' rows now constructs a KeboolaStorageClient (replaces KeboolaAccess) and passes bucket/source_table/source_query to materialize_query. Reuses one client across rows for HTTP keep-alive. Sources keboola URL from env too (KEBOOLA_STACK_URL) when instance.yaml doesn't have stack_url configured. cli/commands/admin.py discover-and-register defaults Keboola rows to query_mode='materialized' (NULL source_query = full table), matching the v26 migration's unification of the local/materialized split for Keboola. BigQuery and Jira keep their per-source defaults. src/db.py Schema bump 25 → 26. Migration: UPDATE table_registry SET query_mode='materialized' WHERE source_type='keboola' AND query_mode='local'. NULL source_query on those rows means "full table export" — same effective behavior the local mode provided, but now via Storage API instead of the extension. pyproject.toml kbcstorage dep stays (admin-side bucket/table list still uses the SDK in app/api/admin.py / connectors/keboola/client.py); only the data path is migrated off the SDK. Comment updated to reflect the new boundary. tests - test_keboola_storage_api.py (NEW, 19 tests): ExportFilter parsing, HTTP client (token redaction, retry logic, polling), download_file (single, gzipped, sliced), end-to-end export_table_to_csv. - test_keboola_materialize.py rewritten: mocks KeboolaStorageClient instead of FakeAccess; same atomic-write + zero-rows + unsafe-id contracts. - test_sync_trigger_keboola_materialized.py: registry rows now carry bucket+source_table+JSON-shape source_query. 114+ Keboola-impacted tests green locally. * test: schema version assertion bumped to 26 alongside the keboola query_mode migration * fix(keboola): cutover hot-patches surfaced on agnes-dev Five small fixes that were applied as in-container hot-patches during agnes-dev cutover and need to be on the source-of-truth image so a fresh upgrade does not undo them. - app/api/sync.py: auto-discover gate considers the WHOLE registry (any source, any mode), not just rows where source matches and query_mode is local. After the v25→v26 keboola materialized migration an instance can have 30 materialized rows and zero local rows; the previous gate kept re-firing _discover_and_register_tables every scheduler tick, creating duplicate auto-discovered rows with the wrong bucket prefix every time. - app/api/admin.py: _discover_and_register_tables reassembles the bucket as <stage>.<bucket-id> (e.g. in.c-finance) instead of dropping the stage prefix; default query_mode for keboola is now materialized (the v26 contract); validator allows NULL source_query for keboola materialized rows (full-table export via Storage API export-async, no SQL needed). - cli/commands/admin.py: register-table mirrors the server validator (NULL source_query allowed for source_type=keboola); --bucket help text generalized to cover both BQ dataset and Keboola bucket id. - connectors/keboola/extractor.py: max_line_size=64 MiB on read_csv_auto so embedded JSON / SQL cells (kbc_component_configuration in particular) do not trip the default 2 MiB ceiling. - connectors/keboola/storage_api.py: GCP backend support — when the Storage API returns a manifest whose slice URLs are gs:// references with a gcsCredentials block, rewrite to the JSON REST download endpoint and authenticate with the issued OAuth bearer token; redact tokens in any surfaced error string. * test: align with new keboola materialized + auto-discover-gate contracts - test_admin_keboola_materialized: rename test_register_keboola_materialized_rejects_missing_source_query → test_register_keboola_materialized_accepts_missing_source_query. v25→v26 introduced 'keboola materialized with NULL source_query means full-table export via Storage API export-async' as the default registration shape; the rejection case is no longer the contract. - test_sync_filter: add list_all() to _StubRegistry. The auto-discover gate in _run_sync now keys off the WHOLE registry (not just local rows) so materialized-only Keboola instances do not re-trigger discovery on every tick. * feat(keboola): native parquet export — skip CSV roundtrip Storage API export-async accepts fileType={csv,parquet}. Switching the materialized sync to parquet eliminates the CSV → DuckDB COPY → parquet roundtrip that pinned a single uvicorn worker over 4 GiB on multi-GB tables (read_csv with all_varchar + max_line_size=64MB has to materialize the whole CSV in memory before COPY can stream out a parquet). Snowflake UNLOAD on Keboola's side already produces typed, self-contained parquet files; the extractor downloads them and renames into place. Two cases: - Single-file export (small table): file_info.url points at one signed URL; download_file streams chunks straight to .parquet.tmp and we're done. No DuckDB. - Sliced export (Snowflake UNLOAD respects MAX_FILE_SIZE — 16 MiB default — so anything larger arrives as N parquet slices): each slice is a complete parquet file with its own footer; naive concat would corrupt them. download_file_slices keeps the slices as separate files in a tempdir, then DuckDB COPY (SELECT * FROM read_parquet([slice0, slice1, ...])) merges them into one consolidated parquet. DuckDB streams row groups during this — peak memory bounded to one row group (~1 MiB) regardless of source size. The legacy CSV path stays as the explicit opt-in via source_query= '{"file_type":"csv"}' for projects whose backend can't UNLOAD parquet (none known today; cheap escape hatch). Backward-compat alias KeboolaStorageClient.export_table_to_csv kept. Also fixes a latent bug in download_file's gzip detection: previous heuristic flagged any unencrypted file as gzipped, which would have corrupted parquet downloads at gunzip time. Name-suffix-only now. * fix: tempdir leak cleanup, every 0m schedule, /sync/trigger body shapes Three small self-contained fixes uncovered during agnes-dev cutover. - connectors/keboola/extractor.py: tempfile.TemporaryDirectory now uses ignore_cleanup_errors=True so a worker death mid-write doesn't leave multi-GiB stale slice trees on the boot disk. (12 GiB seen after a disk-full crash where TemporaryDirectory's own cleanup also raised and got swallowed.) - src/scheduler.py: is_valid_schedule accepts 'every 0m' (interval=0 = always due). Force-resync of an errored row no longer requires waiting out the default 'every 1h' interval — admin can flip the schedule, trigger, then flip back. - app/api/sync.py: POST /api/sync/trigger accepts both ['table_id'] (legacy bare-array body) and {'tables': ['table_id']} (matches the response payload shape, more discoverable for clients building requests by hand). Malformed bodies return 422 with a structured detail; null/missing means 'sync everything' as before. Tests cover: tempdir cleanup on raise (sliced parquet path), is_valid_schedule + is_table_due 'every 0m' acceptance, and trigger body parametrized matrix (8 valid shapes + 6 rejection cases). * fix: targeted-trigger filter in materialized pass + auto-upgrade defer Two operational gaps observed during agnes-dev cutover, in the same sync-routing area. - _run_materialized_pass now takes a 'tables' arg and skips rows not in the target set with reason='not_in_target'. POST /api/sync/trigger with a body of tables previously only scoped the legacy extractor subprocess — the materialized pass kept iterating every due materialized row, so an admin asking to re-sync kbc_job re-ran every other due materialized row alongside it. Match on registry id OR name (admins commonly pass either form). tables=None preserves the no-filter behavior. - New GET /api/sync/status (public, no auth) returns {locked: bool} off _sync_lock.locked(). agnes-auto-upgrade.sh probes this before docker compose up -d and exits 0 with a 'deferred recreate' log line if a sync is in flight — the next 5-min cron tick retries. Pre-fix, an auto-upgrade triggered mid-sync would recreate the uvicorn worker and kill the in-flight extractor / Snowflake-UNLOAD download (observed when kbc_job's first 7-day retry got SIGKILLed). Connection failures in the probe fall through to the upgrade — being stuck on a wedged image is worse than interrupting a hypothetical sync. * fix: auto-discover protects admin overrides + surfaces drift Two real-world incidents on agnes-dev drove this: 1. kbc_job was registered manually with the correct (in.c-kbc_telemetry, kbc_job) coordinates. A naive auto-discover re-run would have inserted a SECOND kbc_job row at the slugified id 'in_c-keboola-storage_kbc_job' (where Keboola's discovery places it) — and that row's Storage API export-async 404s. 2. An earlier auto-discover bug stripped the stage prefix from bucket ids ('c-finance' instead of 'in.c-finance'), inserting 137 rows whose syncs all failed. Fix: - _discover_and_register_tables now builds a plan first (_build_keboola_discovery_plan) classifying each discovered table into one of new / existing_match / existing_drift / invalid, then executes only the 'new' bucket. Drift rows are reported with both sides of the disagreement plus drift_kind: - same_id_diff_coords: registry has the same id but different bucket / source_table (admin migrated coords inline). - name_collision: discovery's slugified id differs from any registry id, but the discovered .name matches an existing row's .name (case-insensitive). Catches the kbc_job case. - Bucket detection now prefers the API's authoritative bucket_id field (separate field on the Keboola tables.list response, normalised by KeboolaClient.discover_all_tables). Falls back to id-string parsing only when bucket_id is missing (older fallback path inside discover_all_tables). - Endpoint POST /api/admin/discover-and-register?dry_run=true returns the plan without writing — would_register, drift, invalid lists. Lets an operator audit before merging discovery with a registry that has admin overrides. Removed 'every 0m' from test_register_request_rejects_malformed_sync_schedule — the runtime started accepting it in the previous commit (force-resync override) and the validator follows suit. * feat(keboola): AGNES_TEMP_DIR routes tempfiles off overlayfs /tmp The container's /tmp lives on the boot disk's overlayfs (29 GiB on agnes-dev, shared with /var). Snowflake UNLOAD of a wide table writes slices into per-call /tmp tempdirs that fill multi-GiB / many-slice exports long before the dedicated data disk fills. agnes-dev hit 100% boot-disk while the 20 GiB data disk had 15 GiB free. connectors.keboola.storage_api.get_temp_root() reads AGNES_TEMP_DIR; mkdirs the target on first use; unset / empty / unwritable falls back to None (system tempdir, OSS-pre-fix behaviour). Both materialize_query (parquet path) and _extract_via_legacy (CSV fallback) and the sliced-CSV concat path in storage_api use the helper now. docker-compose.yml defaults AGNES_TEMP_DIR=/data/tmp on app, scheduler, and extract services. The data volume is the dedicated disk in production layouts and a plain docker volume in single-disk dev/laptop setups — same blast radius as the previous /tmp default on the latter, no regression.	2026-05-07 12:12:14 +02:00
ZdenekSrotyr	c97fd504c5	release: 0.45.0 — easy-wins bundle (#84 #164 #177 #178 #203 #204 ) Operator-and-analyst quality bundle: a security fix for the optional Telegram bot, two CLI gaps closed, and three rounds of UX polish on `agnes diagnose` and `agnes pull` so non-TTY consumers (CI runners, Claude Code SessionStart hooks, sub-agent watchdogs) get readable, actionable signal. - Pairing-code RNG: random.choices -> secrets.choice (CSPRNG). - Telegram script runner: refuse out-of-shape usernames before sudo -u. CLAUDE.md.bak.<ISO-timestamp> before regenerating. - agnes admin unregister-table <id> -> DELETE /api/admin/registry/{id} - agnes admin update-table <id> --field=value ... -> PUT /api/admin/registry/{id} response but never promotes the headline. BQ billing-equals-data check downgraded warning -> info. default (5 s / 1 MiB vs 30 s / 10%) so sub-agent watchdogs don't kill the pull as a hung process. New env knobs: AGNES_PULL_PROGRESS_INTERVAL_{SECONDS,BYTES}. --include-schema (or ?include=schema) to opt back in. Tests: 120 passed across the touched modules, including new tests for each fix. Pre-existing failures on main (DB migration v1->v9, binary rename) are unrelated and not introduced here.	2026-05-07 11:43:16 +02:00
ZdenekSrotyr	cb55374a66	release: 0.44.1 — admin-user-detail empty-group-dropdown hint Patch release: pure UX fix on /admin/users/{id} (frontend-only, no API, no DB, no schema). See CHANGELOG.md for the full entry.	2026-05-07 09:09:45 +02:00
dependabot[bot]	970067e6c3	chore(deps): bump python-multipart from 0.0.26 to 0.0.27 Bumps [python-multipart](https://github.com/Kludex/python-multipart) from 0.0.26 to 0.0.27. - [Release notes](https://github.com/Kludex/python-multipart/releases) - [Changelog](https://github.com/Kludex/python-multipart/blob/main/CHANGELOG.md) - [Commits](https://github.com/Kludex/python-multipart/compare/0.0.26...0.0.27) --- updated-dependencies: - dependency-name: python-multipart dependency-version: 0.0.27 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-05-07 09:09:45 +02:00
ZdenekSrotyr	bb36a69b1e	release: 0.44.0	2026-05-07 07:00:00 +02:00
ZdenekSrotyr	e1ac7d41f1	release: 0.43.0 — server-pinned CLI auto-upgrade See CHANGELOG.md for the full entry. (Bumped from 0.42.0 to 0.43.0 since 0.42.0 was taken by PR #208's backtick-rewriter fix during this branch's review cycle.)	2026-05-06 23:24:44 +02:00
ZdenekSrotyr	09958c9d87	release: 0.42.0	2026-05-06 18:04:39 +02:00
ZdenekSrotyr	dfb7f25e76	release: 0.41.0 — orchestrator filesystem fallback for missing _meta materialized rows 0.40.0 added _persist_materialized_inner_view in materialize_query, which tried to open extract.duckdb from a fresh DuckDB handle to write the _meta row + inner view. In production this conflicts with the same uvicorn process's existing read-only ATTACH (orchestrator's analytics conn holds extract.duckdb ATTACHed as <source_name> alias), and DuckDB single-process file-handle uniqueness rejects with: Binder Error: Unique file handle conflict: Cannot attach "extract" — already attached by database "<source>" The helper logs WARNING fail-soft, parquet stays canonical, but the master view never appears via the meta path. Fix: at the end of _attach_and_create_views, scan <extract_dir>/data/.parquet and CREATE OR REPLACE VIEW <id> AS SELECT FROM read_parquet('<path>') for any parquet whose <id> is not already in the per-source tables list (= meta path didn't pick it up). Decoupled from materialize_query open-handle race. Honors the same view_ownership cross-connector collision rules as the meta path (first-come-first-served via view_repo.claim). Tests: - filesystem-fallback fires when _meta row missing - skipped when meta path already created the view (no shadow) - skips invalid identifiers (e.g. parquet stem starting with a digit) - doesn't crash when source has no data/ subdir	2026-05-06 16:58:18 +02:00
ZdenekSrotyr	b5b16e98a0	release: 0.40.0 — materialize_query writes _meta + inner view so master views appear Pre-fix flow: 1. extractor subprocess writes _meta with N remote rows + creates N inner views in extract.duckdb (rebuild_from_registry skips materialized rows per design — explicit `continue` at line 389) 2. _run_materialized_pass calls materialize_query, which writes parquet atomically + returns stats — but never updates _meta 3. orchestrator.rebuild scans _meta, finds only the N remote rows, creates master views only for them. Materialized parquet is on disk but invisible to /api/query → 400 'not yet materialized' Symptom appears after every container recreate (the previous run's _meta state is wiped because docker compose down nukes the named volume that backs extract.duckdb on some compose layouts; even on volumes that persist, the next extractor pass calls _create_meta_table which DROPs + CREATEs _meta cleanly). Fix: after os.replace(tmp_path, parquet_path) in materialize_query, open extract.duckdb (read-write), DELETE existing _meta row for table_id, INSERT new one with query_mode='materialized', and CREATE OR REPLACE VIEW <table_id> AS SELECT * FROM read_parquet(<path>). All inside a single transaction so concurrent reads see either old or new state, not torn rows. Fail-soft on lock contention or schema drift — parquet remains canonical, next sync pass recovers. Tests: 3 new in test_bq_materialize.py covering: - meta + inner view registered after materialize, alongside existing remote rows - re-run replaces (not duplicates) the meta row - skips inner-view registration when extract.duckdb doesn't exist yet (fresh BQ-only deployment edge case)	2026-05-06 16:04:58 +02:00
ZdenekSrotyr	3b9f6b447d	release: 0.39.0 — perf bundle (BQ query rewrite + session pool + chunked download + HTTP/2)	2026-05-06 13:18:19 +02:00
ZdenekSrotyr	830d1a38f6	merge: CLI perf (chunked DL + HTTP/2 + persistent client + progress) # Conflicts: # CHANGELOG.md	2026-05-06 13:16:31 +02:00
ZdenekSrotyr	bd1b5ad444	perf(cli): persistent HTTP/2 client across pull invocation Pool the httpx.Client used by `stream_download` so N parquet downloads share a single TLS handshake instead of one handshake each. With the optional `h2` package installed, HTTP/2 multiplexing further lets all chunk Range requests share a single TCP connection — synergizes with the range-chunked download path added in the previous commit. The shared client is created lazily on first stream-download call, kept alive for the duration of the process via a module-level slot, and closed at exit via `atexit.register`. Construction wraps in a try/except: when `h2` is unavailable (slim install), httpx raises ImportError on `http2=True` and we transparently fall back to an HTTP/1.1 client — pooling alone still amortizes TLS handshakes. `agnes pull` must never crash on a missing optional package, so the fallback path is non-negotiable. `h2>=4.1.0` is added to the core dependency set; downstream slim installs that drop it lose the HTTP/2 benefit but keep correctness.	2026-05-06 13:06:36 +02:00
ZdenekSrotyr	226eb71592	Merge remote-tracking branch 'origin/main' into pr198-review # Conflicts: # CHANGELOG.md	2026-05-06 11:35:45 +02:00
ZdenekSrotyr	d68c3c5fa2	release: 0.38.2 — bq_query_timeout_ms applied on every BQ attach + surfaced silent failures	2026-05-06 09:48:12 +02:00
ZdenekSrotyr	a7d19206d7	release: 0.38.1 — docs(marketplace) two-step fallback	2026-05-06 09:27:42 +02:00
ZdenekSrotyr	6c94d2cbce	Merge remote-tracking branch 'origin/main' into pr180-review # Conflicts: # CHANGELOG.md # pyproject.toml	2026-05-06 07:27:25 +02:00
ZdenekSrotyr	fdc6cd7fb4	release: 0.37.0 — STATE_DIR + flat-mount overlay; host-mount direct-bind fix	2026-05-06 06:53:48 +02:00
ZdenekSrotyr	f33475cec3	release: 0.36.0 — perf + analyst-clarity bundle Renames the [Unreleased] section to [0.36.0] in CHANGELOG, adds the top-level summary, drops a fresh empty [Unreleased] above, and bumps pyproject from 0.35.1. Also fixes the third Devin Review finding on this PR: the CLI ReadTimeout message hardcoded QUERY_TIMEOUT_S (300s) so a 30s-default call (agnes catalog, agnes auth, …) reported a wait window that didn't match reality. _translate_transport_error now takes the actual httpx timeout from the calling helper; the BQ-job advisory only appears for calls where the timeout was set ≥ 60s.	2026-05-05 18:57:04 +02:00
ZdenekSrotyr	28423907fd	feat: clean CLI errors + init progress + skip-materialize + claude.md catalog pointer Three first-try-failure-surface fixes from Pavel's #185 trace + the template guidance question, all under PR #188's umbrella so they land together with the file_server / parallel pull / Tier 1 work. 1. CLI clean-error wrapper — new AgnesTransportError raised by the api_*/stream_download helpers when httpx times out / drops / refuses, plus a top-level Typer wrapper (cli/main.py) that prints one-line "Error: …" + actionable hint and exits non-zero. Full traceback goes to ~/.config/agnes/last-error.log for support forwarding. Unhandled Exceptions are caught at the same boundary so no Python traceback ever leaks to the analyst's terminal. Pavel's #185 Phase 3B: a 30-frame httpx traceback from a slow BQ --remote query made it look like a CLI bug. Now: clean message + hint pointing at `agnes snapshot create` / partition-column guidance. Entry point in pyproject.toml flipped from `cli.main:app` → `cli.main:_run_with_clean_errors` so the wrapper actually runs under the installed `agnes` binary. 2. agnes init / agnes pull --skip-materialize + progress bar. --skip-materialize omits query_mode='materialized' rows from the download set so a first init doesn't spend 44 minutes silently pulling a single 6 GB parquet (Pavel's #185 Phase 1). Rich-driven per-file progress bar with label/bytes/rate/ETA renders to stderr when not --quiet and not --json. Aggregates across the parallel ThreadPoolExecutor workers added earlier in this PR. 3. config/claude_md_template.txt: explicit one-line snippet pointing at `agnes catalog --json \| jq '.tables[] \| select(.id=="<id>")'` for per-table descriptions + restated invariant: "the description field on each catalog row is the authoritative business-rules text — re-read live, never copy into this file." Resolves the regression-or-feature debate between Pavel (wants annotations) and the user feedback that landed in the prior commit (don't embed table-specific content; tables change). Catalog command stays the source of truth.	2026-05-05 18:11:59 +02:00
ZdenekSrotyr	4751094e1c	fix(keboola): per-table fallback to legacy Storage-API client (#183 ) * fix(keboola): per-table fallback to legacy Storage-API client The DuckDB Keboola extension's per-table COPY fails with `Schema '..."in.c-..."' does not exist or not authorized` on projects whose Snowflake backend doesn't expose bucket schemas to the storage-token-derived QueryService role (keboola/duckdb-extension#17). ATTACH itself succeeds, so the existing extension-level fallback in `_try_attach_extension` never triggers — the table is just marked failed. - Promote `kbcstorage>=0.9.0` from optional to core dep so the legacy client import in `_extract_via_legacy` doesn't crash default installs with `ModuleNotFoundError`. - Wrap `_extract_via_extension` in a per-table try/except so a scan failure retries via `_extract_via_legacy` instead of recording `tables_failed` and moving on. Slower than the extension path, but produces correct parquets on affected projects while the upstream extension fix lands. * test(keboola): cover per-table extension→legacy fallback Two existing tests mocked _extract_via_extension to throw and asserted the original message survived in result["errors"]. With per-table fallback, the new flow retries via _extract_via_legacy — which on the mock URLs would throw a different (404 / DNS-fail) error, replacing the asserted message. - Mock _extract_via_legacy alongside _extract_via_extension in test_network_timeout_during_extraction + test_partial_failure_continues + test_all_tables_fail_returns_full_failure_stats so the assertion observes the final propagated error from the fallback chain. - Add test_extension_per_table_failure_falls_back_to_legacy that exercises the new behavior directly: extension scan fails with the QueryService schema-not-authorized message (keboola/duckdb-extension#17), legacy succeeds, parquet ends up queryable.	2026-05-05 15:47:44 +02:00
ZdenekSrotyr	a220955640	release: 0.35.1 — CLI --remote query timeout fix Patch release bundling the only Unreleased change: bump httpx client timeout for agnes query --remote from 30s to 300s (configurable via AGNES_QUERY_TIMEOUT). Renames CHANGELOG [Unreleased] section to [0.35.1] and bumps pyproject version to match.	2026-05-05 15:01:37 +02:00
ZdenekSrotyr	3d63965a67	Merge remote-tracking branch 'origin/main' into pr180-review # Conflicts: # CHANGELOG.md # app/web/templates/_app_header.html	2026-05-05 12:05:50 +02:00
ZdenekSrotyr	78cad8b235	release: 0.35.0 — /store + /my-ai-stack + security fixes + CLI	2026-05-05 08:18:16 +02:00
ZdenekSrotyr	567385d046	release: 0.35.0 — session pipeline fix (BREAKING) (#176 ) Five compounding defects on default `docker compose up` deploys made the session pipeline silently broken: sessions uploaded by analysts via `agnes push` landed on /data/user_sessions/<user>/.jsonl but nothing ever processed them. Fix is one PR: promote anthropic + openai to core deps, wire all three LLM-pipeline jobs into scheduler-v2 with offset cadences (10m/15m/17m), drop the side-car services from compose, seed a default ai: block on first-time setup with an env-var fallback in code, surface the pending review queue to admins, and expose a health check that warns when uploaded jsonls aren't being processed. BREAKING* for operators on COMPOSE_PROFILES=full or with custom Compose overrides referencing the corporate-memory or session-collector service stanzas — drop them. The scheduler is now the sole driver.	2026-05-05 00:46:27 +02:00
ZdenekSrotyr	d2104555c6	fix(deps): promote anthropic + openai to core dependencies (#176 ) LLM provider SDKs are imported by services/corporate_memory and services/verification_detector — both production code paths. Listing them only in [project.optional-dependencies].dev caused the scheduler container to boot-loop with ModuleNotFoundError on default `docker compose up` deploys, because the Dockerfile installs core deps only (`uv pip install --system --no-cache .`). Adds tests/test_packaging.py to lock the contract: anthropic + openai must live in [project].dependencies, not in dev extras.	2026-05-04 23:52:30 +02:00
ZdenekSrotyr	0430c0de00	release: 0.34.0 — clean analyst bootstrap (BREAKING) + bundled fixes Headlines: - Clean analyst bootstrap rewrite: web /setup → paste prompt → Claude Code in empty folder = working analyst workspace. CLI binary renamed da → agnes. See CHANGELOG ## [0.34.0] for the full breaking-change matrix. - Unified /setup flow: collapsed the admin/analyst tile split (the ?role= query parameter introduced mid-cycle is gone). Every signed-in user sees the same flow; marketplace + plugins block emitted iff caller has plugin grants. PAT scope uniform (general 90 d). - Bundled fixes: supersedes #172 (Windows console encoding), merges #174 (BigQuery materialize view fix + concurrency, schema v24 migration), closes #171 (--remote query pre-check no longer over-rejects narrow queries on partitioned tables, ~30,000x over-estimate fix). - Devin Review findings addressed throughout the cycle: query.py:464 (rewriter cross-contamination), extractor.py:166 (TTL reclaim dead code), db.py:1757 (v24 migration retry path), init.py:99 (stale on-disk token override), and more. - Operator UX: register-table now requires --bucket for materialized rows + emits first-sync and grant hints on success. agnes status sessions counter reads from ~/.claude/projects/<encoded-cwd>/. agnes init --token now wins over stale ~/.config/agnes/token.json. Open follow-ups (separate issues): - #175 sync architecture redesign (full-extract Keboola, full-file downloads, user-global sync_state) - #177 admin CLI: missing unregister-table / update-table commands - #178 agnes diagnose: introduce "info" severity tier	2026-05-04 23:13:23 +02:00
ZdenekSrotyr	e438170ade	merge: pull #174 (BQ materialize view fix + concurrency, 0.33.0) into bootstrap branch Brings in zs/materialize-sync-fix (PR #174): - BigQuery view materialize works (wrap admin SQL in bigquery_query()) - Per-table mutex + fcntl.flock for concurrent COPY corruption - Cost guardrail dry-run engages on materialized rows - Schema v23 -> v24 migration: rewrite source_query to BQ-native - Server-generated trivial source_query from bucket+source_table - Validator backtick relaxation for materialized rows - 0.33.0 release cut Conflict resolution: - CHANGELOG.md: keep our [Unreleased] (bootstrap rewrite content) ABOVE the new [0.33.0] section from #174. The bootstrap rewrite remains unreleased; it'll cut 0.34.0 (or later) when this PR merges to main. - tests/conftest.py: union — keep our analyst-bootstrap fixture re-export AND #174's bq_instance / stub_bq_extractor fixtures. - pyproject.toml auto-merged to 0.33.0 (matches the cut), correct. - src/db.py auto-merged: SCHEMA_VERSION = 24, _v23_to_v24_finalize added — no overlap with our work which left schema at v23. - CLAUDE.md auto-merged: schema-history paragraph extended with v24. Verified: 79/79 across CLI bootstrap suite + materialize suite + schema v24 migration tests pass locally on Python 3.13/macOS.	2026-05-04 20:53:00 +02:00
ZdenekSrotyr	cd3293b994	release: 0.33.0 — BQ materialize view fix + concurrency control	2026-05-04 20:30:50 +02:00
ZdenekSrotyr	8c8cdf6a6a	feat(cli): rename binary from da to agnes (BREAKING)	2026-05-04 16:05:14 +02:00
ZdenekSrotyr	cf8930b593	chore(release): cut 0.32.0 — #160 da query --remote on VIEW + 4 reinforcing fixes CHANGELOG: rename [Unreleased] → [0.32.0] — 2026-05-04, prepend a new empty [Unreleased] for next-PR landing zone. pyproject.toml: version 0.31.0 → 0.32.0. Per repo discipline (memory: feedback_release_cut_with_pr.md), the release-cut commit lands as the FINAL commit of the PR that contained the user-visible behavior change — it does not get a separate PR. After merge: tag v0.32.0 on the merge commit + create a GitHub Release (memory: feedback_github_release_per_tag.md — the tag alone isn't enough; the Release prose is the operator-visible artifact). Headline: closes #160. da query --remote now resolves query_mode='remote' BQ rows whose entity is VIEW or MATERIALIZED_VIEW (the bug Pavel hit). Plus 4 reinforcing fixes — server-side cost guardrail (bq_max_scan_bytes, default 5 GiB), registry-gating of direct bq.* paths, bigquery_query() function-call backdoor closed, structured CLI render of typed BQ errors — and one operator-side admin convenience (BQ test-connection endpoint + billing_project placeholder UI). 14 issues caught and addressed across 6 iterations of Devin Review. E2E verified on agnes-zsrotyr.groupondev.com (commit `7f743d03`): - VIEW path resolves (count=23 from active_inventory_view) - VIEW aggregate parity vs filtered BASE TABLE - cost guardrail rejects with structured 400 detail - bq_path_not_registered 403 (incl. quoted "bq" variant) - bigquery_query() blocklist 400 - test-connection endpoint 200 with elapsed_ms	2026-05-04 14:37:52 +02:00
ZdenekSrotyr	26dc367037	release(0.31.0): cut Agent Setup Prompt + BREAKING CLI/API removals	2026-05-03 21:03:57 +02:00
ZdenekSrotyr	91caefaca9	security(auth): per-IP rate limit + last-admin guard (#165 ) * security(auth): per-IP rate limit on auth endpoints + generalize last-admin guard Closes #45 and #151. #45 — every auth endpoint was unthrottled (login, magic-link, token, bootstrap), leaving us open to password brute-force and SMTP email-bombing. Wires slowapi (new dep) into the middleware chain with per-route limits: 10/min on login + token, 5/min on send-link, 3/min on bootstrap. Returns 429 with Retry-After: 60 once exceeded. Per-IP key respects the leftmost X-Forwarded-For hop (Caddy in front of the app strips client-supplied XFF). Operator escape hatch: AGNES_AUTH_RATELIMIT_ENABLED=0. Test suite disables the limiter via autouse conftest fixture so existing auth tests that hammer endpoints in tight loops are unaffected. #151 — DELETE /api/admin/users/{id}/memberships/{group_id} and the mirror DELETE /api/admin/groups/{group_id}/members/{user_id} only guarded against self-removal as last admin. Generalizes to refuse removing anyone from the seeded Admin group when they are the only remaining active admin (mirrors the existing count_admins(active_only=True) <= 1 check on delete_user / update_user). Recovery from zero admins requires direct DB access, so this closes a path where a scheduler/bootstrap actor that bypasses normal admin checks could otherwise empty the group. * security(auth): throttle remaining email-bombing + token-confirm endpoints Address code-review gap on PR #165 — the first commit covered /send-link but missed two endpoints with the IDENTICAL email-bombing surface: - POST /auth/password/reset — sends reset mail, anti-enum response - POST /auth/password/setup/request — sends setup mail, anti-enum response Both now share the 5/min limit with /send-link. Also add 10/min to the token-confirm surfaces — high-entropy tokens but partial leaks via logs / referer have surfaced before, and unbounded guess rate would let an attacker exhaust the keyspace adjacent to a leaked prefix: - POST /auth/email/verify - GET /auth/email/verify — closes the click-through bypass - POST /auth/password/reset/confirm - POST /auth/password/setup/confirm Doc fix: rate_limit.py module docstring + CHANGELOG entry no longer claim "disable without a redeploy" (misleading). The Limiter constructor freezes `enabled` from env at import time, matching every other Agnes env knob — operators set the flag and bounce the container. Tests: 4 new cases in test_auth_rate_limit.py covering /reset, /setup/request, /reset/confirm, GET /verify. Full suite: 2583 passed, 32 skipped, 0 failed. * security(auth): throttle JSON /auth/password/setup — closes form-throttle bypass Second code-review pass on PR #165 caught a fifth gap: POST /auth/password/setup (JSON variant, kept for backward compat) consumes the same setup_token as the web form /setup/confirm but was unthrottled — an attacker brute-forcing the token just switches from the form path to the JSON path and resumes at unbounded RPS. Apply the same 10/min limit and signature shape used on /setup/confirm. Also extend CHANGELOG note about the JSON-variant bypass for future operators reading the security entry. Test: 1 new case (test_password_setup_json_rate_limited_after_10_requests), 9 rate-limit tests + 28 password-flow tests + 41 auth-provider tests pass, no regressions. * chore(release): cut 0.30.1 — auth security hardening (rate limit + last-admin guard)	2026-05-02 21:08:33 +02:00
ZdenekSrotyr	7052a23552	release(0.30.0): per-connector tab UI + Keboola materialized parity + /admin/server-config full exposure Highlights (full prose in CHANGELOG.md [0.30.0]): - Smart local sync — Claude Code SessionStart/SessionEnd hooks via 'da analyst setup' + 'da sync --quiet' for hook-friendly output - query_mode='materialized' end-to-end for BigQuery + Keboola — admin SELECT (against bq.dataset.x or kbc.bucket.table) → scheduler runs through DuckDB extension → parquet → da sync distribution - /admin/tables per-connector tabs (BigQuery / Keboola / Jira), full Keboola Custom-SQL parity, form cleanup, per-row Manage access deep link - /admin/server-config known-fields registry + structured nested editor: surfaces BQ optional knobs (billing_project, legacy_wrap_views, max_bytes_per_materialize), ai.base_url, new openmetadata + desktop sections, full corporate_memory governance schema - da diagnose warns on USER_PROJECT_DENIED-prone billing_project=project config - Schema v20 — adds source_query TEXT to table_registry	2026-05-01 20:38:34 +02:00
Vojtech	c364f65127	fix(tls-rotate): self-signed fallback sets basicConstraints=critical,CA:FALSE (#159 ) * fix(tls-rotate): self-signed fallback sets basicConstraints=critical,CA:FALSE OpenSSL's default '[v3_ca]' config marks CA:TRUE on 'req -x509', which causes strict TLS stacks (rustls / webpki, used by uv, cargo, and future versions of pip) to reject the cert with 'invalid peer certificate: CaUsedAsEndEntity' per RFC 5280 §4.2.1.9. Browsers, curl, and OpenSSL-based clients tolerated the violation, hiding the bug until a uv user hit it. Affects every VM running on the self-signed fallback while the corp PKI hasn't published the real chain yet. Fix lands on the next agnes-tls-rotate.timer tick (or 'systemctl start agnes-tls-rotate.service' for an immediate refresh). Existing CSR / real-cert paths unaffected; only the bring-up fallback regenerates. * chore(release): cut 0.29.0 --------- Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-05-01 12:23:14 +02:00
Vojtech	bd7b8c3233	fix(analyst): document BigQuery remote-query capability in bootstrap CLAUDE.md template (#154 ) * fix(analyst): document BigQuery remote-query capability in bootstrap CLAUDE.md template Closes #153. The CLAUDE.md template generated by `da analyst bootstrap` (config/claude_md_template.txt) covered metrics, sync, corporate memory, and directory layout — but had ZERO mention of query_mode: "remote", da fetch, da query --remote, or --register-bq. Result: the AI analyst running in a freshly-bootstrapped workspace had no idea BigQuery-backed tables existed, no path to fetch unsynced data, and no fallback for tables not in the catalog. Validated against /Users/<user>/foundry-ai/foundryai-data-analyst/CLAUDE.md on 2026-05-01: section confirmed missing. Workspace-level (parent-dir) CLAUDE.md carried legacy SSH-heredoc instructions but the analyst-level file (which Claude reads as primary project context) had nothing. ## Changes ### config/claude_md_template.txt (+83) Added a `## Remote Queries (BigQuery)` section covering: - Discovery first — `da catalog --json \| jq '...'` to see all tables with their query_mode, then `da schema` and `da describe` for shape. - Three query patterns: - `da fetch` (preferred) — materialize a filtered subset locally, query the snapshot, drop when done. - `da query --remote` — one-shot server-side execution (cheap probes). - `da query --register-bq` — hybrid joins between local + ad-hoc BQ. - `da fetch` estimate-first discipline — rules of thumb on --select / --where / --estimate / snapshot reuse. - BigQuery SQL flavor cheat sheet for `--where` (DATE literal, DATE_SUB, REGEXP_CONTAINS, CAST AS INT64). - Unknown-table fallback: when a table isn't in `da catalog` at all, use ad-hoc `--register-bq` if the agnes server SA has BQ access, or ask admin to register with `query_mode: "remote"` for ongoing use. - Pointer to `da skills show agnes-data-querying` for deeper guidance. ### docs/setup/claude_md_template.txt (deleted) Stale 359-line template that documented the deprecated SSH-heredoc remote_query.sh protocol. No code references it (verified via grep across .py / .sh / .yml / .md). Removing eliminates two failure modes: 1. A future refactor accidentally pulling it into a workspace and shipping deprecated guidance to analyst Claude sessions. 2. Reviewer confusion over which template is canonical. ### CHANGELOG.md `### Fixed` and `### Removed` entries under [Unreleased]. ## Tested - Manually walked the diff against `da skills show agnes-data-querying` output on a live VM (foundryai-development) — patterns + flags match the modern CLI exactly. - Re-bootstrap test deferred: requires network round-trip; pattern is identical to existing template substitution path so render is not at risk. ## Out of scope - The companion gap that data_description.md often only enumerates query_mode: "local" tables (no signal that other modes exist) — separate concern, fix likely belongs in the metadata generator on the server side, not in the analyst template. - Encouraging admins to register frequently-queried BQ tables as `query_mode: "remote"` in the registry — workflow improvement, not a code bug. * chore(release): cut 0.28.0 --------- Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-05-01 12:06:41 +02:00
minasarustamyan	d4ac84dd46	feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150 ) * feat(rbac): drop dataset_permissions + access_requests + users.role + is_public; v19 migration BREAKING. Sjednocení datové RBAC vrstvy do per-group resource_grants modelu. Před PR byla legacy data RBAC vrstva (dataset_permissions + is_public bypass) de-facto neaktivní — is_public neměl API/UI/CLI surface, default true znamenal že can_access_table vždycky bypassl. Dnes každý non-admin přístup vyžaduje explicitní resource_grants(group, "table", id) řádek. Schema v18 → v19 (src/db.py:_v18_to_v19_finalize): - DROP TABLE dataset_permissions, access_requests - DROP COLUMN users.role (NULL artifact since v13) - DROP COLUMN table_registry.is_public - Drops přes table-rebuild idiom (rename → create new → INSERT … SELECT → drop old) kvůli DuckDB ALTER DROP COLUMN limitacím na tabulkách s historic FK constraints. INSERT picks intersection sloupců, takže test fixtures s minimal pre-v19 schemou migrate cleanly. Runtime: - src/rbac.py:can_access_table → deleguje na app.auth.access.can_access - DatasetPermissionRepository, AccessRequestRepository smazány - AGNES_ENABLE_TABLE_GRANTS env-gate v app/resource_types.py odstraněn (TABLE je unconditionally enabled) API drop: - app/api/permissions.py, app/api/access_requests.py celé soubory - /admin/permissions web route + admin_permissions.html - "Request Access" modal v catalog.html + locked-row UI - ~10 if user.get("role") != "admin" checků nahrazeno (admin shortcut je uvnitř can_access_table) - /api/settings: drop permissions field z GET; PUT /api/settings/dataset gate přepnut na can_access(user_id, "table", dataset, conn) Auth: - app/auth/jwt.py:create_access_token: drop role parametr (claim zmizí z nově vydávaných JWT; staré tokeny zůstávají valid, claim ignored) - app/api/users.py: drop role z CreateUserRequest / UpdateUserRequest (admin promotion = explicit add to Admin group via memberships API) - src/repositories/users.py: drop role z create() / update() CLI: - da admin set-role smazán → hard-fail s replacement command - da admin add-user --role flag pryč - da auth import-token --role flag pryč - da auth whoami: drop "Role:" výpis - cli/config.py:save_token: role parametr now optional, no longer written (back-compat se starými token.json soubory zachována — pole se ignoruje) Tests: - DELETE: test_permissions.py, test_permissions_api.py, test_access_requests_api.py - REWRITE: test_access_control.py (resource_grants flow), test_rbac.py (can_access_table over resource_grants), test_journey_rbac.py (drop access-request flow), test_resource_types.py (drop env-gate tests, drop is_public from helpers), test_v2_.py (drop role-based user dicts in favor of id-based + Admin group membership), test_settings_api.py (no permissions field, can_access gate) - TRIVIAL: ~30 souborů — drop role="admin" arg z UserRepository.create a 3rd positional role z create_access_token - NEW: test_v18_to_v19 migration test (test_db.py), test_can_access_table_no_implicit_public (test_rbac.py), test_admin_set_role_returns_hardfail (test_cli_admin.py) - OpenAPI snapshot regenerated Docs: - CHANGELOG: BREAKING entry pod [Unreleased] - CLAUDE.md: schema v18 → v19 - docs/architecture.md: schema table + RBAC sekce přepsána - docs/auth-google-oauth.md: admin promotion přes da admin break-glass - cli/skills/security.md: kompletně přepsáno na group-based model - docs/TODO-rbac-data-enforcement.md: smazáno (TODO splněn) Test results: 2363 passed, 19 failed. Zbývající failures jsou pre-existing Windows-specific issues (fcntl, charset) nesouvisející s tímto PR — ověřeno git stash pop. Plan: ~/.claude/plans/floofy-coalescing-parnas.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> chore(release): cut 0.27.0 --------- Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-04-30 22:02:16 +02:00
Vojtech	2447da7bb1	refactor(ops): bake all host artifacts into image, drop every curl-from-main (#149 ) * refactor(ops): bake all host artifacts into image, drop every curl-from-main Replaces the curl-from-main pattern (originally introduced in 0.25.0 for agnes-auto-upgrade.sh; older for the compose files + Caddyfile) with image- bundled host artifacts. Same-tag delivery for everything the host runs, version-pinned by AGNES_TAG, atomically rolled back by reverting the image. ## Motivation The customer-instance startup template was curling 6 files from raw.githubusercontent.com on every VM boot: docker-compose.yml docker-compose.prod.yml docker-compose.host-mount.yml docker-compose.tls.yml Caddyfile scripts/ops/agnes-auto-upgrade.sh (added in 0.25.0) Every one of them already lives inside the image (`COPY . .` copies the whole repo to /app/). Curling them from the public internet duplicates content the image already carries and introduces three problems: 1. Split-brain version pinning. image_tag pins the docker image to an immutable digest. The compose files + script bypassed that pinning by tracking `main` (or the rarely-set compose_ref). A customer pinned to stable-2026.04.516 could wake up tomorrow with their host artifacts floating on whatever shipped to main overnight — even though they're explicitly pinned for stability. 2. No rollback knob. Reverting a bad host artifact meant reverting the upstream PR globally — affects every customer that reboots after the bad commit. No "rollback for me only" path; tag-pinning gave no protection. 3. Public-internet dependency on every boot. The image is already pulled from a private registry on the same boot. Reusing that channel is strictly cheaper than adding a second one. Customers with restricted egress (no raw.githubusercontent.com reachability) silently broke on every boot. ## Changes ### Dockerfile (+19 -8) After `COPY . .` and before the wheel build, an explicit `cp` lifts every host-side artifact into a stable contract path /opt/agnes-host/: agnes-auto-upgrade.sh (mode 0755 — host cron driver) docker-compose.{yml,prod,host-mount,tls}.yml Caddyfile (mode 0644) Why a copy instead of pointing at /app directly: /app is owned by uid 999 (USER agnes); /opt/agnes-host is root-owned, mode 0755 across the board, stable path that won't shift if /app structure refactors. ### infra/modules/customer-instance/startup-script.sh.tpl (+22 -36) Replaced six curls and the standalone agnes-auto-upgrade.sh extract block (introduced earlier in this PR) with one extract sequence in section 3: docker pull "$${IMAGE_REPO}:$${IMAGE_TAG}" EXTRACT_CONTAINER=$(docker create "$${IMAGE_REPO}:$${IMAGE_TAG}") trap "docker rm '$EXTRACT_CONTAINER' >/dev/null 2>&1 \|\| true" EXIT docker cp "$EXTRACT_CONTAINER:/opt/agnes-host/." "$APP_DIR/" docker cp "$EXTRACT_CONTAINER:/opt/agnes-host/agnes-auto-upgrade.sh" /usr/local/bin/agnes-auto-upgrade.sh chmod +x /usr/local/bin/agnes-auto-upgrade.sh The auto-upgrade section (#6) is now a no-op — script is already in place. ### infra/modules/customer-instance/variables.tf (+1 -1) `compose_ref` marked DEPRECATED in description. Default unchanged for one release cycle to avoid breaking existing terraform plans. Will be removed in a future major bump. ### CHANGELOG.md `### Changed` entry under [Unreleased] — supersedes the narrower entry this PR previously had (which only covered the script). ## Out of scope (filed as follow-ups) 1. agnes-the-ai-analyst-infra/startup.sh (operator deploy) still curls the same artifacts from main. Symmetric fix needed there. Will file as a separate PR against the infra repo. 2. Self-update inside agnes-auto-upgrade.sh after a successful `docker compose pull` of a new digest. Otherwise the running cron keeps using the OLD baked-in script for one tick after image upgrade. ~10 LOC. Deferred to keep this PR scoped. 3. scripts/ops/agnes-tls-rotate.sh has the same shape — host-side bash currently sourced via the infra repo. Should follow the same bake-into-image pattern. ## Tested - Local: `docker build .` succeeds with the new RUN block. - `docker create` + `docker cp /opt/agnes-host/.` round-trips all 6 artifacts; sha matches each source file. - Not yet tested on a live VM bring-up — that requires a CI image with this Dockerfile change. Recommend reviewer trigger CI build, then do a single VM-recreate against a dev VM (e.g. foundryai-development) to confirm the extract path works end-to-end before merge. ## Compatibility - Existing VMs running 0.25.0 are unaffected — they have host artifacts in place from `curl from main` already; this PR doesn't touch them. They pick up the new pattern only on next VM recreate. - VMs pinned to an image_tag older than this PR (no /opt/agnes-host in the image) would FAIL the docker cp. Current diff fails-loud (no fallback). Recommend operators upgrade to a fresh-enough image_tag alongside the template upgrade — same coupling as any compose-flag bump. * docs(infra): document image_tag >= v0.26.0 minimum on prod/dev_instances The new startup script extracts host artifacts from /opt/agnes-host/ inside the image — a directory added in this PR (will ship as v0.26.0). Pinning image_tag to an older tag would fail-loud at first boot with 'docker cp: No such file or directory'. Existing VMs are unaffected because the module ignores metadata_startup_script changes. Devin ANALYSIS_0004 on PR #149. * fix(changelog): mark BREAKING + drop private-repo reference Per CLAUDE.md, breaking changes start with BREAKING so operators can grep before bumping the pin. The image_tag minimum constraint introduced here qualifies — older tags fail-loud at first boot. Also drop the explicit 'agnes-the-ai-analyst-infra' name from the entry; the OSS distribution shouldn't reference operator-side deploy templates by their private-repo names. Generic 'consumer- side deploy templates' wording instead. Devin BUG_0001 + WARN_0001 on PR #149. * chore(release): cut 0.26.0 --------- Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-04-30 21:40:25 +02:00
Vojtech	ddffdfeafd	fix(ops): fail-fast guard in agnes-auto-upgrade — refuse start if config disk not mounted (#146 ) * fix(ops): fail-fast guard in agnes-auto-upgrade — refuse to start containers if config disk not mounted Companion to keboola/agnes-the-ai-analyst-infra#62. Same incident: foundryai-development 2026-04-30, marketplaces / DuckDB / session secret written to /data (sdb) instead of the config disk (sdc), wiped on next container recreate. ## Why an app-side guard agnes-auto-upgrade.sh fires every 5 min on every VM. If `/data/state` is not on the config disk (because of the propagation regression fixed by the infra PR, or the boot-time udev race fixed by infra #58, or any future mount-loss path), this script previously ran `docker compose up -d` anyway — and the app silently wrote state onto the wrong disk. Next recreate, that state was gone. The boot-time fixes in infra are preventive. This is the runtime backstop. ## Behavior Before the existing pull/up logic, when /dev/disk/by-id/google-config-disk exists on the VM: 1. Up to 3 mount-and-verify attempts with backoff (2s, 4s, 6s). - Mount the config disk if /data/state is not a mountpoint. - Detect mismatch: if /data/state is mounted from the wrong source, umount and retry. 2. After the loop, assert findmnt source matches the config disk. - On mismatch: `logger -t agnes-auto-upgrade FATAL` + exit 1. systemd marks the service failed; no docker compose action runs; existing containers (if any) keep running on stale state, but no new write lands on the wrong disk. 3. Once verified mounted: re-apply `mount --make-rprivate /data /data/state` on every run. Idempotent. Guards against propagation regressions sneaking back in via future docker / kernel changes. VMs without a config disk (foundryai-poc, single-disk legacy) skip the whole block — the `if [ -e $CONFIG_DEVICE ]` guard. ## Tested Patched script installed on foundryai-development as a hotfix; manual run post-migration was a no-op (digest unchanged); /data/state stayed on sdc across a full `docker compose down + up -d` cycle. ## Rollout - This file is fetched by infra startup.sh from raw.githubusercontent.com/keboola/agnes-the-ai-analyst/main on every boot. Once merged to main, all VMs pick up the new script on their next boot — no infra recreate needed. - For immediate rollout to running VMs without waiting for next boot: `scp scripts/ops/agnes-auto-upgrade.sh <vm>:/tmp/ && ssh <vm> sudo install -m755 -o root -g root /tmp/agnes-auto-upgrade.sh /usr/local/bin/agnes-auto-upgrade.sh` (already done on foundryai-development). * chore: vendor-agnostic comment + changelog text Drop customer-specific VM names from the script comment and CHANGELOG entry. The OSS distribution should not name a particular operator's hosts; the technical description already conveys why the guard exists. * fix(ops): suppress mount stderr in retry loop Match the rest of the script's error-tolerant idiom (2>/dev/null). Mount failures in the cold-boot udev race the loop is designed to handle gracefully should not flow to stdout — cron would mail on every transient retry. Devin BUG_0001 on PR #146. * fix(changelog): move auto-upgrade entry to [Unreleased] Entry landed under v0.20.0 because that section was [Unreleased] when this branch first opened — releases v0.21–v0.24 cut in the meantime stranded it inside an already-released section. Move it back where new entries belong. Devin BUG_0001 on PR #146. * fix(infra): single-source agnes-auto-upgrade.sh via curl from main Replace the inline heredoc copy of the auto-upgrade script in the customer-instance Terraform startup template with a curl fetch from raw.githubusercontent.com on every boot. The inline copy had drifted several iterations behind canonical scripts/ops/agnes-auto-upgrade.sh (missing TLS overlay detection, array-form COMPOSE_FILES, and now the config-disk fail-fast guard from this PR). Devin ANALYSIS_0001 on PR #146. * fix(infra): fetch docker-compose.tls.yml unconditionally + document coupling The canonical agnes-auto-upgrade.sh from main detects TLS at runtime via cert files on disk, regardless of the TLS_MODE Terraform variable. Certs can appear after boot via agnes-tls-rotate.sh or manual provisioning, and the cron job would then fail every 5 min under 'set -euo pipefail' because docker-compose.tls.yml was never fetched. Also document the main-vs-COMPOSE_REF coupling: when the canonical script references a new compose file, the fetch list above must be updated to match — pinned-ref VMs would otherwise break. Devin BUG_0001 + ANALYSIS_0001 on PR #146. * fix(ops,infra): unconditional Caddyfile + skip tls overlay if missing Caddyfile fetch now matches docker-compose.tls.yml: unconditional in startup-script.sh.tpl. Without it, Docker would auto-create an empty directory at the bind-mount target and Caddy would crash-loop while the tls overlay has already closed :8000 — making the app unreachable on any non-caddy VM where certs land via rotate or manual provisioning. Defensive layer: agnes-auto-upgrade.sh now also requires Caddyfile to exist (size > 0) before activating the tls profile, with a WARN log if it's missing. Belt-and-suspenders so the failure mode is contained even when the script is deployed by some other path (not just the customer-instance TF module). Devin BUG_0001 on PR #146. * chore(release): cut 0.25.0 --------- Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-04-30 20:07:22 +02:00
minasarustamyan	fb1573766a	feat(admin): users/groups UI polish + SSO lock + v18 migration (#142 ) Cuts release 0.24.0. ## Highlights - SSO-managed accounts read-only for password / delete operations (UI + API). New `is_sso_user` flag derived from group memberships. - Admin/Everyone system rows show `google_sync` chip + Workspace email subtitle when env-mapped. - Origin pill vocabulary unified across `/admin/groups`, `/admin/access`, `/admin/users`, `/admin/users/{id}`, `/profile` (Admin yellow, Everyone gray, google_sync green, custom purple). - Effective-access readout no longer short-circuits for admin users — always renders per-resource breakdown. - Schema migration v18 drops stranded non-google memberships in env-mapped Admin/Everyone groups (cleans up v13's blanket Everyone backfill). ## Devin findings addressed - _is_sso_user requires source='google_sync' on system-group branches (so v13 system_seed memberships in env-mapped Everyone don't lock out the admin). - POST add-to-group returns correct origin via _derive_origin (matching GET). - 8 customer-specific token instances (groupon.com / foundryai) replaced with vendor-neutral placeholders across templates, tests, and CHANGELOG. - deriveDisplayName name-skip for canonical "Admin"/"Everyone" so an overlapping AGNES_GOOGLE_GROUP_PREFIX doesn't mangle the chip text. See CHANGELOG [0.24.0] for full notes.	2026-04-30 15:16:04 +02:00
ZdenekSrotyr	70672204fe	feat(memory): admin Edit + MEMORY_DOMAIN RBAC + ai-section UI (#141 ) Cuts release 0.23.0. ## Highlights - Single-item Edit button on every memory item card (modal hits PATCH /api/memory/admin/{id}). - MEMORY_DOMAIN RBAC resource type — admins grant user_groups access to specific domains via /admin/access. Composes with existing audience filter (OR semantics, no-op when no grants). - ai: section editable in /admin/server-config — admins can set ANTHROPIC_API_KEY / model / provider / base_url for the corporate-memory extractor without editing instance.yaml directly. api_key auto-masked. ## Devin findings addressed - Modal NULL→empty fix (audience visibility wouldn't break). - Stats endpoint granted_domains parity with list endpoint. - Documented intentional MEMORY_DOMAIN→audience bypass. - Documented conscious ai.base_url SSRF exclusion (legit internal LiteLLM/vLLM proxies). See CHANGELOG [0.23.0] for full notes.	2026-04-30 11:04:41 +02:00
ZdenekSrotyr	83adf01bde	fix(v2): #134 BigQuery cross-project errors return structured 502/400 + BqAccess facade (#138 ) * docs(spec): #134 unify BigQuery access behind BqAccess facade Brainstorm output for issue #134. Captures: - root cause (incl. correction of the issue's hypothesis about commit 33a9964) - BqAccess facade API + project resolution rules - error contract — typed BqAccessError mapped to HTTP 502 for upstream BQ failures, 500 for deployment/config bugs - migration plan for v2_scan, v2_sample, RemoteQueryEngine - test rewrite eliminating _bq_client_factory injection point - E2E verification protocol on agnes-development as success criterion * docs(spec): #134 revise after first review Incorporates code-reviewer findings: Must-fix: - Add v2_schema (2 copies of INSTALL/LOAD/SECRET dance) to migration scope. - Reframe v2_scan headline: missing try/except around BQ calls is the actual cause of bare 500s, not project resolution (which 33a9964 fixed). - List two more deferred call sites (extractor.py, register_bq_table) with explicit rationale. Important: - Drop billing != data clause from cross_project_forbidden heuristic; rely only on 'serviceusage' substring. billing != data is normal for cross-project setup, was over-classifying. - Split bq_bad_request into _user (400) and _server (502) variants; add sql_origin parameter to translate_bq_error so call sites declare whether SQL contains user input. - Add @functools.cache to BqAccess.from_config; document tests bypass via dependency_overrides. - Replace monkey-patched-classmethod test pattern with BqAccess(client_factory=...) injection at construction time. Cleaner than today's _bq_client_factory and 1:1 migration shape. - Keep BqProjects.data (reviewer assumed registry has source_project; it doesn't). Multi-project explicitly listed as non-goal with note. Nice-to-have: - Add 'Implementation strategy' section: 2 staged commits (bug fix alone is revertable; refactor follows). - Extend E2E protocol to cover all three endpoints, not just /sample. - Note removal of stale docstring at src/remote_query.py:204. * docs(spec): #134 revision 3 — incorporates second-round review Must-fix from second review: - v2_schema split into two migration cases: _fetch_bq_schema translates errors via translate_bq_error; _fetch_bq_table_options preserves its swallow-all 'except Exception → return {}' so /schema doesn't 502 on partition-info failures. - RemoteQueryEngine.__init__ now resolves BqAccess lazily (in _get_bq_client, not in __init__). Without this, ~7 DuckDB-only tests in test_remote_query.py would suddenly fail with not_configured. - translate_bq_error pass-through for BqAccessError is now load-bearing (clause 1, before any Google-API branch). bq.client() raises BqAccessError for bq_lib_missing/auth_failed; without explicit pass-through those fall to 'unknown' and re-raise as bare 500. - Commit 1 now emits the SAME structured response shape as commit 2 to avoid contract churn between commits. - BIGQUERY_PROJECT env-var precedence is BREAKING for env-only deployments — flagged in CHANGELOG ### Changed. Editorial: - sql_origin renamed to bad_request_status with values 'client_error' / 'upstream_error' (clearer about what the parameter actually decides). bq_bad_request_user/_server kinds collapsed to bq_bad_request (400) and bq_upstream_error (502). - CLI (cli/commands/query.py) noted as external RemoteQueryEngine caller; unaffected because new bq_access kwarg has default None. - Added unit/integration tests for the new contracts: test_translate_passes_through_BqAccessError, test_v2_scan_returns_500_on_bq_lib_missing, test_v2_schema_returns_200_with_empty_partition_on_bq_failure, test_resolve_succeeds_after_config_set. - E2E protocol now covers /schema as the fourth endpoint. - Documented functools.cache-doesn't-cache-exceptions semantics and fixture nullcontext-doesn't-close caveat for nested sessions. * docs(spec): #134 revision 4 — incorporates third-round review Third reviewer verdict: 'implementation-ready with two trivial edits'; explicitly noted prior rounds did the heavy lifting. Edits: 1. get_bq_access() module-level function instead of @classmethod @functools.cache from_config. Removes the classmethod-cache stacking footgun (different Python versions wrap differently) and gives FastAPI's dependency introspection a clean function signature. Drops the 'Do not subclass BqAccess' caveat that no longer applies. 2. Commit 1 strategy explicitly: wrap _fetch_bq_sample (v2_sample), _bq_dry_run_bytes + _run_bq_scan (v2_scan), and _fetch_bq_schema (v2_schema strict block). Do NOT touch _fetch_bq_table_options swallow-all in commit 1 — preserved as-is, then migrated (still preserved) in commit 2. All three endpoints emit the same structured body shape so client parsers see one consistent contract throughout the staged rollout. No more half-rolled-out window where /sample is bare 500 while /scan is structured 502. * docs(plan): #134 implementation plan — Phase 1 (atomic bug fix) + Phase 2 (BqAccess refactor) + Phase 3 (verification) Bite-sized TDD tasks. 3 phases, 16 tasks total: Phase 1 (Commit 1) — atomic bug fix across all four v2 endpoints: Tasks 1.1-1.5 wrap _fetch_bq_sample, _bq_dry_run_bytes, _run_bq_scan, _fetch_bq_schema with structured 502/400 try/except. _fetch_bq_table_options preserved untouched. CHANGELOG Fixed entries. Phase 2 (Commit 2) — BqAccess facade extraction + migration: Tasks 2.1-2.5 build connectors/bigquery/access.py bottom-up (BqProjects, BqAccessError, translate_bq_error, default factories, BqAccess class, get_bq_access module-level cached). Task 2.6 adds conftest.py fixture. Tasks 2.7-2.9 migrate v2_scan, v2_sample, v2_schema to BqAccess. Tasks 2.10-2.11 migrate RemoteQueryEngine + tests (lazy bq_access, drop _bq_client_factory). Task 2.12 CHANGELOG Changed BREAKING + Internal. Phase 3 — Verification: 3.1 full pytest. 3.2 squash into two PR-shape commits. 3.3 manual E2E on agnes-development per spec protocol → close #134. Self-review table maps spec sections to implementing tasks; no gaps. * fix(v2): #134 structured 502/400 on BQ errors across /scan, /scan/estimate, /sample, /schema Wraps the BigQuery call sites in v2_scan, v2_sample, and v2_schema (strict block only) with try/except for google.api_core exceptions, translating to HTTPException with a structured body shape: {error, message, details}. Fixes Pavel's report (#134) where these endpoints returned bare HTTP 500 with no body when the SA on agnes-development hit cross-project Forbidden on serviceusage.services.use. Also fixes /sample's missing billing_project fallback (the bug 33a9964 fixed for /scan never landed here). Status code split: - /scan, /scan/estimate: BadRequest -> 400 (bq_bad_request) since SQL is user-derived from req.select/where/order_by. - /sample, /schema: BadRequest -> 502 (bq_upstream_error) since SQL is server-constructed from validated identifiers. - All Forbidden -> 502 with cross_project_forbidden if 'serviceusage' in error message (with hint pointing at data_source.bigquery.billing_project), else bq_forbidden. Body shape matches what the upcoming BqAccess refactor (next commit) will produce, so client-side parsers see one consistent contract throughout the staged rollout. _fetch_bq_table_options preserved exactly as-is — its swallow-all-and-return-empty contract is intentional and survives into the refactor; /schema continues to return 200 with empty partition info when partition queries fail. Outer wraps in scan_endpoint, scan_estimate_endpoint, sample, and schema endpoints exist only to make the test pattern (monkeypatching whole _fetch_* functions) work, and are tagged TODO(#134 Phase 2) for removal once BqAccess centralizes translation. * refactor(bq): #134 BqAccess facade — unify v2_scan, v2_sample, v2_schema, RemoteQueryEngine Extracts the duplicated BigQuery-access pattern (project resolution + client construction + DuckDB-extension session + Google-API error translation) into connectors/bigquery/access.py. Migrates four call sites to use it: - app/api/v2_scan.py — _bq_dry_run_bytes, _run_bq_scan - app/api/v2_sample.py — _fetch_bq_sample - app/api/v2_schema.py — _fetch_bq_schema (strict translation), _fetch_bq_table_options (preserves swallow-all best-effort contract) - src/remote_query.py — RemoteQueryEngine, lazy bq_access kwarg The new module exposes: - BqProjects (frozen dataclass: billing + data project IDs) - BqAccessError (typed exception with HTTP_STATUS class mapping) - BqAccess (facade with injectable client_factory/duckdb_session_factory for tests; defaults call the real google-cloud-bigquery + DuckDB extension) - get_bq_access (module-level @functools.cache; FastAPI Depends target) - translate_bq_error (Google API exception → BqAccessError mapper, with BqAccessError pass-through, 'serviceusage'-substring heuristic for cross_project_forbidden, and bad_request_status param distinguishing user-derived (400) from server-constructed (502) SQL) - _default_client_factory, _default_duckdb_session_factory RemoteQueryEngine.__init__ no longer accepts _bq_client_factory; tests migrate to bq_access=BqAccess(projects, client_factory=...). DuckDB-only RemoteQueryEngine tests need no changes — bq_access defaults to None and get_bq_access() is only invoked on first BQ call (lazy resolution). BqAccessError raised internally is translated to RemoteQueryError( error_type="bq_error") in _get_bq_client to preserve the engine's existing public contract — CLI and /api/query/hybrid callers see no change. Endpoint tests (test_v2_scan, test_v2_scan_estimate, test_v2_sample, test_v2_schema) migrate from monkey-patching whole _fetch_* functions to using the new bq_access fixture in tests/conftest.py — which exercises the REAL translation path through BqAccess + translate_bq_error, closing the test gap flagged in Task 1.1's review. Side-effect behavior change: v2_sample's FROM clause now uses the data project (instance.yaml data_source.bigquery.project), not the conflated billing_project from Phase 1. Documented in CHANGELOG ### Internal. BREAKING for deployments combining BIGQUERY_PROJECT env var with data_source.bigquery.project in instance.yaml — env var now overrides data project too. See CHANGELOG ### Changed. Two known-duplicate BQ-access sites (connectors/bigquery/extractor.py, scripts/duckdb_manager.register_bq_table) explicitly out of scope; tracked as follow-up. Removed stale docstring at the previous src/remote_query.py:204 that referenced scripts.duckdb_manager._create_bq_client as the default BQ client factory (RemoteQueryEngine never actually used that function). Test counts: tests/test_bq_access.py +27 (new), tests/test_v2_.py + tests/test_remote_query.py migrated to bq_access fixture (counts unchanged or +1-2 per file). Full suite: 2086 passed, 8 pre-existing failures (DB migration tests with unrelated internal_roles DependencyException — not introduced by this PR). fix(bq_access): translate DefaultCredentialsError to BqAccessError(auth_failed) CI on PR #138 caught: bigquery.Client(...) resolves Application Default Credentials at construction time; without ADC (CI without SA key, dev laptop without 'gcloud auth application-default login') it raises google.auth.exceptions.DefaultCredentialsError synchronously. Pre-fix _default_client_factory only caught ImportError, so DefaultCredentialsError propagated as raw exception — and from production endpoints would surface as bare 500 (the exact failure mode #134 sets out to fix). Now translates to BqAccessError(kind='auth_failed', details.hint='Run gcloud auth application-default login...'). Endpoint catch chain returns HTTP 502 with structured body. Adds unit test test_raises_auth_failed_on_default_credentials_error. Third-round spec review flagged this case in passing; the fix didn't land. CI's auth-less environment surfaced it. * fix(bq_access): get_bq_access() returns sentinel instead of raising when not configured Devin BUG_0001 on PR #138 review: 'get_bq_access() as FastAPI Depends breaks all v2 endpoints for non-BigQuery instances'. Pre-fix: get_bq_access() raised BqAccessError(not_configured) when neither BIGQUERY_PROJECT env nor data_source.bigquery.project was set. Because FastAPI resolves Depends() BEFORE the endpoint body runs, this exception fires during dep-injection — the endpoint's try/except BqAccessError clause never gets a chance to catch it. Result: every v2 request on Keboola-only or CSV-only instances returned bare HTTP 500, even for local-source tables that never touch BigQuery. Fix: get_bq_access() now returns a sentinel BqAccess with empty BqProjects and factories that raise BqAccessError(not_configured) on actual use. Construction succeeds, FastAPI's dep-injection cleanly yields the sentinel, the endpoint runs. The local-source code path in build_sample / build_schema / etc. never calls bq.client() or bq.duckdb_session() (it reads parquet directly), so non-BQ tables return 200 as before. Only when an endpoint actually tries to query BQ (source_type == 'bigquery') does the sentinel raise — and the endpoint's existing except BqAccessError catches it normally, returning structured 502 with hint. Test get_bq_access::test_raises_not_configured_when_neither_set renamed and rewritten to test_returns_sentinel_when_neither_set: asserts BqAccess is returned, then asserts client() and duckdb_session() each raise BqAccessError(not_configured) on call. Test test_does_not_cache_exceptions removed (no longer applicable) and replaced with test_sentinel_is_cached_per_process documenting the operator-restart-on-config-change contract. * docs(spec+plan): #134 genericize customer-specific tokens (CLAUDE.md OSS rule) Devin BUG_0001/0002 round 3 on PR #138: spec and plan docs contained customer-specific deployment hostnames, deployment names, and a GCP project ID that violated CLAUDE.md's vendor-agnostic OSS rule ('Nothing customer-specific belongs in code, configuration defaults, comments, docs, commit messages, PR titles, or PR bodies'). Replacements: agnes-development.groupondev.com -> <your-agnes-host> agnes-development -> <your-dev-instance> prj-grp-dataview-prod-1ff9 -> <your-data-project> s1_session_landings -> <bq_table_id> E2E verification semantics unchanged — operators still run the same four curls + config flip + retry, just substituting their own host / deployment name / project / table. * fix(bq_access): hook get_bq_access.cache_clear into instance_config.reset_cache Devin ANALYSIS_0004 on PR #138: get_bq_access is @functools.cache'd at process level, so it captures BigQuery project IDs at first call and ignores subsequent instance.yaml changes. Pre-Phase-2 the v2 endpoints re-read get_value() on every request, so admin /api/admin/server-config saves (which call instance_config.reset_cache()) hot-reloaded the BQ project. Without this fix, my refactor silently regresses that contract — operators editing instance.yaml via the admin UI would see no effect on v2 endpoints until container restart. instance_config.reset_cache() now also calls connectors.bigquery.access.get_bq_access.cache_clear() (lazy import, swallowed if connectors module isn't loaded — keeps instance_config usable in isolated unit tests). Adds test_instance_config_reset_cache_invalidates_get_bq_access as regression guard. Updates CHANGELOG Internal entry to mention the hot-reload contract + the not-configured sentinel behavior (round-3 fix from Devin BUG_0001 was previously only in commit message). * fix(bq_access): surface not_configured before identifier validation + plan path genericize Devin BUG_0001 + BUG_0002 round 5 on PR #138. BUG_0001 (plan doc): personal filesystem path violated CLAUDE.md vendor-agnostic rule. Replaced with '<worktree-root>' placeholder. BUG_0002 (sentinel error path): when get_bq_access() returns the sentinel BqAccess (BQ not configured), the empty bq.projects.data was reaching validate_quoted_identifier first and raising ValueError -> endpoint mapped to HTTP 400 'unsafe_identifier' instead of structured 500 'not_configured' with hint. Each fetch helper now checks 'if not bq.projects.data: bq.client()' as the first step, which triggers the sentinel's BqAccessError(not_configured). Endpoint catches the typed error and returns HTTP 500 with hint pointing at data_source.bigquery.project. Best-effort _fetch_bq_table_options returns {} silently in this case (preserves the swallow-all contract). * fix(bq_access): classify DuckDB-native exceptions from bigquery_query() via string match Devin ANALYSIS on PR #138 review (latest round). The DuckDB bigquery extension is a C++ plugin making its own HTTP calls — when BQ returns 403, it throws duckdb.IOException with the BQ error embedded as text, not gax.Forbidden. translate_bq_error's isinstance checks would miss these, falling to case 7 → bare 500 in production for v2_scan, v2_sample, and v2_schema (the bigquery_query() paths). Fix: last-resort string-match heuristic before the re-raise. 'Forbidden' / '403' / 'Bad Request' / '400' in the lowercased message classifies via the same kind hierarchy. The 'serviceusage' substring still distinguishes cross_project_forbidden from bq_forbidden. Specific enough that random exceptions without HTTP-error keywords still re-raise. Adds 4 unit tests covering the new heuristic + the 'don't swallow random exceptions' invariant. * chore(release): cut 0.22.0 PR #138 contains issue #134 user-visible behavior changes: - BREAKING: BIGQUERY_PROJECT env var now overrides instance.yaml data_source.bigquery.project for v2 endpoints (previously RemoteQueryEngine billing only). - Fixed: structured 502/400 on /api/v2/sample, /scan, /scan/estimate, /schema when BigQuery raises Forbidden/BadRequest (was bare 500). - Internal: BqAccess facade refactor unifying four duplicate BQ-access call sites; instance_config.reset_cache() now invalidates BqAccess cache too so admin server-config saves hot-reload BQ project IDs. Bumps to 0.22.0 because PR #137 merged first and took 0.21.0.	2026-04-30 10:11:20 +02:00
minasarustamyan	4ec5ff44dd	feat(setup): cross-platform TLS bootstrap + marketplace plugin install (#137 ) Bootstraps the Agnes Claude Code marketplace + RBAC-allowed plugins from the dashboard CTA, and inlines the server's TLS cert when the chain isn't publicly trusted (self-signed / private CA). Cross-platform setup prompt covers Windows Git Bash, macOS, Linux. Includes Bun-compiled `claude` fix (macOS goes via git-clone fallback, same as Windows), PAT stripping after clone, explicit error handling, and four rounds of Devin Review fixes (phantom step references, $PLATFORM re-detection, heredoc/awk line-count sync). Cuts 0.21.0. See CHANGELOG.md [0.21.0] section for details.	2026-04-30 08:56:45 +02:00
Vojtech	38f6b639d2	feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136 ) Cuts release 0.20.0. ## Highlights - X-Request-ID header on every response + sanitized to [A-Za-z0-9_-] (CRLF log-forging mitigation) - Error pages (HTML + JSON 500) surface request_id for support tickets - Dev debug toolbar gated by DEBUG=1 — fastapi-debug-toolbar with custom DuckDBPanel - Centralized app.logging_config.setup_logging() replaces 23 scattered basicConfig calls - Telegram bot drops bot.log file — stdout only (BREAKING) ## Devin findings addressed - BUG_0001: .env.template no longer claims FastAPI debug=True - BUG_0002: subprocess extractor logs INFO to stderr again - ANALYSIS_0003: _wants_html no longer matches Accept: / (curl gets JSON as before) - BUG on b1c6ee9: HTML 500 page no longer leaks str(exc) in production - BUG on b13d2fe: 2 CLAUDE.md compliance flags (transform.py + ws_gateway) accepted as scope-limited logging refactor — follow-up to update CLAUDE.md if needed See CHANGELOG [0.20.0] for full notes.	2026-04-29 22:54:21 +02:00
ZdenekSrotyr	b7a1795834	feat(scheduler): re-wire sync_schedule + script.schedule; tune via env; OpenMetadata TLS (#135 ) Bundles 4 issues: - #79 — table_registry.sync_schedule honored at runtime (API-side filter + Pydantic validators) - #78 — script_registry.schedule honored via new POST /api/scripts/run-due (atomic claim, BackgroundTask exec, deploy-time safety validation) - #77 — sidecar JOBS env-driven (SCHEDULER_DATA_REFRESH_INTERVAL/HEALTH_CHECK_INTERVAL/SCRIPT_RUN_INTERVAL/TICK_SECONDS) - #89 — OpenMetadataClient verify=True default (BREAKING for self-signed) Cuts release 0.19.0. See CHANGELOG for full notes incl. Known Limitations.	2026-04-29 22:06:30 +02:00
ZdenekSrotyr	514fe2c8b6	chore(release): cut 0.18.0 Bundles #119 (BigQuery register-table M1), #126 (memory tree+duplicates+bulk-edit), #131 (Google groups prefix filter, BREAKING — auto-Everyone removed).	2026-04-29 14:34:58 +02:00
ZdenekSrotyr	995e4cd366	fix(scheduler): HTTP marketplaces job + SCHEDULER_API_TOKEN shared secret (#127 ) * fix(scheduler): HTTP marketplaces job + SCHEDULER_API_TOKEN shared secret Two scheduler-reliability bugs surfaced after the v0.12.1 USER-agnes flip: 1. The marketplaces job called src.marketplace.sync_marketplaces() in-process from the scheduler container, racing the app's long-lived system.duckdb handle. DuckDB rejects cross-process writers — every cron tick 500-ed on "Could not set lock on file ... PID 0". 2. The data-refresh + new marketplaces jobs both 401-ed on the API because SCHEDULER_API_TOKEN was never propagated by the Terraform startup script. The scheduler had no credential to authenticate with. Fix: - New POST /api/marketplaces/sync-all (admin-only) drives the nightly refresh through the app process so it inherits the existing DB connection. - Scheduler swaps fn->http for marketplaces; all jobs are now plain HTTP and the scheduler is reduced to a cron clock. - New app/auth/scheduler_token.py adds a shared-secret auth path. The startup script generates a 256-bit secret on first boot, persists it across reboots, and writes it to /opt/agnes/.env. Both containers source the same .env. The app validates incoming Bearer tokens against the env var (constant-time, length-floored) and resolves matches to a synthetic scheduler@system.local user that's a member of the Admin system group. Audit-log entries from the scheduler are attributed to this user. - app/main.py seeds the synthetic user at startup so the first cron tick has a valid actor; lazy seed in get_scheduler_user covers token rotation before the next app restart. Tests: 5 new in tests/test_auth_scheduler_token.py covering empty/short secret rejection, exact-match comparison, idempotent user seeding, and lazy provisioning. 142 marketplace + scheduler tests + 96 auth tests remain green. Existing VMs with .env from before this change need a one-time re-provisioning (re-run startup-script or rotate via openssl rand); documented in CHANGELOG. * fix(audit): use '_all' sentinel for bulk marketplace sync — Devin review #127 Avoids the literal string 'marketplace:None' in the audit_log resource column when the bulk sync endpoint writes its summary row. * fix(scheduler): unblock event loop + per-job timeouts — Devin review #127 Two findings from Devin re-review on commit 5fbad15: 1. BUG: trigger_sync_all was async def, so FastAPI ran it on the asyncio event loop. sync_marketplaces() does blocking I/O (subprocess git clones up to GIT_TIMEOUT_SEC=300 each, threading.Lock, DuckDB writes) and would freeze every concurrent request for the duration of a bulk sync. Switched to plain def so FastAPI auto-routes to the thread pool. 2. ANALYSIS: scheduler used a fixed 120s httpx timeout for every POST. Bulk marketplace sync iterates the registry under a single lock with up to 300s per repo — easily exceeds 120s on 2-3 slow repos. The scheduler then sees a timeout, doesn't update last_run, and re-fires on the next 30s tick, queueing redundant work. Per-job timeout override added to the JOBS tuple; marketplaces gets 900s (15 min), data-refresh keeps 120s, health-check 30s. * fix(auth): require_session_token rejects scheduler shared secret — Devin review #127 require_session_token gates /auth/tokens (PAT minting). Pre-fix it only rejected JWTs with typ=pat — but the scheduler shared secret is an opaque string, so verify_token() returns None, payload becomes {}, and the PAT-claim check silently passed. A caller bearing SCHEDULER_API_TOKEN could mint persistent PATs that survive a secret rotation. Added explicit is_scheduler_token() check before the PAT-claim check; new regression test in tests/test_auth_scheduler_token.py. Devin's other note (pre-existing async def trigger_sync at marketplaces.py:392 also calls blocking sync_one) — Devin flagged it as out-of-scope for this PR and I agree; tracking separately. * release(0.17.0): cut + clean up CHANGELOG duplicates Cuts 0.17.0 (minor: scheduler shared-secret auth + sync-all endpoint plus the deploy-shape fixes that landed since the last release tag). Bumps pyproject from 0.15.0 — also corrects the missed bump from PR #120 (v0.16.0 was tagged on GitHub and shipped as :stable, but pyproject stayed at 0.15.0, so /api/version, /cli/latest, and `da --version` had been under-reporting the running release). Removes the long-form duplicate entries for 0.13.0 / 0.14.0 / 0.15.0 above [0.16.0] — the canonical short summaries (with GitHub-release links) already exist below 0.16.0, the long forms were leftover state from before those versions were cut and have been silently shadowed ever since.	2026-04-29 11:44:00 +02:00
ZdenekSrotyr	61f6b8d2d5	feat(ci+tests): deploy safety audit — linting, rollback, smoke tests, 50+ new tests (#120 ) Comprehensive deploy safety audit implementing 19 improvements across CI/CD pipeline, test coverage, and source code. ### CI/CD Pipeline - ruff + mypy added to both release.yml and keboola-deploy.yml (continue-on-error) - Smoke test added to keboola-deploy.yml (was missing) - Automatic rollback on smoke test failure in release.yml - Expanded smoke-test.sh with catalog, admin/tables, marketplace.zip, metrics - Required status checks via .github/settings.yml - Dependabot + CODEOWNERS + pre-commit hooks + ruff config ### Source Code - DB schema version check in /api/health (db_schema: ok/mismatch/unhealthy) - Config versioning (config_version: 1 in instance.yaml, non-blocking validation) - BigQuery extractor ATTACH error handling (try/except around INSTALL+ATTACH) - Post-deploy smoke test script for prod VM validation ### Test Coverage (~50 new tests) - v13->v14 migration, Email magic link TTL, PAT, Marketplace ZIP/Git, Jira webhooks, Hybrid Query BQ, Keboola/BQ extractor failure modes, Orchestrator failure modes Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-04-29 09:18:55 +02:00
PavelDo	e1108b6112	feat(memory): corporate memory v1+v1.5 + 0.15.0 (#72 ) Adds corporate memory v1 (verification flywheel + contradiction detection + confidence scoring) and v1.5 (audience-based distribution + per-item privacy + admin curation). Server: GET /api/memory/bundle returns mandatory + ranked-approved items within a token budget; POST /api/memory/admin/mandate accepts an audience field gated against user_group_members; /api/memory/stats uses SQL aggregation. CLI: da sync writes received items to .claude/rules/km_*.md. Verification detector extracts knowledge candidates from session JSONL files. Auto-tagging via Haiku when ai: is configured. Adapted from the v9-era branch onto v13/v14 RBAC: _is_privileged_viewer + _effective_groups now query user_group_members JOIN user_groups; require_role(Role.KM_ADMIN) replaced with require_admin (km_admin collapsed into admin). Schema v15: knowledge_items context-engineering columns + knowledge_contradictions + session_extraction_state. Schema v16: verification_evidence. Cuts release v0.15.0 (also bundles #116 /me/debug page).	2026-04-29 07:16:22 +02:00
ZdenekSrotyr	2e1dfb7553	feat(v2): claude-driven fetch primitives + 0.14.0 (#102 ) Replaces the BigQuery wrap-view pattern with a discovery + scoped-fetch toolkit driven by the analyst's Claude session. Adds /api/v2/{catalog,schema,sample,scan,scan/estimate}, da catalog/schema/describe/fetch/snapshot/disk-info CLI commands, sqlglot-backed WHERE validator, process-local quota tracker, agent rails skill (cli/skills/agnes-data-querying.md). BREAKING: BQ wrap views off by default — set data_source.bigquery.legacy_wrap_views=true for one cycle. Backward-compat field_validator on primary_key. Catalog cache now matches documented 300s TTL with RBAC fresh per request. Cuts release v0.14.0.	2026-04-29 01:07:19 +02:00
ZdenekSrotyr	a222f92e70	feat(admin): server configuration editor + 0.13.0 (#107 ) Adds /admin/server-config UI for editing instance.yaml from the web. Hardening: SSRF gate on data_source URLs, narrow-overlay write strategy, atomic writes, audit log with secret masking on shape changes, threading lock on read-modify-write, corrupt-overlay refusal on write side + louder log on read side, modal Promise resolution on backdrop dismiss, sentinel scrub on save (defense-in-depth client+server). Bundles Windows PowerShell wrapper from #80. Cuts release v0.13.0.	2026-04-29 00:47:23 +02:00
ZdenekSrotyr	5f6bb7a4b2	fix(security+ops) + release(0.12.1): #82 #85 #87 hardening + cut 0.12.1 (#104 ) * fix(security+ops): #82 #85 #87 — auth hardening, API validation, deploy posture Security and operational hardening across three issue groups: - M23: docker-compose.override.yml → docker-compose.dev.yml (BREAKING, prod foot-gun) - C13: Container runs as non-root user 'agnes' (USER directive in Dockerfile) - M21: Docker resource limits (mem_limit, cpus) on app + scheduler - M22: Caddyfile security headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, -Server) - M17: /api/health split into minimal (unauth) + /api/health/detailed (auth) (BREAKING) - M26: release.yml restricts build-and-push to main + workflow_dispatch; paths-ignore for docs - C2: table_id traversal validation on /api/data/{table_id}/download - M4: Upload streaming (chunk-read + temp file) instead of full-buffer; /local-md hashed filename - C5: reset_token removed from POST /api/users/{id}/reset-password response - C8: Startup WARNING when no user has password_hash (bootstrap window visible) - M9: Audit log on failed web form login (mirrors /auth/token endpoint) - M10: Atomic magic-link consume via compare-and-swap (CONSUMED: marker + DuckDB conflict catch) Also: SSRF protection on /api/admin/configure (#46), memory stats SQL aggregation (#90) Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF 169.254.x.x + IPv6 multicast; M10 marker cleanup safety Review fixes: - Add 169.254.0.0/16 (link-local, cloud metadata) to SSRF regex — was missing, allowing requests to AWS/GCP/Azure metadata endpoints - Add ff[0-9a-f]{2}: (IPv6 multicast) to SSRF regex - M10: wrap Step 3 (CONSUMED marker cleanup) in try-except with warning log — prevents unhandled exception if DB write fails after successful token consumption - Add test for 169.254.169.254 SSRF rejection Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF IPv6 bypass, CLI health endpoint, upload FD leak Address Devin Review findings on PR #104: 1. SSRF IPv6 bypass: Replace hostname regex with DNS resolution + ipaddress module checks. The old regex patterns like `fe80:` only matched up to the first colon, missing real IPv6 addresses like `fe80::1`, `fc00::1`, `ff02::1`. The new approach resolves the hostname via getaddrinfo and checks each resulting IP against ipaddress.is_private/is_loopback/is_link_local/is_reserved/is_multicast. 2. CLI commands broken: `da setup test-connection`, `da setup verify`, `da diagnose`, `da status` all called /api/health expecting the old format (status=="healthy", services dict). Now they call /api/health/detailed for service-level checks (with graceful fallback to the minimal endpoint when auth is not configured). 3. Temp file handle leak: _stream_to_temp returns an open NamedTemporaryFile; callers now close it before shutil.move() to prevent FD leaks until GC. Also adds IPv6 SSRF test cases (loopback, link-local, unique-local, multicast) with mocked DNS resolution for test environment independence. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): download regex blocks hyphenated IDs; document health split Address Devin Review round-3 findings on PR #104: 1. _SAFE_IDENTIFIER regex blocked hyphenated table IDs: The download endpoint used the strict SQL-identifier regex which does not allow dots or hyphens, but Keboola table IDs like in.c-crm.orders contain both. Switched to _SAFE_QUOTED_IDENTIFIER which allows dots and hyphens while still blocking path-traversal chars (/, .., \) and quote/control characters. Added test for hyphenated/dotted IDs. 2. Documented health endpoint split in DEPLOYMENT.md: Added Health checks & external monitoring section explaining both endpoints (minimal unauth /api/health vs authenticated /api/health/detailed) and how to wire external monitoring tools to the detailed endpoint with a PAT. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * release(0.12.1): cut hotfix for snapshot integrity + #82/#85/#87 hardening * fix(security): apply CAS pattern to password reset confirm (#82/M10 follow-up) Devin review on the rebased PR flagged the asymmetry: magic-link verify got the atomic compare-and-swap pattern in the original M10 fix, but password reset confirm at /auth/password/reset/confirm was still using read-validate-clear. Two concurrent POSTs with the same valid reset token could both succeed in setting different new passwords (last-write- wins). Lower severity than the magic-link race because the attacker would need the reset token AND to race the legitimate user, but the asymmetry was a polish gap. Mirrors app/auth/providers/email.py::_consume_token CAS exactly: write unique CONSUMED:<random> marker via UPDATE...WHERE token=old_token, then SELECT to verify our marker won, then proceed. Only the winner clears the marker and applies the password change. New regression test_concurrent_reset_only_one_wins in tests/test_password_flows.py::TestResetConfirm pins the contract: two ThreadPoolExecutor workers + Barrier hit /reset/confirm with the same token; exactly one gets 302 (password applied), the other gets 200 with 'Invalid or expired'. Sanity-checked against the pre-CAS code — both POSTs got 302 (race confirmed). --------- Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-04-28 19:57:30 +02:00
ZdenekSrotyr	5c54320f75	chore(release): cut 0.12.0 Pre-1.0 SemVer convention: BREAKING changes land in MINOR. This release is heavy on those — RBAC v13 schema rewrite, schema v14 FK constraints, marketplace admin god-mode drop, internal_roles/group_mappings/user_role_grants tables removed, exit code 2 from Keboola extractor (partial fail), Script API admin-only, scripts/grpn/ → scripts/ops/. CHANGELOG.md: rename Unreleased → [0.12.0] — 2026-04-28; append a fresh empty Unreleased above so the next PR has somewhere to land. pyproject.toml: 0.11.5 → 0.12.0. Tag the merge commit as v0.12.0 + push the tag after merge to main.	2026-04-28 14:25:13 +02:00
Vojtech Rysanek	7147bac079	feat(rbac+marketplace): schema v14 FK + AGNES_ENABLE_TABLE_GRANTS + break-glass CLI Follow-up to the RBAC v13 + marketplace work in the parent commit. Addresses deferred Devin findings, gemini-flagged blockers, and adds three guard rails. == Schema v14 — FK constraints on user_group_members + resource_grants == Adds DuckDB foreign-key constraints so cascade deletes can no longer leave orphaned member / grant rows pointing at a deleted group_id (which were relying on application-level cascades up to v13). Migration is RENAME → CREATE-with-FK → INSERT → DROP, wrapped in BEGIN TRANSACTION so a partial failure rolls back without leaving the DB at a half-applied schema. == AGNES_ENABLE_TABLE_GRANTS feature flag (default off) == ResourceType.TABLE was shipped in the parent commit as listing-only — admins can record grants but runtime enforcement still flows through legacy dataset_permissions. To avoid the misleading-UX surface area, the chip is hidden from /admin/access and POST /api/admin/grants returns 422 with the env-var name in detail until the operator opts in. Existing TABLE rows in resource_grants stay listable + deletable so cleanup is never blocked. Helpers: is_resource_type_enabled(rt), enabled_resource_types(). == Break-glass admin CLI == `da admin break-glass <user>` adds the user to the Admin user_group with source='system_seed' regardless of RBAC state. Bypasses authentication — relies on filesystem access to ${DATA_DIR}/state/system.duckdb implying host-level trust. Recovery path when the operator has locked themselves out of /admin/access. == Devin round-2 fixes (deferred on b4ec4c4) == - src/repositories/user_groups.py — narrow update() guard from blocking any mutation on system groups to blocking name change only. Description edits now pass through. Endpoint pre-check stays as defense-in-depth. Prior behavior surfaced as a misleading 409 'Cannot rename a system group' on description-only PATCH. - app/api/access.py:delete_group — wrap cascade DELETEs + repo.delete in BEGIN TRANSACTION / COMMIT / ROLLBACK. Prevents orphan rows if any DELETE fails after the user_groups row is gone. - app/marketplace_server/{packager,router}.py — split compute_etag_for_user() from build_zip(); router resolves etag first and 304-shorts before any file read or ZIP_DEFLATED. In-process cachetools.TTLCache (default 120s, env-tunable via AGNES_MARKETPLACE_ETAG_TTL, set 0 to disable). invalidate_etag_cache() called by sync to force re-hash on content drift. == Tests == - TestTableGrantsFeatureFlag (4 cases) — endpoint exclude/include, grant rejection/acceptance under the flag. - test_v12_to_v13_finalize_rollback_on_failure — destructive: monkeypatches _seed_system_groups to raise mid-transaction, asserts schema_version stays at 12, legacy tables intact, new tables empty (rollback fired). Then restores the real function and asserts the retry succeeds. - test_update_system_group_description_allowed, test_update_system_group_same_name_no_op — repo-level coverage of the narrowed guard.	2026-04-28 14:25:13 +02:00
ZdenekSrotyr	e9d7af3cce	feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening This squashes 13 commits from ma/staging plus a small docstring translation into a single coherent unit. Three workstreams. == RBAC v13 redesign == - Drops core.viewer/analyst/km_admin/admin hierarchy and the internal_roles / group_mappings / user_role_grants / plugin_access tables. - Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry. - Two authorization primitives in app.auth.access: require_admin — Admin-group god-mode require_resource_access(rt, "{path}") — entity-scoped grants Single DB lookup per request; no session cache; no implies BFS. - /admin/access UI (single page) replaces /admin/role-mapping + /admin/plugin-access. CLI `da admin group/grant ` replaces `da admin role/mapping/grant-role/revoke-role/effective-roles`. - ResourceType.TABLE listing-only — admins can record table grants, runtime enforcement still flows through legacy dataset_permissions (migration plan in docs/TODO-rbac-data-enforcement.md). == Claude Code marketplace == - Aggregated /marketplace.zip + /marketplace.git/ (PAT-gated, RBAC-filtered, content-addressed cache via dulwich). - Admin god-mode dropped on the marketplace surface — admins curate their own view via grants like everyone else. - Bare-repo cache materializes per RBAC-filtered ETag; stale entries not pruned in this iteration (disclaimed in git_backend.py docstring). == #81 #83 #44 security/ops hardening == - #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias). - #81 Group B — Keboola extractor 3-state exit codes: 0 success / 1 total fail / 2 PARTIAL fail Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary alerting must teach it the new partial signal. - #81 Group C — schema v10 view_ownership; rejects silent overwrite of a prior connector's view name on collision. - #81 Group D — extractor-side identifier validation. - #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset + path-traversal fix. - #44 — entire /api/scripts/* surface is admin-only (planted-script + sandbox-bypass risk closed). == Web UI polish + deploy fix == - /admin/access: live grant-count badges (no stale snapshot revert), shared-header CSS link added to /catalog and /admin/{tables,permissions}, per-resource-type colored stripes. - docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't silently shadow sub-mounts and write state to the wrong disk. == OSS vendor-neutralization (waves 1+2) == - scripts/grpn/ → scripts/ops/. Customer-specific identifiers (project IDs, internal hostnames, dev/prod VM IPs, brand names) replaced with placeholders across code, docs, Terraform, Caddyfile, OAuth probe, and planning docs. Downstream infra repos that copied scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must update the path. == Translation == - src/repositories/user_groups.py::ensure_system docstring translated from Czech to English for codebase consistency. Co-authored-by: Mina Rustamyan <mina@keboola.com>	2026-04-28 14:25:04 +02:00
Petr Simecek	2dfb246996	release(0.11.5): post-merge follow-up — Devin review fixes + authlib warning silenced (#74 ) Cuts 0.11.5 with all the [Unreleased] bullets that landed on top of PR #73 between commit a899877 (the original "v0.11.4" tag in the chain) and the final merge commit on main. No new public-API surface; the user-visible payoff is that v8→v9-migrated installations work end-to-end (login flows, GET /api/users, admin nav, the new role-management REST API and its last-admin protection) and `make local-dev` startup is finally quiet. Bullets covered (full text in CHANGELOG.md [0.11.5]): - _hydrate_legacy_role re-resolves from grants on every request — fixes privilege-retention after grant revoke via the role-management API. - Dev-bypass + OAuth callback now pass user_id to resolve_internal_roles so direct grants land in the session cache (not the DB-fallback path). - GET /api/users hydrates user dicts before Pydantic validation (HTTP 500 on every migrated install) + same fix for update/delete paths so last-admin protection triggers on migrated admins. - Scheduler stopped spamming POST /auth/token 401 — the auto-fetch fallback was always broken; SCHEDULER_API_TOKEN is now the only path. - POST /auth/token / Google OAuth / password / email-magic-link all hydrate user["role"] before issuing the JWT (Pydantic 500 + wrong token payload). New TestAuthLoginFlowsPostMigration regression class. - docs/RBAC.md no longer documents the non-existent implies= keyword on register_internal_role. - _seed_core_roles now actually runs on every connect (the docstring was lying — only ran during fresh install + v8→v9). New TestSeedCoreRolesSafetyNet regression class. This commit also adds: - AuthlibDeprecationWarning suppression at app/main.py top — upstream- internal forward-compat note from authlib._joserfc_helpers, not actionable on our side. Filter is targeted by class (with a message-based fallback) so other DeprecationWarnings remain visible. - pyproject.toml version: 0.11.4 → 0.11.5. - CHANGELOG.md: [Unreleased] → [0.11.5] — 2026-04-27, new empty [Unreleased] skeleton appended for the next PR to land on. Tag v0.11.5 follows; keboola-deploy-v0.11.5 tag triggers the keboola-deploy.yml workflow for agnes-dev.keboola.com.	2026-04-27 02:32:18 +02:00
Petr Simecek	83ced81966	feat(auth): unified role management — UI + REST API + CLI + schema v9 (v0.11.4) (#73 ) * feat(auth): v9 schema — unified role management foundation (WIP) Tasks 1-5, 10 of the role-management-complete plan. Foundation only, follow-up commits add REST API, CLI, UI, and tests. Schema v9: - user_role_grants table: direct user → internal_role mapping (complementary to group_mappings). Drives PAT/headless auth and persists across sessions. Source field tracks 'direct' vs auto-seed. - internal_roles.implies (JSON): transitive role hierarchy. core.admin implies core.km_admin → core.analyst → core.viewer. Resolver does BFS expand at lookup time. - internal_roles.is_core (BOOL): distinguishes seeded core.* hierarchy from module-registered roles. UI renders them differently. - v8→v9 migration: ADD COLUMN, CREATE TABLE, _seed_core_roles + _backfill_users_role_to_grants, then NULL legacy users.role values. DuckDB FK constraint blocks DROP COLUMN — sloupec zůstává jako deprecated artifact (UserRepository ignoruje), fyzický drop deferred. Resolver: - Regex extended to allow dotted namespace (core.admin, context_engineering.admin), max 64 chars total. - expand_implies(role_keys, conn): BFS over implies JSON column. - resolve_internal_roles signature gains optional user_id parameter; unions group-mapping resolution with user_role_grants direct grants before implies expansion. require_internal_role: - Two-path resolution: session cache (OAuth) → DB grants (PAT/headless fallback). PAT clients now legitimately satisfy gates without the OAuth round-trip, fixing the v8 limitation where every PAT-callable admin endpoint needed require_role(Role.ADMIN) instead of require_internal_role(...). Backward-compat: - require_role(Role.X) and require_admin become thin wrappers over require_internal_role(f"core.{role}"). Implies hierarchy preserves the legacy "at least this level" semantics automatically — no per-level comparison code needed. - src/rbac.py helpers (is_admin, has_role, get_user_role, set_user_role, can_access_table, get_accessible_tables) all read from the resolver via _get_internal_role_keys. - UserRepository.create() and update() now mirror role changes into user_role_grants via _grant_core_role helper. Preserves API while making the new table the source of truth. - UserRepository.delete() pre-deletes user_role_grants rows (FK cascade — DuckDB doesn't auto-cascade). - count_admins() reads user_role_grants ⨝ internal_roles instead of the now-NULL users.role column. First consumer: - app/api/admin.py module-level docstring documents the v9 pattern for future module authors. Existing require_role(Role.ADMIN) callsites flow through the wrapper; no behavior change for OAuth callers, and PAT callers gain access via direct grants. Tests: full suite green (1396 passed, 6 skipped). Existing tests exercise the new pathway transparently because UserRepository.create auto-grants. New test_pat_caller_with_direct_grant_passes pins the PAT-aware contract. Schema: v9 (was v8). pyproject.toml + CHANGELOG bump deferred to the final PR-prep commit. * feat(auth): role management complete — REST API + CLI + UI + docs (v0.11.4) Sjednocuje legacy users.role enum s v8 internal-roles foundation pod jeden model s implies hierarchií, dodává admin UI + REST API + CLI pro správu group mappings i přímých user grants, a dělá require_internal_role PAT-aware tak, aby admin endpointy fungovaly uniformly napříč OAuth i headless callery. REST API (app/api/role_management.py, +496 LOC): - 8 endpointů pod /api/admin: internal-roles list, group-mappings CRUD, users/{id}/role-grants CRUD, users/{id}/effective-roles debug. - Všechny gated require_internal_role("core.admin"). Audit-log na každé mutaci (role_mapping.created/deleted, role_grant.created/deleted). - Last-admin protection: refuse to delete the final core.admin grant (mirrors users.py:count_admins protection). - Nový UserRoleGrantsRepository v src/repositories/user_role_grants.py. CLI (cli/commands/admin.py extension, +258 LOC): - da admin role list / show <key> - da admin mapping list / create <group-id> <role-key> / delete <id> - da admin grant-role <email> <role-key> - da admin revoke-role <email> <role-key> - da admin effective-roles <email> - Všechno přes typer + PAT auth, --json flag, response-shape tolerantní. UI (admin_role_mapping.html + admin_user_detail.html + nav + user list): - Nová stránka /admin/role-mapping: internal_roles read-only table + group_mappings table with create/delete forms. - Nová stránka /admin/users/{id}: core role single-select + capabilities multi-checkbox + effective-roles debug (direct + group + expanded). - Existing user list dostává "Detail" link na novou stránku. - Nav link na /admin/role-mapping. Tests: +85 nových testů přes 4 nové soubory: - test_schema_v9_migration.py (8) — fresh install + v8→v9 backfill + legacy column NULL semantics + unknown-role fallback + invariants. - test_api_role_management.py (33) — všech 8 endpointů, happy + error paths, audit-log assertions, last-admin protection. - test_cli_admin_role.py (25 + 1 conditional) — typer subcommands, text + json output, PAT integration smoke. - test_admin_role_mapping_ui.py (9) + test_admin_user_capabilities_ui.py (10) — page rendering, auth gating, form contracts, JS hooks. Full suite: 1482 passed, 6 skipped (was 1396 → +86, žádné regrese). Docs: - docs/internal-roles.md kompletní rewrite — odstranil "no UI yet", přidal hierarchy diagram, dual-path resolution, dotted-namespace convention, admin workflow přes UI/CLI/REST, refresh semantics for group mappings vs direct grants, migration notes. - CLAUDE.md schema v8 → v9. - CHANGELOG.md [0.11.4] s BREAKING marker pro users.role NULL semantics + complete Added/Changed/Removed/Internal sekce. - pyproject.toml: 0.11.3 → 0.11.4. Sequencing: po mergi tohoto PR Pabu rebasuje pabu/local-dev (PR #72) na main, jeho schema migrations se posouvají z v9/v10/v11 na v10/v11/v12. Implementation breakdown: - Sequential (já): foundation tasks — schema v9, resolver, PAT-aware require_internal_role, backward-compat wrappers, rbac refactor, UserRepository auto-grant. - Parallel sub-agents (3 worktrees, ~10 min): REST API, CLI, UI. - Sequential (já): integrace, docs/CHANGELOG/version, schema tests, fullsuite verification. * fix(auth): address Devin review on PR #73 — three regressions Three concrete bugs caught in Devin's PR review, all fixed in this commit. 1. users.role hydration on read (the big one): v8→v9 migration NULLs users.role for every existing user, but a long tail of read sites still inspect user["role"] directly: - app/web/templates/_app_header.html:15 — admin nav gate - app/web/templates/_app_header.html:36-37 — role badge in dropdown - app/web/router.py:319-321 — UserInfo.is_admin/is_analyst/is_privileged - app/web/router.py:489 — corporate memory is_km_admin - app/api/catalog.py:54 — admin "see all tables" bypass - app/api/sync.py:215 — admin "see all sync states" bypass Without a fix, every existing admin loses the entire admin nav (and API admin bypasses) immediately after upgrade — a serious regression. Fix: new helper _hydrate_legacy_role() in app/auth/dependencies.py maps the highest-level core.* grant back into user["role"] as the legacy enum string. Called from get_current_user() on both auth paths (LOCAL_DEV_MODE + JWT/PAT). Idempotent — skips when role is already populated. Net effect: every pre-v9 callsite keeps working transparently for both OAuth and PAT callers, with one extra DB round-trip per authenticated request (same cost as the existing PAT-aware require_internal_role fallback). 3 regression tests in tests/test_schema_v9_migration.py: - test_hydration_recovers_role_from_user_role_grants - test_hydration_returns_highest_grant (multi-grant → highest wins) - test_hydration_falls_back_to_viewer_when_no_grants (safe fallback) 2. CLI effective-roles TypeError: API returns direct/group as List[Dict] (RoleGrantResponse-shaped), but the CLI did ', '.join(direct) which raises TypeError on dicts. Tests masked it because mocks used bare string lists. Replaced raw .join() with a _names() helper that extracts role_key from each item, falling back to str() for legacy mock shapes. 3. UI template field-name mismatch: admin_user_detail.html JS reads data.groups but the API serializes the field as group (singular, per EffectiveRolesResponse pydantic). Currently benign because the API always returns group:[], but the field would silently disappear once the group-derived view is wired up. Added data.group as the primary lookup, kept the legacy aliases for shape-drift tolerance. Full suite: 1485 passed (was 1482, +3 hydration tests), 6 skipped, no regressions. * fix(auth): Devin review #2 + UX self-service + RBAC docs rename Three threads landed in one commit because they share the same auth/role surface and CHANGELOG entry. Devin review #73 second round (2 actionable findings): - _hydrate_legacy_role no longer short-circuits on truthy users.role. The role-management endpoints (POST/DELETE /api/admin/users/{id}/ role-grants + the changeCoreRole UI flow) only mutate user_role_grants — they don't update the legacy column. The early return trusted that stale value, so a user downgraded via the new REST/UI kept role="admin" in their dict on subsequent requests, which fooled _is_admin_user_dict (src/rbac.py) and the catalog/sync admin-bypass short-circuits into retaining elevated table access even though require_internal_role correctly denied the API gates. Always re-resolves now, making user_role_grants the single source of truth on every authenticated request. Cost: one DB round-trip per request — same as the existing PAT-aware fallback. Pinned by test_hydration_ignores_stale_legacy_role_after_grant_revoke. - Dev-bypass (app/auth/dependencies.py) and OAuth callback (app/auth/providers/google.py) now pass user_id to resolve_internal_roles so direct grants land in session["internal_roles"] alongside group-mapped roles. Pre-fix, every admin-gated request fell through to the per-request DB fallback inside require_internal_role and the dev-bypass log line read "resolved 0 internal role(s)" for an obviously-admin user. test_session_internal_roles_populated updated to assert union. User-visible UX (also addresses local-test feedback): - HTTP 500 on /admin/users post-v8→v9 migration — UserResponse.role is required str, but legacy users.role was NULL-ed by the migration. _to_response in app/api/users.py now routes every dict through _hydrate_legacy_role; same fix lifts the silent no-op of last-admin protection in update_user/delete_user (the role-equality short-circuits would skip the count_admins guard for migrated admins). Three regression tests under TestAPIUsersPostMigration. - /profile is now a real self-service detail page for every signed-in user (not just admins). Three new server-side sections: Effective roles (resolver output as chip cloud), Direct grants (rows in user_role_grants with source label), Roles via groups (which Cloud Identity / dev group grants which role for the current user). Non-admins finally see why a feature is or isn't accessible. Admins additionally see a deep-link to /admin/users/{id} for editing their own grants. - /admin/role-mapping group-id picker. New "Known groups" panel above the create form: clickable chips for the calling admin's own session.google_groups (tagged "your group") merged with external_group_ids already used in existing mappings (tagged "already mapped"). Click a chip → fills the form. Empty-state copy points operators at LOCAL_DEV_GROUPS / Google sign-in instead of leaving them to guess Cloud Identity opaque IDs from memory. Operational fixes: - Scheduler log-noise: every cron tick produced a POST /auth/token 401 because the auto-fetch fallback called the endpoint with just an email (no password) and silently fell through. Removed the broken path entirely. Operators set SCHEDULER_API_TOKEN (long-lived PAT) in production; in LOCAL_DEV_MODE the dev-bypass auto-authenticates the un-tokenized request, so jobs continue to work. Docs: - docs/internal-roles.md → docs/RBAC.md (git mv preserves history). Standard industry term, more discoverable for engineers grepping for RBAC in a new repo. Restructured: Quickstart-by-role (operator / end-user / module author), step-by-step Module-author workflow with code examples (register key, gate endpoint, declare implies, write contract test), naming pitfalls, refresh semantics. CLAUDE.md gets a new "Extensibility → RBAC" section pointing contributors at the doc before they add gated endpoints. Cross-refs in app/api/admin.py + tests/test_role_resolver.py updated. Tests: 293 in the auth/role/scheduler/UI test set passed, 0 regressions. * fix(auth): Devin review #3 — login flows + RBAC docs Two new findings on commit 7d1c048, both real and addressed. Finding 1 (BUG, HTTP 500): every auth login flow loaded users via UserRepository.get_by_email and passed user["role"] straight to create_access_token, Pydantic response models, and _set_login_cookie without going through _hydrate_legacy_role. Post-v9 the legacy column is NULL for migrated users, and TokenResponse.role is a required str — so POST /auth/token raised ValidationError → HTTP 500 for any v8-admin trying to log in via password. Same root cause produced non-crashing but semantically wrong JWTs (role: null) from Google OAuth, password web flows, and email magic-link verification. Fix: hydrate inline in every login flow before reading user["role"]: - app/auth/router.py — POST /auth/token (the crash site) - app/auth/providers/google.py — OAuth callback (was just stale JWT) - app/auth/providers/password.py — 5 flows: JSON login, web login, JSON setup, web reset confirm, web setup confirm - app/auth/providers/email.py — centralized in _consume_token, covers both /verify endpoints New regression class TestAuthLoginFlowsPostMigration pins both the no-crash and the correct-role contracts for all four legacy levels (viewer/analyst/km_admin/admin) on POST /auth/token. Finding 2 (DOCS): docs/RBAC.md showed register_internal_role() being called with implies=[...], but the function signature is (key, , display_name, description, owner_module). A module author copying the example would TypeError at import time. The implies field on internal_roles IS honored at runtime by expand_implies, but the registry-side write path (register_internal_role + InternalRoleSpec + sync_registered_roles_to_db) doesn't exist yet — implies is currently seeded only for the core. hierarchy via _seed_core_roles in src/db.py. Rewrote the Implies hierarchy and Module-author workflow sections to document what's actually supported in 0.11.4 and what a future change would need to add. The "for cross-module hierarchies, register each level + grant both" pattern works today. Tests: 322 in the auth/role/scheduler/UI/password test set passed, 0 regressions. * fix(db): _seed_core_roles actually runs on every connect (Devin review #4) Devin flagged that the docstring on `_seed_core_roles` promised per-connect execution as a safety net for accidental DELETEs and in-code seed changes, but the only call sites lived inside `if current < SCHEMA_VERSION:` — so once a DB was on v9 the function never ran again, and the docstring lied. Picked option (b) from the review (actually call it on every startup) over option (a) (fix the docstring) because the safety net is genuinely useful: - recovery from accidental admin DELETE on internal_roles, - in-code _CORE_ROLES_SEED tweaks (display_name/description/implies) ship without a manual SQL deploy, - fresh installs and migrations stop needing their own seed call sites. Tail call gated by `get_schema_version(conn) <= SCHEMA_VERSION` so the future-version-is-noop rollback contract still holds — a v9 binary won't touch a DB that's been upgraded past v9. Test coverage: new TestSeedCoreRolesSafetyNet class (3 tests) pins the three contracts — deleted row re-seeds, mutated display_name re-syncs from in-code seed, applied_at on schema_version doesn't churn on already-current DBs. Existing TestMigrationSafety::test_future_version_is_noop still passes (verified against the gating logic).	2026-04-27 02:23:01 +02:00
Petr Simecek	6c36b26979	release(0.11.3): internal roles + external→internal group mapping (foundation) (#71 ) * feat(auth): internal roles + external→internal group mapping (foundation) Two-layer authorization model: external Cloud Identity groups (org-managed) get mapped onto internal Agnes-defined capabilities (app-managed) via an admin-curated many-to-many table. Per-request permission checks read off the session — no DB hit. Refresh requires re-login. Schema v8 — new tables: - internal_roles (id, key UNIQUE, display_name, description, owner_module, …) — app-defined capabilities like 'context_admin'. Modules self-register at import; the startup hook syncs the registry into this table (idempotent). - group_mappings (id, external_group_id, internal_role_id FK, …) — admin-managed bindings, UNIQUE(external_group_id, internal_role_id). app/auth/role_resolver.py — new module: - register_internal_role(key, display_name, description, owner_module) Module-author entry point. lower_snake_case key, immutable, validated. Same key + same fields = no-op (re-import safe); same key + different fields = ValueError so two modules can't silently overwrite each other. - sync_registered_roles_to_db(conn) — startup reconciliation. Inserts new keys, updates drifted metadata, never deletes (preserves mappings). - resolve_internal_roles(external_groups, conn) — joins group_mappings. Sorted, deduplicated role-key list. Plugged into google_callback + dev-bypass branch in get_current_user. - require_internal_role('key') — FastAPI dependency factory; reads session.internal_roles; 403 with explicit message when missing. Resolution runs at sign-in only (Google callback + LOCAL_DEV_GROUPS change in dev-bypass) — same semantics as session.google_groups. No admin UI yet; mappings created via repository directly until follow-up PR ships UI. 21 new tests in tests/test_role_resolver.py: register/list, idempotency, collision detection, key-format validation; sync insert/update/no-delete; resolve empty/single/many-to-many/malformed-input; e2e via LOCAL_DEV_GROUPS — gated endpoint allowed/denied + direct session-cookie inspection. Full sweep: 178/178 passed across auth + db + repo tests. (Two pre-existing test_catalog_export.py failures verified unrelated.) * fix(auth): polish review feedback — first-request dev populate + PAT doc Two follow-ups from a code-reviewer pass on the foundation commit before opening the PR: - Dev-bypass populates session["internal_roles"] on the first request after sign-in, not just when external groups change. The previous guard only resolved when groups_changed=True, which left a hole for the LOCAL_DEV_GROUPS=`""` (explicit empty) flow: target=[], current=None, neither write branch fires, internal_roles stays unset, and require_internal_role then 403s with no roles to check against. The OAuth callback writes session["internal_roles"] unconditionally on sign-in (even []); dev-bypass now matches that semantics. Adds a single-pass populate gated on the key being absent from the session, so subsequent same-state requests still no-op (cheap session lookup, no resolver call). - Document that internal roles are session-scoped and PAT/headless clients will get 403 from any require_internal_role(...) endpoint. Same constraint already applies to session.google_groups (PAT JWTs deliberately don't snapshot group memberships — they could change after issuance with no way to re-sign), but the doc didn't surface this — an operator pointing a CLI at a role-gated endpoint would see 403 with no clue why. New "PAT and headless requests" section spells out the constraint, the rationale, and the three escape valves (use users.role for the gate; route through OAuth; wait for the planned `da admin grant-role` CLI helper). 54 auth tests still pass locally (21 role-resolver + 33 existing auth-provider). * release(0.11.3): cut release for the internal-roles foundation Bumps pyproject.toml 0.11.2 → 0.11.3 and renames CHANGELOG's [Unreleased] section to [0.11.3] — 2026-04-26 (with a fresh empty [Unreleased] skeleton appended). Adds the matching [0.11.3] link reference at the bottom of CHANGELOG so the section heading renders as a hyperlink to the GitHub release page once the tag lands. The bullet itself is unchanged content; the rephrasing of "dev-bypass when external groups change" → "dev-bypass — populates on first request and whenever external groups change, mirroring the OAuth callback's always-write semantics" reflects the polish committed in d590579, plus the appended PAT/headless caveat pointing at the doc section that landed in the same polish pass. * fix(auth): address review feedback from Pavel — PAT-specific 403, audit logs, hardening Round-2 polish over the internal-roles foundation, addressing Pavel's review on PR #71. No behavior change for the happy path; tightens the safety rails and makes the failure modes self-explanatory. User-visible: - require_internal_role now distinguishes "no session" (Bearer/PAT caller) from "signed in but missing role" and surfaces a PAT-specific 403 detail in the first case ("This endpoint needs an interactive (OAuth) session — Bearer/PAT tokens do not carry session-resolved roles by design"). - docs/internal-roles.md documents deactivate+reactivate as the supported "force re-resolve now" lever for users that can't be made to log out. Internal hardening: - INFO-level audit log on every successful resolve (OAuth callback + dev-bypass) so a wrong-role complaint is debuggable from the log alone. - Startup warning when SESSION_SECRET is shorter than 32 chars, matching the existing JWT_SECRET_KEY gate — both HMAC surfaces sign trust-laden state (session.internal_roles, session.google_groups, JWTs). - _clear_registry_for_tests() now refuses to run unless TESTING=1 so a stray import path in production can't drop the registered capabilities. Tests: - 4 new tests in tests/test_role_resolver.py covering: stale-session contract after a mid-session mapping revoke (pin the documented limitation), PAT 403 detail wording, OAuth pipeline data flow from external groups to internal_roles, and the dev-bypass empty-list fallback when the resolver raises. CHANGELOG.md updated under [0.11.3] (### Changed + ### Internal). CLAUDE.md schema doc bumped from v7 to v8. --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-26 23:49:10 +02:00
Petr Simecek	1c18cdf15f	release(0.11.2): LOCAL_DEV_GROUPS dev mock + Makefile defaults + docs/local-development.md (#70 ) * feat(auth): mock session.google_groups in LOCAL_DEV_MODE via LOCAL_DEV_GROUPS LOCAL_DEV_MODE auto-logged-in the dev user but left session.google_groups empty, so group-aware UI/code paths can't be exercised on localhost without a real Google OAuth round-trip. New LOCAL_DEV_GROUPS env var (JSON array matching the production {id, name} shape) populates the session on every dev-bypass request — same structure the OAuth callback writes, so mock and prod stay in lockstep. Compare-then-write avoids spurious Set-Cookie noise on PAT/CLI requests; malformed input falls back to [] with a WARNING so the dev mock never breaks the dev flow. * refactor(auth): fail-fast LOCAL_DEV_GROUPS at startup + cache + no-mutate Three small follow-ups on the same dev-mock vector before merge: - Validate LOCAL_DEV_GROUPS at app startup and report the parsed group IDs in the LOCAL_DEV_MODE banner. A malformed value now warns loudly at boot instead of silently logging on the first authenticated request, where it's easy to miss. - Cache the parsed result single-slot, keyed by the raw env-string. Avoids re-parsing JSON on every authenticated request without test-isolation surprises — when the env value changes, the key changes and the cache transparently rebuilds. - Stop mutating the parsed-input dicts (item.setdefault → spread-merge) so the cached list stays a fresh value on every rebuild. - Replace the try/except guard around request.session with hasattr — SessionMiddleware is always registered, the silent except was paranoid. Tests grow by a direct session-cookie inspection (decoupled from the profile template) and three startup-banner log assertions. * fix(auth): drop fragile session-decoder test + actually skip empty-target write Two follow-ups on the LOCAL_DEV_GROUPS feature before merge: - Drop test_session_holds_mocked_groups_directly. It manually decoded the signed session cookie via TimestampSigner + base64, hardcoding both the Starlette session-cookie format and the 14-day max_age. Starlette has changed its session encoding before (URLSafeTimedSerializer pre-0.20) and would do so again silently — the test would fail with a cryptic BadSignature, not a clear "mock is broken" signal. The remaining test_dev_user_sees_mocked_groups_on_profile already covers the same observable signal (mocked groups in /profile body) without coupling to Starlette internals. - Actually skip the session write when target_groups is empty. The previous comment claimed compare-then-write avoided spurious Set-Cookie noise on PAT/CLI requests, but on those requests session.get("google_groups") is None and target is [], so None != [] always evaluates True and the write fired anyway, marking the session dirty and re-issuing Set-Cookie on every request. Adding `target_groups and ...` to the guard makes the comment honest: empty mock now genuinely no-ops, stable browser sessions still skip via value-equality, and the only remaining write is the one that actually changes state. 33 auth tests still pass locally. * fix(auth): match production's always-write semantics for stale dev groups Devin code-review finding on PR #70: my earlier `target_groups and ...` short-circuit silently diverged from the production OAuth callback. In app/auth/providers/google.py:189-194 the callback always writes session.google_groups on each login — including [] on failure or empty token — so the session always reflects authoritative current state. The mock should match. Failure mode the previous guard left open: a developer sets LOCAL_DEV_GROUPS=[{...}] for a session, the groups land in the signed cookie, then the developer unsets the env var and reloads. target → [], session.get → [{...}], `if target_groups and ...` is False, no write, stale groups stay in the browser session indefinitely. Mock now lies about state until logout. Fix splits the guard: - target_groups truthy + value-changed → write the new mock (existing path) - target_groups falsy + non-empty stored → write [] to clear stale state - otherwise no-op (target [] + stored None/[]: no transition to record) PAT/CLI requests with no prior session still take the no-op path (target=[], session.get → None which is falsy), so the original goal of suppressing spurious Set-Cookie noise on token traffic is preserved. Tests already cover the populated and unset paths; the new clear-stale branch is correct by construction (production has the same shape) and the rare manual reset workflow. * release(0.11.2): default mocked groups in make local-dev + docs/local-development.md Cuts 0.11.2 around the LOCAL_DEV_GROUPS work plus a small dev-experience follow-up: every `make local-dev` now boots with two sensible default mocked groups (Local Dev Engineers + Local Dev Admins on example.com), so /profile and group-aware code paths render something realistic without the operator having to discover and set LOCAL_DEV_GROUPS. Layered so the default lives in the workflow, not the contract: - scripts/run-local-dev.sh seeds LOCAL_DEV_GROUPS via shell ":=" syntax — only sets the var when the operator hasn't already. Override: LOCAL_DEV_GROUPS='[...]' make local-dev. Disable: LOCAL_DEV_GROUPS= make local-dev. - docker-compose.local-dev.yml swaps the commented JSON example for a bare `- LOCAL_DEV_GROUPS` passthrough — the value comes from the shell, the compose file just propagates it. Operators running `docker compose up` directly without the wrapper script get an empty mock (correct: they didn't opt into the make-driven defaults). - Makefile help line mentions the mocked groups so the behavior is visible without grepping. New docs/local-development.md consolidates dev-onboarding instructions that were previously scattered across docker-compose.local-dev.yml inline comments, docs/auth-groups.md "Local-dev mock" section, the Makefile help text, and CLAUDE.md "First-Time Setup". Single page now covers TL;DR, what LOCAL_DEV_MODE actually bypasses, group mocking controls + verification, what is not mocked (Cloud Identity, real OAuth, admin Workspace permissions), and the safety rails that keep the dev shortcuts off production. Version bump 0.11.1 → 0.11.2 in pyproject.toml, CHANGELOG cuts [Unreleased] → [0.11.2] — 2026-04-26 with a fresh empty [Unreleased] skeleton. * fix(local-dev): default LOCAL_DEV_GROUPS truncated by shell parameter expansion Reported by an operator running `make local-dev` against the freshly released 0.11.2 — the LOCAL_DEV_MODE banner showed: LOCAL_DEV_GROUPS is not valid JSON, ignoring: Expecting ',' delimiter: line 1 column 70 (char 69) LOCAL_DEV_GROUPS is set but produced no valid groups — check the WARNING above for the parse error. Cause: the default value lived inside `${LOCAL_DEV_GROUPS:=…}` parameter expansion. Bash matches `}` to close the expansion at the first `}` encountered in the body, regardless of context — even one inside a nested JSON object literal. The two-element JSON array was therefore truncated to the first group's closing brace, leaving an unparseable fragment: [{"id":"local-dev-engineers@example.com","name":"Local Dev Engineers" There is no escaping syntax for `}` inside parameter expansion (the backslash escapes I had only escaped the quotes — `}` reaches bash literally). Fix: hold the default in a single-quoted variable and reference it through `${LOCAL_DEV_GROUPS:-$DEFAULT_LOCAL_DEV_GROUPS}`. The variable's value is opaque to the expansion — no `}` matching inside it — so the JSON survives intact. Verified with `python -m json`: parsed OK: 2 groups: ['local-dev-engineers@example.com', 'local-dev-admins@example.com'] Operators on a running 0.11.2 stack: `make local-dev-down && make local-dev` to pick up the corrected default. * fix(local-dev): respect LOCAL_DEV_GROUPS= disable path + add 0.11.2 changelog link Two follow-ups from a Devin code-review pass on PR #70: - run-local-dev.sh: switch ${LOCAL_DEV_GROUPS:-$DEFAULT} to ${LOCAL_DEV_GROUPS-$DEFAULT} (no leading colon). The :- form substitutes the default when the variable is unset OR set-but-empty, silently overwriting the documented disable knob. Three places promise this works — docs/local-development.md, the CHANGELOG entry, and the script's own comment — so the bug was an operator-facing lie, not just an implementation detail. The bare - form only substitutes on unset, so `LOCAL_DEV_GROUPS= make local-dev` now reaches the Python parser as "" and short-circuits to []. Verified with both empty and unset shells. - CHANGELOG.md: add the [0.11.2] link reference at the bottom. Keep-a-Changelog convention is to mirror every version heading with a release-tag link in the footer; the 0.11.2 heading was missing its counterpart, breaking the Markdown link rendering on GitHub. --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-04-26 16:48:55 +02:00
Petr Simecek	2099bb816e	release(0.11.1): hotfix the missed CADDY_TLS passthrough + changelog discipline (#67 ) Patch release containing the two follow-up changes from 0.11.0: - Caddy CADDY_TLS env passthrough in docker-compose.yml (#55) — should have shipped with #52 but the first PR got accidentally closed before merge. Without this fix Caddy ignores .env CADDY_TLS and crash-loops on any LE / internal-CA deployment. - CLAUDE.md changelog discipline (#59) — every PR touching user-visible behavior must update CHANGELOG.md under [Unreleased] in the same PR. The discipline rule itself caused this release to exist: writing the [Unreleased] entry made the missed fix obvious, which is exactly the feedback loop the rule is supposed to create.	2026-04-26 01:52:08 +02:00
Petr Simecek	598f186eb1	release(0.11.0): reset to pre-1.0 semver + first changelog (#58 ) The version = "2.x" strings in earlier pyproject.toml snapshots were arbitrary placeholders from the initial scaffold (cookiecutter default), not a reflection of API maturity. Resetting to 0.11.0 to signal pre-1.0 status: public surface (CLI flags, REST endpoints, instance.yaml schema, extract.duckdb contract) may still shift between minor versions. CalVer image tags (stable-YYYY.MM.N, dev-YYYY.MM.N) continue from CI; semver tags (v0.X.Y) are cut at release boundaries and reference the same commit as a stable-* tag from the same day. CHANGELOG.md replaces the old CalVer draft format with Keep a Changelog + semver. The 0.11.0 entry curates everything currently in main: - Auth: Workspace groups, password reset, PAT, magic-link, seed admin pwd - Deploy: keboola-deploy workflow, Caddy/LE/cert-file TLS, dev_instances TLS, optional Google OAuth from SM, LOCAL_DEV_MODE, /setup wizard - CLI: wheel distribution, auto-update, --version, --dry-run, gzip - Data: remote query (BQ+DuckDB), business metrics, OpenAPI snapshot test - Security: padak-security.md audit batch + urllib3 + argon2-cffi - Two BREAKING items called out (Caddy profile rename, Caddyfile default cert mode flipped to cert-file)	2026-04-26 01:05:55 +02:00
Petr Simecek	1bbbe58ea0	release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43 ) * fix(cli): versioned wheel URL in setup instructions; drop broken /cli/agnes.whl alias (#36) * fix(cli): inline PEP 427 wheel filename in setup instructions `uv tool install <server>/cli/agnes.whl` fails with error: The wheel filename "agnes.whl" is invalid: Must have a version because uv validates the filename in the URL path before fetching — so the server-side Content-Disposition header (which has the real versioned filename) is never consulted, and an HTTP redirect does not help either: uv resolves the filename from the initial URL. Fix the root cause by inlining the real PEP 427 filename into the setup snippet the dashboard copies to the clipboard. The wheel filename is resolved server-side via `_find_wheel()` and substituted into the lines returned from `setup_instructions.resolve_lines()`, so both the read-only HTML preview and the JS clipboard renderer get byte-identical output. Also added `/cli/wheel/{filename}` to serve wheels at their PEP 427 path, and kept `/cli/agnes.whl` as a 302 redirect for manual/legacy callers — though that redirect alone is NOT sufficient for `uv tool install` (uv validates before following redirects) and is there only as defense-in-depth. Verified locally: - `uv tool install <server>/cli/wheel/agnes_the_ai_analyst-2.0.0-py3-none-any.whl` succeeds - `/install` HTML now renders the versioned URL; `/cli/agnes.whl` no longer appears in the rendered snippet * fix(cli): remove /cli/agnes.whl alias entirely — it only confused users The bareword alias was never actually usable: - `uv tool install <server>/cli/agnes.whl` fails at filename validation before any HTTP fetch, so neither the Content-Disposition header nor a 302 redirect rescued it. - The 302-to-versioned-path fallback left a visibly "working" URL in browser / curl -L contexts, which is exactly how the original bug got reported in the first place ("the URL loads, why doesn't install work?"). Remove the endpoint and scrub all remaining references. The only CLI wheel URL is now `/cli/wheel/{filename}` with the real PEP 427 filename, which the setup-instructions template already generates server-side. Existing tests that referenced /cli/agnes.whl become negative tests ("must not appear") so we don't regress. * feat(cli): --version flag; sync --dry-run + progress indicator (#38) * feat(cli): add --version / -V flag Prints `da <version>` from package metadata (importlib.metadata). Falls back to "unknown" when the package is not installed (e.g. running from a source checkout without `uv pip install -e .`), instead of crashing. Eager typer callback, so `da --version` exits before subcommand resolution and does not require any auth/config. * feat(cli): da sync --dry-run + X/N progress indicator --dry-run reports what would be downloaded/uploaded without hitting the API or writing local state. Supports the full flag set (--table, --json, --upload-only); JSON shape is {"dry_run": true, "would_download": [...], "summary": {...}}. Progress bar now shows "[X/N] Downloading <table>..." with a Rich BarColumn + TaskProgressColumn + TimeElapsedColumn instead of a bare spinner — makes long syncs visible. * feat(cli): durable sync + server gzip + auto-update check (#41) * fix(sync): atomic writes + manifest hash verification + retry on transient errors Three durability hooks around stream_download and the sync command: 1. Atomic writes. stream_download now streams into `<target>.tmp` and calls os.replace() on success, so the real target file never exists in a half-written state. On failure the tmp is unlinked — no cleanup leftovers, no guard needed at read time. 2. Retry with backoff. Transient errors (ConnectError, ReadError, WriteError, RemoteProtocolError, TimeoutException, 5xx) are retried up to 3× with 0.3s / 1s / 3s backoff. 4xx (auth, 404) surfaces immediately — retrying those is pointless. 3. Manifest-hash verification. After download, sync.py computes MD5 of the target (same 8KiB chunking as app/api/sync.py:_file_hash) and compares against `server_tables[tid]["hash"]`. Mismatch ⇒ unlink, record error, skip state commit. The PAR1 structural check survives as a fallback for legacy manifests without a hash. Also makes _rebuild_duckdb_views tolerant: single broken parquet is skipped with a stderr warning instead of killing the whole rebuild. Supersedes #40 — this commit is a strict super-set (hash check + PAR1 fallback + atomic write + retry). #40 can be closed without merging. * perf(server): enable GZipMiddleware for JSON / HTML responses GZipMiddleware at minimum_size=1024 shaves bandwidth on manifest-style JSON endpoints (/api/sync/manifest, /api/version, …) and the /install HTML preview. Parquet file downloads are already columnar-compressed so the middleware sees limited benefit there — but it doesn't hurt, httpx on the client side decompresses transparently. Placed after session middleware so gzip wraps the session-Set-Cookie response too, and before CORSMiddleware so compression is applied to both cross-origin and same-origin responses. * feat(cli): auto-check for newer CLI version on startup Server side - GET /cli/latest returns {version, wheel_filename, download_url_path} for whatever wheel is currently in AGNES_CLI_DIST_DIR. Public, cacheable, no secrets — consumed by the CLI auto-update probe. Client side - New cli/update_check.py: reads /cli/latest with a 3s timeout, caches the result in $DA_CONFIG_DIR/update_check.json for 24h. Cache is invalidated when the installed version changes (e.g. after a fresh `uv tool install`) so stale "you're behind" warnings don't linger. - Root typer callback fires the probe before subcommand dispatch; any failure is swallowed so a bad network never blocks a working command. - Outdated → one-line stderr warning: [update] da 2.0.0 is out of date — latest on this server is 2.1.0. Upgrade: uv tool install --force <server>/cli/wheel/<…>.whl - Disable with DA_NO_UPDATE_CHECK=1. * fix(pr-review): None-guard the upgrade line + skip gzip on parquet paths Two follow-ups from Devin review on #41. 1. format_outdated_notice(UpdateInfo(download_url=None)) emitted literal "uv tool install --force None" — copy-pasting that fails. Drop the upgrade snippet when the URL is absent and keep only the version line. 2. GZipMiddleware compressed everything over 1024 bytes, including the parquet FileResponses served by /api/data/{tid}/download, /cli/wheel/{name}, and /cli/download. Parquet is already columnar- compressed — gzip there is pure CPU + latency with no size win, and /api/data bodies can reach hundreds of MB. Wrap GZipMiddleware in a small _SelectiveGZipMiddleware that skips those path prefixes and delegates the rest to the stock middleware. JSON / HTML endpoints (manifest, /install, /api/version, …) still get compressed. * release: bump to 2.1.0 — unify AGNES_VERSION with pyproject.toml version (#42) Before: two independent version systems. pyproject.toml carried semver (2.0.0 → wheel filename → `da --version`) while release.yml injected CalVer into AGNES_VERSION (e.g. 2026.04.155 → /api/version). Users saw different strings in the CLI vs. the /install page, and the CLI auto- update check couldn't tell "new deploy, same package version" apart from "new package version". Make pyproject.toml [project].version the single product-version source of truth. release.yml extracts it and feeds AGNES_VERSION, so every surface (/api/version, /api/health, /cli/latest, `da --version`) agrees on one number. The CalVer tag keeps doing what CalVer is for: release identity on the git tag and Docker image tag (versioned_tag). Also wires AGNES_TAG through the build: release.yml → Dockerfile ARG → env, so /api/version.image_tag finally reports the actual image tag instead of the "unknown" fallback. Bump to 2.1.0 to reflect the PRs shipped on ps/wheel-name-fix: durable sync (atomic writes + manifest MD5 + retry), server GZip, CLI auto- update probe, setup snippet PEP 427 URL. * fix(pr-review): directional version compare in is_outdated() UpdateInfo.is_outdated() used `self.latest != self.installed`, which fires in both directions. If the server is rolled back or the user connects to an older deployment, the CLI would warn "out of date" and — worse — the formatted notice would prompt uv tool install --force <older-version>.whl i.e. an unintended downgrade. Compare with packaging.version.Version (PEP 440 aware, handles pre- release tags). Fall back to dotted-int tuple compare if packaging is somehow missing, and return False on unparseable strings — better to miss an upgrade hint than to silently suggest a downgrade. Adds 4 test cases: installed older (True), installed newer (False), 10.0.0 vs 2.1.0 lexical-compare trap (correct), unparseable strings (False). Addresses Devin review on #43. * fix(pr-review): read FastAPI app version from package metadata app/main.py:80 hardcoded `version="2.0.0"` in the FastAPI constructor. After #42 bumped pyproject.toml to 2.1.0, /api/version, /cli/latest, and `da --version` all reported 2.1.0 while /openapi.json and the /docs UI still advertised 2.0.0. Read `agnes-the-ai-analyst` version via importlib.metadata (same pattern cli/main.py:_cli_version already uses), with a `"dev"` fallback when the package is not installed (source checkout). This way pyproject.toml stays the single source of truth across every version surface — /openapi.json now tracks the bump automatically. Adds a dedicated test file to pin this behavior so a future regression to a hardcoded literal fails at CI. Addresses second Devin finding on #43. * fix(pr-review): _fmt_bytes PiB label + negative cache in update_check Two more follow-ups from Devin review on #43. 1. _fmt_bytes off-by-unit. The old loop exited at TiB but the fallback labelled PiB, so 1 PiB rendered as "1024.0 PiB". Restructure: put every unit inside the loop (KiB through EiB) so the division count always matches the label. Covers up to 1 ZiB cleanly; anything beyond renders as "<big>.0 EiB" rather than crashing. 2. Negative cache for failed /cli/latest probes. On a corporate firewall / VPN that silently drops packets, the 3s HTTP timeout fired on every `da` invocation. Writing a `latest=None` cache entry with a 5-minute TTL caps that at one probe per 5min. Successful probes still use the 24h TTL. Reading logic branches on whether the cached `latest` is None. Adds TestFmtBytes (2 cases: small/medium sizes and the PiB/EiB fallback regression), plus two TestSync update-check cases covering negative- cache reuse and TTL expiry.	2026-04-22 21:18:18 +02:00
dependabot[bot]	6e93461918	chore(deps): bump python-multipart from 0.0.24 to 0.0.26 Bumps [python-multipart](https://github.com/Kludex/python-multipart) from 0.0.24 to 0.0.26. - [Release notes](https://github.com/Kludex/python-multipart/releases) - [Changelog](https://github.com/Kludex/python-multipart/blob/master/CHANGELOG.md) - [Commits](https://github.com/Kludex/python-multipart/compare/0.0.24...0.0.26) --- updated-dependencies: - dependency-name: python-multipart dependency-version: 0.0.26 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-04-21 13:26:19 +00:00
dependabot[bot]	043ae4b378	chore(deps): bump authlib from 1.6.9 to 1.6.11 Bumps [authlib](https://github.com/authlib/authlib) from 1.6.9 to 1.6.11. - [Release notes](https://github.com/authlib/authlib/releases) - [Changelog](https://github.com/authlib/authlib/blob/v1.6.11/docs/changelog.rst) - [Commits](https://github.com/authlib/authlib/compare/v1.6.9...v1.6.11) --- updated-dependencies: - dependency-name: authlib dependency-version: 1.6.11 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-04-17 00:41:27 +00:00
ZdenekSrotyr	510608813c	test: add shared test infrastructure (fixtures, factories, assertions, mocks) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 11:05:35 +02:00
ZdenekSrotyr	86fe4b411d	fix: upgrade urllib3 1.26→2.6.3 — resolves all 4 Dependabot security alerts Removed kbcstorage from all dependency groups (optional + dev) so urllib3 is no longer pinned to <2.0. Legacy Keboola client is available via manual install: pip install kbcstorage	2026-04-09 14:53:30 +02:00
ZdenekSrotyr	809448e02b	fix: move kbcstorage to optional dep — unblocks urllib3 security updates kbcstorage pins urllib3<2.0.0 which blocks Dependabot security patches. Moved to [project.optional-dependencies] keboola-legacy since the primary extraction path uses the DuckDB Keboola extension, not kbcstorage. Legacy fallback uses lazy import — app works without it installed.	2026-04-09 14:46:50 +02:00
ZdenekSrotyr	0279cc06fa	refactor: consolidate deps into pyproject.toml, remove requirements.txt - All dependencies now in pyproject.toml [project.dependencies] - Dev/test deps in [project.optional-dependencies] dev and [tool.uv] - Dockerfile uses uv pip install . from pyproject.toml - CI uses uv pip install ".[dev]" - Deleted requirements.txt and requirements-dev.txt - Updated README, CLAUDE.md install instructions - Enhanced .dockerignore (exclude tests, docs, infra from image)	2026-04-09 13:17:59 +02:00
ZdenekSrotyr	224635b88d	security: fix auth (argon2, cookie, JWT), CORS, session middleware, pyproject.toml	2026-04-08 12:08:52 +02:00
ZdenekSrotyr	5ee12d78e7	refactor: final cleanup — delete legacy auth, clean deps, fix hash, migrate to uv - Delete root auth/ directory (legacy Flask providers, orphaned) - Clean requirements.txt: remove Flask, gunicorn, authlib, sendgrid, anthropic, openai, argon2-cffi (9 unused deps) - Fix hash computation in orchestrator: MD5 of parquet mtime+size (CLI sync now skips unchanged tables correctly) - Migrate pip → uv in CLAUDE.md, scripts/init.sh, pyproject.toml - Sync pyproject.toml dependencies with requirements.txt 578 tests passing.	2026-03-31 19:18:30 +02:00
ZdenekSrotyr	3701130a11	feat: add Docker, CLI tool, scheduler, and agent skills - Dockerfile (uv-based) + docker-compose.yml (3 services) - CLI tool 'da' with commands: auth, sync, query, status, admin, diagnose, skills - Scheduler sidecar service (replaces systemd timers) - pyproject.toml for uv distribution - Built-in skills (setup, troubleshoot) for AI agents - 17 CLI tests, 75 total tests passing	2026-03-27 15:30:03 +01:00

1 2 3

133 commits