agnes-the-ai-analyst

Author	SHA1	Message	Date
ZdenekSrotyr	917f9aaef0	release: 0.47.2 — restore #218 + #219 fixes silently reverted by #217 (#225 ) ## Summary Smoke-testing the just-shipped 0.47.1 against production exposed two regressions: 1. `agnes query --remote "SELECT FROM unit_economics WHERE bad_col=1"` returned `Table "unit_economics" must be qualified` (the OLD error) instead of `Unrecognized name: bad_col` (the #218 fix's intended behavior). 2. `agnes query "DESCRIBE unit_economics"` showed only DuckDB's misleading `Did you mean order_economics?` with no Agnes hint paragraph (the #219 fix is missing). Root cause: PR #217's squash merge (`506a378c`) carried stale snapshots of `app/api/query.py` and `cli/commands/query.py` from before #218 and #219 merged. The rebase-and-merge auto-merged those files cleanly (no conflict markers) but the result silently reverted both fixes. Restore the two changes verbatim. Tests for both fixes already on main and continue to pass against the restored code. ## Test plan - [x] `pytest tests/test_api_query_guardrail.py tests/test_cli_query.py` — clean - [x] Manual repro against prod after deploy: both flows now surface the intended diagnostic. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/225" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 19:57:18 +02:00
ZdenekSrotyr	506a378c3a	release: 0.47.1 — Keboola connector v27 (incremental, partitioned, where_filters, typed parquet) (#217 ) ## Summary Brings the Keboola connector to feature parity with the legacy internal data-analyst's per-table sync strategies. Closes the four documented gaps from the spec branch (`zs/keboola-connector-specs`): - Typed parquet in the legacy SDK extraction path — column types from Keboola Storage metadata (provider cascade `user > ai-metadata-enrichment > keboola.snowflake-transformation`) survive the CSV → parquet roundtrip; invalid date strings (`'0000-00-00'`) and invalid numeric strings (`'Non-Manager'`) become NULL while keeping the column's typed schema. Pre-fix everything was VARCHAR. - Incremental sync via Storage API `changedSince` — opt-in per table; pulls only delta rows, merges into the existing parquet by `primary_key` (drop_duplicates with keep='last'). Cuts daily extraction from O(full table) to O(delta). - Partitioned sync — flat per-partition layout `data/<table>/<key>.parquet` (e.g. `2026_05.parquet`), per-affected-partition merge for daily updates, chunked initial load with 1-day overlap and 2-empty-chunk stop heuristic. - `where_filters` — server-side row filter with date placeholders (`{{today}}`, `{{last_3_months}}`, `{{start_of_3_months_ago}}`, etc.) resolved at sync time. Force the SDK path; reject `incremental + where_filters` combination at API layer (changedSince already filters temporally). ## Architecture - Schema migration v25 → v26: 7 new columns on `table_registry`. Existing `sync_strategy` column reused (pre-v26 it was inert catalog metadata; post-v26 the extractor dispatches off it). - Per-table dispatcher in `extractor.run()` routes to one of `_extract_via_extension` (full_refresh + extension), `_extract_via_legacy` (full_refresh + filters or extension fallback), `extract_incremental`, or `extract_partitioned`. - API conflict policy: `incremental + where_filters` → 422; `partitioned + query_mode='remote'` → 422; `partitioned ⇒ partition_by required`. - Admin UI: third "Direct extract (Storage API)" radio in the Keboola Register / Edit modals, alongside existing "Whole table (extension)" and "Custom SQL". When selected, exposes a v26 sync-strategy panel with conditional fields per strategy. ## Test plan - [x] Unit + module — 134 v26 tests covering migration, repo, parquet_io, where_filters, incremental (compute_changed_since + merge_parquet + extract_incremental E2E), partitioned (key derivation + merge_partition + chunked windows + extract_partitioned E2E), extractor dispatcher, admin API validators, PUT field clearing, registry-shape → dispatcher bridge - [x] HTML form structure — all v26 inputs + visibility classes + JS payload fields verified in rendered template - [x] Real Keboola roundtrip — registered a small test table as `sync_strategy='incremental'` against a test Storage project, triggered two syncs: - Sync 1: `changedSince=None` → full pull → 9 rows typed parquet - Sync 2: `changedSince=last_sync - 1d window` → 9 delta rows merged with 9 existing → 9 after dedup on primary_key (PK merge confirmed) - [x] Browser UX — agent-browser session against a local uvicorn: login → admin/tables → register modal → switch radios → verify field visibility per strategy → submit → edit existing row → switch to Direct/Incremental → save → confirm DB persistence - [x] Regression — no regressions in the broader 3252-test suite (3 pre-v26 tests updated for the deprecation-marker removal + schema-version bump; 2 pre-existing environment-sensitive test failures unrelated to this change) ## Bugs caught + fixed during E2E The browser + real-Keboola roundtrip exposed four bugs the unit tests missed: 1. JS visibility race — two competing `forEach` loops set `display=''` then `display='none'` on form elements sharing `kb-strategy-incremental kb-strategy-partitioned` classes (window_days + max_history_days are reused across strategies). Fix: single-pass selector with class-based visibility resolver. 2. PUT cannot clear field — pre-v26 `updates = {k: v ... if v is not None}` collapsed "omitted from body" and "sent as null" into the same case, so admin couldn't switch a partitioned row back to full_refresh and have stale `partition_by` clear. Fix: `model_dump(exclude_unset=True)`. 3. Subprocess DB lock conflict — `_read_last_sync` reopened `system.duckdb` while the parent server held the write lock (subprocess contract at `app/api/sync.py:_run_sync` line 260). Fix: parent injects `__last_sync__` into table_config before subprocess spawn. 4. Wrong KBC table_id — `extract_incremental` / `extract_partitioned` built the Storage API table_id from the registry row's slugified `id` (`circle_inc`) instead of `bucket.source_table` (`in.c-finance.circle`), producing 404s. Fix: prefer `bucket+source_table`; fall back to `id` only when bucket empty. ## Operator notes - Existing tables stay on `full_refresh` after migration; admins opt individual tables in via `agnes admin register-table --sync-strategy ...`, the Keboola Edit modal, or `POST/PUT /api/admin/registry`. - `merge_parquet` and `merge_partition` use `pd.concat + drop_duplicates`, loading both existing and delta into pandas RAM. For tables in the multi-million-row range this may OOM — switch to `partitioned` strategy for those (per-partition merge keeps memory bounded). Documented in `### Internal` of the changelog entry. - Date placeholders are resolved at sync time, not register time — a typo'd `{{lasst_week}}` is accepted at register and surfaces only when the next sync runs. By design (rolling windows need late-binding). ## Spec source The four corresponding plans on the `zs/keboola-connector-specs` branch under `docs/superpowers/plans/2026-05-07-0[1-4]-*.md` capture the design rationale and link back to internal repo references for each subsystem. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/217" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 19:01:27 +02:00
ZdenekSrotyr	aa5921da67	release: 0.47.0 — source-agnostic catalog metadata + cache discipline (#223 ) ## Summary - Catalog enrichment for `query_mode='remote'` rows: `rows`, `size_bytes`, `partition_by`, `clustered_by` per table (BQ + Keboola providers). - `/api/v2/schema/{id}` cache miss: 2 BQ jobs → 1 (-50%) via shared `fetch_bq_columns_full`. - All four catalog/schema/sample/metadata caches flush on registry change; single-row re-warm scheduled. - Automatic cache warmup at server startup (bounded concurrency, opt-out via `AGNES_SKIP_CACHE_WARMUP=1`). - SSE-driven freshness toolbar on `/admin/tables` with progress bar, log, and per-row badge. - New admin doc `docs/admin/query-modes.md` — single source of truth on `local` / `remote` / `materialized` choice. Closes #155. Closes #156. ## Test plan - [x] 65+ targeted tests pass across 11 new test modules + 3 modified ones. - [x] No DB migration; no wire-break; `MIN_COMPAT_CLI_VERSION` unchanged. - [ ] Reviewer: register a remote BQ table via `/admin/tables`, observe the toolbar populates within ~2 s and the per-row badge transitions warming → fresh. - [ ] Reviewer: trigger `Re-warm all`, verify SSE log scrolls and `cacheWarmupBar` progresses. - [ ] Reviewer: edit a registered row's bucket, verify `agnes schema <id>` returns updated columns immediately (no 1-hour staleness). - [ ] Reviewer: confirm `agnes admin register-table --query-mode remote` prints the new IAM-smoke-check hint. ## Notable design decisions - BigQuery `INFORMATION_SCHEMA.TABLE_STORAGE` is the only valid scope for size+rows (verified live 2026-05-07; dataset-scoped doesn't exist). Region resolved from `instance.yaml.data_source.bigquery.location` → `bq.client().get_dataset(...)` → fall back to legacy `__TABLES__`. - VIEW handling: TABLE_STORAGE returns no rows for views, fall through to `__TABLES__` (also empty) → `TableMetadata(rows=None, size_bytes=None, partition_by=..., clustered_by=...)`. Null size signals analyst Claude to apply existing CLAUDE.md guidance. - `size_bytes` is `active_logical_bytes + long_term_logical_bytes` — full BQ scan reads both; reporting only active undercounts aged partitioned tables. - Source-agnostic provider seam: per-source `connectors/<source>/metadata.py:fetch(MetadataRequest)`; dispatcher in `app/api/v2_catalog.py:_metadata_provider_for` lazily imports per source_type so a Keboola-only deployment doesn't pay the BQ-extension import cost. - Warmup non-blocking: FastAPI `lifespan` schedules `asyncio.create_task(_warm_catalog_caches_bg)` before `yield`. Per-row failures isolated. ## Out of scope - Profile / column histograms / dimension cardinality for remote tables (separate issue). - Onboarding nudge ("you have 0 remote tables, consider registering some BQ ones") — separate UX call. - Provider plug-in registration via entry-points (the dispatch table is a hardcoded if-tree today; one line per future source). ## Release Bumps `pyproject.toml` 0.46.1 → 0.47.0 (main shipped 0.46.0 + 0.46.1 during this PR — see commit `d98976ec`). New CHANGELOG section under `## [0.47.0] — 2026-05-07`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/223" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 18:33:55 +02:00
ZdenekSrotyr	751cc25327	release: 0.46.5 — agnes describe -n parses, server sanitizes NaN (#224 ) ## Summary Two bugs in `agnes describe` surfaced from a real analyst session following the CLAUDE.md agent-rails discovery workflow. Together they break `agnes describe` end-to-end for any analyst (or analyst-AI) who follows the documented form. ### A) CLI parsing `agnes describe TABLE -n 5` failed with `Missing argument 'TABLE_ID'`. Root cause: the command was registered as a `Typer.Typer` subcommand group via `app.add_typer(describe_app, name="describe")` + `@describe_app.callback(invoke_without_command=True)`, and that pattern mis-parses positional + short-int option in some orderings. Same pattern in `cli/commands/schema.py` works only because schema has no INTEGER short option. Fix: switch to flat `@app.command("describe")`. ### B) Server NaN `/api/v2/sample/<id>` (called by `agnes describe`) returned HTTP 500 with `ValueError: Out of range float values are not JSON compliant: nan` whenever a row contained NaN. Fix: sanitize NaN/±inf to None before JSON serialization. ## Test plan - [x] `pytest tests/test_cli_describe.py` — added regression tests pinning `-n` parsing on either side of the positional. - [x] `pytest tests/test_api_v2_sample.py` — added regression test for NaN row → JSON `null` (not 500). <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/224" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 18:16:21 +02:00
ZdenekSrotyr	7fc5365891	release: 0.46.3 — self-heal session pipeline + clearer diagnose (#220 ) ## Summary Verified against production: `claude -p` headless mode doesn't fire SessionEnd hooks (proven via `--output-format stream-json --include-hook-events`: zero `SessionEnd` events), so any session JSONLs from `-p` invocations stay orphaned locally and never reach the server. Fix: add `agnes push --quiet` as a third SessionStart entry — symmetric self-heal alongside the existing `agnes pull` entry. Existing workspaces pick this up on their next `agnes init` via the marker-based migration already in `cli/lib/hooks.py`. Separately: a colleague's fresh install showed `agnes diagnose` warning "uploads are not being processed", which led them to suspect their `agnes push` was broken. The warning is actually about the LLM-based `verification-detector` backlog (uploads themselves were arriving fine — confirmed by 23+3 JSONLs landed on the server while the warning was firing). Reword the warning to "verification-detector backlog" + add `last_processed` to the diagnose dict so operators don't have to grep logs to confirm. ## Test plan - [x] `pytest tests/test_lib_hooks.py` — updated count + added `agnes push in SessionStart` assertion. - [x] `pytest tests/test_setup_hooks_template.py` — updated. - [x] `pytest tests/test_clean_install_integration.py` — updated. - [x] `pytest tests/test_health_session_pipeline.py` — updated warning text + asserted `last_processed` field. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/220" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 17:41:22 +02:00
ZdenekSrotyr	378ee40459	release: 0.46.1 — surface real BQ error from remote_estimate_failed retry (#218 ) ## Summary When `agnes query --remote` references a column that doesn't exist on the FROM table, users were seeing `Table "<id>" must be qualified with a dataset` instead of the actually-useful `Unrecognized name: <column>` from BigQuery. Surface the first-attempt diagnostic now; keep the second-attempt context as `underlying_original`. Reproduced against production: ``` $ agnes query --remote "SELECT COUNT(*) FROM unit_economics WHERE authorize_date = DATE '2025-05-06'" Error: remote_estimate_failed (HTTP 400) message: Could not estimate scan size for this query. underlying: 400 ... Table "unit_economics" must be qualified with a dataset. ``` (`unit_economics` has `authorize_timestamp`, not `authorize_date`.) ## Test plan - [x] New `test_remote_estimate_failed_surfaces_first_error_when_attempts_differ` asserts the first-attempt message wins, second-attempt is preserved as `underlying_original`, hint points to `agnes schema`. - [x] Existing `test_guardrail_returns_400_remote_estimate_failed_on_double_parse_error` still passes (both attempts mocked to identical error). - [x] `pytest tests/test_api_query_guardrail.py` clean. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/218" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 16:54:45 +02:00
ZdenekSrotyr	28430ced09	Keboola cutover: native parquet path + sync correctness + auto-discover protection (#190 ) * fix: cutover regressions + parallel Keboola legacy fallback Bundled fixes from a fresh-deploy run on a Keboola Storage backend with the block-shared-snowflake-access feature flag — DuckDB Keboola extension's per-table scan can't access bucket schemas, so the legacy kbcstorage Storage-API client is the only working path. CUTOVER REGRESSIONS - agnes pull hash mismatch on every Keboola local-mode table — src/orchestrator.py:_update_sync_state stored md5(mtime+size)[:12] while the CLI compares against full 32-char content MD5. Now stores the same content MD5 the materialized SQL path already used. - Trailing-slash sanitization in connectors/keboola/access.py and extractor.py — DuckDB Keboola extension's ATTACH fails when the URL ends in / (canonical form). - src/profiler.py:TableInfo.description becomes optional — two call sites instantiated without it, crashing the profiler pass. - scripts/ops/agnes-auto-upgrade.sh: chown on UID change — older images ran as root, current runs as agnes (uid 999). Reads target uid:gid from /etc/passwd inside the new image and chowns ${STATE_DIR}, /data/extracts, /data/analytics when the digest moves. - POST /api/sync/trigger is now singleton per process — two near-simultaneous trigger calls each forked an extractor subprocess, fought for extract.duckdb's file lock, starved uvicorn, flipped the container to unhealthy. Trigger now returns 409 (sync_already_in_progress) when held; _run_sync acquires non-blocking. PARALLEL LEGACY FALLBACK - Process pool fan-out for the _extract_via_legacy queue (default 8 workers, override via AGNES_KEBOOLA_PARALLELISM). Process pool, not thread pool, because connectors/keboola/client.py:export_table does os.chdir(temp_dir) — process-global, so threads raced and slice files landed in the wrong directory ("[Errno 2] No such file or directory: '<job_id>.csv_X_Y_Z.csv'"). - Extractor subprocess timeout 1800s -> 3600s (configurable via AGNES_EXTRACTOR_TIMEOUT_SEC). 28+ tables × multi-minute Keboola export jobs need the headroom on telemetry-class projects. - Process group cleanup on timeout — Popen(start_new_session=True) puts the extractor in its own group. On timeout the parent SIGTERMs the group (10s grace) then SIGKILLs stragglers. Without this, the pool workers were reparented to PID 1 and continued holding open Keboola Storage export jobs. Inline extractor script also installs a SIGTERM -> sys.exit(143) handler so the with ProcessPoolExecutor(...) block __exit__ runs cleanly. Tests: existing tests that patched subprocess.run updated to patch subprocess.Popen with a _FakePopen stand-in (same exit-code-injection contract). Two tests that exercised the parallel path forced AGNES_KEBOOLA_PARALLELISM=1 to keep mocks alive (mocks don't ride into ProcessPoolExecutor subprocesses). Squashed onto current main (was 7 commits + multi-commit CHANGELOG + agnes-auto-upgrade.sh conflicts; squash avoids per-commit conflict resolution against main's flat-mount STATE_DIR refactor and 0.38.0 release cut). * feat(keboola): Storage API direct extract path; drop extension data path The DuckDB Keboola extension's COPY routes through Keboola QueryService, which is unreliable on linked-bucket projects (extension v0.1.6 fixes that case but isn't yet in the community CDN, and pre-fix any project with the block-shared-snowflake-access feature flag couldn't see bucket schemas at all). Move the extract path off the extension entirely and talk to the Storage API directly via signed-URL download — works on any project, regardless of extension state. connectors/keboola/storage_api.py (NEW) Lightweight client built on requests.Session. Three endpoints: - POST /v2/storage/tables/{id}/export-async (kicks off job) - GET /v2/storage/jobs/{id} (poll until done) - GET /v2/storage/files/{id}?federationToken=1 (signed URL detail) - GET <signed_url> (download bytes) Supports sliced exports (manifest + per-slice signed URLs) and gzipped payloads. ExportFilter dataclass mirrors the Keboola filter spec (whereFilters / columns / changedSince / limit) and handles JSON round-trip with the registry's source_query column. Token redaction in error messages. Bounded exponential backoff on job polling. No cloud-SDK dependency on the data path; thread-safe. connectors/keboola/extractor.py - materialize_query() rewritten: takes bucket/source_table/source_query (JSON filter spec), exports via KeboolaStorageClient, converts CSV to parquet via DuckDB, atomic os.replace. Same return shape so sync.py downstream code stays uniform with the BQ branch. - _extract_via_legacy() also moved to Storage API direct (kept the name for caller compatibility with _legacy_worker / the parallel batch extractor). Per-call temp directories — no os.chdir, threads don't race. app/api/sync.py _run_materialized_pass for source_type='keboola' rows now constructs a KeboolaStorageClient (replaces KeboolaAccess) and passes bucket/source_table/source_query to materialize_query. Reuses one client across rows for HTTP keep-alive. Sources keboola URL from env too (KEBOOLA_STACK_URL) when instance.yaml doesn't have stack_url configured. cli/commands/admin.py discover-and-register defaults Keboola rows to query_mode='materialized' (NULL source_query = full table), matching the v26 migration's unification of the local/materialized split for Keboola. BigQuery and Jira keep their per-source defaults. src/db.py Schema bump 25 → 26. Migration: UPDATE table_registry SET query_mode='materialized' WHERE source_type='keboola' AND query_mode='local'. NULL source_query on those rows means "full table export" — same effective behavior the local mode provided, but now via Storage API instead of the extension. pyproject.toml kbcstorage dep stays (admin-side bucket/table list still uses the SDK in app/api/admin.py / connectors/keboola/client.py); only the data path is migrated off the SDK. Comment updated to reflect the new boundary. tests - test_keboola_storage_api.py (NEW, 19 tests): ExportFilter parsing, HTTP client (token redaction, retry logic, polling), download_file (single, gzipped, sliced), end-to-end export_table_to_csv. - test_keboola_materialize.py rewritten: mocks KeboolaStorageClient instead of FakeAccess; same atomic-write + zero-rows + unsafe-id contracts. - test_sync_trigger_keboola_materialized.py: registry rows now carry bucket+source_table+JSON-shape source_query. 114+ Keboola-impacted tests green locally. * test: schema version assertion bumped to 26 alongside the keboola query_mode migration * fix(keboola): cutover hot-patches surfaced on agnes-dev Five small fixes that were applied as in-container hot-patches during agnes-dev cutover and need to be on the source-of-truth image so a fresh upgrade does not undo them. - app/api/sync.py: auto-discover gate considers the WHOLE registry (any source, any mode), not just rows where source matches and query_mode is local. After the v25→v26 keboola materialized migration an instance can have 30 materialized rows and zero local rows; the previous gate kept re-firing _discover_and_register_tables every scheduler tick, creating duplicate auto-discovered rows with the wrong bucket prefix every time. - app/api/admin.py: _discover_and_register_tables reassembles the bucket as <stage>.<bucket-id> (e.g. in.c-finance) instead of dropping the stage prefix; default query_mode for keboola is now materialized (the v26 contract); validator allows NULL source_query for keboola materialized rows (full-table export via Storage API export-async, no SQL needed). - cli/commands/admin.py: register-table mirrors the server validator (NULL source_query allowed for source_type=keboola); --bucket help text generalized to cover both BQ dataset and Keboola bucket id. - connectors/keboola/extractor.py: max_line_size=64 MiB on read_csv_auto so embedded JSON / SQL cells (kbc_component_configuration in particular) do not trip the default 2 MiB ceiling. - connectors/keboola/storage_api.py: GCP backend support — when the Storage API returns a manifest whose slice URLs are gs:// references with a gcsCredentials block, rewrite to the JSON REST download endpoint and authenticate with the issued OAuth bearer token; redact tokens in any surfaced error string. * test: align with new keboola materialized + auto-discover-gate contracts - test_admin_keboola_materialized: rename test_register_keboola_materialized_rejects_missing_source_query → test_register_keboola_materialized_accepts_missing_source_query. v25→v26 introduced 'keboola materialized with NULL source_query means full-table export via Storage API export-async' as the default registration shape; the rejection case is no longer the contract. - test_sync_filter: add list_all() to _StubRegistry. The auto-discover gate in _run_sync now keys off the WHOLE registry (not just local rows) so materialized-only Keboola instances do not re-trigger discovery on every tick. * feat(keboola): native parquet export — skip CSV roundtrip Storage API export-async accepts fileType={csv,parquet}. Switching the materialized sync to parquet eliminates the CSV → DuckDB COPY → parquet roundtrip that pinned a single uvicorn worker over 4 GiB on multi-GB tables (read_csv with all_varchar + max_line_size=64MB has to materialize the whole CSV in memory before COPY can stream out a parquet). Snowflake UNLOAD on Keboola's side already produces typed, self-contained parquet files; the extractor downloads them and renames into place. Two cases: - Single-file export (small table): file_info.url points at one signed URL; download_file streams chunks straight to .parquet.tmp and we're done. No DuckDB. - Sliced export (Snowflake UNLOAD respects MAX_FILE_SIZE — 16 MiB default — so anything larger arrives as N parquet slices): each slice is a complete parquet file with its own footer; naive concat would corrupt them. download_file_slices keeps the slices as separate files in a tempdir, then DuckDB COPY (SELECT * FROM read_parquet([slice0, slice1, ...])) merges them into one consolidated parquet. DuckDB streams row groups during this — peak memory bounded to one row group (~1 MiB) regardless of source size. The legacy CSV path stays as the explicit opt-in via source_query= '{"file_type":"csv"}' for projects whose backend can't UNLOAD parquet (none known today; cheap escape hatch). Backward-compat alias KeboolaStorageClient.export_table_to_csv kept. Also fixes a latent bug in download_file's gzip detection: previous heuristic flagged any unencrypted file as gzipped, which would have corrupted parquet downloads at gunzip time. Name-suffix-only now. * fix: tempdir leak cleanup, every 0m schedule, /sync/trigger body shapes Three small self-contained fixes uncovered during agnes-dev cutover. - connectors/keboola/extractor.py: tempfile.TemporaryDirectory now uses ignore_cleanup_errors=True so a worker death mid-write doesn't leave multi-GiB stale slice trees on the boot disk. (12 GiB seen after a disk-full crash where TemporaryDirectory's own cleanup also raised and got swallowed.) - src/scheduler.py: is_valid_schedule accepts 'every 0m' (interval=0 = always due). Force-resync of an errored row no longer requires waiting out the default 'every 1h' interval — admin can flip the schedule, trigger, then flip back. - app/api/sync.py: POST /api/sync/trigger accepts both ['table_id'] (legacy bare-array body) and {'tables': ['table_id']} (matches the response payload shape, more discoverable for clients building requests by hand). Malformed bodies return 422 with a structured detail; null/missing means 'sync everything' as before. Tests cover: tempdir cleanup on raise (sliced parquet path), is_valid_schedule + is_table_due 'every 0m' acceptance, and trigger body parametrized matrix (8 valid shapes + 6 rejection cases). * fix: targeted-trigger filter in materialized pass + auto-upgrade defer Two operational gaps observed during agnes-dev cutover, in the same sync-routing area. - _run_materialized_pass now takes a 'tables' arg and skips rows not in the target set with reason='not_in_target'. POST /api/sync/trigger with a body of tables previously only scoped the legacy extractor subprocess — the materialized pass kept iterating every due materialized row, so an admin asking to re-sync kbc_job re-ran every other due materialized row alongside it. Match on registry id OR name (admins commonly pass either form). tables=None preserves the no-filter behavior. - New GET /api/sync/status (public, no auth) returns {locked: bool} off _sync_lock.locked(). agnes-auto-upgrade.sh probes this before docker compose up -d and exits 0 with a 'deferred recreate' log line if a sync is in flight — the next 5-min cron tick retries. Pre-fix, an auto-upgrade triggered mid-sync would recreate the uvicorn worker and kill the in-flight extractor / Snowflake-UNLOAD download (observed when kbc_job's first 7-day retry got SIGKILLed). Connection failures in the probe fall through to the upgrade — being stuck on a wedged image is worse than interrupting a hypothetical sync. * fix: auto-discover protects admin overrides + surfaces drift Two real-world incidents on agnes-dev drove this: 1. kbc_job was registered manually with the correct (in.c-kbc_telemetry, kbc_job) coordinates. A naive auto-discover re-run would have inserted a SECOND kbc_job row at the slugified id 'in_c-keboola-storage_kbc_job' (where Keboola's discovery places it) — and that row's Storage API export-async 404s. 2. An earlier auto-discover bug stripped the stage prefix from bucket ids ('c-finance' instead of 'in.c-finance'), inserting 137 rows whose syncs all failed. Fix: - _discover_and_register_tables now builds a plan first (_build_keboola_discovery_plan) classifying each discovered table into one of new / existing_match / existing_drift / invalid, then executes only the 'new' bucket. Drift rows are reported with both sides of the disagreement plus drift_kind: - same_id_diff_coords: registry has the same id but different bucket / source_table (admin migrated coords inline). - name_collision: discovery's slugified id differs from any registry id, but the discovered .name matches an existing row's .name (case-insensitive). Catches the kbc_job case. - Bucket detection now prefers the API's authoritative bucket_id field (separate field on the Keboola tables.list response, normalised by KeboolaClient.discover_all_tables). Falls back to id-string parsing only when bucket_id is missing (older fallback path inside discover_all_tables). - Endpoint POST /api/admin/discover-and-register?dry_run=true returns the plan without writing — would_register, drift, invalid lists. Lets an operator audit before merging discovery with a registry that has admin overrides. Removed 'every 0m' from test_register_request_rejects_malformed_sync_schedule — the runtime started accepting it in the previous commit (force-resync override) and the validator follows suit. * feat(keboola): AGNES_TEMP_DIR routes tempfiles off overlayfs /tmp The container's /tmp lives on the boot disk's overlayfs (29 GiB on agnes-dev, shared with /var). Snowflake UNLOAD of a wide table writes slices into per-call /tmp tempdirs that fill multi-GiB / many-slice exports long before the dedicated data disk fills. agnes-dev hit 100% boot-disk while the 20 GiB data disk had 15 GiB free. connectors.keboola.storage_api.get_temp_root() reads AGNES_TEMP_DIR; mkdirs the target on first use; unset / empty / unwritable falls back to None (system tempdir, OSS-pre-fix behaviour). Both materialize_query (parquet path) and _extract_via_legacy (CSV fallback) and the sliced-CSV concat path in storage_api use the helper now. docker-compose.yml defaults AGNES_TEMP_DIR=/data/tmp on app, scheduler, and extract services. The data volume is the dedicated disk in production layouts and a plain docker volume in single-disk dev/laptop setups — same blast radius as the previous /tmp default on the latter, no regression.	2026-05-07 12:12:14 +02:00
ZdenekSrotyr	c97fd504c5	release: 0.45.0 — easy-wins bundle (#84 #164 #177 #178 #203 #204 ) Operator-and-analyst quality bundle: a security fix for the optional Telegram bot, two CLI gaps closed, and three rounds of UX polish on `agnes diagnose` and `agnes pull` so non-TTY consumers (CI runners, Claude Code SessionStart hooks, sub-agent watchdogs) get readable, actionable signal. - Pairing-code RNG: random.choices -> secrets.choice (CSPRNG). - Telegram script runner: refuse out-of-shape usernames before sudo -u. CLAUDE.md.bak.<ISO-timestamp> before regenerating. - agnes admin unregister-table <id> -> DELETE /api/admin/registry/{id} - agnes admin update-table <id> --field=value ... -> PUT /api/admin/registry/{id} response but never promotes the headline. BQ billing-equals-data check downgraded warning -> info. default (5 s / 1 MiB vs 30 s / 10%) so sub-agent watchdogs don't kill the pull as a hung process. New env knobs: AGNES_PULL_PROGRESS_INTERVAL_{SECONDS,BYTES}. --include-schema (or ?include=schema) to opt back in. Tests: 120 passed across the touched modules, including new tests for each fix. Pre-existing failures on main (DB migration v1->v9, binary rename) are unrelated and not introduced here.	2026-05-07 11:43:16 +02:00
ZdenekSrotyr	df896816d8	chore: rename stale 'da' references to 'agnes' + CHANGELOG Drive-by docstring/comment cleanup in cli_artifacts.py and update_check.py. CHANGELOG entry for the auto-upgrade feature shipped in this branch.	2026-05-06 23:23:59 +02:00
ZdenekSrotyr	f4bc04958d	fix: Devin Review #1 — apply backtick mask to wrapping rewriter `_rewrite_user_sql_for_bigquery_query` does its own bare-name detection (mirroring the non-RBAC parts of `_bq_guardrail_inputs`). The backtick masking from #201 was applied to `_bq_guardrail_inputs` and the forbidden-table loop, but missed this third site — so a registered local-mode table name appearing as the table segment of a user-supplied full backtick path (e.g. ``\`prj.ds.orders\`` matching registered local ``orders``) tripped the cross-source guard and forced every backtick-path query into the 50-100× slower ATTACH-catalog fallback. Mask once at the top of the function, route both the BQ-name detection (line ~830) and the cross-source check (line ~867) through the masked copy. New regression test `test_local_name_inside_backtick_path_does_not_trip_cross_source` proves the wrapper now wraps when it should.	2026-05-06 21:06:21 +02:00
ZdenekSrotyr	824e3cb636	feat(query): registry-gate full backtick BigQuery paths (#201 ) Adds Pass 3 to `_bq_guardrail_inputs` that scans user SQL for full backtick paths `<project>.<dataset>.<table>` and gates them identically to the `bq."<dataset>"."<table>"` pass: - Project must match the configured BigQuery data project (`get_bq_access().projects.data`). Mismatch → HTTP 403 `bq_path_cross_project`. - Path must point at a registered row. Unregistered → HTTP 403 `bq_path_not_registered`. - Non-admin caller must hold a grant on the registered row's id. Missing grant → HTTP 403 `bq_path_access_denied`. Pre-fix, full backtick paths bypassed Agnes RBAC entirely — only the service account scope limited reach. Post-fix the boundary matches what `agnes catalog`-driven flows already enforce. Admin still bypasses the per-id grant check but cannot bypass registration or project match. Pass 3 also seeds `dry_run_set` for resolved registered paths so the cost-cap dry-run runs against the same physical table the user named — composing cleanly with the Layer 2 fail-fast fallback.	2026-05-06 18:02:53 +02:00
ZdenekSrotyr	c32be3fe96	fix(query): cap-guard fallback retries original SQL, fails fast (#201 ) When BQ rejects the rewritten dry-run SQL with `bq_bad_request`, the cap-guard now retries with the user's ORIGINAL SQL instead of building a synthetic `SELECT * FROM <table>` per registered table. The synthetic path threw away user filters / projections / partition predicates and routinely ballooned the estimate to "full table size", falsely tripping `remote_scan_too_large` on legitimate narrow queries (typical issue #201 trace: rewriter corrupts a backtick path → BQ parse error → synthetic over-estimate → 400). Behaviour: - Rewritten SQL succeeds: same as before (issue #171 single-dry-run). - Rewritten SQL parse-errors, original SQL succeeds: use original estimate. Common case for users submitting BQ-native input. - Both fail with `bq_bad_request`: HTTP 400 `remote_estimate_failed` with a hint pointing at `agnes catalog` / BQ-native syntax. No silent over-estimate. - Non-parse BQ error (forbidden, upstream): still 502 as before. This is a behaviour change for clients matching error kinds — failure to estimate scan size now surfaces as `remote_estimate_failed` instead of being masked behind `remote_scan_too_large` from the synthetic path. Replaces the existing `test_guardrail_falls_back_to_per_table_estimate_on_bq_parse_error` (which pinned the old contract) with `test_fallback_tries_original_sql_first` and `test_fallback_fails_fast_on_pure_duckdb_syntax`.	2026-05-06 18:02:53 +02:00
ZdenekSrotyr	720a2180c0	fix(query): rewriter respects backtick segments (#201 ) `agnes query --remote` corrupted user SQL when the request contained a full BigQuery backtick path (`<project>.<dataset>.<table>`) whose table segment matched a registered bare-name alias. The bare-name rewriter used `\b` word-boundary matching against the lower-cased SQL; both `.` and `` ` `` are non-word characters, so the regex fired INSIDE the user's backtick path and produced malformed nested-backtick SQL that BigQuery rejected at parse time. Fix: - Add `_mask_backticks(sql)` helper: replace each `…` segment with spaces of equal length, preserving offsets so word-boundary searches find positions only outside backticks. - `_bq_guardrail_inputs` (bare-name pass + forbidden-table pass) searches against the masked SQL. - `_rewrite_bq_table_refs_to_native` Pass 1 splits the SQL on `(\`[^\`]*\`)` and rewrites only the outside-backtick chunks. Pass 2 (`bq."ds"."tbl"` → backtick form) is unchanged — its prefix can't appear inside backticks. Adds three regressions covering the rewrite + guardrail paths.	2026-05-06 18:02:53 +02:00
ZdenekSrotyr	81d065b1ea	fix: Devin Review #1 — bigquery_query() first arg uses billing project, not data In cross-project BQ setups (where billing != data), the SA typically has serviceusage.services.use on the billing project but not on the data project. The rewriter passed bq.projects.data as the first arg to bigquery_query(), which BQ uses as the execution + billing project → 403 USER_PROJECT_DENIED. Match the convention used everywhere else in the codebase (app/api/v2_scan.py, app/api/v2_sample.py, app/api/v2_schema.py, connectors/bigquery/extractor.py): backtick paths inside the inner SQL use the data project (resolves the actual table location), the bigquery_query() first arg uses the billing project (decides who pays + which project the job runs under). For single-project deploys the two are identical so the fix is a no-op there. Test pins the cross-project case: data-prj for backticks, billing-prj for the bigquery_query() first arg.	2026-05-06 14:07:38 +02:00
ZdenekSrotyr	aee585fac6	fix: devil's advocate R2 — narrow shared-client try, PID tmp suffix, Syntax error anchor R2 adversarial review surfaced 3 issues, all addressed: #1 cli/client.py:572-577 outer try/except wrapped both _get_shared_client() AND the actual download. A 401/403/404/5xx from the server triggered a full second download attempt with a fresh client — wasted bandwidth on hard failures, no fail-fast on revoked PAT. Narrowed the try to only the shared-client construction; the download itself is no longer retried under the fallback except. #2 concurrent agnes pull invocations (e.g. SessionStart hook + manual run) collided on bare <target>.tmp / <target>.partN paths — one process's in-progress write got yanked by the other's cleanup, manifest hash check then failed spuriously. Per-process suffix (<target>.{pid}.tmp, <target>.{pid}.partN) makes intermediate files disjoint; the final os.replace to the bare target is atomic so last-writer-wins. #3 _looks_like_bq_rewrite_parse_error patterns 'Syntax error' could false-positive on a query like WHERE log_msg = 'Syntax error in foo' that fails for an unrelated reason (quota, network) and has the literal substring echoed in the error text. Anchored to 'Syntax error: ' (with trailing colon) — BQ always emits the colon in this error format, user SQL string literals normally don't.	2026-05-06 13:57:29 +02:00
ZdenekSrotyr	e5645fd280	fix: devil's advocate R1 — chunked probe, parse-error heuristic narrow, pool settings refresh, content-length sanity, multi-project skip R1 adversarial review surfaced 5 issues, all addressed: #1 chunked download silently disabled in non-Caddy deployments (HEAD on GET-only FastAPI route returns 405). _probe_range_support now falls back to GET with Range: bytes=0-0 when HEAD fails — works against both Caddy file_server (HEAD-friendly) and dev FastAPI direct (GET-only). #2 parse-error fallback heuristic too broad — matched on Unrecognized name / Function not found / No matching signature / Invalid cast, which BQ surfaces for ordinary user-column typos. That triggered slow ATTACH-catalog retry on every typo (2× latency tax). Narrowed to just 'Syntax error' / 'syntax error' which are the genuine DuckDB-vs-BQ dialect mismatch markers. #3 apply_bq_session_settings was only run on fresh-built pool entries, not on reuse. An operator's /admin/server-config change to bq_query _timeout_ms wouldn't propagate to long-lived pooled sessions until restart. Fixed: re-apply on every pool acquire (idempotent + fail-soft). #4 content-length sanity bound — a misconfigured proxy returning a wildly inflated Content-Length would cause overlapping chunked Range requests against the actual file → corrupt assembled output (caught by manifest hash check, but only after wasted bandwidth). Cap at 100 GiB; above that, drop to single-stream. #5 rewriter assumed every BQ row resolves under the single bq.projects.data project. Bucket containing '.' suggests a project- qualified bucket (multi-project deployment); rewriter would silently target the wrong project. Conservative skip with regression test.	2026-05-06 13:50:46 +02:00
ZdenekSrotyr	8e56d45c68	fix(query): code-review fixes — outer LIMIT wrap, dollar-quoting, parse-error fallback Address code-reviewer findings on the bigquery_query() rewrite path: 1. Outer LIMIT wrap — bigquery_query() materialises BQ result into DuckDB before fetchmany sees it (vs ATTACH-catalog Storage Read API streaming). A user 'SELECT *' against a billion-row remote table would buffer the entire result before request.limit applied. Wrap rewritten SQL in an outer 'LIMIT N+1' so the cap pushes into the BQ job itself. 2. Dollar-quoted inner SQL — naive replace("'", "''") doubling missed DuckDB backslash-escape sequences (\\, \\n, \\t, …). A predicate like 'WHERE name = ''O\\'Brien''' was unsafe under the doubling path. DuckDB $bqq_inner$ … $bqq_inner$ form takes the inner SQL verbatim with no escapes whatsoever. Falls back to legacy doubling if user SQL improbably contains the literal tag. 3. Parse-error fallback — when the rewritten path fails with a BQ-side parse / validation error (DuckDB-only syntax like ::INT cast that survives identifier rewrite but BQ refuses), retry the user's original SQL via the legacy ATTACH-catalog path so the request still succeeds. Mirrors the existing dry-run fallback contract. 4. CHANGELOG — delete duplicate CLI bullets that landed under already-released [0.38.1] (file corruption from merge — entries are correctly under [0.39.0]).	2026-05-06 13:29:45 +02:00
ZdenekSrotyr	b2c1ff143c	fix(query): rewrite BQ-backed user SQL via bigquery_query() to enable predicate pushdown User SQL hitting query_mode='remote' BigQuery rows was 50-100x slower than the equivalent direct bigquery_query() call because DuckDB's master view (CREATE VIEW … AS SELECT * FROM bigquery.<ds>.<tbl>) does not push WHERE/SELECT/LIMIT into BQ in ATTACH-catalog mode. The BQ extension opens a Storage Read API session over the entire upstream table; on >100M-row sources this was 70-150s and frequently failed with 'Response too large to return'. Extract the existing dry-run rewriter's core (table-name → BQ-native backtick path) into a shared helper. Add an execution-path rewriter that wraps the whole user SQL in bigquery_query('<project>', '<inner>') so the BQ planner sees the full query and engages partition pruning + projection pushdown server-side. Conservative fall-through: cross-source JOINs (BQ ↔ Keboola/Jira local), queries already containing bigquery_query(, and unconfigured BQ project all skip the rewrite and run the original SQL via ATTACH-catalog so behavior degrades gracefully.	2026-05-06 13:02:34 +02:00
ZdenekSrotyr	05e535d743	fix(admin/tables): unescape shell-quoting backslashes in descriptions	2026-05-06 10:13:49 +02:00
ZdenekSrotyr	6c94d2cbce	Merge remote-tracking branch 'origin/main' into pr180-review # Conflicts: # CHANGELOG.md # pyproject.toml	2026-05-06 07:27:25 +02:00
ZdenekSrotyr	df2c33147c	fix: Devin Review on #194 round 2 — 3 BUG-class findings 1. instance.yaml overlay path now matches read site under STATE_DIR. Three sites updated: - app/api/admin.py:1005 (server-config endpoint writer) - app/api/admin.py:2610 (configure endpoint writer) - app/instance_config.py:106 (overlay reader) All three now go through _state_dir() so under flat-mount layout (STATE_DIR=/data-state) the irreplaceable instance.yaml overlay lands on the state disk (sdc) instead of the regenerable data disk (sdb). Without this fix, .env_overlay correctly went to the state disk while instance.yaml went to the data disk — config would be lost if an operator wiped sdb. 2. Strip customer-specific tokens from OSS repo per CLAUDE.md vendor-agnostic rule: - docker-compose.host-mount.yml: 'a deployer (Groupon FoundryAI)' → 'a deployer in production' - docker-compose.flat-mount.yml: 'caused 2026-05-05 in the Groupon FoundryAI deployment' → generic 'production failure mode' - docs/state-dir.md: rewrote the incident reference to describe the failure mode abstractly without naming the deployment; updated the recommendation table to say 'shadow-mount class' instead of dating the specific incident. 3. Updated docs/state-dir.md 'What reads STATE_DIR' to list all read/write sites including the three migrated in this round (admin.py, instance_config.py, marketplaces.py). ANALYSIS finding (tls-rotate.sh hardcoded host-mount.yml) deferred — same operator-side class as auto-upgrade.sh hardcoded host-mount, documented limitation per the PR body.	2026-05-05 20:02:50 +02:00
ZdenekSrotyr	b6543c9c55	fix: Devin Review on #194 — 2 BUG-class findings 1. .env_overlay write paths now match read path under STATE_DIR. app/main.py:343 reads via _state_dir() (post-PR #194), but two write sites still hardcoded ${DATA_DIR}/state/.env_overlay: - app/api/admin.py:2687 — configure endpoint secrets persistence - app/api/marketplaces.py:152 — marketplace PAT persistence Under flat-mount layout (STATE_DIR=/data-state) the admin UI wrote secrets to /data/state/.env_overlay while the app read from /data-state/.env_overlay, silently dropping the value on next restart. Both write sites now go through _state_dir(). 2. host-mount.yml: caddy inherits data:/srv:ro from base, but with no service populating the data: named volume (other services switched to direct /data binds), the inherited mount points at an empty Docker volume — try_files finds nothing, every parquet download falls through to uvicorn, defeating the v0.36.0 file_server bypass under the host-mount layout. Added a caddy override that restates all mounts including a direct /data:/srv:ro bind. Mirrors the comment + treatment already in flat-mount.yml.	2026-05-05 19:47:12 +02:00
ZdenekSrotyr	f2ce915458	fix: Devin Review on #188 commit `28423907` — 2 bugs 🚩 /api/v2/catalog still async def while now calling sync stat() `/api/v2/catalog` was left as `async def` when the rest of Tier 1 was converted, on the assumption it was lightweight. The new `_materialized_size_hint` populator added in this PR calls `Path.stat()` / `Path.exists()` for every visible row to bucket the parquet size — on a local FS that's microseconds, but on a network-mounted DATA_DIR (NFS / CIFS / GCS-FUSE) those syscalls can block the event loop. Convert to plain `def` so FastAPI auto-offloads to the thread pool, mirroring /api/query etc. 🔴 stream_download translates HTTPStatusError as generic transport error `response.raise_for_status()` inside the retry loop raises `httpx.HTTPStatusError` on 4xx/5xx. After retries exhaust, the new `isinstance(last_exc, httpx.HTTPError)` check at line 219 was eating the status code: HTTPStatusError is a subclass of HTTPError, so the generic transport translation produced "Unexpected error: HTTPStatusError" instead of the informative "Client error '401 Unauthorized' for url …" that callers expect. Fix: short-circuit HTTPStatusError before the HTTPError branch — it re-raises verbatim so the caller's status-code handling + the rich server error body (e.g. 401 expired token, 403 cross_project_forbidden) reach the analyst. api_get / api_post / api_delete / api_patch don't have the same bug: httpx Client.get/etc. don't raise HTTPStatusError unless the caller explicitly calls .raise_for_status(), and our wrappers don't. Only stream_download does, hence the targeted fix there.	2026-05-05 18:29:44 +02:00
ZdenekSrotyr	e5fb913cec	perf: Tier 1 event-loop unblocking — async def → def on BQ-bound handlers Five hottest BQ-touching endpoints were `async def` but invoked synchronous DuckDB / BQ-extension calls inside the body. Under uvicorn's single event loop that meant a single heavy `agnes query --remote` (waiting up to ~200 s for BQ's jobs.query) froze EVERY other request — /api/health, dashboard, auth, even another query — for the full BQ wait. Operators saw "VM idle, app frozen" during PR #188's testing. Convert to plain `def` so FastAPI auto-offloads the body to the anyio thread pool. Event loop stays free for non-BQ requests. - app/api/query.py:execute_query - app/api/v2_scan.py:scan_estimate_endpoint, scan_endpoint - app/api/v2_sample.py:sample - app/api/v2_schema.py:schema Audit: 0 `await` statements in any converted handler (verified file-by- file), so the rename is safe. Tests in tests/test_v2_*.py called the handlers via `asyncio.run(...)` which now fails on a non-coroutine return; swapped for direct calls (asyncio.run( -> ( ) — keeps paren balance). Plus AGNES_THREADPOOL_SIZE env var (default 200, was anyio's stock 40) in app/main.py:lifespan. Set via anyio.to_thread.current_default_thread_limiter().total_tokens. 200 is comfortable headroom for <50 concurrent analysts; bump for more. 480/480 impacted tests pass (the 2 remaining errors are a pre-existing fixture setup issue in test_reader_smoke_matrix.py unrelated to this change).	2026-05-05 17:44:08 +02:00
ZdenekSrotyr	30e81a15b9	feat(workspace-prompt): decision tree + size-hint so analyst Claude gets it right first try Three concrete changes addressing the "analyst Claude misuses the CLI" class of bugs (image.png table — issues #3, #5, plus the recurrent "how big is this table" guesswork): 1. config/claude_md_template.txt — the template agnes init writes to <workspace>/CLAUDE.md. Surfaces every catalog-row field with a why, adds a query_mode-based decision tree, explicit --estimate scoping (snapshot create ONLY — was the #1 first-try error), an agnes fetch → agnes snapshot create rename note, and a 6-row failure-mode table that maps each common error wording to its right next step. 2. app/api/v2_catalog.py — populate rough_size_hint for local + materialized rows from the on-disk parquet size, bucketed small/medium/large/very_large. Was hardcoded null with a TODO; AI couldn't tell "is this 6.8 GB" without a failed --remote round-trip. 3. cli/update_check.py — the [update] banner survived the da→agnes rename and printed "[update] da X is out of date" on every command, training analysts to associate the binary with the old name. Verified by rendering the template against representative contexts (33/33 tests pass) and running every use case from the original screenshot through the real CLI against a dev VM.	2026-05-05 16:44:24 +02:00
ZdenekSrotyr	1be997f6d4	feat(caddy): file_server for parquet downloads — bypass uvicorn A single analyst's multi-GB `agnes pull` held the only uvicorn worker for the duration of the stream, starving UI / /api/health / every other API endpoint. Container flipped to `unhealthy`. Triggered while a 6.8 GB `order_economics` pull was in-flight on prod 2026-05-05. Caddy now intercepts `GET /api/data/{table_id}/download` and serves the parquet directly via sendfile from the data volume (mounted r-o at /srv inside the caddy container). RBAC enforced by `forward_auth` to a new lightweight `GET /api/data/{table_id}/check-access` endpoint (returns 204 / 403) — the bulk transfer never reaches uvicorn. Path discovery via `try_files` over the known extract.duckdb v2 source subdirs. Anything not at a static path falls through to the existing app handler so legacy `src_data/parquet` and future connectors still work without a Caddyfile change. Non-Caddy deployments are unchanged. Stage 1 (multi-worker uvicorn) was considered but blocked by the single-writer DuckDB lock on system.duckdb — workers > 1 would crash at startup on "Could not set lock on file", the same race that pushed the scheduler from in-process writes to HTTP-via-app. Multi-reader workers + single-writer coordination is out of scope for this PR.	2026-05-05 16:41:33 +02:00
ZdenekSrotyr	4f04235502	feat(bigquery): bq_query_timeout_ms knob; default 600s (was 90s) DuckDB BigQuery extension defaults `bq_query_timeout_ms` to 90 s, which is too tight for analyst-scale queries against view-backed BQ datasets. `agnes query --remote` HTTP 400'd with `Binder Error: Query execution exceeded the timeout. Job ID: ...` whenever the underlying BQ job ran longer than 90 s, even though the job itself was healthy. Add `data_source.bigquery.query_timeout_ms` (default 600 000 ms = 10 min, sentinel 0 falls through to the extension default). Applied via `SET bq_query_timeout_ms` after every `LOAD bigquery` on every BQ-touching DuckDB session: orchestrator's `_remote_attach` ATTACH path, BqAccess session factory, and the standalone extractor. Configurable via `/admin/server-config` UI. Fail-soft: extension versions that don't recognise the setting silently keep the default rather than poisoning the session.	2026-05-05 16:40:40 +02:00
ZdenekSrotyr	8d8d2c219e	refactor(cli-store): pull/info → agnes admin store; add agnes store mine Backup-orchestration commands were split across two namespaces (pull in agnes store, push in agnes admin store), which broke the operator mental model — pull/push are a paired operation and should sit together. Move pull + info into agnes admin store so all bulk operations share one help screen. Add agnes store mine as the user-facing equivalent — calls the same /api/store/bundle.zip endpoint with ?owner=me, which the server resolves to the caller's user_id. Authors can archive their own uploads without admin role; whole-Store bulk reads stay admin-flavored as a discoverability hint. Server: 3-line addition to export_bundle handles owner='me' as a magic alias for the caller. No new endpoint. Tests updated: pull/info expectations move from agnes store to agnes admin store; new tests cover agnes store mine and the ?owner=me server resolution. 69/69 store tests green locally.	2026-05-05 13:49:18 +02:00
ZdenekSrotyr	3d63965a67	Merge remote-tracking branch 'origin/main' into pr180-review # Conflicts: # CHANGELOG.md # app/web/templates/_app_header.html	2026-05-05 12:05:50 +02:00
ZdenekSrotyr	a8f9d065c8	feat(store): bundle export/import + agnes store update + agnes admin store push Adds whole-Store backup/restore primitives so an external CI/CD job can mirror the Store to a git repo (and restore back from one). REST: - GET /api/store/bundle.zip — deterministic ZIP of all (filtered) Store entities. Layout: manifest.json + entities/<id>/{plugin,assets}/. Manifest carries owner_email for cross-instance restore. Auth: any authenticated user (Store is community-open). - POST /api/store/import-bundle — admin-only restore. Modes merge\|replace\|skip; owner resolution by email with stub-disabled-user fallback when the email is unknown on the target instance. CLI: - agnes store update <id> [--description X] [--zip PATH] ... — in-place edit (server PUT permits owner OR admin per F4). Closes the missing edit affordance for analysts who want to fix a typo or push a new ZIP without losing install_count. - agnes store pull [-o store.zip] [--unpack DIR] — download the bundle. --unpack streams + extracts so an external git-backup workflow can drop the tree straight into a repo and `git add .`. - agnes store info [--json] — counts + size summary. - agnes admin store push <zip-or-dir> [--mode ...] — admin-only restore. Auto-zips a directory client-side so a working-tree → server round-trip is one command. cli/v2_client.py gains api_get_stream helper for binary downloads. Tests: 5 new server tests (bundle shape + filters + round-trip + stub user creation + skip mode + admin-only gate) + 11 new CLI tests (update, pull/unpack, info, admin push). 66/66 store-related tests green locally.	2026-05-05 11:51:31 +02:00
ZdenekSrotyr	d878764ac1	fix(session-collector-api): mirror sibling endpoints' audit-on-exception (Devin Review on #179 ) Devin flagged that run_session_collector still had the same audit-skip gap I fixed in run_verification_detector and run_corporate_memory in the previous two rounds — a PermissionError walking /home, an OSError on /data/user_sessions mkdir, or any other unhandled exception from collector.run() would skip the audit_log row and only show in docker logs. Same try/except + unhandled_error pattern as the sibling endpoints. All three LLM-pipeline run-* endpoints now record their failures the same way; /admin/scheduler-runs sees them. Regression test in tests/test_admin_run_endpoints.py::TestRunSessionCollector::test_unhandled_exception_still_audits.	2026-05-05 09:31:33 +02:00
ZdenekSrotyr	e86da72997	fix(corporate-memory-api): mirror verification-detector audit-on-exception (Devin Review on #179 ) Devin flagged that run_corporate_memory still had the same audit-skip gap I just fixed in run_verification_detector — if collect_all() throws anything other than the already-translated ValueError (DuckDB lock, network blip, unexpected SDK error), the audit_log row was never written and /admin/scheduler-runs missed the failure. Same try/except + unhandled_error pattern as the verification_detector fix from `4c4dfee8`. Regression test in tests/test_admin_run_endpoints.py::TestRunCorporateMemory::test_unhandled_exception_still_audits.	2026-05-05 09:11:13 +02:00
ZdenekSrotyr	4c4dfee8e6	feat(profile): /profile/sessions page + audit on detector exception + correct SCHEDULER_AUDIT_ACTIONS Three changes addressing user feedback during e2e test of #179 + Devin Review on `e86dd5ed`. 1) /profile/sessions — new self-service user page in the user menu. Lists all session jsonls the caller uploaded via `agnes push` joined against session_extraction_state. Each row shows uploaded_at, file size, status badge (pending/processed/extracted), processed_at, and items_extracted. The page docstring + help text explicitly call out that items_extracted=0 means the verification detector ran fine but the LLM found no claims to track — that's the documented "no items" outcome, not a broken pipeline. Closes the gap surfaced during the e2e test of #176 where a user could see their sessions on disk and process them through the LLM but had no UI to inspect what happened. 2) run_verification_detector audits unhandled exceptions (Devin #1). If detector.run() threw anything other than the already-translated ValueError, the audit_log row was never written. The endpoint now wraps detector.run in try/except, records the exception in audit_params["unhandled_error"], then re-raises as 500 after audit. The /admin/scheduler-runs page surfaces the failure row with the error type + message. 3) SCHEDULER_AUDIT_ACTIONS list corrected (Devin #2). Previous list had "marketplaces_sync_all" (wrong — actual is "marketplace.sync_all") plus "data_refresh" and "scripts_run_due" which app/api/sync.py and app/api/scripts.py don't write to audit_log. Fixed to the four actually-logged strings; comment points at the missing audit calls as a follow-up. Tests: tests/test_web_ui.py adds TestAdminRoleGuards::test_profile_sessions_page_no_admin_required and tightens test_admin_scheduler_runs_page_admin_only to assert the correct marketplace.sync_all string.	2026-05-05 08:57:35 +02:00
ZdenekSrotyr	f0d091f721	fix(store): scratch dir leak on ZIP validation failure (Devin Review) create_entity + update_entity created the `scratch` temp dir inside one try/finally but cleaned it up in a separate one. Validation HTTPExceptions raised by _safe_zip_extract (zip_unsafe_path, zip_too_large_uncompressed) or the BadZipFile→422 conversion exited the first scope, and the second finally was never entered → temp dir leaked on every failed upload. Devin flagged this on the F2 commit. The leak pre-existed (zip_unsafe_path was the original vector); F2 added zip_too_large_uncompressed to the same broken cleanup path. Fixed by collapsing scratch creation + cleanup into one outer try/finally that covers both extraction AND metadata/bake; the inner try/except/finally still handles BadZipFile→422 + tmp file cleanup. Same restructure in update_entity. Regression test `test_scratch_dir_cleaned_up_after_failed_extraction` triggers a zip_unsafe_path 422 and asserts tmp/agnes_store_* contains no leaked dirs.	2026-05-05 08:52:15 +02:00
ZdenekSrotyr	fd3c76d21b	fix(store): security + correctness blockers found in PR review (F1, F2, F4, F5) Three independent reviews of PR #180 surfaced four real defects in the new Store / my-ai-stack surface. CHANGELOG entries detail each; one-liners: - F1 video_url XSS: any authenticated user could upload a Store entity with `video_url=javascript:...` and pop XSS in any viewer's session via the `<a href=...>` "Watch video" link in store_detail.html. Jinja2 autoescape doesn't block URI schemes inside attribute values. Fixed by scheme-validating to http(s) only on create + update; 400 invalid_video_url. - F2 ZIP decompression bomb: _safe_zip_extract checked path-traversal but not declared file_size totals — a 50 MB compressed upload at 1:1000 ratio decompresses to 50 GB and DOS the host disk. Fixed by summing zinfo.file_size across infolist() and refusing > 200 MB before extractall touches disk. 413 zip_too_large_uncompressed. - F4 admin authz parity: PUT /api/store/entities/{id} was owner-only while DELETE allowed owner OR admin; the store-detail page hid Edit/Delete buttons from admin even though DELETE was permitted. Fixed by allowing admin on PUT and passing is_admin to the template; gate is now is_owner OR is_admin everywhere. - F5 cross-owner suffix collision: sanitize_username is many-to-one (alice.smith / alice_smith both → alice-smith). Two such users uploading entities with the same display name produced identical `<name>-by-<username>` suffixes, silently colliding in the served agnes-store-bundle on-disk paths AND the manifest catalog (Claude Code dedupes by plugin.json `name`). Fixed by enforcing global uniqueness on the suffixed value at create_entity; 409 conflict_global_suffix. F3 (ZIP symlink members) was investigated and confirmed to be a false-positive — Python's stdlib ZipFile.extractall does not honor symlink mode bits, so no exploit exists. 9 new regression tests in tests/test_store_api.py::TestStoreSecurityFixes covering all four. Test run locally: 60/60 store-related tests pass.	2026-05-05 08:18:02 +02:00
ZdenekSrotyr	e68c2d3f0f	fix(session-collector): argv-free run() helper, drop SystemExit footgun (Devin Review on #179 ) run_session_collector called collector.main() which did argparse.parse_args() on uvicorn's sys.argv (['app.main:app', '--host', ...]) → sys.exit(2) → SystemExit(2), which inherits from BaseException, escapes FastAPI handlers, and propagates through the thread pool. Every scheduler tick that fired the endpoint either 500-ed or risked killing the uvicorn worker. services/session_collector/collector.py now exposes run(dry_run, verbose) that returns (rc, stats); main() is a thin CLI shim that parses argv and delegates. The admin endpoint calls run() directly and audit-logs the per-run stats (users_processed, files_copied, files_skipped) instead of just the rc. Three regression tests in TestRunHelper. Closes Devin Review finding on app/api/admin.py:2819 (#179).	2026-05-05 06:31:55 +02:00
ZdenekSrotyr	9f33e24bf9	fix(config): overlay-aware LLM consumers + env-ref resolution (#179 review) Devin BUG: /api/admin/configure seeds an ai: block to the writable overlay at DATA_DIR/state/instance.yaml, but the three LLM consumers imported from config.loader.load_instance_config — which reads the static config dir only. Even if they had read the overlay, the loader ran yaml.safe_load directly without passing through _resolve_env_refs, so '${ANTHROPIC_API_KEY}' would have stayed a literal placeholder. The pipeline appeared to work because the factory falls back to the env var directly, but the overlay path itself was dead code. Two fixes, both required: 1. Switched the three LLM consumers to app.instance_config.load_instance_config: - services/corporate_memory/collector.py:collect_all - services/verification_detector/__main__.py:main - app/api/admin.py:run_verification_detector 2. app/instance_config.py runs the loaded overlay through config.loader._resolve_env_refs before the deep-merge, so '${ANTHROPIC_API_KEY}' resolves at config-load time. New regression suite tests/test_instance_config_overlay.py pins: - env-ref resolution against the overlay (resolved when env set, empty when env missing — never the literal placeholder) - deep-merge still preserves static-only sections - the three consumers reach app.instance_config (inspected via inspect.getsource so a future refactor that reverts the import fails the test) - end-to-end: a seeded overlay + ANTHROPIC_API_KEY env reaches the factory with a resolved api_key	2026-05-05 05:57:22 +02:00
ZdenekSrotyr	98a8aba3be	fix(tests): align test_llm_connector with new factory + fail-fast (#179 review) The PR rewrote collect_all() to call the new create_extractor_from_env_or_config() helper, but the existing tests still mocked the old direct create_extractor() symbol and the old silent-skip-on-missing-config behavior. Five tests in TestCorporateMemoryCollector and one in TestCollectorExtractorIntegration were red on the PR branch. Changes: - Tests now mock connectors.llm.create_extractor_from_env_or_config (the symbol the collector imports lazily). - Renamed test_collect_all_no_ai_config_skips -> test_collect_all_no_ai_config_or_env_raises and test_collector_handles_invalid_config -> test_collector_raises_on_invalid_config. Both assert pytest.raises(ValueError) — the explicit fail-fast semantics defect 5 of #176 was supposed to enforce. - collect_all() no longer swallows the factory's ValueError into stats["errors"]; it propagates so the scheduler / admin endpoint surface the actionable misconfiguration message instead of pretending the run was a no-op. - /api/admin/run-corporate-memory translates the propagated ValueError into a 500 with the factory's message, matching /api/admin/run-verification-detector.	2026-05-05 05:55:01 +02:00
Minas Arustamyan	d5a7c9ad79	feat(store): /store + /my-ai-stack — community marketplace + per-user composition Adds a community-driven Store where any authenticated user uploads skills/agents/plugins as ZIPs, plus /my-ai-stack as the per-user composition view. The served Claude Code marketplace is now: (admin_granted ∖ opt_outs) ∪ store_installs Skill + agent installs are merged into a single `agnes-store-bundle` plugin in the served marketplace; type=plugin uploads stay standalone. Names are suffixed with `-by-<owner-username>` at upload time so two owners can use the same display name without colliding in Claude Code's flat skill/agent namespace. Schema v23 → v24 adds three tables: - store_entities — community-uploaded skills/agents/plugins - user_store_installs — what each user has chosen to install - user_plugin_optouts — opt-out overlay on top of admin grants Admin grant-delete drops every user's opt-out for that plugin so re-grant resets cleanly to enabled (no sticky personal preference). UI: - /store — e-commerce-style listing with type/category/owner filters, search, pagination, owner-aware [Install] buttons, clickable cards - /store/new — 2-step upload wizard with drag & drop, preview validation (POST /api/store/entities/preview), docs multi-upload, photo + video URL - /store/{id} — detail page with hero, file list, docs, owner actions (Edit/Delete) for the uploader - /my-ai-stack — Granted plugins (toggle opt-out) + From the Store (uninstall) sections - Admin nav: Marketplaces moved into Admin dropdown, renamed to "Curated Marketplaces" Validation hardening: type-mismatch guards reject skill ZIP uploaded as agent (or vice versa), and plugin ZIPs masquerading as skills/agents. Human-readable error messages mapped client-side from machine codes. Cross-source naming: Store entity-id-prefixed dirs (`plugins/store-<id>/`) plus the bundle (`plugins/store-bundle/`) avoid collisions with admin marketplaces (whose `store` slug is reserved by `is_valid_slug`). Bundle composition is content-hashed at serve time — install/uninstall or owner re-upload bumps the bundle's plugin.json `version`, so Claude Code's auto-update toggle picks up changes. Tests: 50+ new tests across naming, repositories, filter (admin ∪ store ∪ bundle), API (upload/install/uninstall/delete/preview/docs), end-to-end marketplace.zip with bundle merging.	2026-05-05 02:53:49 +02:00
ZdenekSrotyr	a621a415cc	fix(health): session-pipeline staleness check (#176 ) GET /api/health/detailed now returns a session_pipeline service entry. Heuristic: max(mtime of /data/user_sessions/*/.jsonl) <= max(processed_at in session_extraction_state) + grace_seconds grace_seconds = 2 × verification-detector cadence (default 30 min; configurable via SCHEDULER_VERIFICATION_DETECTOR_INTERVAL). When the assert fails, status='warning' (never 'error') with an actionable detail pointing at the verification-detector scheduler job. A warning bubbles up to the existing overall='degraded' aggregation — operators querying /api/health/detailed (or /agnes diagnose system) get a clear breadcrumb instead of a silently-broken pipeline. Cold-start case (no session files, or files newer than the grace window with empty state table) is handled explicitly to avoid noise on a fresh deploy. Tests: tests/test_health_session_pipeline.py.	2026-05-05 00:04:28 +02:00
ZdenekSrotyr	45de71e8ab	fix(scheduler): wire LLM pipeline into scheduler-v2 (#176 ) The session-collector, verification-detector, and corporate-memory services now run on the same scheduler-v2 model that already drives data-refresh, health-check, script-runner, and marketplaces: - New admin endpoints in app/api/admin.py: POST /api/admin/run-session-collector POST /api/admin/run-verification-detector POST /api/admin/run-corporate-memory All admin-gated, sync-def (FastAPI thread pool), with one audit row per invocation. Same single-writer-of-system.duckdb pattern as the existing /api/marketplaces/sync-all job. - services/scheduler/__main__.py JOBS gains three entries with offset cadences (10m / 15m / 17m, all coprime modulo the 30s tick) so the three LLM-backed jobs don't fire on the same tick and stack their API + DB load. - The verification-detector endpoint surfaces the LLM factory's fail-fast ValueError as HTTP 500 with the actionable message, preserving the no-silent-skip contract from the previous commit. Tests: - tests/test_admin_run_endpoints.py covers admin gating + scheduler registration + endpoint contract. - tests/test_scheduler_sidecar.py existing tests continue to pass.	2026-05-04 23:57:43 +02:00
ZdenekSrotyr	bbb04ac041	fix(setup): seed default ai: block + env-var fallback (#176 ) POST /api/admin/configure now writes a default ai: block into the instance.yaml overlay when the request leaves it untouched and either ANTHROPIC_API_KEY or LLM_API_KEY is set in the environment. The block references the env var via ${VAR} syntax — secrets never land in YAML. connectors.llm.factory grows create_extractor_from_env_or_config which falls back to ANTHROPIC_API_KEY / LLM_API_KEY when ai_config is empty and raises a clear ValueError when neither is available. Both services/corporate_memory and services/verification_detector switch to the new helper, replacing the old 'silently skip when ai: missing' path that was the silent-failure root cause. Tests: - tests/test_setup_ai_block.py — overlay seeding contract. - tests/test_llm_provider_env_fallback.py — fallback + fail-fast.	2026-05-04 23:55:19 +02:00
ZdenekSrotyr	5915f92eaa	fix(query-guardrail): single-pass alternation regex (Devin Review on query.py:464) The iterative bare-name rewriter (one re.sub per name, longest-first) was vulnerable to cross-contamination when the GCP project ID contained a registered table name as a hyphen-delimited word. Concrete repro: project = 'my-ue-project' registered = ['orders', 'ue'] user SQL = 'SELECT * FROM orders JOIN ue ON ...' iter 1 (orders): produces 'FROM `my-ue-project.fin.orders` JOIN ue ...' iter 2 (ue): '\bue\b' matches 'ue' INSIDE 'my-ue-project' (hyphen creates word boundary on both sides) — corrupts the iter-1 path Fallback at query.py:576 caught the resulting BQ parse error and fell back to per-table SELECT * estimate, so impact was over-estimation, not fail-open — but the #171 partition-pruning fix silently degraded to pre-fix behavior whenever a project name shared a hyphen-segment with a registered table. Fix: single re.sub call with an alternation regex sorted longest-first. Single-pass means each source position is processed exactly once, so freshly-inserted backticked text from one match isn't re-scanned by later names in the alternation. Regression test test_rewrite_helper_does_not_corrupt_when_project_id_contains_registered_name covers the exact Devin repro.	2026-05-04 22:51:33 +02:00
ZdenekSrotyr	103efb69f0	chore(cli-rename): replace stale `da` verbs in active code paths Bring admin UI, audit-log messages, code comments, and analyst-facing skill docs in line with the post-bootstrap CLI surface (`agnes pull`, `agnes push`, `agnes init`, `agnes snapshot create`). The legacy `_LEGACY_STRINGS` detection tuple in `app/api/claude_md.py` and the hook upgrade markers in `cli/lib/hooks.py` are intentionally left as-is — they exist precisely to flag pre-rewrite content for re-authoring. Strip "(folded from `da metrics list`)" / "(lifted from `da metrics show`)" / "Replaces the old `da analyst status`" docstring noise — the rename history is in CHANGELOG.md, not in module docstrings.	2026-05-04 21:10:43 +02:00
ZdenekSrotyr	500db8cd3c	fix(query-guardrail): dry-run user SQL not synthetic SELECT * (#171 ) Closes #171. The /api/query cost guardrail used to dry-run a synthetic `SELECT * FROM <table>` for each registered remote-BQ row referenced by the user SQL — which made BigQuery estimate a full table scan, with column projection, predicate pushdown, and partition pruning all disabled. Narrow queries on big partitioned/clustered tables (the documented happy path for `agnes query --remote`) hit ~30,000× over-estimates and got rejected with 400 `remote_scan_too_large` even when BQ's own dry-run reported single-digit MB. Pavel's report on #171 traced the root cause and proposed the fix: rewrite the user SQL to BQ-native syntax and dry-run it as a single job, exactly the way `bq query --dry_run` works. Implementation: - New helper _rewrite_user_sql_for_bq_dry_run rewrites bare registered names (word-boundary, case-insensitive, longest-first to avoid prefix collisions) + bq."<ds>"."<tbl>" forms to backticked `<project>.<ds>.<tbl>` paths. - _bq_quota_and_cap_guard runs ONE dry-run on the rewritten SQL. Cap check uses the real estimate. - Fallback path: if BQ rejects with bq_bad_request (e.g. DuckDB-only syntax like ::INT casts), the guard falls back to the pre-fix per-table SELECT * approach so non-portable queries still get a (loose) cap estimate instead of fail-opening. Non-parse BQ errors (forbidden, upstream) still propagate as 502. - _bq_guardrail_inputs now also returns name_lookups so the rewriter has the (registered_name, bucket, source_table) mapping it needs. - Per-table breakdown is unavailable from a composite dry-run; total bytes are pinned to dry_run_set[0] for the post-flight record_bytes(sum(...)) call to keep returning the right total. Tests (7 new, 3 existing still pass): - dry-run receives rewritten user SQL with WHERE clause intact (the load-bearing assertion for #171) - single dry-run per request even with multiple registered tables (JOIN, UNION) referenced - fallback to per-table SELECT * on bq_bad_request - non-parse BQ errors (forbidden) still 502 - rewriter unit tests: bare + bq.path in same SQL, longest-name-wins on prefix collision, case-insensitive bare-name match	2026-05-04 21:08:21 +02:00
ZdenekSrotyr	e438170ade	merge: pull #174 (BQ materialize view fix + concurrency, 0.33.0) into bootstrap branch Brings in zs/materialize-sync-fix (PR #174): - BigQuery view materialize works (wrap admin SQL in bigquery_query()) - Per-table mutex + fcntl.flock for concurrent COPY corruption - Cost guardrail dry-run engages on materialized rows - Schema v23 -> v24 migration: rewrite source_query to BQ-native - Server-generated trivial source_query from bucket+source_table - Validator backtick relaxation for materialized rows - 0.33.0 release cut Conflict resolution: - CHANGELOG.md: keep our [Unreleased] (bootstrap rewrite content) ABOVE the new [0.33.0] section from #174. The bootstrap rewrite remains unreleased; it'll cut 0.34.0 (or later) when this PR merges to main. - tests/conftest.py: union — keep our analyst-bootstrap fixture re-export AND #174's bq_instance / stub_bq_extractor fixtures. - pyproject.toml auto-merged to 0.33.0 (matches the cut), correct. - src/db.py auto-merged: SCHEMA_VERSION = 24, _v23_to_v24_finalize added — no overlap with our work which left schema at v23. - CLAUDE.md auto-merged: schema-history paragraph extended with v24. Verified: 79/79 across CLI bootstrap suite + materialize suite + schema v24 migration tests pass locally on Python 3.13/macOS.	2026-05-04 20:53:00 +02:00
ZdenekSrotyr	3d58768143	fix: address Devin Review findings — incomplete renames + estimate guard 13 Devin findings across 10 files: 🔴 Critical: - app/api/v2_catalog.py:42 — `_fetch_hint` returns `da fetch` in /api/v2/catalog responses (user-visible in every catalog list) - cli/skills/agnes-data-querying.md — 11 stale `da fetch`/`da sync` refs in the bundled skill markdown - config/claude_md_template.txt:38 — referenced `agnes pull --docs-only` flag that does NOT exist in agnes pull (removed; spec only ships --quiet/--json/ --dry-run) 🟡 Important: - app/api/admin.py:252 — `da fetch` in bq_max_scan_bytes hint - cli/commands/auth.py:119 — `da sync` in import-token docstring (--help text) - cli/commands/tokens.py:48 — "Export it so `da` can use it" prose - ARCHITECTURE.md — 4 stale rows in CLI commands table - README.md — stale paragraphs for analysts (da sync, da analyst setup) 🚩 Substantive observations addressed: - app/api/query.py:249,302,489 — server-side error/help strings still said `da sync`/`da fetch` (returned in API responses to clients) - cli/commands/snapshot.py:235-241 — DuckDB existence guard incorrectly blocked `--estimate` (server-side dry-run that never opens local DB). Added test ensuring estimate path skips the guard. Skipped (intentionally historical): - app/api/admin.py:2377,2429,2437 — historical comments describing past manifest-vs-sync_state bug; past tense, accurate to keep as `da sync`.	2026-05-04 20:05:06 +02:00
ZdenekSrotyr	5bffec641f	chore(lint): final ruff fixes	2026-05-04 19:32:52 +02:00
ZdenekSrotyr	6c0846fd17	feat(config): expose materialize.lock_ttl_seconds in server-config New top-level 'materialize' section, single field (lock_ttl_seconds). Default 86400 (24h). Backs the file-lock TTL reclaim added in the per-table-mutex change. Editable via PUT /api/admin/server-config and the /admin/server-config UI.	2026-05-04 18:52:54 +02:00
ZdenekSrotyr	3871d5320a	feat(admin): server-generate materialized source_query, allow BQ backticks When admin registers a materialized BQ row with bucket+source_table but no source_query, the server generates 'SELECT * FROM `<project>.<ds>.<tbl>`' from instance.yaml's configured BQ project. Same fallback fires on PUT when flipping to materialized. The backtick rejection guard, which was appropriate for DuckDB-flavor source_query, is relaxed for materialized rows since the new wrapping path (Task 2) runs admin SQL through BQ jobs API which uses BQ-native syntax (backticks for dashed identifiers).	2026-05-04 18:37:27 +02:00

1 2 3 4

158 commits