Commit graph

75 commits

Author SHA1 Message Date
ZdenekSrotyr
6fe9135cb5
release: 0.47.3 — self-upgrade ignores 24h cache, always re-probes /cli/latest (#227)
## Summary

`agnes self-upgrade` without `--force` previously short-circuited on the local 24h `update_check.json` cache. After a server-side version bump within that window, the explicit command exited silently as a no-op — empirically observed today when prod 0.47.1 → 0.47.2 didn't propagate.

Fix: always invalidate the cache in `_resolve_info`. The cache still gates the implicit warning loop in the root callback (correctly — that runs on every `agnes <anything>` and can't hammer `/cli/latest`).

## Test plan

- [x] New `test_self_upgrade_bypasses_24h_cache_without_force` — stale cache claims current; mocked server reports newer; assert UpdateInfo carries the newer version, not the cached one.
- [x] Existing self-upgrade tests pass (including `--force` semantics — force is now downstream-only, behavior preserved).
2026-05-07 22:08:21 +02:00
ZdenekSrotyr
917f9aaef0
release: 0.47.2 — restore #218 + #219 fixes silently reverted by #217 (#225)
## Summary

Smoke-testing the just-shipped 0.47.1 against production exposed two regressions:

1. `agnes query --remote "SELECT FROM unit_economics WHERE bad_col=1"` returned `Table "unit_economics" must be qualified` (the OLD error) instead of `Unrecognized name: bad_col` (the #218 fix's intended behavior).
2. `agnes query "DESCRIBE unit_economics"` showed only DuckDB's misleading `Did you mean order_economics?` with no Agnes hint paragraph (the #219 fix is missing).

Root cause: PR #217's squash merge (`506a378c`) carried stale snapshots of `app/api/query.py` and `cli/commands/query.py` from before #218 and #219 merged. The rebase-and-merge auto-merged those files cleanly (no conflict markers) but the result silently reverted both fixes.

Restore the two changes verbatim. Tests for both fixes already on main and continue to pass against the restored code.

## Test plan

- [x] `pytest tests/test_api_query_guardrail.py tests/test_cli_query.py` — clean
- [x] Manual repro against prod after deploy: both flows now surface the intended diagnostic.
<!-- devin-review-badge-begin -->

---

<a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/225" target="_blank">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
    <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review">
  </picture>
</a>
<!-- devin-review-badge-end -->
2026-05-07 19:57:18 +02:00
ZdenekSrotyr
506a378c3a
release: 0.47.1 — Keboola connector v27 (incremental, partitioned, where_filters, typed parquet) (#217)
## Summary

Brings the Keboola connector to feature parity with the legacy internal data-analyst's per-table sync strategies. Closes the four documented gaps from the spec branch (`zs/keboola-connector-specs`):

- **Typed parquet** in the legacy SDK extraction path — column types from Keboola Storage metadata (provider cascade `user > ai-metadata-enrichment > keboola.snowflake-transformation`) survive the CSV → parquet roundtrip; invalid date strings (`'0000-00-00'`) and invalid numeric strings (`'Non-Manager'`) become NULL while keeping the column's typed schema. Pre-fix everything was VARCHAR.
- **Incremental sync** via Storage API `changedSince` — opt-in per table; pulls only delta rows, merges into the existing parquet by `primary_key` (drop_duplicates with keep='last'). Cuts daily extraction from O(full table) to O(delta).
- **Partitioned sync** — flat per-partition layout `data/<table>/<key>.parquet` (e.g. `2026_05.parquet`), per-affected-partition merge for daily updates, chunked initial load with 1-day overlap and 2-empty-chunk stop heuristic.
- **`where_filters`** — server-side row filter with date placeholders (`{{today}}`, `{{last_3_months}}`, `{{start_of_3_months_ago}}`, etc.) resolved at sync time. Force the SDK path; reject `incremental + where_filters` combination at API layer (changedSince already filters temporally).

## Architecture

- **Schema migration v25 → v26**: 7 new columns on `table_registry`. Existing `sync_strategy` column reused (pre-v26 it was inert catalog metadata; post-v26 the extractor dispatches off it).
- **Per-table dispatcher** in `extractor.run()` routes to one of `_extract_via_extension` (full_refresh + extension), `_extract_via_legacy` (full_refresh + filters or extension fallback), `extract_incremental`, or `extract_partitioned`.
- **API conflict policy**: `incremental + where_filters` → 422; `partitioned + query_mode='remote'` → 422; `partitioned ⇒ partition_by required`.
- **Admin UI**: third "Direct extract (Storage API)" radio in the Keboola Register / Edit modals, alongside existing "Whole table (extension)" and "Custom SQL". When selected, exposes a v26 sync-strategy panel with conditional fields per strategy.

## Test plan

- [x] **Unit + module** — 134 v26 tests covering migration, repo, parquet_io, where_filters, incremental (compute_changed_since + merge_parquet + extract_incremental E2E), partitioned (key derivation + merge_partition + chunked windows + extract_partitioned E2E), extractor dispatcher, admin API validators, PUT field clearing, registry-shape → dispatcher bridge
- [x] **HTML form structure** — all v26 inputs + visibility classes + JS payload fields verified in rendered template
- [x] **Real Keboola roundtrip** — registered a small test table as `sync_strategy='incremental'` against a test Storage project, triggered two syncs:
  - Sync 1: `changedSince=None` → full pull → 9 rows typed parquet
  - Sync 2: `changedSince=last_sync - 1d window` → 9 delta rows merged with 9 existing → 9 after dedup on primary_key (PK merge confirmed)
- [x] **Browser UX** — agent-browser session against a local uvicorn: login → admin/tables → register modal → switch radios → verify field visibility per strategy → submit → edit existing row → switch to Direct/Incremental → save → confirm DB persistence
- [x] **Regression** — no regressions in the broader 3252-test suite (3 pre-v26 tests updated for the deprecation-marker removal + schema-version bump; 2 pre-existing environment-sensitive test failures unrelated to this change)

## Bugs caught + fixed during E2E

The browser + real-Keboola roundtrip exposed four bugs the unit tests missed:

1. **JS visibility race** — two competing `forEach` loops set `display=''` then `display='none'` on form elements sharing `kb-strategy-incremental kb-strategy-partitioned` classes (window_days + max_history_days are reused across strategies). Fix: single-pass selector with class-based visibility resolver.
2. **PUT cannot clear field** — pre-v26 `updates = {k: v ... if v is not None}` collapsed "omitted from body" and "sent as null" into the same case, so admin couldn't switch a partitioned row back to full_refresh and have stale `partition_by` clear. Fix: `model_dump(exclude_unset=True)`.
3. **Subprocess DB lock conflict** — `_read_last_sync` reopened `system.duckdb` while the parent server held the write lock (subprocess contract at `app/api/sync.py:_run_sync` line 260). Fix: parent injects `__last_sync__` into table_config before subprocess spawn.
4. **Wrong KBC table_id** — `extract_incremental` / `extract_partitioned` built the Storage API table_id from the registry row's slugified `id` (`circle_inc`) instead of `bucket.source_table` (`in.c-finance.circle`), producing 404s. Fix: prefer `bucket+source_table`; fall back to `id` only when bucket empty.

## Operator notes

- Existing tables stay on `full_refresh` after migration; admins opt individual tables in via `agnes admin register-table --sync-strategy ...`, the Keboola Edit modal, or `POST/PUT /api/admin/registry`.
- `merge_parquet` and `merge_partition` use `pd.concat + drop_duplicates`, loading both existing and delta into pandas RAM. For tables in the multi-million-row range this may OOM — switch to `partitioned` strategy for those (per-partition merge keeps memory bounded). Documented in `### Internal` of the changelog entry.
- Date placeholders are resolved at **sync time**, not register time — a typo'd `{{lasst_week}}` is accepted at register and surfaces only when the next sync runs. By design (rolling windows need late-binding).

## Spec source

The four corresponding plans on the `zs/keboola-connector-specs` branch under `docs/superpowers/plans/2026-05-07-0[1-4]-*.md` capture the design rationale and link back to internal repo references for each subsystem.
<!-- devin-review-badge-begin -->

---

<a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/217" target="_blank">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
    <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review">
  </picture>
</a>
<!-- devin-review-badge-end -->
2026-05-07 19:01:27 +02:00
ZdenekSrotyr
aa5921da67
release: 0.47.0 — source-agnostic catalog metadata + cache discipline (#223)
## Summary

- Catalog enrichment for `query_mode='remote'` rows: `rows`, `size_bytes`, `partition_by`, `clustered_by` per table (BQ + Keboola providers).
- `/api/v2/schema/{id}` cache miss: 2 BQ jobs → 1 (-50%) via shared `fetch_bq_columns_full`.
- All four catalog/schema/sample/metadata caches flush on registry change; single-row re-warm scheduled.
- Automatic cache warmup at server startup (bounded concurrency, opt-out via `AGNES_SKIP_CACHE_WARMUP=1`).
- SSE-driven freshness toolbar on `/admin/tables` with progress bar, log, and per-row badge.
- New admin doc `docs/admin/query-modes.md` — single source of truth on `local` / `remote` / `materialized` choice.

Closes #155.
Closes #156.

## Test plan

- [x] 65+ targeted tests pass across 11 new test modules + 3 modified ones.
- [x] No DB migration; no wire-break; `MIN_COMPAT_CLI_VERSION` unchanged.
- [ ] Reviewer: register a remote BQ table via `/admin/tables`, observe the toolbar populates within ~2 s and the per-row badge transitions warming → fresh.
- [ ] Reviewer: trigger `Re-warm all`, verify SSE log scrolls and `cacheWarmupBar` progresses.
- [ ] Reviewer: edit a registered row's bucket, verify `agnes schema <id>` returns updated columns immediately (no 1-hour staleness).
- [ ] Reviewer: confirm `agnes admin register-table --query-mode remote` prints the new IAM-smoke-check hint.

## Notable design decisions

- BigQuery `INFORMATION_SCHEMA.TABLE_STORAGE` is the only valid scope for size+rows (verified live 2026-05-07; dataset-scoped doesn't exist). Region resolved from `instance.yaml.data_source.bigquery.location` → `bq.client().get_dataset(...)` → fall back to legacy `__TABLES__`.
- VIEW handling: TABLE_STORAGE returns no rows for views, fall through to `__TABLES__` (also empty) → `TableMetadata(rows=None, size_bytes=None, partition_by=..., clustered_by=...)`. Null size signals analyst Claude to apply existing CLAUDE.md guidance.
- `size_bytes` is `active_logical_bytes + long_term_logical_bytes` — full BQ scan reads both; reporting only active undercounts aged partitioned tables.
- Source-agnostic provider seam: per-source `connectors/<source>/metadata.py:fetch(MetadataRequest)`; dispatcher in `app/api/v2_catalog.py:_metadata_provider_for` lazily imports per source_type so a Keboola-only deployment doesn't pay the BQ-extension import cost.
- Warmup non-blocking: FastAPI `lifespan` schedules `asyncio.create_task(_warm_catalog_caches_bg)` before `yield`. Per-row failures isolated.

## Out of scope

- Profile / column histograms / dimension cardinality for remote tables (separate issue).
- Onboarding nudge ("you have 0 remote tables, consider registering some BQ ones") — separate UX call.
- Provider plug-in registration via entry-points (the dispatch table is a hardcoded if-tree today; one line per future source).

## Release

Bumps `pyproject.toml` 0.46.1 → 0.47.0 (main shipped 0.46.0 + 0.46.1 during this PR — see commit `d98976ec`). New CHANGELOG section under `## [0.47.0] — 2026-05-07`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- devin-review-badge-begin -->

---

<a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/223" target="_blank">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
    <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review">
  </picture>
</a>
<!-- devin-review-badge-end -->
2026-05-07 18:33:55 +02:00
ZdenekSrotyr
751cc25327
release: 0.46.5 — agnes describe -n parses, server sanitizes NaN (#224)
## Summary

Two bugs in `agnes describe` surfaced from a real analyst session following the CLAUDE.md agent-rails discovery workflow. Together they break `agnes describe` end-to-end for any analyst (or analyst-AI) who follows the documented form.

### A) CLI parsing

`agnes describe TABLE -n 5` failed with `Missing argument 'TABLE_ID'`. Root cause: the command was registered as a `Typer.Typer` subcommand group via `app.add_typer(describe_app, name="describe")` + `@describe_app.callback(invoke_without_command=True)`, and that pattern mis-parses positional + short-int option in some orderings. Same pattern in `cli/commands/schema.py` works only because schema has no INTEGER short option. Fix: switch to flat `@app.command("describe")`.

### B) Server NaN

`/api/v2/sample/<id>` (called by `agnes describe`) returned HTTP 500 with `ValueError: Out of range float values are not JSON compliant: nan` whenever a row contained NaN. Fix: sanitize NaN/±inf to None before JSON serialization.

## Test plan

- [x] `pytest tests/test_cli_describe*.py` — added regression tests pinning `-n` parsing on either side of the positional.
- [x] `pytest tests/test_api_v2_sample*.py` — added regression test for NaN row → JSON `null` (not 500).
<!-- devin-review-badge-begin -->

---

<a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/224" target="_blank">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
    <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review">
  </picture>
</a>
<!-- devin-review-badge-end -->
2026-05-07 18:16:21 +02:00
ZdenekSrotyr
50d10443d1
release: 0.46.2 — friendlier hint on missing-table errors for remote tables (#219)
## Summary

`agnes query "DESCRIBE unit_economics"` (where `unit_economics` is `query_mode='remote'`) previously returned DuckDB's nearest-name suggestion (`Did you mean "order_economics"`?), sending users down the wrong path. Now appends a friendly hint about remote tables.

Reproduced from a real analyst session — colleague spent ~30s diagnosing what was actually "this is a remote table, not materialized locally".

## Test plan

- [x] New test: `_query_local("DESCRIBE unit_economics", ...)` against an empty local DuckDB triggers the new hint, original DuckDB error still echoed.
- [x] Negative test: a syntax-error query does NOT trigger the hint (regex only matches "Table with name X does not exist").
- [x] `pytest tests/test_cli_query*.py` clean.
<!-- devin-review-badge-begin -->

---

<a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/219" target="_blank">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
    <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review">
  </picture>
</a>
<!-- devin-review-badge-end -->
2026-05-07 17:24:10 +02:00
ZdenekSrotyr
28430ced09
Keboola cutover: native parquet path + sync correctness + auto-discover protection (#190)
* fix: cutover regressions + parallel Keboola legacy fallback

Bundled fixes from a fresh-deploy run on a Keboola Storage backend with
the block-shared-snowflake-access feature flag — DuckDB Keboola
extension's per-table scan can't access bucket schemas, so the legacy
kbcstorage Storage-API client is the only working path.

CUTOVER REGRESSIONS

- agnes pull hash mismatch on every Keboola local-mode table —
  src/orchestrator.py:_update_sync_state stored md5(mtime+size)[:12]
  while the CLI compares against full 32-char content MD5. Now stores
  the same content MD5 the materialized SQL path already used.

- Trailing-slash sanitization in connectors/keboola/access.py and
  extractor.py — DuckDB Keboola extension's ATTACH fails when the URL
  ends in / (canonical form).

- src/profiler.py:TableInfo.description becomes optional — two call
  sites instantiated without it, crashing the profiler pass.

- scripts/ops/agnes-auto-upgrade.sh: chown on UID change — older images
  ran as root, current runs as agnes (uid 999). Reads target uid:gid
  from /etc/passwd inside the new image and chowns ${STATE_DIR},
  /data/extracts, /data/analytics when the digest moves.

- POST /api/sync/trigger is now singleton per process — two
  near-simultaneous trigger calls each forked an extractor subprocess,
  fought for extract.duckdb's file lock, starved uvicorn, flipped the
  container to unhealthy. Trigger now returns 409
  (sync_already_in_progress) when held; _run_sync acquires non-blocking.

PARALLEL LEGACY FALLBACK

- Process pool fan-out for the _extract_via_legacy queue (default 8
  workers, override via AGNES_KEBOOLA_PARALLELISM). Process pool, not
  thread pool, because connectors/keboola/client.py:export_table does
  os.chdir(temp_dir) — process-global, so threads raced and slice files
  landed in the wrong directory ("[Errno 2] No such file or directory:
  '<job_id>.csv_X_Y_Z.csv'").

- Extractor subprocess timeout 1800s -> 3600s (configurable via
  AGNES_EXTRACTOR_TIMEOUT_SEC). 28+ tables × multi-minute Keboola export
  jobs need the headroom on telemetry-class projects.

- Process group cleanup on timeout — Popen(start_new_session=True) puts
  the extractor in its own group. On timeout the parent SIGTERMs the
  group (10s grace) then SIGKILLs stragglers. Without this, the pool
  workers were reparented to PID 1 and continued holding open Keboola
  Storage export jobs. Inline extractor script also installs a SIGTERM
  -> sys.exit(143) handler so the with ProcessPoolExecutor(...) block
  __exit__ runs cleanly.

Tests: existing tests that patched subprocess.run updated to patch
subprocess.Popen with a _FakePopen stand-in (same exit-code-injection
contract). Two tests that exercised the parallel path forced
AGNES_KEBOOLA_PARALLELISM=1 to keep mocks alive (mocks don't ride into
ProcessPoolExecutor subprocesses).

Squashed onto current main (was 7 commits + multi-commit CHANGELOG +
agnes-auto-upgrade.sh conflicts; squash avoids per-commit conflict
resolution against main's flat-mount STATE_DIR refactor and 0.38.0
release cut).

* feat(keboola): Storage API direct extract path; drop extension data path

The DuckDB Keboola extension's COPY routes through Keboola QueryService,
which is unreliable on linked-bucket projects (extension v0.1.6 fixes
that case but isn't yet in the community CDN, and pre-fix any project
with the block-shared-snowflake-access feature flag couldn't see bucket
schemas at all). Move the extract path off the extension entirely and
talk to the Storage API directly via signed-URL download — works on any
project, regardless of extension state.

connectors/keboola/storage_api.py (NEW)
  Lightweight client built on requests.Session. Three endpoints:
  - POST /v2/storage/tables/{id}/export-async        (kicks off job)
  - GET  /v2/storage/jobs/{id}                        (poll until done)
  - GET  /v2/storage/files/{id}?federationToken=1     (signed URL detail)
  - GET  <signed_url>                                 (download bytes)
  Supports sliced exports (manifest + per-slice signed URLs) and gzipped
  payloads. ExportFilter dataclass mirrors the Keboola filter spec
  (whereFilters / columns / changedSince / limit) and handles JSON
  round-trip with the registry's source_query column. Token redaction
  in error messages. Bounded exponential backoff on job polling.
  No cloud-SDK dependency on the data path; thread-safe.

connectors/keboola/extractor.py
  - materialize_query() rewritten: takes bucket/source_table/source_query
    (JSON filter spec), exports via KeboolaStorageClient, converts CSV
    to parquet via DuckDB, atomic os.replace. Same return shape so
    sync.py downstream code stays uniform with the BQ branch.
  - _extract_via_legacy() also moved to Storage API direct (kept the
    name for caller compatibility with _legacy_worker / the parallel
    batch extractor). Per-call temp directories — no os.chdir, threads
    don't race.

app/api/sync.py
  _run_materialized_pass for source_type='keboola' rows now constructs a
  KeboolaStorageClient (replaces KeboolaAccess) and passes
  bucket/source_table/source_query to materialize_query. Reuses one
  client across rows for HTTP keep-alive. Sources keboola URL from env
  too (KEBOOLA_STACK_URL) when instance.yaml doesn't have stack_url
  configured.

cli/commands/admin.py
  discover-and-register defaults Keboola rows to query_mode='materialized'
  (NULL source_query = full table), matching the v26 migration's
  unification of the local/materialized split for Keboola. BigQuery and
  Jira keep their per-source defaults.

src/db.py
  Schema bump 25 → 26. Migration: UPDATE table_registry SET
  query_mode='materialized' WHERE source_type='keboola' AND
  query_mode='local'. NULL source_query on those rows means "full table
  export" — same effective behavior the local mode provided, but now
  via Storage API instead of the extension.

pyproject.toml
  kbcstorage dep stays (admin-side bucket/table list still uses the
  SDK in app/api/admin.py / connectors/keboola/client.py); only the
  data path is migrated off the SDK. Comment updated to reflect the
  new boundary.

tests
  - test_keboola_storage_api.py (NEW, 19 tests): ExportFilter parsing,
    HTTP client (token redaction, retry logic, polling), download_file
    (single, gzipped, sliced), end-to-end export_table_to_csv.
  - test_keboola_materialize.py rewritten: mocks KeboolaStorageClient
    instead of FakeAccess; same atomic-write + zero-rows + unsafe-id
    contracts.
  - test_sync_trigger_keboola_materialized.py: registry rows now carry
    bucket+source_table+JSON-shape source_query.

114+ Keboola-impacted tests green locally.

* test: schema version assertion bumped to 26 alongside the keboola query_mode migration

* fix(keboola): cutover hot-patches surfaced on agnes-dev

Five small fixes that were applied as in-container hot-patches during
agnes-dev cutover and need to be on the source-of-truth image so a fresh
upgrade does not undo them.

- app/api/sync.py: auto-discover gate considers the WHOLE registry (any
  source, any mode), not just rows where source matches and query_mode
  is local. After the v25→v26 keboola materialized migration an
  instance can have 30 materialized rows and zero local rows; the
  previous gate kept re-firing _discover_and_register_tables every
  scheduler tick, creating duplicate auto-discovered rows with the
  wrong bucket prefix every time.

- app/api/admin.py: _discover_and_register_tables reassembles the
  bucket as <stage>.<bucket-id> (e.g. in.c-finance) instead of
  dropping the stage prefix; default query_mode for keboola is now
  materialized (the v26 contract); validator allows NULL source_query
  for keboola materialized rows (full-table export via Storage API
  export-async, no SQL needed).

- cli/commands/admin.py: register-table mirrors the server validator
  (NULL source_query allowed for source_type=keboola); --bucket help
  text generalized to cover both BQ dataset and Keboola bucket id.

- connectors/keboola/extractor.py: max_line_size=64 MiB on
  read_csv_auto so embedded JSON / SQL cells (kbc_component_configuration
  in particular) do not trip the default 2 MiB ceiling.

- connectors/keboola/storage_api.py: GCP backend support — when the
  Storage API returns a manifest whose slice URLs are gs://
  references with a gcsCredentials block, rewrite to the JSON REST
  download endpoint and authenticate with the issued OAuth bearer
  token; redact tokens in any surfaced error string.

* test: align with new keboola materialized + auto-discover-gate contracts

- test_admin_keboola_materialized: rename
  test_register_keboola_materialized_rejects_missing_source_query →
  test_register_keboola_materialized_accepts_missing_source_query.
  v25→v26 introduced 'keboola materialized with NULL source_query
  means full-table export via Storage API export-async' as the
  default registration shape; the rejection case is no longer the
  contract.

- test_sync_filter: add list_all() to _StubRegistry. The auto-discover
  gate in _run_sync now keys off the WHOLE registry (not just local
  rows) so materialized-only Keboola instances do not re-trigger
  discovery on every tick.

* feat(keboola): native parquet export — skip CSV roundtrip

Storage API export-async accepts fileType={csv,parquet}. Switching the
materialized sync to parquet eliminates the CSV → DuckDB COPY → parquet
roundtrip that pinned a single uvicorn worker over 4 GiB on multi-GB
tables (read_csv with all_varchar + max_line_size=64MB has to
materialize the whole CSV in memory before COPY can stream out a
parquet). Snowflake UNLOAD on Keboola's side already produces typed,
self-contained parquet files; the extractor downloads them and renames
into place.

Two cases:

- **Single-file** export (small table): file_info.url points at one
  signed URL; download_file streams chunks straight to .parquet.tmp
  and we're done. No DuckDB.

- **Sliced** export (Snowflake UNLOAD respects MAX_FILE_SIZE — 16 MiB
  default — so anything larger arrives as N parquet slices): each
  slice is a complete parquet file with its own footer; naive concat
  would corrupt them. download_file_slices keeps the slices as
  separate files in a tempdir, then DuckDB COPY (SELECT * FROM
  read_parquet([slice0, slice1, ...])) merges them into one
  consolidated parquet. DuckDB streams row groups during this — peak
  memory bounded to one row group (~1 MiB) regardless of source size.

The legacy CSV path stays as the explicit opt-in via source_query=
'{"file_type":"csv"}' for projects whose backend can't UNLOAD
parquet (none known today; cheap escape hatch). Backward-compat alias
KeboolaStorageClient.export_table_to_csv kept.

Also fixes a latent bug in download_file's gzip detection: previous
heuristic flagged any unencrypted file as gzipped, which would have
corrupted parquet downloads at gunzip time. Name-suffix-only now.

* fix: tempdir leak cleanup, every 0m schedule, /sync/trigger body shapes

Three small self-contained fixes uncovered during agnes-dev cutover.

- connectors/keboola/extractor.py: tempfile.TemporaryDirectory now uses
  ignore_cleanup_errors=True so a worker death mid-write doesn't leave
  multi-GiB stale slice trees on the boot disk. (12 GiB seen after a
  disk-full crash where TemporaryDirectory's own cleanup also raised
  and got swallowed.)

- src/scheduler.py: is_valid_schedule accepts 'every 0m' (interval=0
  = always due). Force-resync of an errored row no longer requires
  waiting out the default 'every 1h' interval — admin can flip the
  schedule, trigger, then flip back.

- app/api/sync.py: POST /api/sync/trigger accepts both ['table_id']
  (legacy bare-array body) and {'tables': ['table_id']} (matches the
  response payload shape, more discoverable for clients building
  requests by hand). Malformed bodies return 422 with a structured
  detail; null/missing means 'sync everything' as before.

Tests cover: tempdir cleanup on raise (sliced parquet path),
is_valid_schedule + is_table_due 'every 0m' acceptance, and trigger
body parametrized matrix (8 valid shapes + 6 rejection cases).

* fix: targeted-trigger filter in materialized pass + auto-upgrade defer

Two operational gaps observed during agnes-dev cutover, in the same
sync-routing area.

- _run_materialized_pass now takes a 'tables' arg and skips rows not in
  the target set with reason='not_in_target'. POST /api/sync/trigger
  with a body of tables previously only scoped the legacy extractor
  subprocess — the materialized pass kept iterating every due
  materialized row, so an admin asking to re-sync kbc_job re-ran
  every other due materialized row alongside it. Match on registry id
  OR name (admins commonly pass either form). tables=None preserves
  the no-filter behavior.

- New GET /api/sync/status (public, no auth) returns {locked: bool}
  off _sync_lock.locked(). agnes-auto-upgrade.sh probes this before
  docker compose up -d and exits 0 with a 'deferred recreate' log
  line if a sync is in flight — the next 5-min cron tick retries.
  Pre-fix, an auto-upgrade triggered mid-sync would recreate the
  uvicorn worker and kill the in-flight extractor / Snowflake-UNLOAD
  download (observed when kbc_job's first 7-day retry got SIGKILLed).
  Connection failures in the probe fall through to the upgrade —
  being stuck on a wedged image is worse than interrupting a
  hypothetical sync.

* fix: auto-discover protects admin overrides + surfaces drift

Two real-world incidents on agnes-dev drove this:

1. kbc_job was registered manually with the correct
   (in.c-kbc_telemetry, kbc_job) coordinates. A naive auto-discover
   re-run would have inserted a SECOND kbc_job row at the slugified
   id 'in_c-keboola-storage_kbc_job' (where Keboola's discovery
   places it) — and that row's Storage API export-async 404s.

2. An earlier auto-discover bug stripped the stage prefix from
   bucket ids ('c-finance' instead of 'in.c-finance'), inserting
   137 rows whose syncs all failed.

Fix:

- _discover_and_register_tables now builds a plan first
  (_build_keboola_discovery_plan) classifying each discovered table
  into one of new / existing_match / existing_drift / invalid, then
  executes only the 'new' bucket. Drift rows are reported with both
  sides of the disagreement plus drift_kind:
  - same_id_diff_coords: registry has the same id but different
    bucket / source_table (admin migrated coords inline).
  - name_collision: discovery's slugified id differs from any
    registry id, but the discovered .name matches an existing row's
    .name (case-insensitive). Catches the kbc_job case.

- Bucket detection now prefers the API's authoritative bucket_id
  field (separate field on the Keboola tables.list response,
  normalised by KeboolaClient.discover_all_tables). Falls back to
  id-string parsing only when bucket_id is missing (older fallback
  path inside discover_all_tables).

- Endpoint POST /api/admin/discover-and-register?dry_run=true
  returns the plan without writing — would_register, drift,
  invalid lists. Lets an operator audit before merging discovery
  with a registry that has admin overrides.

Removed 'every 0m' from test_register_request_rejects_malformed_sync_schedule
— the runtime started accepting it in the previous commit (force-resync
override) and the validator follows suit.

* feat(keboola): AGNES_TEMP_DIR routes tempfiles off overlayfs /tmp

The container's /tmp lives on the boot disk's overlayfs (29 GiB on
agnes-dev, shared with /var). Snowflake UNLOAD of a wide table writes
slices into per-call /tmp tempdirs that fill multi-GiB / many-slice
exports long before the dedicated data disk fills. agnes-dev hit
100% boot-disk while the 20 GiB data disk had 15 GiB free.

connectors.keboola.storage_api.get_temp_root() reads AGNES_TEMP_DIR;
mkdirs the target on first use; unset / empty / unwritable falls
back to None (system tempdir, OSS-pre-fix behaviour). Both
materialize_query (parquet path) and _extract_via_legacy (CSV
fallback) and the sliced-CSV concat path in storage_api use the
helper now.

docker-compose.yml defaults AGNES_TEMP_DIR=/data/tmp on app, scheduler,
and extract services. The data volume is the dedicated disk in
production layouts and a plain docker volume in single-disk
dev/laptop setups — same blast radius as the previous /tmp default
on the latter, no regression.
2026-05-07 12:12:14 +02:00
ZdenekSrotyr
c97fd504c5 release: 0.45.0 — easy-wins bundle (#84 #164 #177 #178 #203 #204)
Operator-and-analyst quality bundle: a security fix for the optional
Telegram bot, two CLI gaps closed, and three rounds of UX polish on
`agnes diagnose` and `agnes pull` so non-TTY consumers (CI runners,
Claude Code SessionStart hooks, sub-agent watchdogs) get readable,
actionable signal.

- Pairing-code RNG: random.choices -> secrets.choice (CSPRNG).
- Telegram script runner: refuse out-of-shape usernames before sudo -u.

CLAUDE.md.bak.<ISO-timestamp> before regenerating.

- agnes admin unregister-table <id> -> DELETE /api/admin/registry/{id}
- agnes admin update-table <id> --field=value ...  -> PUT /api/admin/registry/{id}

response but never promotes the headline. BQ billing-equals-data check
downgraded warning -> info.

default (5 s / 1 MiB vs 30 s / 10%) so sub-agent watchdogs don't kill
the pull as a hung process. New env knobs:
AGNES_PULL_PROGRESS_INTERVAL_{SECONDS,BYTES}.

--include-schema (or ?include=schema) to opt back in.

Tests: 120 passed across the touched modules, including new tests for
each fix. Pre-existing failures on main (DB migration v1->v9, binary
rename) are unrelated and not introduced here.
2026-05-07 11:43:16 +02:00
Minas Arustamyan
cd10aefdbd fix(refresh-marketplace): align manual-mode hint with hook JSON
Hook JSON path uses /reload-plugins (no restart needed); manual-mode
echo path was still telling the operator to /exit + restart. Both now
say /reload-plugins.

Tests renamed to *_reload_hint_* to match the new wording.
2026-05-07 06:59:13 +02:00
Minas Arustamyan
3aeb0f2fbd fix(refresh-marketplace): use /reload-plugins instead of /exit + restart
Claude Code's `/reload-plugins` slash command picks up newly installed
plugins into the running session without forcing the user to /exit and
restart Claude Code. The hook JSON `systemMessage` and `additionalContext`
both now point at it.

Tests updated to pin the new hint shape.
2026-05-07 06:59:13 +02:00
Minas Arustamyan
166c1c0752 fix(refresh-marketplace): pass --scope project to claude plugin update
Without `--scope project`, `claude plugin update <name>@agnes` operated
at user scope (the default) instead of updating the project-scoped
install — so version bumps in the served manifest never propagated to
the workspace, even though `claude plugin install` correctly used
`--scope project` for the missing-plugin path.

Mirrors the install line in the same function. Any change refresh-
marketplace makes to a plugin must now stay in project scope —
consistent with the SessionStart hook firing per-workspace.
2026-05-07 06:59:13 +02:00
Minas Arustamyan
50e0463501 feat(marketplace): clone-based plugin setup + auto-refresh SessionStart hook
Adds end-to-end flow for installing and keeping the per-user filtered
Claude Code marketplace in sync with the user's Agnes stack
(admin RBAC grants \ MyAIStack opt-outs U /store installs).

Setup (one-liner in install prompt step 5):
  `agnes refresh-marketplace --bootstrap` clones the per-user marketplace
  bare repo to ~/.agnes/marketplace, strips PAT from the cloned origin
  URL, registers the local path with Claude Code, and installs every
  plugin in the served manifest at --scope project. Replaces a 15-line
  inline shell sequence that tripped Claude Code's agent-driven `rm -rf`
  permission gate.

Auto-refresh (SessionStart hook installed by `agnes init`):
  `agnes refresh-marketplace --quiet` runs every Claude Code session,
  fetches+resets the clone (server rebuilds as orphan commits, so
  pull --ff-only is impossible), and version-aware reconciles:
    - missing in workspace -> claude plugin install <name>@agnes --scope project
    - version differs       -> claude plugin update <name>@agnes
    - matches               -> skip
  Don't auto-uninstall plugins that disappeared from the manifest --
  a transient empty manifest from the server would wipe the stack.

Hook output: when --quiet AND something actually changed, emits Claude
Code hook JSON on stdout -- `systemMessage` (transient toast) and
`hookSpecificOutput.additionalContext` (model-side system reminder),
both carrying the change summary plus a "/exit + restart Claude Code"
instruction (Claude only scans plugins at session start).

Windows hook compatibility: the refresh-marketplace hook command is
wrapped in `bash -c "..."` because Claude Code on Windows runs hook
commands directly without invoking a shell, so `2>/dev/null || true`
would otherwise be passed as literal argv tokens.

Cross-cutting:
  - cli/lib/marketplace.py: shared CLONE_DIR + MARKETPLACE_NAME constants.
  - cli/lib/hooks.py: SessionStart now has two independent entries
    (pull + refresh-marketplace) so a failure in one doesn't suppress
    the other; legacy `da sync` and prior single-pull layouts upgrade
    cleanly on re-init.
  - PAT injection on every git fetch via per-invocation credential
    helper (token in \$AGNES_TOKEN env, never in argv or .git/config).
  - Pre-snapshot of installed plugins captured BEFORE
    `claude plugin marketplace update` so silent auto-applied version
    bumps still fire notifications.
  - scripts/dev/agnes-client-reset.sh: cleans ~/.claude/plugins/marketplaces/agnes,
    ~/.claude/plugins/cache/agnes, drops uv build cache, documents
    workspace-scoped residue that can't be enumerated from the script.
  - app/web/setup_instructions.py: legacy AGNES_DEBUG_AUTH path also
    uses clone (direct HTTPS marketplace add is broken end-to-end on
    every Claude Code distribution -- stores response as single file,
    plugin source paths then 404).

28 new tests (test_cli_refresh_marketplace.py) + extended hook + setup
template tests cover bootstrap, fetch+reset ordering, version-aware
reconcile, project-path filtering, hook JSON shape, and the bash-c
Windows wrapper invariant.
2026-05-07 06:59:13 +02:00
ZdenekSrotyr
630e224578 feat(cli): add agnes self-upgrade with smoke test + rollback
Reuses cli.update_check.check() for the version probe — extended with
bypass_disabled=True so explicit user-typed self-upgrade is not silenced
by AGNES_NO_UPDATE_CHECK (which is for the implicit warning loop).

Install path: uv tool install --force when uv is on PATH; otherwise
curl + pip via sys.executable (NOT system python3, NOT --user — both
would land outside the agnes venv and silently no-op the upgrade).

Smoke test execs the binary at the install-resolved path (uv tool dir
joined with agnes-the-ai-analyst/bin/agnes, or sys.executable's sibling
agnes for pip) — never via shutil.which, which can resolve a stale shadow
on PATH and produce a false-positive smoke pass on the OLD version. Smoke
also asserts --version output contains info.latest via PEP 440 Version()
equality (so 0.40.0 does not falsely match 0.40.10).

On smoke fail: rollback to last_known_good.json (written only after a
previous run's smoke passed). Rollback rc is captured and surfaced on
stderr if it also fails. First-ever upgrade or unrecoverable rollback
prints the canonical bootstrap recovery: curl -fsSL <server>/cli/install.sh | bash.

AGNES_SELF_UPGRADE_IN_PROGRESS=1 is set for the duration of the run
and propagated to the smoke-test subprocess. Layer B's _check_version_headers
honors the sentinel and skips the < min hard-stop, so an in-flight
upgrade can never sys.exit(2) itself.

--force invalidates the update_check cache BEFORE probing. --force +
offline = exit 1 with explicit stderr (without --force, offline is silent).
--quiet suppresses progress output but never gags failure stderr.
2026-05-06 23:23:23 +02:00
ZdenekSrotyr
6c94d2cbce Merge remote-tracking branch 'origin/main' into pr180-review
# Conflicts:
#	CHANGELOG.md
#	pyproject.toml
2026-05-06 07:27:25 +02:00
ZdenekSrotyr
28423907fd feat: clean CLI errors + init progress + skip-materialize + claude.md catalog pointer
Three first-try-failure-surface fixes from Pavel's #185 trace + the
template guidance question, all under PR #188's umbrella so they land
together with the file_server / parallel pull / Tier 1 work.

1. CLI clean-error wrapper — new AgnesTransportError raised by the
   api_*/stream_download helpers when httpx times out / drops /
   refuses, plus a top-level Typer wrapper (cli/main.py) that prints
   one-line "Error: …" + actionable hint and exits non-zero. Full
   traceback goes to ~/.config/agnes/last-error.log for support
   forwarding. Unhandled Exceptions are caught at the same boundary
   so no Python traceback ever leaks to the analyst's terminal.

   Pavel's #185 Phase 3B: a 30-frame httpx traceback from a slow BQ
   --remote query made it look like a CLI bug. Now: clean message +
   hint pointing at `agnes snapshot create` / partition-column
   guidance.

   Entry point in pyproject.toml flipped from `cli.main:app` →
   `cli.main:_run_with_clean_errors` so the wrapper actually runs
   under the installed `agnes` binary.

2. agnes init / agnes pull --skip-materialize + progress bar.
   --skip-materialize omits query_mode='materialized' rows from the
   download set so a first init doesn't spend 44 minutes silently
   pulling a single 6 GB parquet (Pavel's #185 Phase 1). Rich-driven
   per-file progress bar with label/bytes/rate/ETA renders to stderr
   when not --quiet and not --json. Aggregates across the parallel
   ThreadPoolExecutor workers added earlier in this PR.

3. config/claude_md_template.txt: explicit one-line snippet pointing
   at `agnes catalog --json | jq '.tables[] | select(.id=="<id>")'`
   for per-table descriptions + restated invariant: "the description
   field on each catalog row is the authoritative business-rules
   text — re-read live, never copy into this file." Resolves the
   regression-or-feature debate between Pavel (wants annotations)
   and the user feedback that landed in the prior commit (don't
   embed table-specific content; tables change). Catalog command
   stays the source of truth.
2026-05-05 18:11:59 +02:00
ZdenekSrotyr
4908a0d7a2 Merge remote-tracking branch 'origin/main' into pr180-review
# Conflicts:
#	CHANGELOG.md
#	pyproject.toml
2026-05-05 15:22:10 +02:00
Vojtech Rysanek
0843c2bd1b fix(cli): bump --remote query timeout to 300s, add AGNES_QUERY_TIMEOUT
The httpx client behind 'agnes query --remote' used the default 30s
timeout, killing every BigQuery SELECT that took longer than half a
minute — i.e. most non-trivial remote queries.

cli/client.py now exposes QUERY_TIMEOUT_S (default 300s, override via
AGNES_QUERY_TIMEOUT) and propagates a kw-only 'timeout' through
api_get/post/delete/patch. _query_remote passes QUERY_TIMEOUT_S so only
the long-running /api/query path gets the bump; every other CLI call
keeps the 30s default.

Server-side has no read deadline on /api/query, so the client cap was
the sole bottleneck.
2026-05-05 16:40:54 +04:00
ZdenekSrotyr
8d8d2c219e refactor(cli-store): pull/info → agnes admin store; add agnes store mine
Backup-orchestration commands were split across two namespaces (pull in
agnes store, push in agnes admin store), which broke the operator
mental model — pull/push are a paired operation and should sit
together.

Move pull + info into agnes admin store so all bulk operations share
one help screen. Add agnes store mine as the user-facing equivalent —
calls the same /api/store/bundle.zip endpoint with ?owner=me, which
the server resolves to the caller's user_id. Authors can archive
their own uploads without admin role; whole-Store bulk reads stay
admin-flavored as a discoverability hint.

Server: 3-line addition to export_bundle handles owner='me' as a
magic alias for the caller. No new endpoint.

Tests updated: pull/info expectations move from agnes store to
agnes admin store; new tests cover agnes store mine and the
?owner=me server resolution. 69/69 store tests green locally.
2026-05-05 13:49:18 +02:00
ZdenekSrotyr
a8f9d065c8 feat(store): bundle export/import + agnes store update + agnes admin store push
Adds whole-Store backup/restore primitives so an external CI/CD job can
mirror the Store to a git repo (and restore back from one).

REST:
- GET /api/store/bundle.zip — deterministic ZIP of all (filtered) Store
  entities. Layout: manifest.json + entities/<id>/{plugin,assets}/.
  Manifest carries owner_email for cross-instance restore. Auth: any
  authenticated user (Store is community-open).
- POST /api/store/import-bundle — admin-only restore. Modes
  merge|replace|skip; owner resolution by email with stub-disabled-user
  fallback when the email is unknown on the target instance.

CLI:
- agnes store update <id> [--description X] [--zip PATH] ... — in-place
  edit (server PUT permits owner OR admin per F4). Closes the missing
  edit affordance for analysts who want to fix a typo or push a new
  ZIP without losing install_count.
- agnes store pull [-o store.zip] [--unpack DIR] — download the bundle.
  --unpack streams + extracts so an external git-backup workflow can
  drop the tree straight into a repo and `git add .`.
- agnes store info [--json] — counts + size summary.
- agnes admin store push <zip-or-dir> [--mode ...] — admin-only restore.
  Auto-zips a directory client-side so a working-tree → server
  round-trip is one command.

cli/v2_client.py gains api_get_stream helper for binary downloads.

Tests: 5 new server tests (bundle shape + filters + round-trip + stub
user creation + skip mode + admin-only gate) + 11 new CLI tests
(update, pull/unpack, info, admin push). 66/66 store-related tests
green locally.
2026-05-05 11:51:31 +02:00
ZdenekSrotyr
16373d6b0b feat(cli): agnes store + agnes my-stack commands
Adds CLI coverage for the new REST surface introduced in this PR:

  agnes store list / show / install / uninstall / upload / delete
  agnes my-stack show / toggle

Covers 11 of the 15 new endpoints — listing, detail, install/uninstall,
upload (multipart), delete, my-stack get + curated toggle. Photo / docs
download endpoints intentionally skipped; analyst-side automation rarely
needs raw bytes back, and the web UI already covers them.

cli/v2_client.py: api_post_multipart + api_put_multipart helpers (httpx
files= passthrough). api_delete + api_put_json fillers were already
needed for non-multipart writes; added together.

Tests: tests/test_cli_store.py — help-text smoke tests + happy-path
mocked tests for list, install, upload, my-stack show, my-stack toggle.
12 new tests, all green.
2026-05-05 08:18:12 +02:00
ZdenekSrotyr
4c7ce9ce32
Update cli/commands/init.py
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-05-04 23:25:06 +02:00
ZdenekSrotyr
36012e0833 fix(admin): register-table real-world UX gaps for materialized BQ
Three items from operator feedback after running the actual flow:

(1) Help docstring lied: "--bucket / --source-table ignored" for
materialized rows. Reality: --bucket is load-bearing because
`agnes schema <name>` builds the BQ identifier as
`bq.<bucket>.<source_table>`. An empty bucket registered the row but
broke schema/describe with HTTP 400 "unsafe BQ identifier in
registry". Fix: docstring rewritten to reflect reality, plus
client-side validation rejects materialized + empty bucket with a
clear error pointing at the right knob.

(2) Post-register UX cliff: `agnes pull` after register-table reports
"Updated 0 tables (1 total)" because registration adds a registry
row but does NOT trigger a parquet build. Operators routinely
assume something's broken when they need to run
`agnes setup first-sync` to kick off the materialization. Hint
emitted on success now points at first-sync.

(3) RBAC gotcha: `agnes catalog` is RBAC-filtered via
`resource_grants`, so non-admin users don't see freshly-registered
rows until a grant is created. Hint emitted on success now points at
`agnes admin grant create <group> table <name>`.

Tests: 8/8 in test_cli_admin_materialized.py, including two new
regression tests for the validation + the hint output.
2026-05-04 23:06:17 +02:00
ZdenekSrotyr
8784f10a6b fix(devin-review): stale-token override + status sessions counter + lock comment
Three Devin Review findings on PR #173 addressed in one commit since
they're in adjacent code paths:

1. cli/commands/init.py:99 (\u{1F534}): `agnes init --token NEW` ran
   step 2 verify against the OLD on-disk token because `get_token()`
   read `~/.config/agnes/token.json` before the env var, and
   `_override_server_env` only set the env var. So `agnes init --force`
   on a machine with a stale token.json failed 401 with a confusing
   'token expired' even though the --token arg was valid.

   Fix: ContextVar-based override in `cli.config._token_override`
   checked by `get_token()` BEFORE the on-disk read.
   `_with_token_override` context manager scopes the override.
   `_override_server_env` now also sets the contextvar via
   `_with_token_override(token)`, so both env var and contextvar
   carry the override (env for back-compat with anything bypassing
   get_token; contextvar is the authoritative source).
   Async-safe (each task sees its own override) and leak-proof
   (resets on context exit).
   2 new tests: regression on stale-disk-token + scope leak guard.

2. cli/commands/status.py:43 (\u{1F7E1}): sessions_pending_upload only
   checked legacy `<workspace>/user/sessions/` and always reported 0
   in workspaces bootstrapped with `agnes init` (Claude Code writes
   to `~/.claude/projects/`, not the legacy path). Same bug we fixed
   for `agnes push` in 08e49591.

   Fix: route through `cli.lib.claude_sessions.list_session_files()`
   so status and push agree on what counts as a pending session.

3. connectors/bigquery/extractor.py:111 (\u{1F7E1}): docstring claimed
   "a live holder still wins the second flock attempt" — incorrect on
   Linux. After `unlink()` + `open()`, the new file is a new inode;
   fcntl.flock keys per-inode, so the old holder's lock does NOT block
   the new acquisition. In a genuine TTL-overrun scenario two writers
   CAN race the parquet.tmp.

   Fix: documentation only. Comment now honestly describes the
   inode-recreation behavior, names the threading.Lock as the actual
   in-process guard, and flags pid-gating as the next-iteration fix
   if real corruption surfaces. The 24h default TTL is well above
   typical COPY durations so the practical risk is low.

Tests: 17/17 across test_cli_init.py + test_lib_pull.py + the broader
regression set.
2026-05-04 21:26:30 +02:00
ZdenekSrotyr
103efb69f0 chore(cli-rename): replace stale da verbs in active code paths
Bring admin UI, audit-log messages, code comments, and analyst-facing
skill docs in line with the post-bootstrap CLI surface (`agnes pull`,
`agnes push`, `agnes init`, `agnes snapshot create`). The legacy
`_LEGACY_STRINGS` detection tuple in `app/api/claude_md.py` and the hook
upgrade markers in `cli/lib/hooks.py` are intentionally left as-is —
they exist precisely to flag pre-rewrite content for re-authoring.

Strip "(folded from `da metrics list`)" / "(lifted from `da metrics
show`)" / "Replaces the old `da analyst status`" docstring noise — the
rename history is in CHANGELOG.md, not in module docstrings.
2026-05-04 21:10:43 +02:00
ZdenekSrotyr
ee83cebbda fix(cli): Windows console crash on cs-CZ codepage (port + broaden #172)
Ports Minas's PR #172 (against pre-rename `da` CLI on main) and applies
the principle to the post-rename `agnes` CLI. Two distinct failure modes
on Windows consoles whose default codepage is cp1250 (cs-CZ) / cp1252
(en-US):

1. `agnes pull` and other Rich-progress codepaths
   UnicodeEncodeError on Braille spinner glyphs. Fix: `cli/main.py`
   reconfigures stdout/stderr to UTF-8 with errors='replace' at import
   time on `sys.platform == 'win32'` so Rich's legacy-Windows render
   path emits decodable bytes. Wrapped in try/except so pytest's
   captured streams (which aren't TextIOWrapper) don't break.

2. `agnes skills list` and `agnes skills show`
   UnicodeDecodeError when reading skill markdown containing em-dashes /
   accented chars. Default `Path.read_text()` uses
   locale.getpreferredencoding(False), which is the broken codepage on
   Windows. Fix: every call site passes encoding='utf-8' explicitly.

Broader scope than #172 because:
- The bootstrap rewrite renamed/removed several files Minas's PR
  patched (`cli/commands/analyst.py` -> rolled into init.py;
  `cli/commands/sync.py` -> split into pull/push). Those targets no
  longer exist; the equivalent code lives in init.py.
- Other call sites Minas didn't touch (still bare in his branch) are
  patched here too — config.py / update_check.py / snapshot_meta.py /
  setup.py / skills.py — so the codebase has zero locale-default text
  I/O in cli/.

Side cleanup: stale `Run `da`` reference in snapshot_meta.py:88 fixed
to `agnes` while touching the file.
2026-05-04 20:45:29 +02:00
ZdenekSrotyr
e323ab76cc fix(snapshot): catch httpx transport errors in --estimate path
CI failure: test_readers_in_pre_init_dir asserted no Traceback in stderr
when running `agnes snapshot create x --as y --estimate` in a folder
that never saw `agnes init`. The estimate-guard fix in 3d587681 let
`--estimate` skip the local_db check and reach `api_post_json`, but the
existing `except V2ClientError` doesn't cover transport-layer failures.
With no server configured the URL defaults to http://localhost:8000;
httpx raises ConnectError → ConnectError isn't a V2ClientError → the
exception bubbles up through Typer/rich as a full traceback.

Add `except httpx.HTTPError` next to V2ClientError so connection /
DNS / TLS / timeout failures all render the friendly hint
`Run `agnes init …` first` instead of leaking transport noise.
2026-05-04 20:36:30 +02:00
ZdenekSrotyr
08e4959185 fix(push): read sessions from ~/.claude/projects/<encoded-cwd>/
Real bug: `agnes push` was reading `<workspace>/user/sessions/`, but
Claude Code writes session jsonls to `~/.claude/projects/<encoded-cwd>/`
and nothing on the analyst side ever copies them across. The SessionEnd
hook ran `agnes push` happily and uploaded zero sessions every time.

`cli/lib/claude_sessions.py` probes both Claude Code encoding variants
(older `/`→`-` keeping spaces+tildes; newer all-non-alphanumeric→`-`
with collapsed runs) and unions whichever exist. Users who upgraded
Claude Code mid-project end up with both encoded dirs side-by-side on
disk; the union ensures no session is left behind. Same-named jsonl in
both dirs → newest mtime wins. `<workspace>/user/sessions/` survives as
a fallback for any setup that explicitly mirrors sessions there.

Verified on real disk: helper returns 2 dirs + 8 unioned session files
for the Agnes-test workspace where the previous code returned 0.
2026-05-04 20:29:59 +02:00
ZdenekSrotyr
3d58768143 fix: address Devin Review findings — incomplete renames + estimate guard
13 Devin findings across 10 files:

🔴 Critical:
- app/api/v2_catalog.py:42 — `_fetch_hint` returns `da fetch` in /api/v2/catalog
  responses (user-visible in every catalog list)
- cli/skills/agnes-data-querying.md — 11 stale `da fetch`/`da sync` refs in the
  bundled skill markdown
- config/claude_md_template.txt:38 — referenced `agnes pull --docs-only` flag
  that does NOT exist in agnes pull (removed; spec only ships --quiet/--json/
  --dry-run)

🟡 Important:
- app/api/admin.py:252 — `da fetch` in bq_max_scan_bytes hint
- cli/commands/auth.py:119 — `da sync` in import-token docstring (--help text)
- cli/commands/tokens.py:48 — "Export it so `da` can use it" prose
- ARCHITECTURE.md — 4 stale rows in CLI commands table
- README.md — stale paragraphs for analysts (da sync, da analyst setup)

🚩 Substantive observations addressed:
- app/api/query.py:249,302,489 — server-side error/help strings still said
  `da sync`/`da fetch` (returned in API responses to clients)
- cli/commands/snapshot.py:235-241 — DuckDB existence guard incorrectly
  blocked `--estimate` (server-side dry-run that never opens local DB).
  Added test ensuring estimate path skips the guard.

Skipped (intentionally historical):
- app/api/admin.py:2377,2429,2437 — historical comments describing past
  manifest-vs-sync_state bug; past tense, accurate to keep as `da sync`.
2026-05-04 20:05:06 +02:00
ZdenekSrotyr
20bb9efc0e chore(lint): drop unused os import from init.py 2026-05-04 19:32:18 +02:00
ZdenekSrotyr
7e1dd1adba refactor(cli): drop sync/fetch/analyst/metrics; register init/pull/push (BREAKING) 2026-05-04 18:59:51 +02:00
ZdenekSrotyr
5551f12bb0 fix(cli): hint text 'Run: da sync' → 'Run: agnes pull' 2026-05-04 18:42:21 +02:00
ZdenekSrotyr
ff5da0af90 feat(cli): agnes admin metrics {import,export,validate} 2026-05-04 18:39:05 +02:00
ZdenekSrotyr
42b8d0309b feat(cli): agnes catalog --metrics replaces da metrics list/show 2026-05-04 18:33:17 +02:00
ZdenekSrotyr
8309141705 feat(cli): agnes snapshot create (folded from da fetch); friendly exit if no DuckDB 2026-05-04 18:32:30 +02:00
ZdenekSrotyr
5e1e8c4e14 feat(cli): agnes status = workspace state; old health check moves to agnes diagnose system 2026-05-04 18:29:15 +02:00
ZdenekSrotyr
b799aa534a fix(cli): I1+I2 review — surface manifest_unauthorized + add 3 typed-error tests 2026-05-04 18:19:35 +02:00
ZdenekSrotyr
9b70ca3069 feat(cli): agnes init orchestrator + AGNES_WORKSPACE.md template 2026-05-04 18:15:08 +02:00
ZdenekSrotyr
60b6fbed97 feat(cli): agnes push command (extracted from sync --upload-only) 2026-05-04 18:09:57 +02:00
ZdenekSrotyr
7f89e1d594 feat(cli): agnes pull command (Typer wrapper around lib.pull.run_pull) 2026-05-04 18:07:28 +02:00
ZdenekSrotyr
1563b05f2e refactor(cli): hard-cutover env vars + config dir to AGNES_*
Task 0.5 of clean-analyst-bootstrap. Greenfield rewrite — no fallback,
no aliases. Existing dev environments lose their cached PAT and must
re-authenticate.

Env var renames (hard cutover):
- DA_CONFIG_DIR    -> AGNES_CONFIG_DIR
- DA_SERVER        -> AGNES_SERVER
- DA_SERVER_URL    -> AGNES_SERVER_URL  (test-only stale ref, not in spec)
- DA_NO_UPDATE_CHECK -> AGNES_NO_UPDATE_CHECK
- DA_LOCAL_DIR     -> AGNES_LOCAL_DIR
- DA_TOKEN         -> AGNES_TOKEN
- DA_STREAM_RETRIES -> AGNES_STREAM_RETRIES

Config dir rename: ~/.config/da/ -> ~/.config/agnes/ (across code,
comments, docstrings, error messages, install templates, dev scripts).

Stale `da X` references in CLI source (and adjacent app/, tests/):
swept docstrings, comments, help text, and error messages where the
verb survives the rewrite (init, pull, push, catalog, status, diagnose,
auth, admin, skills, query, schema, describe, explore, disk-info,
snapshot, login, logout, whoami, server, setup) and replaced `da X`
with `agnes X`. Intentionally kept `da sync`, `da fetch`, `da analyst`,
`da metrics` — those verbs are removed in later tasks; the legacy
strings will be detected by `_LEGACY_STRINGS` (added in Task 2).

Test fixes:
- TestCLIVersion now asserts output starts with `agnes ` (was `da `).

Test results: 2675 passed, 25 skipped (full pytest run, excluding 9
pre-existing test_db.py / test_user_management.py / test_e2e_extract.py
/ test_cli_binary_rename.py failures unrelated to this rename).
2026-05-04 16:35:44 +02:00
ZdenekSrotyr
57482be263 feat(cli): #160 shared structured error renderer for BQ-typed responses
The reporter (#160) saw `USER_PROJECT_DENIED` raw in the CLI because
all three CLI error-rendering paths flatten typed BqAccessError /
guardrail / RBAC dicts to a truncated single-line string, hiding the
structured `hint` field that explains how to fix the misconfig.

Fix: shared `cli/error_render.py:render_error(status_code, body)` that
recognizes the canonical typed shapes and pretty-prints them. Falls
back to truncated-and-flattened form for unrecognized bodies, so the
renderer never makes worse-than-status-quo output.

Recognized shapes:
- {detail: {kind: ..., hint?, billing_project?, data_project?}}
  — typed BqAccessError responses from /api/v2/scan, /sample, /schema,
  /api/query (when /api/query escalates a BQ failure)
- {detail: {reason: 'remote_scan_too_large', scan_bytes, limit_bytes,
  tables, suggestion}} — new /api/query cost-guardrail rejection
- {detail: {reason: 'bq_path_not_registered'/'bq_path_access_denied',
  path, hint?, registered_as?}} — new /api/query RBAC patch
- {detail: '...'} — string detail (legacy endpoints)

Wired through 3 CLI paths:
- cli/v2_client.py: V2ClientError.__str__ delegates to render_error;
  pre-truncation removed from V2ClientError.message (was hiding hints
  past 200 chars).
- cli/commands/query.py:_query_remote: parse JSON body, call renderer
  on error.
- cli/commands/query.py:_query_hybrid: catch RemoteQueryError, build
  synthetic `{detail: {kind: error_type, **details}}` payload, render.

tests/test_cli_query.py:test_remote_query_failure: assertion updated
from `"Query failed"` (no longer printed) to `HTTP 400` + `bad SQL`
(what the renderer surfaces for string detail).

Sample output for cross_project_forbidden:

  Error: cross_project_forbidden (HTTP 502)
    billing_project: (empty)
    data_project: prj-example-data-001
    message: USER_PROJECT_DENIED on bigquery.googleapis.com
    hint: Set data_source.bigquery.billing_project in
        /admin/server-config to a project where the SA has
        serviceusage.services.use, or grant the SA that role on the
        data project.

19 tests pass — 10 from T4a now GREEN + 3 prior cli_query tests still
green + 6 ancillary.
2026-05-04 10:31:35 +02:00
ZdenekSrotyr
297d07f2a1 fix(cli): setup summary reflects actual CLAUDE.md write outcome (True/False return) 2026-05-04 07:17:37 +02:00
ZdenekSrotyr
955b56608d feat(api,web,cli): /admin/workspace-prompt + /api/welcome restored + da analyst writes CLAUDE.md
- app/api/claude_md.py: GET /api/welcome (analyst, auth required); GET/PUT/DELETE
  /api/admin/workspace-prompt-template; POST …/preview; two-pass Jinja2 validation
  on PUT; validation stub mirrors build_claude_md_context() shape
- app/main.py: register claude_md_router
- app/web/router.py: GET /admin/workspace-prompt → admin_workspace_prompt.html
- app/web/templates/admin_workspace_prompt.html: CodeMirror editor + live preview +
  status chip + reset modal; mirrors admin_welcome.html for Agent Setup Prompt
- app/web/templates/_app_header.html: add "Agent Workspace Prompt" nav item next to
  "Agent Setup Prompt"; extend _admin_active to cover /admin/workspace-prompt
- cli/commands/analyst.py: _init_claude_workspace now accepts server_url + token;
  _write_claude_md fetches GET /api/welcome, writes CLAUDE.md, graceful 404/5xx;
  setup command adds --no-claude-md flag to opt out; default = write CLAUDE.md
- tests: test_claude_md_api.py (16 tests); test_analyst_bootstrap.py updated with
  4 new CLAUDE.md bootstrap tests; test_welcome_template_api.py: update stale
  assertion about /api/welcome being removed (endpoint restored)
- tests/snapshots/openapi.json: regenerated
2026-05-03 22:44:14 +02:00
ZdenekSrotyr
61ef0d0eed fix(devin-review): address 4 findings on PR #167
- Fix #1: _detect_existing_project now checks .claude/settings.json for
  "da sync" marker instead of deleted CLAUDE.md; update tests accordingly.
- Fix #2: preview endpoint uses autoescape=False to match /setup rendering;
  align render_agent_prompt_banner in welcome_template.py to the same.
- Fix #3: apply _sanitize_banner_html to override render path in setup_page
  so all render paths sanitize consistently.
- Fix #4: move .setup-link-banner into the existing-user branch where
  account_details.last_sync_display is reachable; remove dead copy from
  new-user branch.
2026-05-03 21:15:01 +02:00
ZdenekSrotyr
8db4c1645b feat(admin-prompt): variant C — banner on /setup, drop CLAUDE.md generation
- src/welcome_template.py: rewrite as HTML banner renderer
  (render_agent_prompt_banner); drop _list_tables, _metrics_summary,
  _marketplaces_for_user, render_welcome, _load_default_template.
  build_context now exposes only instance/server/user/now/today.
  _sanitize_banner_html strips script/iframe/on*/javascript: post-render.
- app/api/welcome.py: drop get_welcome handler, WelcomeResponse, old
  _VALIDATION_STUB_CONTEXT. Admin endpoints stay at same URLs; validation
  stub updated to match new slim context. Preview now uses autoescape=True.
- app/web/router.py: setup_page calls render_agent_prompt_banner and passes
  banner_html to install.html; admin_agent_prompt_page drops _load_default_template.
- app/web/templates/install.html: add .setup-banner CSS + banner block above hero.
- cli/commands/analyst.py: replace _generate_claude_md with _init_claude_workspace;
  no CLAUDE.md written, only .claude/CLAUDE.local.md placeholder + settings.json hooks.
- tests: delete test_cli_analyst_welcome.py (tests deleted endpoint/function);
  rewrite TestGenerateClaudeMd → TestInitClaudeWorkspace; update api test to
  assert /api/welcome returns 404 and remove welcome-fetch tests.
2026-05-03 16:12:13 +02:00
ZdenekSrotyr
517e63d217 fix(cli): warn on welcome-fetch failures; expand test coverage 2026-05-03 16:10:48 +02:00
ZdenekSrotyr
c604dad9cf feat(cli): da analyst setup fetches rendered welcome from /api/welcome 2026-05-03 16:10:48 +02:00
ZdenekSrotyr
85d3810535 feat(materialized): query_mode='materialized' for BigQuery + Keboola — admin SELECT → parquet → analyst
Closes the 'admin pre-stages a curated table/view for analysts' use case end-to-end across both supported source connectors.

Backend (BigQuery + Keboola, schema v20):
  - schema v20 adds source_query TEXT to table_registry (renumbered from v19 after main's #150 RBAC migration also bumped to v19)
  - connectors/bigquery/extractor.py adds materialize_query(table_id, sql, *, bq, output_dir, max_bytes=...) — BqAccess session, dry-run cost guardrail (default 10 GiB, configurable via data_source.bigquery.max_bytes_per_materialize), idempotent ATTACH, rows/bytes/md5 metadata for sync_state
  - connectors/keboola/access.py — new KeboolaAccess facade (parallel of BqAccess) wrapping ATTACH 'keboola://...' AS kbc
  - connectors/keboola/extractor.py adds materialize_query — same shape, no dry-run analog (Keboola Storage API has different cost model); legacy bucket-download path skips query_mode='materialized' rows
  - app/api/sync.py:_run_materialized_pass dispatches by source_type to the right materialize_query
  - app/api/admin.py: RegisterTableRequest accepts source_query; model_validator coheres mode↔source_query↔bucket; PUT preserves omitted fields; deprecation marks (Field(deprecated=True)) on sync_strategy + profile_after_sync (no extractor reads them; profile_after_sync becomes inert — bug from earlier work where /api/sync/trigger never honored the flag); _BQ_OPTIONAL_FIELD_DEFAULTS injects defaults into GET /server-config payload

Operator + CLI surface:
  - da admin register-table --query / --query-mode materialized
  - scripts/smoke-test-materialized-bq.sh — end-to-end smoke for operators

Tests (incl. spike + integration + regression):
  - test_db_migration_v20, test_table_registry_source_query
  - test_bq_materialize, test_bq_cost_guardrail, test_bq_init_extract_skips
  - test_keboola_access, test_keboola_extension_query_passthrough (lock-in for the DuckDB extension capability), test_keboola_materialize, test_keboola_init_extract_skips, test_keboola_materialized_e2e (skipped without KBC_TEST_* creds)
  - test_sync_trigger_materialized, test_sync_trigger_keboola_materialized
  - test_api_admin_materialized, test_cli_admin_materialized
  - test_admin_bq_register, test_admin_discover_bigquery, test_admin_keboola_materialized, test_admin_phase_c_deprecation, test_admin_put_preservation, test_materialized_e2e

Cost: BQ uses bigquery_query() (jobs API, view-aware) — works on tables, views, materialized views uniformly. Keboola uses ATTACH+COPY parquet through the DuckDB extension.
2026-05-01 20:25:56 +02:00
ZdenekSrotyr
d0b7e122d6 feat(cli): smart local sync — Claude Code SessionStart/SessionEnd hooks + da sync --quiet
The analyst flow becomes a closed loop with the server-curated table catalog:

  - `da analyst setup` writes `<workspace>/.claude/settings.json` with two hooks:
      SessionStart → `da sync --quiet || true`        — pulls fresh RBAC-filtered parquets at session start
      SessionEnd   → `da sync --upload-only --quiet || true` — uploads session jsonl + CLAUDE.local.md
  - `|| true` keeps Claude Code unblocked when the server is down.
  - Workspace-level (not user-home) so the hooks fire only when Claude Code opens this analyst workspace.
  - `da sync --quiet` rewrites the CLI output for hook consumption — 0 stdout on success, single-line error on failure.
  - Existing settings.json is patched (deep-merged), not overwritten; malformed JSON is reported, not silently overwritten.

Tests cover: workspace bootstrap, hook insertion, malformed-json safety, quiet-mode output shape.
2026-05-01 20:25:27 +02:00
minasarustamyan
d4ac84dd46
feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150)
* feat(rbac): drop dataset_permissions + access_requests + users.role + is_public; v19 migration

BREAKING. Sjednocení datové RBAC vrstvy do per-group resource_grants modelu.
Před PR byla legacy data RBAC vrstva (dataset_permissions + is_public bypass)
de-facto neaktivní — is_public neměl API/UI/CLI surface, default true znamenal
že can_access_table vždycky bypassl. Dnes každý non-admin přístup vyžaduje
explicitní resource_grants(group, "table", id) řádek.

Schema v18 → v19 (src/db.py:_v18_to_v19_finalize):
- DROP TABLE dataset_permissions, access_requests
- DROP COLUMN users.role (NULL artifact since v13)
- DROP COLUMN table_registry.is_public
- Drops přes table-rebuild idiom (rename → create new → INSERT … SELECT
  → drop old) kvůli DuckDB ALTER DROP COLUMN limitacím na tabulkách
  s historic FK constraints. INSERT picks intersection sloupců, takže
  test fixtures s minimal pre-v19 schemou migrate cleanly.

Runtime:
- src/rbac.py:can_access_table → deleguje na app.auth.access.can_access
- DatasetPermissionRepository, AccessRequestRepository smazány
- AGNES_ENABLE_TABLE_GRANTS env-gate v app/resource_types.py odstraněn
  (TABLE je unconditionally enabled)

API drop:
- app/api/permissions.py, app/api/access_requests.py celé soubory
- /admin/permissions web route + admin_permissions.html
- "Request Access" modal v catalog.html + locked-row UI
- ~10 if user.get("role") != "admin" checků nahrazeno (admin shortcut
  je uvnitř can_access_table)
- /api/settings: drop permissions field z GET; PUT /api/settings/dataset
  gate přepnut na can_access(user_id, "table", dataset, conn)

Auth:
- app/auth/jwt.py:create_access_token: drop role parametr (claim zmizí
  z nově vydávaných JWT; staré tokeny zůstávají valid, claim ignored)
- app/api/users.py: drop role z CreateUserRequest / UpdateUserRequest
  (admin promotion = explicit add to Admin group via memberships API)
- src/repositories/users.py: drop role z create() / update()

CLI:
- da admin set-role smazán → hard-fail s replacement command
- da admin add-user --role flag pryč
- da auth import-token --role flag pryč
- da auth whoami: drop "Role:" výpis
- cli/config.py:save_token: role parametr now optional, no longer written
  (back-compat se starými token.json soubory zachována — pole se ignoruje)

Tests:
- DELETE: test_permissions.py, test_permissions_api.py, test_access_requests_api.py
- REWRITE: test_access_control.py (resource_grants flow), test_rbac.py
  (can_access_table over resource_grants), test_journey_rbac.py
  (drop access-request flow), test_resource_types.py (drop env-gate
  tests, drop is_public from helpers), test_v2_*.py (drop role-based
  user dicts in favor of id-based + Admin group membership),
  test_settings_api.py (no permissions field, can_access gate)
- TRIVIAL: ~30 souborů — drop role="admin" arg z UserRepository.create
  a 3rd positional role z create_access_token
- NEW: test_v18_to_v19 migration test (test_db.py),
  test_can_access_table_no_implicit_public (test_rbac.py),
  test_admin_set_role_returns_hardfail (test_cli_admin.py)
- OpenAPI snapshot regenerated

Docs:
- CHANGELOG: BREAKING entry pod [Unreleased]
- CLAUDE.md: schema v18 → v19
- docs/architecture.md: schema table + RBAC sekce přepsána
- docs/auth-google-oauth.md: admin promotion přes da admin break-glass
- cli/skills/security.md: kompletně přepsáno na group-based model
- docs/TODO-rbac-data-enforcement.md: smazáno (TODO splněn)

Test results: 2363 passed, 19 failed. Zbývající failures jsou pre-existing
Windows-specific issues (fcntl, charset) nesouvisející s tímto PR —
ověřeno git stash pop.

Plan: ~/.claude/plans/floofy-coalescing-parnas.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(release): cut 0.27.0

---------

Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
2026-04-30 22:02:16 +02:00