agnes-the-ai-analyst

Author	SHA1	Message	Date
ZdenekSrotyr	6c94d2cbce	Merge remote-tracking branch 'origin/main' into pr180-review # Conflicts: # CHANGELOG.md # pyproject.toml	2026-05-06 07:27:25 +02:00
ZdenekSrotyr	fdc6cd7fb4	release: 0.37.0 — STATE_DIR + flat-mount overlay; host-mount direct-bind fix	2026-05-06 06:53:48 +02:00
ZdenekSrotyr	a9ae5f9c35	fix(flat-mount): preserve data:/srv:ro and caddy_config:/config in caddy override; CHANGELOG The flat-mount overlay's caddy `volumes: !override` block listed only three mounts, but the base docker-compose.yml caddy service has five. `!override` (compose-spec semantics) replaces the entire list, so two mounts were silently dropped under the flat layout: - `data:/srv:ro` — Caddy's read-only view of the agnes data dir, used by the `@download` file_server handler in Caddyfile (added in v0.36.0 as the perf bypass for multi-GB parquet downloads). Without this mount, `try_files /bigquery/data/<id>.parquet …` finds no file and every parquet download falls through to the app's uvicorn worker — defeating the bypass entirely. - `caddy_config:/config` — Caddy's autosave/ACME state. Less critical (we feed certs in via /certs) but loses the autosaved adapter config across container recreates. Restated both mounts with a comment block explaining the !override caveat for any future overlay author. Plus: CHANGELOG entries for the host-mount.yml direct-bind fix and the STATE_DIR + flat-mount overlay under [Unreleased].	2026-05-05 19:29:38 +02:00
ZdenekSrotyr	e2f740d7ab	fix(changelog): consolidate duplicate Added/Changed sections in 0.36.0 Devin Review on PR #188 (15:53Z): the renamed [0.36.0] section had two separate ### Added blocks and two separate ### Changed blocks, which violates Keep-a-Changelog grouping (and CLAUDE.md's explicit 'group by section' rule). Merged each set into a single ordered block: Added, Changed, Fixed. No content removed; only reflowed.	2026-05-05 19:04:51 +02:00
ZdenekSrotyr	f33475cec3	release: 0.36.0 — perf + analyst-clarity bundle Renames the [Unreleased] section to [0.36.0] in CHANGELOG, adds the top-level summary, drops a fresh empty [Unreleased] above, and bumps pyproject from 0.35.1. Also fixes the third Devin Review finding on this PR: the CLI ReadTimeout message hardcoded QUERY_TIMEOUT_S (300s) so a 30s-default call (agnes catalog, agnes auth, …) reported a wait window that didn't match reality. _translate_transport_error now takes the actual httpx timeout from the calling helper; the BQ-job advisory only appears for calls where the timeout was set ≥ 60s.	2026-05-05 18:57:04 +02:00
ZdenekSrotyr	28423907fd	feat: clean CLI errors + init progress + skip-materialize + claude.md catalog pointer Three first-try-failure-surface fixes from Pavel's #185 trace + the template guidance question, all under PR #188's umbrella so they land together with the file_server / parallel pull / Tier 1 work. 1. CLI clean-error wrapper — new AgnesTransportError raised by the api_*/stream_download helpers when httpx times out / drops / refuses, plus a top-level Typer wrapper (cli/main.py) that prints one-line "Error: …" + actionable hint and exits non-zero. Full traceback goes to ~/.config/agnes/last-error.log for support forwarding. Unhandled Exceptions are caught at the same boundary so no Python traceback ever leaks to the analyst's terminal. Pavel's #185 Phase 3B: a 30-frame httpx traceback from a slow BQ --remote query made it look like a CLI bug. Now: clean message + hint pointing at `agnes snapshot create` / partition-column guidance. Entry point in pyproject.toml flipped from `cli.main:app` → `cli.main:_run_with_clean_errors` so the wrapper actually runs under the installed `agnes` binary. 2. agnes init / agnes pull --skip-materialize + progress bar. --skip-materialize omits query_mode='materialized' rows from the download set so a first init doesn't spend 44 minutes silently pulling a single 6 GB parquet (Pavel's #185 Phase 1). Rich-driven per-file progress bar with label/bytes/rate/ETA renders to stderr when not --quiet and not --json. Aggregates across the parallel ThreadPoolExecutor workers added earlier in this PR. 3. config/claude_md_template.txt: explicit one-line snippet pointing at `agnes catalog --json \| jq '.tables[] \| select(.id=="<id>")'` for per-table descriptions + restated invariant: "the description field on each catalog row is the authoritative business-rules text — re-read live, never copy into this file." Resolves the regression-or-feature debate between Pavel (wants annotations) and the user feedback that landed in the prior commit (don't embed table-specific content; tables change). Catalog command stays the source of truth.	2026-05-05 18:11:59 +02:00
ZdenekSrotyr	e5fb913cec	perf: Tier 1 event-loop unblocking — async def → def on BQ-bound handlers Five hottest BQ-touching endpoints were `async def` but invoked synchronous DuckDB / BQ-extension calls inside the body. Under uvicorn's single event loop that meant a single heavy `agnes query --remote` (waiting up to ~200 s for BQ's jobs.query) froze EVERY other request — /api/health, dashboard, auth, even another query — for the full BQ wait. Operators saw "VM idle, app frozen" during PR #188's testing. Convert to plain `def` so FastAPI auto-offloads the body to the anyio thread pool. Event loop stays free for non-BQ requests. - app/api/query.py:execute_query - app/api/v2_scan.py:scan_estimate_endpoint, scan_endpoint - app/api/v2_sample.py:sample - app/api/v2_schema.py:schema Audit: 0 `await` statements in any converted handler (verified file-by- file), so the rename is safe. Tests in tests/test_v2_*.py called the handlers via `asyncio.run(...)` which now fails on a non-coroutine return; swapped for direct calls (asyncio.run( -> ( ) — keeps paren balance). Plus AGNES_THREADPOOL_SIZE env var (default 200, was anyio's stock 40) in app/main.py:lifespan. Set via anyio.to_thread.current_default_thread_limiter().total_tokens. 200 is comfortable headroom for <50 concurrent analysts; bump for more. 480/480 impacted tests pass (the 2 remaining errors are a pre-existing fixture setup issue in test_reader_smoke_matrix.py unrelated to this change).	2026-05-05 17:44:08 +02:00
ZdenekSrotyr	30e81a15b9	feat(workspace-prompt): decision tree + size-hint so analyst Claude gets it right first try Three concrete changes addressing the "analyst Claude misuses the CLI" class of bugs (image.png table — issues #3, #5, plus the recurrent "how big is this table" guesswork): 1. config/claude_md_template.txt — the template agnes init writes to <workspace>/CLAUDE.md. Surfaces every catalog-row field with a why, adds a query_mode-based decision tree, explicit --estimate scoping (snapshot create ONLY — was the #1 first-try error), an agnes fetch → agnes snapshot create rename note, and a 6-row failure-mode table that maps each common error wording to its right next step. 2. app/api/v2_catalog.py — populate rough_size_hint for local + materialized rows from the on-disk parquet size, bucketed small/medium/large/very_large. Was hardcoded null with a TODO; AI couldn't tell "is this 6.8 GB" without a failed --remote round-trip. 3. cli/update_check.py — the [update] banner survived the da→agnes rename and printed "[update] da X is out of date" on every command, training analysts to associate the binary with the old name. Verified by rendering the template against representative contexts (33/33 tests pass) and running every use case from the original screenshot through the real CLI against a dev VM.	2026-05-05 16:44:24 +02:00
ZdenekSrotyr	2ae486bc5d	feat(pull): parallel parquet downloads (AGNES_PULL_PARALLELISM=4 default) The download loop in cli/lib/pull.py was strictly serial — N tables took Σ stream_download(t_i). With the Caddy file_server change in this PR, the server can now sustain many parallel sendfile transfers without blocking app workers, so the client-side serialization became the new bottleneck. Switch to ThreadPoolExecutor capped by AGNES_PULL_PARALLELISM (default 4, set 1 to restore pre-PR serial). 4 matches typical home-broadband saturation without over-subscribing the analyst's NIC. Drops to serial when len(to_download) <= 1 to avoid executor overhead in the common single-table case. Per-table error semantics preserved via (tid, entry, err) tuple — a failure on one parquet doesn't abort the rest of the batch. Verified end-to-end against a dev VM with the new Caddy file_server deployed: 2-table pull through agnes CLI works under the new concurrency.	2026-05-05 16:42:55 +02:00
ZdenekSrotyr	ab61e30c91	chore(auto-upgrade): re-fetch compose + Caddyfile, self-update Sibling change to the Caddy file_server PR (#182). Without this, existing long-uptime VMs would pull the new agnes image on auto-upgrade but keep their stale Caddyfile + docker-compose.yml — leaving the file_server route + the data:/srv:ro mount inert. Confirmed live 2026-05-05 when the file_server change merged in main but stayed unreachable on a running dev VM until /opt/agnes/* was scp'd by hand. agnes-auto-upgrade.sh now hashes the bind-mounted config files (Caddyfile + every docker-compose overlay) on every 5 min tick and triggers a `docker compose up -d` recreation when the hash drifts — same trigger path as an image-digest change. Fail-soft via the .new-then-mv pattern: a curl 404 / network blip leaves the existing file untouched. Self-update at the bottom of the script: re-fetch /usr/local/bin/agnes-auto-upgrade.sh itself so the very fix that watches config files lands on running VMs without a manual ssh-and- curl cycle. Otherwise we'd have a self-perpetuating "old script problem" — the watch-config logic never propagating to the VMs that need it. Operators no longer need to ssh + scp Caddyfile/compose changes.	2026-05-05 16:42:13 +02:00
ZdenekSrotyr	1be997f6d4	feat(caddy): file_server for parquet downloads — bypass uvicorn A single analyst's multi-GB `agnes pull` held the only uvicorn worker for the duration of the stream, starving UI / /api/health / every other API endpoint. Container flipped to `unhealthy`. Triggered while a 6.8 GB `order_economics` pull was in-flight on prod 2026-05-05. Caddy now intercepts `GET /api/data/{table_id}/download` and serves the parquet directly via sendfile from the data volume (mounted r-o at /srv inside the caddy container). RBAC enforced by `forward_auth` to a new lightweight `GET /api/data/{table_id}/check-access` endpoint (returns 204 / 403) — the bulk transfer never reaches uvicorn. Path discovery via `try_files` over the known extract.duckdb v2 source subdirs. Anything not at a static path falls through to the existing app handler so legacy `src_data/parquet` and future connectors still work without a Caddyfile change. Non-Caddy deployments are unchanged. Stage 1 (multi-worker uvicorn) was considered but blocked by the single-writer DuckDB lock on system.duckdb — workers > 1 would crash at startup on "Could not set lock on file", the same race that pushed the scheduler from in-process writes to HTTP-via-app. Multi-reader workers + single-writer coordination is out of scope for this PR.	2026-05-05 16:41:33 +02:00
ZdenekSrotyr	4f04235502	feat(bigquery): bq_query_timeout_ms knob; default 600s (was 90s) DuckDB BigQuery extension defaults `bq_query_timeout_ms` to 90 s, which is too tight for analyst-scale queries against view-backed BQ datasets. `agnes query --remote` HTTP 400'd with `Binder Error: Query execution exceeded the timeout. Job ID: ...` whenever the underlying BQ job ran longer than 90 s, even though the job itself was healthy. Add `data_source.bigquery.query_timeout_ms` (default 600 000 ms = 10 min, sentinel 0 falls through to the extension default). Applied via `SET bq_query_timeout_ms` after every `LOAD bigquery` on every BQ-touching DuckDB session: orchestrator's `_remote_attach` ATTACH path, BqAccess session factory, and the standalone extractor. Configurable via `/admin/server-config` UI. Fail-soft: extension versions that don't recognise the setting silently keep the default rather than poisoning the session.	2026-05-05 16:40:40 +02:00
ZdenekSrotyr	4751094e1c	fix(keboola): per-table fallback to legacy Storage-API client (#183 ) * fix(keboola): per-table fallback to legacy Storage-API client The DuckDB Keboola extension's per-table COPY fails with `Schema '..."in.c-..."' does not exist or not authorized` on projects whose Snowflake backend doesn't expose bucket schemas to the storage-token-derived QueryService role (keboola/duckdb-extension#17). ATTACH itself succeeds, so the existing extension-level fallback in `_try_attach_extension` never triggers — the table is just marked failed. - Promote `kbcstorage>=0.9.0` from optional to core dep so the legacy client import in `_extract_via_legacy` doesn't crash default installs with `ModuleNotFoundError`. - Wrap `_extract_via_extension` in a per-table try/except so a scan failure retries via `_extract_via_legacy` instead of recording `tables_failed` and moving on. Slower than the extension path, but produces correct parquets on affected projects while the upstream extension fix lands. * test(keboola): cover per-table extension→legacy fallback Two existing tests mocked _extract_via_extension to throw and asserted the original message survived in result["errors"]. With per-table fallback, the new flow retries via _extract_via_legacy — which on the mock URLs would throw a different (404 / DNS-fail) error, replacing the asserted message. - Mock _extract_via_legacy alongside _extract_via_extension in test_network_timeout_during_extraction + test_partial_failure_continues + test_all_tables_fail_returns_full_failure_stats so the assertion observes the final propagated error from the fallback chain. - Add test_extension_per_table_failure_falls_back_to_legacy that exercises the new behavior directly: extension scan fails with the QueryService schema-not-authorized message (keboola/duckdb-extension#17), legacy succeeds, parquet ends up queryable.	2026-05-05 15:47:44 +02:00
ZdenekSrotyr	4908a0d7a2	Merge remote-tracking branch 'origin/main' into pr180-review # Conflicts: # CHANGELOG.md # pyproject.toml	2026-05-05 15:22:10 +02:00
ZdenekSrotyr	a220955640	release: 0.35.1 — CLI --remote query timeout fix Patch release bundling the only Unreleased change: bump httpx client timeout for agnes query --remote from 30s to 300s (configurable via AGNES_QUERY_TIMEOUT). Renames CHANGELOG [Unreleased] section to [0.35.1] and bumps pyproject version to match.	2026-05-05 15:01:37 +02:00
Vojtech Rysanek	0843c2bd1b	fix(cli): bump --remote query timeout to 300s, add AGNES_QUERY_TIMEOUT The httpx client behind 'agnes query --remote' used the default 30s timeout, killing every BigQuery SELECT that took longer than half a minute — i.e. most non-trivial remote queries. cli/client.py now exposes QUERY_TIMEOUT_S (default 300s, override via AGNES_QUERY_TIMEOUT) and propagates a kw-only 'timeout' through api_get/post/delete/patch. _query_remote passes QUERY_TIMEOUT_S so only the long-running /api/query path gets the bump; every other CLI call keeps the 30s default. Server-side has no read deadline on /api/query, so the client cap was the sole bottleneck.	2026-05-05 16:40:54 +04:00
ZdenekSrotyr	8d8d2c219e	refactor(cli-store): pull/info → agnes admin store; add agnes store mine Backup-orchestration commands were split across two namespaces (pull in agnes store, push in agnes admin store), which broke the operator mental model — pull/push are a paired operation and should sit together. Move pull + info into agnes admin store so all bulk operations share one help screen. Add agnes store mine as the user-facing equivalent — calls the same /api/store/bundle.zip endpoint with ?owner=me, which the server resolves to the caller's user_id. Authors can archive their own uploads without admin role; whole-Store bulk reads stay admin-flavored as a discoverability hint. Server: 3-line addition to export_bundle handles owner='me' as a magic alias for the caller. No new endpoint. Tests updated: pull/info expectations move from agnes store to agnes admin store; new tests cover agnes store mine and the ?owner=me server resolution. 69/69 store tests green locally.	2026-05-05 13:49:18 +02:00
ZdenekSrotyr	3d63965a67	Merge remote-tracking branch 'origin/main' into pr180-review # Conflicts: # CHANGELOG.md # app/web/templates/_app_header.html	2026-05-05 12:05:50 +02:00
ZdenekSrotyr	a8f9d065c8	feat(store): bundle export/import + agnes store update + agnes admin store push Adds whole-Store backup/restore primitives so an external CI/CD job can mirror the Store to a git repo (and restore back from one). REST: - GET /api/store/bundle.zip — deterministic ZIP of all (filtered) Store entities. Layout: manifest.json + entities/<id>/{plugin,assets}/. Manifest carries owner_email for cross-instance restore. Auth: any authenticated user (Store is community-open). - POST /api/store/import-bundle — admin-only restore. Modes merge\|replace\|skip; owner resolution by email with stub-disabled-user fallback when the email is unknown on the target instance. CLI: - agnes store update <id> [--description X] [--zip PATH] ... — in-place edit (server PUT permits owner OR admin per F4). Closes the missing edit affordance for analysts who want to fix a typo or push a new ZIP without losing install_count. - agnes store pull [-o store.zip] [--unpack DIR] — download the bundle. --unpack streams + extracts so an external git-backup workflow can drop the tree straight into a repo and `git add .`. - agnes store info [--json] — counts + size summary. - agnes admin store push <zip-or-dir> [--mode ...] — admin-only restore. Auto-zips a directory client-side so a working-tree → server round-trip is one command. cli/v2_client.py gains api_get_stream helper for binary downloads. Tests: 5 new server tests (bundle shape + filters + round-trip + stub user creation + skip mode + admin-only gate) + 11 new CLI tests (update, pull/unpack, info, admin push). 66/66 store-related tests green locally.	2026-05-05 11:51:31 +02:00
ZdenekSrotyr	952dc9e74d	fix(profile-sessions): tolerate stat() failures on individual jsonl (Devin Review on #179 ) The previous gather used `sorted(glob, key=lambda p: p.stat().st_mtime)`. A transient OSError (race with delete, permission flicker, EBADF on a weird filesystem) on any single file raised through the lambda and 500-ed the whole page. Reworked: stat each path under try/except into a (path, stat) list, sort the already-statted entries. Bad files drop silently from the listing. Regression test test_profile_sessions_page_tolerates_stat_failures patches Path.stat to raise on one of two files, asserts the page returns 200 with the good row rendered and the bad row dropped.	2026-05-05 09:53:06 +02:00
ZdenekSrotyr	d878764ac1	fix(session-collector-api): mirror sibling endpoints' audit-on-exception (Devin Review on #179 ) Devin flagged that run_session_collector still had the same audit-skip gap I fixed in run_verification_detector and run_corporate_memory in the previous two rounds — a PermissionError walking /home, an OSError on /data/user_sessions mkdir, or any other unhandled exception from collector.run() would skip the audit_log row and only show in docker logs. Same try/except + unhandled_error pattern as the sibling endpoints. All three LLM-pipeline run-* endpoints now record their failures the same way; /admin/scheduler-runs sees them. Regression test in tests/test_admin_run_endpoints.py::TestRunSessionCollector::test_unhandled_exception_still_audits.	2026-05-05 09:31:33 +02:00
ZdenekSrotyr	9ebe991b55	feat(profile): per-session jsonl download from /profile/sessions User feedback during e2e of #179: the listing page is nice but I want to grab the raw jsonl and look at what's inside. Adds GET /profile/sessions/<filename>: - Auth via get_current_user (owner-only). - Path safety: rejects "/", "\", "..", leading ".", and any non-".jsonl" filename. The served path resolves under ${DATA_DIR}/user_sessions/<caller.id>/; if resolution escapes that base directory, returns 404 (never 403, so existence of other users' files isn't leaked). - FileResponse with Content-Disposition: attachment. UI: Download button per row in profile_sessions.html. Tests in test_web_ui.py: path-traversal / nested / dotfile / non-jsonl all 404 for owner; unauthenticated 302/401/403; authenticated owner gets 200 + correct Content-Disposition.	2026-05-05 09:15:12 +02:00
ZdenekSrotyr	e86da72997	fix(corporate-memory-api): mirror verification-detector audit-on-exception (Devin Review on #179 ) Devin flagged that run_corporate_memory still had the same audit-skip gap I just fixed in run_verification_detector — if collect_all() throws anything other than the already-translated ValueError (DuckDB lock, network blip, unexpected SDK error), the audit_log row was never written and /admin/scheduler-runs missed the failure. Same try/except + unhandled_error pattern as the verification_detector fix from `4c4dfee8`. Regression test in tests/test_admin_run_endpoints.py::TestRunCorporateMemory::test_unhandled_exception_still_audits.	2026-05-05 09:11:13 +02:00
ZdenekSrotyr	4c4dfee8e6	feat(profile): /profile/sessions page + audit on detector exception + correct SCHEDULER_AUDIT_ACTIONS Three changes addressing user feedback during e2e test of #179 + Devin Review on `e86dd5ed`. 1) /profile/sessions — new self-service user page in the user menu. Lists all session jsonls the caller uploaded via `agnes push` joined against session_extraction_state. Each row shows uploaded_at, file size, status badge (pending/processed/extracted), processed_at, and items_extracted. The page docstring + help text explicitly call out that items_extracted=0 means the verification detector ran fine but the LLM found no claims to track — that's the documented "no items" outcome, not a broken pipeline. Closes the gap surfaced during the e2e test of #176 where a user could see their sessions on disk and process them through the LLM but had no UI to inspect what happened. 2) run_verification_detector audits unhandled exceptions (Devin #1). If detector.run() threw anything other than the already-translated ValueError, the audit_log row was never written. The endpoint now wraps detector.run in try/except, records the exception in audit_params["unhandled_error"], then re-raises as 500 after audit. The /admin/scheduler-runs page surfaces the failure row with the error type + message. 3) SCHEDULER_AUDIT_ACTIONS list corrected (Devin #2). Previous list had "marketplaces_sync_all" (wrong — actual is "marketplace.sync_all") plus "data_refresh" and "scripts_run_due" which app/api/sync.py and app/api/scripts.py don't write to audit_log. Fixed to the four actually-logged strings; comment points at the missing audit calls as a follow-up. Tests: tests/test_web_ui.py adds TestAdminRoleGuards::test_profile_sessions_page_no_admin_required and tightens test_admin_scheduler_runs_page_admin_only to assert the correct marketplace.sync_all string.	2026-05-05 08:57:35 +02:00
ZdenekSrotyr	f0d091f721	fix(store): scratch dir leak on ZIP validation failure (Devin Review) create_entity + update_entity created the `scratch` temp dir inside one try/finally but cleaned it up in a separate one. Validation HTTPExceptions raised by _safe_zip_extract (zip_unsafe_path, zip_too_large_uncompressed) or the BadZipFile→422 conversion exited the first scope, and the second finally was never entered → temp dir leaked on every failed upload. Devin flagged this on the F2 commit. The leak pre-existed (zip_unsafe_path was the original vector); F2 added zip_too_large_uncompressed to the same broken cleanup path. Fixed by collapsing scratch creation + cleanup into one outer try/finally that covers both extraction AND metadata/bake; the inner try/except/finally still handles BadZipFile→422 + tmp file cleanup. Same restructure in update_entity. Regression test `test_scratch_dir_cleaned_up_after_failed_extraction` triggers a zip_unsafe_path 422 and asserts tmp/agnes_store_* contains no leaked dirs.	2026-05-05 08:52:15 +02:00
ZdenekSrotyr	fd3c76d21b	fix(store): security + correctness blockers found in PR review (F1, F2, F4, F5) Three independent reviews of PR #180 surfaced four real defects in the new Store / my-ai-stack surface. CHANGELOG entries detail each; one-liners: - F1 video_url XSS: any authenticated user could upload a Store entity with `video_url=javascript:...` and pop XSS in any viewer's session via the `<a href=...>` "Watch video" link in store_detail.html. Jinja2 autoescape doesn't block URI schemes inside attribute values. Fixed by scheme-validating to http(s) only on create + update; 400 invalid_video_url. - F2 ZIP decompression bomb: _safe_zip_extract checked path-traversal but not declared file_size totals — a 50 MB compressed upload at 1:1000 ratio decompresses to 50 GB and DOS the host disk. Fixed by summing zinfo.file_size across infolist() and refusing > 200 MB before extractall touches disk. 413 zip_too_large_uncompressed. - F4 admin authz parity: PUT /api/store/entities/{id} was owner-only while DELETE allowed owner OR admin; the store-detail page hid Edit/Delete buttons from admin even though DELETE was permitted. Fixed by allowing admin on PUT and passing is_admin to the template; gate is now is_owner OR is_admin everywhere. - F5 cross-owner suffix collision: sanitize_username is many-to-one (alice.smith / alice_smith both → alice-smith). Two such users uploading entities with the same display name produced identical `<name>-by-<username>` suffixes, silently colliding in the served agnes-store-bundle on-disk paths AND the manifest catalog (Claude Code dedupes by plugin.json `name`). Fixed by enforcing global uniqueness on the suffixed value at create_entity; 409 conflict_global_suffix. F3 (ZIP symlink members) was investigated and confirmed to be a false-positive — Python's stdlib ZipFile.extractall does not honor symlink mode bits, so no exploit exists. 9 new regression tests in tests/test_store_api.py::TestStoreSecurityFixes covering all four. Test run locally: 60/60 store-related tests pass.	2026-05-05 08:18:02 +02:00
ZdenekSrotyr	e86dd5edc5	fix(anthropic): strict json_schema (additionalProperties=false) + add /admin/scheduler-runs UI E2E test on a real BQ deploy showed every verification-extraction call fails with HTTP 400 invalid_request_error: "output_config.format.schema: For 'object' type, 'additionalProperties' must be explicitly set to false". The Anthropic structured-output API now requires the field on every object node in the json_schema. Fix: connectors/llm/anthropic_provider.py wraps the caller-supplied schema through a recursive _strict_json_schema() walker that adds the field where missing (preserving any explicit override), then passes the strict variant to the API. Six unit tests in TestStrictJsonSchema pin the recursion across nested objects, array items, and the no-mutation invariant. Adds /admin/scheduler-runs — a read-only admin page that surfaces the last 200 audit-log entries from scheduler-driven actions. New AuditRepository.query_actions(actions, limit) helper, new admin nav entry. Failed scheduler ticks (HTTP 401, network errors) don't reach the audit_log; the page calls that out with a hint to set SCHEDULER_API_TOKEN if no rows show up.	2026-05-05 08:00:57 +02:00
ZdenekSrotyr	9f9aabd72b	fix(corporate-memory): CLI catches fail-fast ValueError, exits 1 with clean message (Devin Review on #179 ) The PR's #176 fail-fast change made collect_all() raise ValueError when neither an ai: block nor ANTHROPIC_API_KEY/LLM_API_KEY was available. verification_detector's CLI was updated to handle it; corporate_memory's CLI was missed and crashed with an unhandled traceback. services/corporate_memory/collector.py:main() now wraps the collect_all call in try/except ValueError, prints a one-line actionable message to stderr, and returns rc=1. Regression test: test_llm_connector.py::TestCorporateMemoryCollector::test_main_returns_1_on_no_ai_config_instead_of_traceback.	2026-05-05 06:45:10 +02:00
ZdenekSrotyr	e68c2d3f0f	fix(session-collector): argv-free run() helper, drop SystemExit footgun (Devin Review on #179 ) run_session_collector called collector.main() which did argparse.parse_args() on uvicorn's sys.argv (['app.main:app', '--host', ...]) → sys.exit(2) → SystemExit(2), which inherits from BaseException, escapes FastAPI handlers, and propagates through the thread pool. Every scheduler tick that fired the endpoint either 500-ed or risked killing the uvicorn worker. services/session_collector/collector.py now exposes run(dry_run, verbose) that returns (rc, stats); main() is a thin CLI shim that parses argv and delegates. The admin endpoint calls run() directly and audit-logs the per-run stats (users_processed, files_copied, files_skipped) instead of just the rc. Three regression tests in TestRunHelper. Closes Devin Review finding on app/api/admin.py:2819 (#179).	2026-05-05 06:31:55 +02:00
ZdenekSrotyr	046d8705ee	docs(changelog): correct "two paths" claim + document new env vars The 0.35.0 entry's 'two paths to a working LLM pipeline' wording was defensible only after the #179 review fixes — on the initial cut, the seeded-overlay path was dead code (consumers imported the static-only loader; even when they didn't, env refs in the overlay weren't resolved). Updated Defect 5's bullet to spell out what was broken and what shipped, and added a new bullet for the scheduler-cadence env-var fix. Added the two new test modules under Internal.	2026-05-05 06:05:27 +02:00
Minas Arustamyan	d5a7c9ad79	feat(store): /store + /my-ai-stack — community marketplace + per-user composition Adds a community-driven Store where any authenticated user uploads skills/agents/plugins as ZIPs, plus /my-ai-stack as the per-user composition view. The served Claude Code marketplace is now: (admin_granted ∖ opt_outs) ∪ store_installs Skill + agent installs are merged into a single `agnes-store-bundle` plugin in the served marketplace; type=plugin uploads stay standalone. Names are suffixed with `-by-<owner-username>` at upload time so two owners can use the same display name without colliding in Claude Code's flat skill/agent namespace. Schema v23 → v24 adds three tables: - store_entities — community-uploaded skills/agents/plugins - user_store_installs — what each user has chosen to install - user_plugin_optouts — opt-out overlay on top of admin grants Admin grant-delete drops every user's opt-out for that plugin so re-grant resets cleanly to enabled (no sticky personal preference). UI: - /store — e-commerce-style listing with type/category/owner filters, search, pagination, owner-aware [Install] buttons, clickable cards - /store/new — 2-step upload wizard with drag & drop, preview validation (POST /api/store/entities/preview), docs multi-upload, photo + video URL - /store/{id} — detail page with hero, file list, docs, owner actions (Edit/Delete) for the uploader - /my-ai-stack — Granted plugins (toggle opt-out) + From the Store (uninstall) sections - Admin nav: Marketplaces moved into Admin dropdown, renamed to "Curated Marketplaces" Validation hardening: type-mismatch guards reject skill ZIP uploaded as agent (or vice versa), and plugin ZIPs masquerading as skills/agents. Human-readable error messages mapped client-side from machine codes. Cross-source naming: Store entity-id-prefixed dirs (`plugins/store-<id>/`) plus the bundle (`plugins/store-bundle/`) avoid collisions with admin marketplaces (whose `store` slug is reserved by `is_valid_slug`). Bundle composition is content-hashed at serve time — install/uninstall or owner re-upload bumps the bundle's plugin.json `version`, so Claude Code's auto-update toggle picks up changes. Tests: 50+ new tests across naming, repositories, filter (admin ∪ store ∪ bundle), API (upload/install/uninstall/delete/preview/docs), end-to-end marketplace.zip with bundle merging.	2026-05-05 02:53:49 +02:00
ZdenekSrotyr	567385d046	release: 0.35.0 — session pipeline fix (BREAKING) (#176 ) Five compounding defects on default `docker compose up` deploys made the session pipeline silently broken: sessions uploaded by analysts via `agnes push` landed on /data/user_sessions/<user>/.jsonl but nothing ever processed them. Fix is one PR: promote anthropic + openai to core deps, wire all three LLM-pipeline jobs into scheduler-v2 with offset cadences (10m/15m/17m), drop the side-car services from compose, seed a default ai: block on first-time setup with an env-var fallback in code, surface the pending review queue to admins, and expose a health check that warns when uploaded jsonls aren't being processed. BREAKING* for operators on COMPOSE_PROFILES=full or with custom Compose overrides referencing the corporate-memory or session-collector service stanzas — drop them. The scheduler is now the sole driver.	2026-05-05 00:46:27 +02:00
ZdenekSrotyr	0430c0de00	release: 0.34.0 — clean analyst bootstrap (BREAKING) + bundled fixes Headlines: - Clean analyst bootstrap rewrite: web /setup → paste prompt → Claude Code in empty folder = working analyst workspace. CLI binary renamed da → agnes. See CHANGELOG ## [0.34.0] for the full breaking-change matrix. - Unified /setup flow: collapsed the admin/analyst tile split (the ?role= query parameter introduced mid-cycle is gone). Every signed-in user sees the same flow; marketplace + plugins block emitted iff caller has plugin grants. PAT scope uniform (general 90 d). - Bundled fixes: supersedes #172 (Windows console encoding), merges #174 (BigQuery materialize view fix + concurrency, schema v24 migration), closes #171 (--remote query pre-check no longer over-rejects narrow queries on partitioned tables, ~30,000x over-estimate fix). - Devin Review findings addressed throughout the cycle: query.py:464 (rewriter cross-contamination), extractor.py:166 (TTL reclaim dead code), db.py:1757 (v24 migration retry path), init.py:99 (stale on-disk token override), and more. - Operator UX: register-table now requires --bucket for materialized rows + emits first-sync and grant hints on success. agnes status sessions counter reads from ~/.claude/projects/<encoded-cwd>/. agnes init --token now wins over stale ~/.config/agnes/token.json. Open follow-ups (separate issues): - #175 sync architecture redesign (full-extract Keboola, full-file downloads, user-global sync_state) - #177 admin CLI: missing unregister-table / update-table commands - #178 agnes diagnose: introduce "info" severity tier	2026-05-04 23:13:23 +02:00
ZdenekSrotyr	0612c1e1a1	fix(schema-v24): raise on deferred migration so retry path actually runs (Devin Review on db.py:1757) Pre-fix: when v24 migration found rows to migrate but data_source.bigquery.project was empty, it logged a warning per row and returned normally. Schema_version then bumped to 24 unconditionally → next start's 'if current < 24:' gate skipped _v23_to_v24_finalize forever, leaving rows in DuckDB-flavor SQL that the new _wrap_admin_sql_for_jobs_api wrapping path rejects. Devin escalated this from advisory ("idempotent retry") to critical on rescan after my reply. The reply was wrong — the LIKE filter inside the function gives idempotency IF the function is called again, but the schema-version gate prevents that call from happening. Fix (Devin's recommended Approach 1): raise RuntimeError BEFORE the schema-version bump when rows need migration but project_id is empty. The schema_version stays at 23, so on next start the 'if current < 24:' gate fires and the migration runs again — this time with project_id configured. Side effect: a BQ-using deployment that hasn't set the project blocks startup until they do. That's the right call for a config error that would otherwise silently break all materialized tables. The error message points at the right knob (data_source.bigquery.project + restart). No-rows-no-block invariant preserved: the early 'if not rows: return' at the top of _v23_to_v24_finalize means non-BQ deployments are unaffected. Tests: - test_v24_raises_when_project_not_configured_and_rows_need_migration: asserts raise + schema_version stays at 23 (the load-bearing invariant for retry-on-next-start to work) - test_v24_skips_clean_when_no_rows_match_even_without_project: asserts non-BQ deployments don't block startup - Existing 3 tests still pass	2026-05-04 23:11:34 +02:00
ZdenekSrotyr	36012e0833	fix(admin): register-table real-world UX gaps for materialized BQ Three items from operator feedback after running the actual flow: (1) Help docstring lied: "--bucket / --source-table ignored" for materialized rows. Reality: --bucket is load-bearing because `agnes schema <name>` builds the BQ identifier as `bq.<bucket>.<source_table>`. An empty bucket registered the row but broke schema/describe with HTTP 400 "unsafe BQ identifier in registry". Fix: docstring rewritten to reflect reality, plus client-side validation rejects materialized + empty bucket with a clear error pointing at the right knob. (2) Post-register UX cliff: `agnes pull` after register-table reports "Updated 0 tables (1 total)" because registration adds a registry row but does NOT trigger a parquet build. Operators routinely assume something's broken when they need to run `agnes setup first-sync` to kick off the materialization. Hint emitted on success now points at first-sync. (3) RBAC gotcha: `agnes catalog` is RBAC-filtered via `resource_grants`, so non-admin users don't see freshly-registered rows until a grant is created. Hint emitted on success now points at `agnes admin grant create <group> table <name>`. Tests: 8/8 in test_cli_admin_materialized.py, including two new regression tests for the validation + the hint output.	2026-05-04 23:06:17 +02:00
ZdenekSrotyr	5915f92eaa	fix(query-guardrail): single-pass alternation regex (Devin Review on query.py:464) The iterative bare-name rewriter (one re.sub per name, longest-first) was vulnerable to cross-contamination when the GCP project ID contained a registered table name as a hyphen-delimited word. Concrete repro: project = 'my-ue-project' registered = ['orders', 'ue'] user SQL = 'SELECT * FROM orders JOIN ue ON ...' iter 1 (orders): produces 'FROM `my-ue-project.fin.orders` JOIN ue ...' iter 2 (ue): '\bue\b' matches 'ue' INSIDE 'my-ue-project' (hyphen creates word boundary on both sides) — corrupts the iter-1 path Fallback at query.py:576 caught the resulting BQ parse error and fell back to per-table SELECT * estimate, so impact was over-estimation, not fail-open — but the #171 partition-pruning fix silently degraded to pre-fix behavior whenever a project name shared a hyphen-segment with a registered table. Fix: single re.sub call with an alternation regex sorted longest-first. Single-pass means each source position is processed exactly once, so freshly-inserted backticked text from one match isn't re-scanned by later names in the alternation. Regression test test_rewrite_helper_does_not_corrupt_when_project_id_contains_registered_name covers the exact Devin repro.	2026-05-04 22:51:33 +02:00
ZdenekSrotyr	c432e90f62	fix(bq-materialize): TTL reclaim was dead code (Devin Review on extractor.py:166) `_try_acquire_file_lock` opened the lock file with `open(mode='w')` BEFORE the mtime check, which truncated the file and refreshed mtime to now. The subsequent age check always saw ~0, so the TTL reclaim branch was never reachable and `materialize.lock_ttl_seconds` was a silently no-op config knob. Repro: before open(w): mtime age = 100000s after open(w): mtime age = 0s Fix: stat the lock path BEFORE any open(). If pre-probe mtime is older than TTL, unlink (forcing a fresh inode for the open + flock that follows). Order is now stat-then-decide-then-probe, not probe-then-stat-then-decide. Two regression tests added in tests/test_bq_materialize_concurrency.py: - test_stale_held_lock_is_reclaimed_despite_live_holder — exercises the full reclaim path with a still-living fcntl holder. Pre-fix this returned None (in_flight forever); post-fix returns a holder fd on a new inode. - test_failed_probe_does_not_self_refresh_lock_mtime — sister test pins that a failed acquisition's mode='w' truncate doesn't pathologically loop. Residual cross-process risk (genuinely overrunning materialize past TTL races a fresh attempt — both write to the same parquet.tmp, inode-level flock independence means new acquisition succeeds while old holder is still alive) stays documented in the helper docstring. In-process threading.Lock keyed on table_id blocks the single-process race; cross-process protection relies on TTL being well above longest plausible COPY (24h default).	2026-05-04 22:36:56 +02:00
ZdenekSrotyr	ed969f5e37	docs(changelog): unified /setup flow under Unreleased Replace the analyst-vs-admin `?role=` design summary with the unified flow we're shipping: single tile, single PAT-mint shape (general / 90 d), `agnes init` mandatory for everyone, marketplace block gated by `resource_grants`, pre-flight check now validates both git and claude. The intro paragraph references the 10-task unification follow-up and the `?role=` introduction-and-removal cycle so a future operator reading the diff doesn't think they missed a release. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 9.	2026-05-04 22:19:57 +02:00
ZdenekSrotyr	8784f10a6b	fix(devin-review): stale-token override + status sessions counter + lock comment Three Devin Review findings on PR #173 addressed in one commit since they're in adjacent code paths: 1. cli/commands/init.py:99 (\u{1F534}): `agnes init --token NEW` ran step 2 verify against the OLD on-disk token because `get_token()` read `~/.config/agnes/token.json` before the env var, and `_override_server_env` only set the env var. So `agnes init --force` on a machine with a stale token.json failed 401 with a confusing 'token expired' even though the --token arg was valid. Fix: ContextVar-based override in `cli.config._token_override` checked by `get_token()` BEFORE the on-disk read. `_with_token_override` context manager scopes the override. `_override_server_env` now also sets the contextvar via `_with_token_override(token)`, so both env var and contextvar carry the override (env for back-compat with anything bypassing get_token; contextvar is the authoritative source). Async-safe (each task sees its own override) and leak-proof (resets on context exit). 2 new tests: regression on stale-disk-token + scope leak guard. 2. cli/commands/status.py:43 (\u{1F7E1}): sessions_pending_upload only checked legacy `<workspace>/user/sessions/` and always reported 0 in workspaces bootstrapped with `agnes init` (Claude Code writes to `~/.claude/projects/`, not the legacy path). Same bug we fixed for `agnes push` in `08e49591`. Fix: route through `cli.lib.claude_sessions.list_session_files()` so status and push agree on what counts as a pending session. 3. connectors/bigquery/extractor.py:111 (\u{1F7E1}): docstring claimed "a live holder still wins the second flock attempt" — incorrect on Linux. After `unlink()` + `open()`, the new file is a new inode; fcntl.flock keys per-inode, so the old holder's lock does NOT block the new acquisition. In a genuine TTL-overrun scenario two writers CAN race the parquet.tmp. Fix: documentation only. Comment now honestly describes the inode-recreation behavior, names the threading.Lock as the actual in-process guard, and flags pid-gating as the next-iteration fix if real corruption surfaces. The 24h default TTL is well above typical COPY durations so the practical risk is low. Tests: 17/17 across test_cli_init.py + test_lib_pull.py + the broader regression set.	2026-05-04 21:26:30 +02:00
ZdenekSrotyr	8233c3e3f9	chore(docs): replace stale `da` verbs and vendor-specific install paths Sweep operator runbooks (docs/QUICKSTART, docs/HEADLESS_USAGE, docs/architecture, docs/sample-data, docs/agent-workspace-prompt, docs/metrics/metrics.yml, dev_docs/server, dev_docs/disaster-recovery), the corporate-memory service README, the jira connector README + backfill scripts, the deploy skill, and test docstrings. Replaces `da sync` → `agnes pull`, `da analyst setup` → `agnes init`, `da metrics ...` → `agnes catalog --metrics` / `agnes admin metrics ...`, `da fetch` → `agnes snapshot create`, plus the matching docker-compose admin invocations. Vendor-specific `/opt/data-analyst/` install paths in jira backfill / consistency scripts and operator docs are replaced with the placeholder `<install-dir>` and a new `AGNES_ENV_FILE` env-var override that lets a deployment inject its actual install path without a code change. Aligns with the OSS vendor-agnostic policy in CLAUDE.md. CHANGELOG `### Internal` entry summarizes the audit and reaffirms the intentional stale-marker tuples (`_LEGACY_STRINGS`, `_OUR_COMMAND_MARKERS`) that must keep referencing `da sync` / `da fetch` / etc. for hook upgrade and override-detection logic.	2026-05-04 21:22:19 +02:00
ZdenekSrotyr	976d0c7160	fix(pull): re-download parquet when file missing despite matching hash Pre-fix `agnes pull` decided what to download from sync_state hash equality alone: if server_hash != local_hash or tid not in local_tables or not server_hash: to_download.append(tid) If the recorded local hash matched server but the actual parquet had been deleted from disk, the download was skipped. The next DuckDB view rebuild then fails on a missing file. Repro: `rm server/parquet/X.parquet && agnes pull` → 'Updated 0 tables', X still missing. Failure modes that produce hash-equal-but-file-missing: - manual `rm` of a single parquet - operator-side cleanup of `server/parquet/` - two workspaces sharing one user's `~/.config/agnes/sync_state.json` (TODO(workspace-scoped-sync-state) in pull.py): one workspace writes its parquets, the other reads sync_state and concludes 'I already have these' - disk corruption / partial restore from backup Fix: existence check runs alongside the hash compare. Missing file forces a re-download regardless of hash equality. `parquet_dir` is hoisted above the loop so the existence check is in scope when the download set is built. Tests: regression test for the hash-equal-but-missing-file case + counterpart for the fast-path (hash-equal-and-file-present must still skip).	2026-05-04 21:12:06 +02:00
ZdenekSrotyr	500db8cd3c	fix(query-guardrail): dry-run user SQL not synthetic SELECT * (#171 ) Closes #171. The /api/query cost guardrail used to dry-run a synthetic `SELECT * FROM <table>` for each registered remote-BQ row referenced by the user SQL — which made BigQuery estimate a full table scan, with column projection, predicate pushdown, and partition pruning all disabled. Narrow queries on big partitioned/clustered tables (the documented happy path for `agnes query --remote`) hit ~30,000× over-estimates and got rejected with 400 `remote_scan_too_large` even when BQ's own dry-run reported single-digit MB. Pavel's report on #171 traced the root cause and proposed the fix: rewrite the user SQL to BQ-native syntax and dry-run it as a single job, exactly the way `bq query --dry_run` works. Implementation: - New helper _rewrite_user_sql_for_bq_dry_run rewrites bare registered names (word-boundary, case-insensitive, longest-first to avoid prefix collisions) + bq."<ds>"."<tbl>" forms to backticked `<project>.<ds>.<tbl>` paths. - _bq_quota_and_cap_guard runs ONE dry-run on the rewritten SQL. Cap check uses the real estimate. - Fallback path: if BQ rejects with bq_bad_request (e.g. DuckDB-only syntax like ::INT casts), the guard falls back to the pre-fix per-table SELECT * approach so non-portable queries still get a (loose) cap estimate instead of fail-opening. Non-parse BQ errors (forbidden, upstream) still propagate as 502. - _bq_guardrail_inputs now also returns name_lookups so the rewriter has the (registered_name, bucket, source_table) mapping it needs. - Per-table breakdown is unavailable from a composite dry-run; total bytes are pinned to dry_run_set[0] for the post-flight record_bytes(sum(...)) call to keep returning the right total. Tests (7 new, 3 existing still pass): - dry-run receives rewritten user SQL with WHERE clause intact (the load-bearing assertion for #171) - single dry-run per request even with multiple registered tables (JOIN, UNION) referenced - fallback to per-table SELECT * on bq_bad_request - non-parse BQ errors (forbidden) still 502 - rewriter unit tests: bare + bq.path in same SQL, longest-name-wins on prefix collision, case-insensitive bare-name match	2026-05-04 21:08:21 +02:00
ZdenekSrotyr	e438170ade	merge: pull #174 (BQ materialize view fix + concurrency, 0.33.0) into bootstrap branch Brings in zs/materialize-sync-fix (PR #174): - BigQuery view materialize works (wrap admin SQL in bigquery_query()) - Per-table mutex + fcntl.flock for concurrent COPY corruption - Cost guardrail dry-run engages on materialized rows - Schema v23 -> v24 migration: rewrite source_query to BQ-native - Server-generated trivial source_query from bucket+source_table - Validator backtick relaxation for materialized rows - 0.33.0 release cut Conflict resolution: - CHANGELOG.md: keep our [Unreleased] (bootstrap rewrite content) ABOVE the new [0.33.0] section from #174. The bootstrap rewrite remains unreleased; it'll cut 0.34.0 (or later) when this PR merges to main. - tests/conftest.py: union — keep our analyst-bootstrap fixture re-export AND #174's bq_instance / stub_bq_extractor fixtures. - pyproject.toml auto-merged to 0.33.0 (matches the cut), correct. - src/db.py auto-merged: SCHEMA_VERSION = 24, _v23_to_v24_finalize added — no overlap with our work which left schema at v23. - CLAUDE.md auto-merged: schema-history paragraph extended with v24. Verified: 79/79 across CLI bootstrap suite + materialize suite + schema v24 migration tests pass locally on Python 3.13/macOS.	2026-05-04 20:53:00 +02:00
ZdenekSrotyr	ee83cebbda	fix(cli): Windows console crash on cs-CZ codepage (port + broaden #172 ) Ports Minas's PR #172 (against pre-rename `da` CLI on main) and applies the principle to the post-rename `agnes` CLI. Two distinct failure modes on Windows consoles whose default codepage is cp1250 (cs-CZ) / cp1252 (en-US): 1. `agnes pull` and other Rich-progress codepaths UnicodeEncodeError on Braille spinner glyphs. Fix: `cli/main.py` reconfigures stdout/stderr to UTF-8 with errors='replace' at import time on `sys.platform == 'win32'` so Rich's legacy-Windows render path emits decodable bytes. Wrapped in try/except so pytest's captured streams (which aren't TextIOWrapper) don't break. 2. `agnes skills list` and `agnes skills show` UnicodeDecodeError when reading skill markdown containing em-dashes / accented chars. Default `Path.read_text()` uses locale.getpreferredencoding(False), which is the broken codepage on Windows. Fix: every call site passes encoding='utf-8' explicitly. Broader scope than #172 because: - The bootstrap rewrite renamed/removed several files Minas's PR patched (`cli/commands/analyst.py` -> rolled into init.py; `cli/commands/sync.py` -> split into pull/push). Those targets no longer exist; the equivalent code lives in init.py. - Other call sites Minas didn't touch (still bare in his branch) are patched here too — config.py / update_check.py / snapshot_meta.py / setup.py / skills.py — so the codebase has zero locale-default text I/O in cli/. Side cleanup: stale `Run `da`` reference in snapshot_meta.py:88 fixed to `agnes` while touching the file.	2026-05-04 20:45:29 +02:00
ZdenekSrotyr	e323ab76cc	fix(snapshot): catch httpx transport errors in --estimate path CI failure: test_readers_in_pre_init_dir asserted no Traceback in stderr when running `agnes snapshot create x --as y --estimate` in a folder that never saw `agnes init`. The estimate-guard fix in `3d587681` let `--estimate` skip the local_db check and reach `api_post_json`, but the existing `except V2ClientError` doesn't cover transport-layer failures. With no server configured the URL defaults to http://localhost:8000; httpx raises ConnectError → ConnectError isn't a V2ClientError → the exception bubbles up through Typer/rich as a full traceback. Add `except httpx.HTTPError` next to V2ClientError so connection / DNS / TLS / timeout failures all render the friendly hint `Run `agnes init …` first` instead of leaking transport noise.	2026-05-04 20:36:30 +02:00
ZdenekSrotyr	cd3293b994	release: 0.33.0 — BQ materialize view fix + concurrency control	2026-05-04 20:30:50 +02:00
ZdenekSrotyr	08e4959185	fix(push): read sessions from ~/.claude/projects/<encoded-cwd>/ Real bug: `agnes push` was reading `<workspace>/user/sessions/`, but Claude Code writes session jsonls to `~/.claude/projects/<encoded-cwd>/` and nothing on the analyst side ever copies them across. The SessionEnd hook ran `agnes push` happily and uploaded zero sessions every time. `cli/lib/claude_sessions.py` probes both Claude Code encoding variants (older `/`→`-` keeping spaces+tildes; newer all-non-alphanumeric→`-` with collapsed runs) and unions whichever exist. Users who upgraded Claude Code mid-project end up with both encoded dirs side-by-side on disk; the union ensures no session is left behind. Same-named jsonl in both dirs → newest mtime wins. `<workspace>/user/sessions/` survives as a fallback for any setup that explicitly mirrors sessions there. Verified on real disk: helper returns 2 dirs + 8 unioned session files for the Agnes-test workspace where the previous code returned 0.	2026-05-04 20:29:59 +02:00
ZdenekSrotyr	d44cace17c	docs(changelog): clean-analyst-bootstrap rewrite (BREAKING)	2026-05-04 19:25:38 +02:00
ZdenekSrotyr	f731ee7897	feat(setup): /setup?role=analyst\|admin branching with role tiles	2026-05-04 17:28:47 +02:00
ZdenekSrotyr	cf8930b593	chore(release): cut 0.32.0 — #160 da query --remote on VIEW + 4 reinforcing fixes CHANGELOG: rename [Unreleased] → [0.32.0] — 2026-05-04, prepend a new empty [Unreleased] for next-PR landing zone. pyproject.toml: version 0.31.0 → 0.32.0. Per repo discipline (memory: feedback_release_cut_with_pr.md), the release-cut commit lands as the FINAL commit of the PR that contained the user-visible behavior change — it does not get a separate PR. After merge: tag v0.32.0 on the merge commit + create a GitHub Release (memory: feedback_github_release_per_tag.md — the tag alone isn't enough; the Release prose is the operator-visible artifact). Headline: closes #160. da query --remote now resolves query_mode='remote' BQ rows whose entity is VIEW or MATERIALIZED_VIEW (the bug Pavel hit). Plus 4 reinforcing fixes — server-side cost guardrail (bq_max_scan_bytes, default 5 GiB), registry-gating of direct bq.* paths, bigquery_query() function-call backdoor closed, structured CLI render of typed BQ errors — and one operator-side admin convenience (BQ test-connection endpoint + billing_project placeholder UI). 14 issues caught and addressed across 6 iterations of Devin Review. E2E verified on agnes-zsrotyr.groupondev.com (commit `7f743d03`): - VIEW path resolves (count=23 from active_inventory_view) - VIEW aggregate parity vs filtered BASE TABLE - cost guardrail rejects with structured 400 detail - bq_path_not_registered 403 (incl. quoted "bq" variant) - bigquery_query() blocklist 400 - test-connection endpoint 200 with elapsed_ms	2026-05-04 14:37:52 +02:00

1 2 3

123 commits