agnes-the-ai-analyst

Author	SHA1	Message	Date
ZdenekSrotyr	16373d6b0b	feat(cli): agnes store + agnes my-stack commands Adds CLI coverage for the new REST surface introduced in this PR: agnes store list / show / install / uninstall / upload / delete agnes my-stack show / toggle Covers 11 of the 15 new endpoints — listing, detail, install/uninstall, upload (multipart), delete, my-stack get + curated toggle. Photo / docs download endpoints intentionally skipped; analyst-side automation rarely needs raw bytes back, and the web UI already covers them. cli/v2_client.py: api_post_multipart + api_put_multipart helpers (httpx files= passthrough). api_delete + api_put_json fillers were already needed for non-multipart writes; added together. Tests: tests/test_cli_store.py — help-text smoke tests + happy-path mocked tests for list, install, upload, my-stack show, my-stack toggle. 12 new tests, all green.	2026-05-05 08:18:12 +02:00
ZdenekSrotyr	fd3c76d21b	fix(store): security + correctness blockers found in PR review (F1, F2, F4, F5) Three independent reviews of PR #180 surfaced four real defects in the new Store / my-ai-stack surface. CHANGELOG entries detail each; one-liners: - F1 video_url XSS: any authenticated user could upload a Store entity with `video_url=javascript:...` and pop XSS in any viewer's session via the `<a href=...>` "Watch video" link in store_detail.html. Jinja2 autoescape doesn't block URI schemes inside attribute values. Fixed by scheme-validating to http(s) only on create + update; 400 invalid_video_url. - F2 ZIP decompression bomb: _safe_zip_extract checked path-traversal but not declared file_size totals — a 50 MB compressed upload at 1:1000 ratio decompresses to 50 GB and DOS the host disk. Fixed by summing zinfo.file_size across infolist() and refusing > 200 MB before extractall touches disk. 413 zip_too_large_uncompressed. - F4 admin authz parity: PUT /api/store/entities/{id} was owner-only while DELETE allowed owner OR admin; the store-detail page hid Edit/Delete buttons from admin even though DELETE was permitted. Fixed by allowing admin on PUT and passing is_admin to the template; gate is now is_owner OR is_admin everywhere. - F5 cross-owner suffix collision: sanitize_username is many-to-one (alice.smith / alice_smith both → alice-smith). Two such users uploading entities with the same display name produced identical `<name>-by-<username>` suffixes, silently colliding in the served agnes-store-bundle on-disk paths AND the manifest catalog (Claude Code dedupes by plugin.json `name`). Fixed by enforcing global uniqueness on the suffixed value at create_entity; 409 conflict_global_suffix. F3 (ZIP symlink members) was investigated and confirmed to be a false-positive — Python's stdlib ZipFile.extractall does not honor symlink mode bits, so no exploit exists. 9 new regression tests in tests/test_store_api.py::TestStoreSecurityFixes covering all four. Test run locally: 60/60 store-related tests pass.	2026-05-05 08:18:02 +02:00
Minas Arustamyan	537ea7662b	chore(store): genericize email examples in docstring + test Per CLAUDE.md vendor-agnostic OSS guidance — replace the real groupon.com email used as a sanitize_username() example with a placeholder (alice_smith@example.com).	2026-05-05 05:48:32 +02:00
Minas Arustamyan	d5a7c9ad79	feat(store): /store + /my-ai-stack — community marketplace + per-user composition Adds a community-driven Store where any authenticated user uploads skills/agents/plugins as ZIPs, plus /my-ai-stack as the per-user composition view. The served Claude Code marketplace is now: (admin_granted ∖ opt_outs) ∪ store_installs Skill + agent installs are merged into a single `agnes-store-bundle` plugin in the served marketplace; type=plugin uploads stay standalone. Names are suffixed with `-by-<owner-username>` at upload time so two owners can use the same display name without colliding in Claude Code's flat skill/agent namespace. Schema v23 → v24 adds three tables: - store_entities — community-uploaded skills/agents/plugins - user_store_installs — what each user has chosen to install - user_plugin_optouts — opt-out overlay on top of admin grants Admin grant-delete drops every user's opt-out for that plugin so re-grant resets cleanly to enabled (no sticky personal preference). UI: - /store — e-commerce-style listing with type/category/owner filters, search, pagination, owner-aware [Install] buttons, clickable cards - /store/new — 2-step upload wizard with drag & drop, preview validation (POST /api/store/entities/preview), docs multi-upload, photo + video URL - /store/{id} — detail page with hero, file list, docs, owner actions (Edit/Delete) for the uploader - /my-ai-stack — Granted plugins (toggle opt-out) + From the Store (uninstall) sections - Admin nav: Marketplaces moved into Admin dropdown, renamed to "Curated Marketplaces" Validation hardening: type-mismatch guards reject skill ZIP uploaded as agent (or vice versa), and plugin ZIPs masquerading as skills/agents. Human-readable error messages mapped client-side from machine codes. Cross-source naming: Store entity-id-prefixed dirs (`plugins/store-<id>/`) plus the bundle (`plugins/store-bundle/`) avoid collisions with admin marketplaces (whose `store` slug is reserved by `is_valid_slug`). Bundle composition is content-hashed at serve time — install/uninstall or owner re-upload bumps the bundle's plugin.json `version`, so Claude Code's auto-update toggle picks up changes. Tests: 50+ new tests across naming, repositories, filter (admin ∪ store ∪ bundle), API (upload/install/uninstall/delete/preview/docs), end-to-end marketplace.zip with bundle merging.	2026-05-05 02:53:49 +02:00
ZdenekSrotyr	0612c1e1a1	fix(schema-v24): raise on deferred migration so retry path actually runs (Devin Review on db.py:1757) Pre-fix: when v24 migration found rows to migrate but data_source.bigquery.project was empty, it logged a warning per row and returned normally. Schema_version then bumped to 24 unconditionally → next start's 'if current < 24:' gate skipped _v23_to_v24_finalize forever, leaving rows in DuckDB-flavor SQL that the new _wrap_admin_sql_for_jobs_api wrapping path rejects. Devin escalated this from advisory ("idempotent retry") to critical on rescan after my reply. The reply was wrong — the LIKE filter inside the function gives idempotency IF the function is called again, but the schema-version gate prevents that call from happening. Fix (Devin's recommended Approach 1): raise RuntimeError BEFORE the schema-version bump when rows need migration but project_id is empty. The schema_version stays at 23, so on next start the 'if current < 24:' gate fires and the migration runs again — this time with project_id configured. Side effect: a BQ-using deployment that hasn't set the project blocks startup until they do. That's the right call for a config error that would otherwise silently break all materialized tables. The error message points at the right knob (data_source.bigquery.project + restart). No-rows-no-block invariant preserved: the early 'if not rows: return' at the top of _v23_to_v24_finalize means non-BQ deployments are unaffected. Tests: - test_v24_raises_when_project_not_configured_and_rows_need_migration: asserts raise + schema_version stays at 23 (the load-bearing invariant for retry-on-next-start to work) - test_v24_skips_clean_when_no_rows_match_even_without_project: asserts non-BQ deployments don't block startup - Existing 3 tests still pass	2026-05-04 23:11:34 +02:00
ZdenekSrotyr	36012e0833	fix(admin): register-table real-world UX gaps for materialized BQ Three items from operator feedback after running the actual flow: (1) Help docstring lied: "--bucket / --source-table ignored" for materialized rows. Reality: --bucket is load-bearing because `agnes schema <name>` builds the BQ identifier as `bq.<bucket>.<source_table>`. An empty bucket registered the row but broke schema/describe with HTTP 400 "unsafe BQ identifier in registry". Fix: docstring rewritten to reflect reality, plus client-side validation rejects materialized + empty bucket with a clear error pointing at the right knob. (2) Post-register UX cliff: `agnes pull` after register-table reports "Updated 0 tables (1 total)" because registration adds a registry row but does NOT trigger a parquet build. Operators routinely assume something's broken when they need to run `agnes setup first-sync` to kick off the materialization. Hint emitted on success now points at first-sync. (3) RBAC gotcha: `agnes catalog` is RBAC-filtered via `resource_grants`, so non-admin users don't see freshly-registered rows until a grant is created. Hint emitted on success now points at `agnes admin grant create <group> table <name>`. Tests: 8/8 in test_cli_admin_materialized.py, including two new regression tests for the validation + the hint output.	2026-05-04 23:06:17 +02:00
ZdenekSrotyr	5915f92eaa	fix(query-guardrail): single-pass alternation regex (Devin Review on query.py:464) The iterative bare-name rewriter (one re.sub per name, longest-first) was vulnerable to cross-contamination when the GCP project ID contained a registered table name as a hyphen-delimited word. Concrete repro: project = 'my-ue-project' registered = ['orders', 'ue'] user SQL = 'SELECT * FROM orders JOIN ue ON ...' iter 1 (orders): produces 'FROM `my-ue-project.fin.orders` JOIN ue ...' iter 2 (ue): '\bue\b' matches 'ue' INSIDE 'my-ue-project' (hyphen creates word boundary on both sides) — corrupts the iter-1 path Fallback at query.py:576 caught the resulting BQ parse error and fell back to per-table SELECT * estimate, so impact was over-estimation, not fail-open — but the #171 partition-pruning fix silently degraded to pre-fix behavior whenever a project name shared a hyphen-segment with a registered table. Fix: single re.sub call with an alternation regex sorted longest-first. Single-pass means each source position is processed exactly once, so freshly-inserted backticked text from one match isn't re-scanned by later names in the alternation. Regression test test_rewrite_helper_does_not_corrupt_when_project_id_contains_registered_name covers the exact Devin repro.	2026-05-04 22:51:33 +02:00
ZdenekSrotyr	c432e90f62	fix(bq-materialize): TTL reclaim was dead code (Devin Review on extractor.py:166) `_try_acquire_file_lock` opened the lock file with `open(mode='w')` BEFORE the mtime check, which truncated the file and refreshed mtime to now. The subsequent age check always saw ~0, so the TTL reclaim branch was never reachable and `materialize.lock_ttl_seconds` was a silently no-op config knob. Repro: before open(w): mtime age = 100000s after open(w): mtime age = 0s Fix: stat the lock path BEFORE any open(). If pre-probe mtime is older than TTL, unlink (forcing a fresh inode for the open + flock that follows). Order is now stat-then-decide-then-probe, not probe-then-stat-then-decide. Two regression tests added in tests/test_bq_materialize_concurrency.py: - test_stale_held_lock_is_reclaimed_despite_live_holder — exercises the full reclaim path with a still-living fcntl holder. Pre-fix this returned None (in_flight forever); post-fix returns a holder fd on a new inode. - test_failed_probe_does_not_self_refresh_lock_mtime — sister test pins that a failed acquisition's mode='w' truncate doesn't pathologically loop. Residual cross-process risk (genuinely overrunning materialize past TTL races a fresh attempt — both write to the same parquet.tmp, inode-level flock independence means new acquisition succeeds while old holder is still alive) stays documented in the helper docstring. In-process threading.Lock keyed on table_id blocks the single-process race; cross-process protection relies on TTL being well above longest plausible COPY (24h default).	2026-05-04 22:36:56 +02:00
ZdenekSrotyr	bc9dd5c5f0	test(setup-instructions): pin no-legacy-da-verbs invariant Adds `test_unified_flow_uses_only_agnes_verbs` that asserts no `da ` substring (with trailing space, to dodge false positives on `Darwin` / `database` / `adapter`) appears in any of the four `resolve_lines()` shapes: - bare (no plugins, no ca) - plugins only - ca only - plugins + ca Also pins the `agnes init --server-url … --token …` shape — commit 8784f10a's stale-on-disk-token fix relies on `init` receiving an explicit `--token` argument; if a future refactor drops the flag from the emitted command the test fails loudly instead of silently regressing to 401-on-stale-token in production. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 10.	2026-05-04 22:20:40 +02:00
ZdenekSrotyr	2ee529533f	refactor(setup-page): drop role query param The `/setup` route no longer accepts `?role=analyst\|admin`. The route signature drops the `Literal[...] = Query(...)` parameter and the silent admin-downgrade block (`if role == "admin" and not is_admin: role = "analyst"`). The `role` ctx variable threaded into install.html also goes away — Task 6 cleans up the template's role-tile UI and the JS PAT-mint ternary. `?role=` is silently ignored by FastAPI for unknown query params, so existing bookmarks (none in production — the param was added in this PR and never shipped) just degrade to the unified layout. No RedirectResponse shim needed. Tests: drop the entire `tests/test_setup_page_roles.py` file (eight role-branching tests that no longer apply) and add `tests/test_setup_page_unified.py` with three tests: - `test_setup_page_renders_unified_layout` - `test_setup_page_ignores_role_query_param` - `test_setup_page_renders_marketplace_for_user_with_grants` - `test_install_legacy_path_redirects_to_setup` Also replace the role-aware `test_install_preview_*` tests in test_web_ui.py with unified-layout assertions. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 5.	2026-05-04 22:16:59 +02:00
ZdenekSrotyr	291079b1d2	refactor(welcome-template): drop role param; resolve plugins per-user unconditionally Removes the `role: Literal["analyst", "admin"] = "admin"` parameter from `compute_default_agent_prompt`. The same RBAC pass (`marketplace_filter.resolve_allowed_plugins`) now runs for every user — admin or not. Users with no `resource_grants` rows get the no-marketplace layout; users with grants get the marketplace block inserted. Admin-vs-analyst is no longer a layout branch. `render_agent_prompt_banner` no longer derives a `role` from `user.is_admin`; it just delegates to `compute_default_agent_prompt`. Two `compute_default_agent_prompt(...role=role)` call sites in `app/web/router.py::setup_page` are updated to drop the keyword so the route keeps rendering — Task 5 will remove the `?role=` query parameter and the silent admin-downgrade block from the route signature itself. Tests: drop role-aware assertions from test_welcome_template_renderer and test_welcome_template_api. Both files now assert the unified default contains `agnes init` + `uv tool install` and bans the legacy `agnes auth import-token` / `agnes auth whoami` verbs. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 4.	2026-05-04 22:13:46 +02:00
ZdenekSrotyr	74b7f6e254	feat(setup-instructions): preflight checks both git and claude Renames `_git_check_block` to `_preflight_block` and adds a `claude --version` check beside `git --version`. Both binaries are required by the marketplace step — git for the clone fallback, claude for `claude plugin marketplace add` / `claude plugin install` — so checking them together gives one clear failure instead of two confusing downstream errors. Install hints: `npm i -g @anthropic-ai/claude-code` for Linux / WSL plus a doc URL (https://docs.claude.com/claude-code) for the native macOS / Windows installers. We don't try to one-line a native installer; the canonical instructions live upstream. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 3.	2026-05-04 22:11:38 +02:00
ZdenekSrotyr	e16698c3cc	refactor(setup-instructions): unified layout with mandatory agnes init Adds `_step_numbers(*, has_marketplace, has_skills)` so step numbering lives in one place instead of being split across three branches in `resolve_lines`. Pins the unified layout in the tests: No plugins: 1 install, 2 init, 3 catalog, 4 diagnose, 5 skills, 6 confirm With plugins: 1, 2, 3, 4 preflight, 5 marketplace, 6 diagnose, 7 skills, 8 confirm `agnes auth import-token` / `agnes auth whoami` are now banned from the rendered prompt — `agnes init` subsumes them. The renamed `test_resolve_lines_no_plugins_unified_six_step_layout` asserts those strings are absent and that the new step headers (`Bootstrap your Agnes workspace`, `Verify the data is queryable`) are present. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 2.	2026-05-04 22:10:05 +02:00
ZdenekSrotyr	9334beed15	refactor(setup-instructions): drop role param; collapse analyst/admin into one layout Removes the `role: Literal["analyst", "admin"]` parameter from `resolve_lines` / `render_setup_instructions` and deletes the `_resolve_analyst_lines`, `_analyst_init_lines`, `_analyst_finale_lines` helpers. The unified flow now always emits `agnes init` (the workspace-rails delivery mechanism) in place of the legacy `agnes auth import-token` + `agnes auth whoami` pair, and uses `agnes catalog` as the smoke-verify step. `agnes init` already verifies the PAT internally, and `agnes catalog` doubles as a data-plane smoke check, so dropping `agnes auth whoami` costs no signal. Drops the now-redundant `tests/test_setup_instructions_analyst.py` and patches the one ordering test in `tests/test_setup_instructions.py` that referenced the old "Log in" / "Verify the login" headers. Also strips the `role=role` kwarg from `compute_default_agent_prompt`'s call into `resolve_lines` so the welcome-template render path keeps working; welcome_template.py's own role param is removed in a follow-up task. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 1.	2026-05-04 22:08:48 +02:00
ZdenekSrotyr	8784f10a6b	fix(devin-review): stale-token override + status sessions counter + lock comment Three Devin Review findings on PR #173 addressed in one commit since they're in adjacent code paths: 1. cli/commands/init.py:99 (\u{1F534}): `agnes init --token NEW` ran step 2 verify against the OLD on-disk token because `get_token()` read `~/.config/agnes/token.json` before the env var, and `_override_server_env` only set the env var. So `agnes init --force` on a machine with a stale token.json failed 401 with a confusing 'token expired' even though the --token arg was valid. Fix: ContextVar-based override in `cli.config._token_override` checked by `get_token()` BEFORE the on-disk read. `_with_token_override` context manager scopes the override. `_override_server_env` now also sets the contextvar via `_with_token_override(token)`, so both env var and contextvar carry the override (env for back-compat with anything bypassing get_token; contextvar is the authoritative source). Async-safe (each task sees its own override) and leak-proof (resets on context exit). 2 new tests: regression on stale-disk-token + scope leak guard. 2. cli/commands/status.py:43 (\u{1F7E1}): sessions_pending_upload only checked legacy `<workspace>/user/sessions/` and always reported 0 in workspaces bootstrapped with `agnes init` (Claude Code writes to `~/.claude/projects/`, not the legacy path). Same bug we fixed for `agnes push` in `08e49591`. Fix: route through `cli.lib.claude_sessions.list_session_files()` so status and push agree on what counts as a pending session. 3. connectors/bigquery/extractor.py:111 (\u{1F7E1}): docstring claimed "a live holder still wins the second flock attempt" — incorrect on Linux. After `unlink()` + `open()`, the new file is a new inode; fcntl.flock keys per-inode, so the old holder's lock does NOT block the new acquisition. In a genuine TTL-overrun scenario two writers CAN race the parquet.tmp. Fix: documentation only. Comment now honestly describes the inode-recreation behavior, names the threading.Lock as the actual in-process guard, and flags pid-gating as the next-iteration fix if real corruption surfaces. The 24h default TTL is well above typical COPY durations so the practical risk is low. Tests: 17/17 across test_cli_init.py + test_lib_pull.py + the broader regression set.	2026-05-04 21:26:30 +02:00
ZdenekSrotyr	8233c3e3f9	chore(docs): replace stale `da` verbs and vendor-specific install paths Sweep operator runbooks (docs/QUICKSTART, docs/HEADLESS_USAGE, docs/architecture, docs/sample-data, docs/agent-workspace-prompt, docs/metrics/metrics.yml, dev_docs/server, dev_docs/disaster-recovery), the corporate-memory service README, the jira connector README + backfill scripts, the deploy skill, and test docstrings. Replaces `da sync` → `agnes pull`, `da analyst setup` → `agnes init`, `da metrics ...` → `agnes catalog --metrics` / `agnes admin metrics ...`, `da fetch` → `agnes snapshot create`, plus the matching docker-compose admin invocations. Vendor-specific `/opt/data-analyst/` install paths in jira backfill / consistency scripts and operator docs are replaced with the placeholder `<install-dir>` and a new `AGNES_ENV_FILE` env-var override that lets a deployment inject its actual install path without a code change. Aligns with the OSS vendor-agnostic policy in CLAUDE.md. CHANGELOG `### Internal` entry summarizes the audit and reaffirms the intentional stale-marker tuples (`_LEGACY_STRINGS`, `_OUR_COMMAND_MARKERS`) that must keep referencing `da sync` / `da fetch` / etc. for hook upgrade and override-detection logic.	2026-05-04 21:22:19 +02:00
ZdenekSrotyr	976d0c7160	fix(pull): re-download parquet when file missing despite matching hash Pre-fix `agnes pull` decided what to download from sync_state hash equality alone: if server_hash != local_hash or tid not in local_tables or not server_hash: to_download.append(tid) If the recorded local hash matched server but the actual parquet had been deleted from disk, the download was skipped. The next DuckDB view rebuild then fails on a missing file. Repro: `rm server/parquet/X.parquet && agnes pull` → 'Updated 0 tables', X still missing. Failure modes that produce hash-equal-but-file-missing: - manual `rm` of a single parquet - operator-side cleanup of `server/parquet/` - two workspaces sharing one user's `~/.config/agnes/sync_state.json` (TODO(workspace-scoped-sync-state) in pull.py): one workspace writes its parquets, the other reads sync_state and concludes 'I already have these' - disk corruption / partial restore from backup Fix: existence check runs alongside the hash compare. Missing file forces a re-download regardless of hash equality. `parquet_dir` is hoisted above the loop so the existence check is in scope when the download set is built. Tests: regression test for the hash-equal-but-missing-file case + counterpart for the fast-path (hash-equal-and-file-present must still skip).	2026-05-04 21:12:06 +02:00
ZdenekSrotyr	500db8cd3c	fix(query-guardrail): dry-run user SQL not synthetic SELECT * (#171 ) Closes #171. The /api/query cost guardrail used to dry-run a synthetic `SELECT * FROM <table>` for each registered remote-BQ row referenced by the user SQL — which made BigQuery estimate a full table scan, with column projection, predicate pushdown, and partition pruning all disabled. Narrow queries on big partitioned/clustered tables (the documented happy path for `agnes query --remote`) hit ~30,000× over-estimates and got rejected with 400 `remote_scan_too_large` even when BQ's own dry-run reported single-digit MB. Pavel's report on #171 traced the root cause and proposed the fix: rewrite the user SQL to BQ-native syntax and dry-run it as a single job, exactly the way `bq query --dry_run` works. Implementation: - New helper _rewrite_user_sql_for_bq_dry_run rewrites bare registered names (word-boundary, case-insensitive, longest-first to avoid prefix collisions) + bq."<ds>"."<tbl>" forms to backticked `<project>.<ds>.<tbl>` paths. - _bq_quota_and_cap_guard runs ONE dry-run on the rewritten SQL. Cap check uses the real estimate. - Fallback path: if BQ rejects with bq_bad_request (e.g. DuckDB-only syntax like ::INT casts), the guard falls back to the pre-fix per-table SELECT * approach so non-portable queries still get a (loose) cap estimate instead of fail-opening. Non-parse BQ errors (forbidden, upstream) still propagate as 502. - _bq_guardrail_inputs now also returns name_lookups so the rewriter has the (registered_name, bucket, source_table) mapping it needs. - Per-table breakdown is unavailable from a composite dry-run; total bytes are pinned to dry_run_set[0] for the post-flight record_bytes(sum(...)) call to keep returning the right total. Tests (7 new, 3 existing still pass): - dry-run receives rewritten user SQL with WHERE clause intact (the load-bearing assertion for #171) - single dry-run per request even with multiple registered tables (JOIN, UNION) referenced - fallback to per-table SELECT * on bq_bad_request - non-parse BQ errors (forbidden) still 502 - rewriter unit tests: bare + bq.path in same SQL, longest-name-wins on prefix collision, case-insensitive bare-name match	2026-05-04 21:08:21 +02:00
ZdenekSrotyr	bd462187e8	test(welcome-template): tighten default-rendered assertions to new agnes verbs The renderer no longer emits the legacy "da analyst setup" verb (the analyst flow uses `agnes init`, the admin flow uses `agnes auth import-token`). The disjunction assertions ("da analyst setup" OR "agnes auth" OR "curl") were permissive and would have silently kept passing even if the renderer regressed. Replace them with role-aware assertions that match the actual emitted markers and explicitly check that no legacy verb survives.	2026-05-04 21:07:51 +02:00
ZdenekSrotyr	8890b6f09b	fix(post-merge): clean up stale `da` verbs introduced via #174 merge Four call sites where #174 (branched from main before the agnes rename fully landed in some files) emitted or referenced `da fetch`. None are operator-visible runtime crashes — but `extractor.py` logs a stale verb to the operator log and `DATA_SOURCES.md` is current docs: - connectors/bigquery/extractor.py:431,434 (operator-facing log line on unverified BQ entity_type — was suggesting `da fetch`). - docs/DATA_SOURCES.md:77,85 (current public docs, two refs to `da fetch` in the workflow + the BQ scope description). - tests/test_cli_query_render.py:7 (module docstring listed `da fetch / agnes schema / etc.` — now `agnes snapshot create / agnes schema / etc.`). - tests/test_cli_snapshot_create.py:1 (docstring referenced `(folded from `da fetch`)` — historical, removed; no value once the rename landed). Pre-existing stale `da` references elsewhere in the branch (templates, operator runbooks, internal comments) are not touched by this commit — they live outside the merge surface and are a separate cleanup task. Verified: 10/10 across the affected test files pass.	2026-05-04 20:57:36 +02:00
ZdenekSrotyr	e438170ade	merge: pull #174 (BQ materialize view fix + concurrency, 0.33.0) into bootstrap branch Brings in zs/materialize-sync-fix (PR #174): - BigQuery view materialize works (wrap admin SQL in bigquery_query()) - Per-table mutex + fcntl.flock for concurrent COPY corruption - Cost guardrail dry-run engages on materialized rows - Schema v23 -> v24 migration: rewrite source_query to BQ-native - Server-generated trivial source_query from bucket+source_table - Validator backtick relaxation for materialized rows - 0.33.0 release cut Conflict resolution: - CHANGELOG.md: keep our [Unreleased] (bootstrap rewrite content) ABOVE the new [0.33.0] section from #174. The bootstrap rewrite remains unreleased; it'll cut 0.34.0 (or later) when this PR merges to main. - tests/conftest.py: union — keep our analyst-bootstrap fixture re-export AND #174's bq_instance / stub_bq_extractor fixtures. - pyproject.toml auto-merged to 0.33.0 (matches the cut), correct. - src/db.py auto-merged: SCHEMA_VERSION = 24, _v23_to_v24_finalize added — no overlap with our work which left schema at v23. - CLAUDE.md auto-merged: schema-history paragraph extended with v24. Verified: 79/79 across CLI bootstrap suite + materialize suite + schema v24 migration tests pass locally on Python 3.13/macOS.	2026-05-04 20:53:00 +02:00
ZdenekSrotyr	e6a2c4c51d	tests: rename 'prj-grp' placeholder to 'my-project' for vendor-agnostic OSS The dashed identifier is what the test exercises (backticks required for dashed BQ project IDs); the literal string can be any synthetic value. 'prj-grp' is too close to a real customer-prefix pattern that the OSS vendor-scrub regex flags. 'my-project' matches placeholders used elsewhere in the project.	2026-05-04 20:38:47 +02:00
ZdenekSrotyr	08e4959185	fix(push): read sessions from ~/.claude/projects/<encoded-cwd>/ Real bug: `agnes push` was reading `<workspace>/user/sessions/`, but Claude Code writes session jsonls to `~/.claude/projects/<encoded-cwd>/` and nothing on the analyst side ever copies them across. The SessionEnd hook ran `agnes push` happily and uploaded zero sessions every time. `cli/lib/claude_sessions.py` probes both Claude Code encoding variants (older `/`→`-` keeping spaces+tildes; newer all-non-alphanumeric→`-` with collapsed runs) and unions whichever exist. Users who upgraded Claude Code mid-project end up with both encoded dirs side-by-side on disk; the union ensures no session is left behind. Same-named jsonl in both dirs → newest mtime wins. `<workspace>/user/sessions/` survives as a fallback for any setup that explicitly mirrors sessions there. Verified on real disk: helper returns 2 dirs + 8 unioned session files for the Agnes-test workspace where the previous code returned 0.	2026-05-04 20:29:59 +02:00
ZdenekSrotyr	92d477e422	fix(setup): default /setup to analyst, hide admin tile from non-admins Three coupled UX fixes for the analyst-onboarding flow: 1. Dashboard "Setup a new Claude Code" CTA was rendering admin paste prompt for everyone (analysts couldn't actually execute the marketplace plugin install / skills setup steps). render_agent_prompt_banner now picks role based on user.is_admin — analysts get the analyst flow. 2. /setup default role changed from admin to analyst. Most visitors are analysts; admin layout is opt-in via the admin tile or ?role=admin. 3. Admin tile is admin-only on the role-tile nav. Non-admins see only the analyst tile. Server-side: non-admin requesting ?role=admin is silently downgraded to analyst (otherwise they'd see admin paste prompt despite no tile). Tests: - New: test_setup_page_admin_tile_hidden_for_non_admin (anonymous client can't see "Admin CLI" or role=admin link) - New: test_setup_page_admin_role_downgraded_for_non_admin (anonymous ?role=admin → analyst layout, no marketplace step in clipboard) - New: test_install_preview_default_role_is_analyst (admin signing in to bare /setup gets analyst clipboard by default) - Renamed: test_setup_page_default_role_is_admin → ..._is_analyst - Updated: test_setup_page_admin_clipboard_renders_admin_layout uses FastAPI dependency_overrides to inject admin user (admin layout is now admin-gated) - Updated: test_install_preview_visible_for_signed_in_user explicitly passes ?role=admin to exercise admin layout	2026-05-04 20:20:37 +02:00
ZdenekSrotyr	d8dc7c7799	fix: update legacy-string assertions in tests + onboarding template Caught by my own broader test scope after Devin fixes — three test files asserted on user-visible strings that were renamed by the bootstrap PR but the assertions weren't updated: - tests/test_api_query_guardrail.py:110 — asserted `da fetch in suggestion` on /api/query 400 response. Renamed to `agnes snapshot create`. - tests/test_query_materialized_error_message.py:56 — asserted `da sync` in materialized-not-yet error detail. Renamed to `agnes pull`. - tests/test_cli_error_render.py:71 — fixture data + assertion both carried `da fetch`. Updated to `agnes snapshot create`. Plus an actual content miss: docs/setup/claude_settings.json (a template shipped to operators) still installed `da sync` / `da sync --upload-only` hooks. The companion test file (tests/test_setup_hooks_template.py) was asserting that legacy state. Updated both: - Template hooks: `agnes pull --quiet` / `agnes push --quiet` - Test assertions + function name match the new commands	2026-05-04 20:08:07 +02:00
ZdenekSrotyr	3d58768143	fix: address Devin Review findings — incomplete renames + estimate guard 13 Devin findings across 10 files: 🔴 Critical: - app/api/v2_catalog.py:42 — `_fetch_hint` returns `da fetch` in /api/v2/catalog responses (user-visible in every catalog list) - cli/skills/agnes-data-querying.md — 11 stale `da fetch`/`da sync` refs in the bundled skill markdown - config/claude_md_template.txt:38 — referenced `agnes pull --docs-only` flag that does NOT exist in agnes pull (removed; spec only ships --quiet/--json/ --dry-run) 🟡 Important: - app/api/admin.py:252 — `da fetch` in bq_max_scan_bytes hint - cli/commands/auth.py:119 — `da sync` in import-token docstring (--help text) - cli/commands/tokens.py:48 — "Export it so `da` can use it" prose - ARCHITECTURE.md — 4 stale rows in CLI commands table - README.md — stale paragraphs for analysts (da sync, da analyst setup) 🚩 Substantive observations addressed: - app/api/query.py:249,302,489 — server-side error/help strings still said `da sync`/`da fetch` (returned in API responses to clients) - cli/commands/snapshot.py:235-241 — DuckDB existence guard incorrectly blocked `--estimate` (server-side dry-run that never opens local DB). Added test ensuring estimate path skips the guard. Skipped (intentionally historical): - app/api/admin.py:2377,2429,2437 — historical comments describing past manifest-vs-sync_state bug; past tense, accurate to keep as `da sync`.	2026-05-04 20:05:06 +02:00
ZdenekSrotyr	5fa1c94b5c	fix(tests): smoke matrix asserts no-traceback only (per-command rc varies)	2026-05-04 19:47:18 +02:00
ZdenekSrotyr	5162c488bb	fix(tests): strip ANSI escapes from --help output before substring asserts Typer/rich emits ANSI styling in CI's --help output (e.g. `--metrics` becomes `-\x1b[0m\x1b[1;36m-metrics`), so literal substring asserts like `assert "--metrics" in result.output` fail. Locally the test runner auto-detects no-TTY and produces plain text, masking the issue. Add a small `_clean()` helper per test file that strips ANSI escape codes (`\x1b\[[0-9;]*m`) before substring containment checks.	2026-05-04 19:43:47 +02:00
ZdenekSrotyr	675f8e1909	chore(lint): drop unused imports from new test files (ruff F401)	2026-05-04 19:32:31 +02:00
ZdenekSrotyr	ce108d4c6d	fix(schema): code-review follow-ups for `fac10b29` - _v23_to_v24_finalize: wrap row-update loop in BEGIN/COMMIT/ROLLBACK to match the project's transactional-finalizer pattern (compare _v12_to_v13_finalize, _v17_to_v18_finalize, _v18_to_v19_finalize). Pre-fix a process crash mid-loop left the schema_version unchanged but partially-converted rows persisted across restart — idempotent overall but inconsistent with project convention. - _v23_to_v24_finalize: re.sub replacement now uses a function-form (lambda) instead of an f-string, so any future project_id with a backslash sequence isn't misinterpreted as a group reference. - tests: add a Keboola-source materialized row case asserting the SELECT's source_type filter prevents non-BQ rewrites.	2026-05-04 19:32:24 +02:00
ZdenekSrotyr	8403529fcd	test: clean-install integration suite (minimal/zero grants, force, pre-init)	2026-05-04 19:22:24 +02:00
ZdenekSrotyr	fac10b29e4	feat(schema): v24 — rewrite materialized BQ source_query to BQ-native Materialize now wraps admin SQL into bigquery_query('<billing>', '<inner>') which requires the inner SQL to be BigQuery-flavor (backticked identifiers, native function syntax). v24 migrates existing rows from DuckDB-flavor (bq."ds"."tbl") to (`<project>.ds.tbl`) using the configured BQ project. Idempotent on already-converted rows; logs a warning and skips when the project isn't configured (operator can configure + restart for retry).	2026-05-04 19:15:54 +02:00
ZdenekSrotyr	42e108ae5e	test: reader smoke matrix on zero-grants workspace	2026-05-04 19:15:39 +02:00
ZdenekSrotyr	a47c2be282	test: clean-bootstrap fixtures (fastapi_test_server, test_pat, zero_grants_workspace) Task 20: reusable pytest fixtures for the clean-bootstrap test suite. Tasks 21 and 22 (reader smoke matrix + init smoke matrix) consume them. - fastapi_test_server boots a real uvicorn subprocess against a tmp DATA_DIR, pre-seeded with admin@example.com (Admin group), analyst@example.com (Everyone group), and three tables (one per query_mode: local / materialized / remote). - web_session: cookie-authenticated httpx.Client for the admin user. - test_pat: minted JWT for the analyst with table grants on local + materialized. - test_pat_no_grants: same shape, zero resource_grants. - zero_grants_workspace: subprocess invocation of `agnes init` against the no-grants PAT; returns the bootstrapped workspace path. - NONEXISTENT_TABLE: module-level sentinel for the upcoming reader matrix. Subprocess uvicorn (mirrors tests/test_e2e_corporate_memory.py) instead of in-thread so DATA_DIR + module-level singletons in src.db don't bleed across tests. agnes CLI invoked via `python -m cli.main` instead of the .venv/bin/agnes shim, which depends on .pth file visibility that iCloud Drive intermittently re-hides on macOS.	2026-05-04 19:11:54 +02:00
ZdenekSrotyr	7e1dd1adba	refactor(cli): drop sync/fetch/analyst/metrics; register init/pull/push (BREAKING)	2026-05-04 18:59:51 +02:00
ZdenekSrotyr	6c0846fd17	feat(config): expose materialize.lock_ttl_seconds in server-config New top-level 'materialize' section, single field (lock_ttl_seconds). Default 86400 (24h). Backs the file-lock TTL reclaim added in the per-table-mutex change. Editable via PUT /api/admin/server-config and the /admin/server-config UI.	2026-05-04 18:52:54 +02:00
ZdenekSrotyr	ff5da0af90	feat(cli): agnes admin metrics {import,export,validate}	2026-05-04 18:39:05 +02:00
ZdenekSrotyr	3871d5320a	feat(admin): server-generate materialized source_query, allow BQ backticks When admin registers a materialized BQ row with bucket+source_table but no source_query, the server generates 'SELECT * FROM `<project>.<ds>.<tbl>`' from instance.yaml's configured BQ project. Same fallback fires on PUT when flipping to materialized. The backtick rejection guard, which was appropriate for DuckDB-flavor source_query, is relaxed for materialized rows since the new wrapping path (Task 2) runs admin SQL through BQ jobs API which uses BQ-native syntax (backticks for dashed identifiers).	2026-05-04 18:37:27 +02:00
ZdenekSrotyr	42b8d0309b	feat(cli): agnes catalog --metrics replaces da metrics list/show	2026-05-04 18:33:17 +02:00
ZdenekSrotyr	8309141705	feat(cli): agnes snapshot create (folded from da fetch); friendly exit if no DuckDB	2026-05-04 18:32:30 +02:00
ZdenekSrotyr	5e1e8c4e14	feat(cli): agnes status = workspace state; old health check moves to agnes diagnose system	2026-05-04 18:29:15 +02:00
ZdenekSrotyr	b799aa534a	fix(cli): I1+I2 review — surface manifest_unauthorized + add 3 typed-error tests	2026-05-04 18:19:35 +02:00
ZdenekSrotyr	9b70ca3069	feat(cli): agnes init orchestrator + AGNES_WORKSPACE.md template	2026-05-04 18:15:08 +02:00
ZdenekSrotyr	c7c42de0f0	feat(sync): treat MaterializeInFlightError as 'skipped, in_flight' _run_materialized_pass distinguishes due-check skips from in-flight skips and never calls state.set_error for either. summary['skipped'] becomes a list of {table, reason} dicts; the end-of-pass log line breaks out the in_flight subcount. Hoists is_table_due to module-level import so test monkeypatching of the symbol intercepts the call (the previous local import made patches a no-op).	2026-05-04 18:11:38 +02:00
ZdenekSrotyr	60b6fbed97	feat(cli): agnes push command (extracted from sync --upload-only)	2026-05-04 18:09:57 +02:00
ZdenekSrotyr	7f89e1d594	feat(cli): agnes pull command (Typer wrapper around lib.pull.run_pull)	2026-05-04 18:07:28 +02:00
ZdenekSrotyr	15004126de	fix(cli-lib): I1+I2+I3 review fixes — token-precedence note, sync-state TODO, dry-run hermeticity test	2026-05-04 18:04:56 +02:00
ZdenekSrotyr	37da602060	feat(cli-lib): cli/lib/pull.py:run_pull primitive with lazy mkdir	2026-05-04 18:00:57 +02:00
ZdenekSrotyr	dc7e27082d	fix(bq-materialize): code-review follow-ups for `16eaf7a3` - extractor._try_acquire_file_lock: close fd and re-raise on non- BlockingIOError from fcntl.flock (read-only fs, unsupported flock, fd exhaustion). Pre-fix the fd leaked silently and the underlying OSError still propagated past the caller. - extractor: reorder module-level layout so logger is bound before the new lock-related helpers reference it. Deferred import of app.instance_config inside _get_lock_ttl_seconds documented inline. - extractor: comment _table_locks unbounded-by-design rationale. - tests: docstring + monkeypatch-target rationale for the two concurrency tests where the contract isn't obvious from the body.	2026-05-04 17:59:21 +02:00
ZdenekSrotyr	5aebeabf23	feat(cli-lib): cli/lib/hooks.py:install_claude_hooks	2026-05-04 17:53:20 +02:00

1 2 3 4 5 ...

283 commits