agnes-the-ai-analyst

Author	SHA1	Message	Date
ZdenekSrotyr	d44cace17c	docs(changelog): clean-analyst-bootstrap rewrite (BREAKING)	2026-05-04 19:25:38 +02:00
ZdenekSrotyr	cc84222216	docs: clean-install manual protocol in release checklist	2026-05-04 19:23:01 +02:00
ZdenekSrotyr	8403529fcd	test: clean-install integration suite (minimal/zero grants, force, pre-init)	2026-05-04 19:22:24 +02:00
ZdenekSrotyr	42e108ae5e	test: reader smoke matrix on zero-grants workspace	2026-05-04 19:15:39 +02:00
ZdenekSrotyr	a47c2be282	test: clean-bootstrap fixtures (fastapi_test_server, test_pat, zero_grants_workspace) Task 20: reusable pytest fixtures for the clean-bootstrap test suite. Tasks 21 and 22 (reader smoke matrix + init smoke matrix) consume them. - fastapi_test_server boots a real uvicorn subprocess against a tmp DATA_DIR, pre-seeded with admin@example.com (Admin group), analyst@example.com (Everyone group), and three tables (one per query_mode: local / materialized / remote). - web_session: cookie-authenticated httpx.Client for the admin user. - test_pat: minted JWT for the analyst with table grants on local + materialized. - test_pat_no_grants: same shape, zero resource_grants. - zero_grants_workspace: subprocess invocation of `agnes init` against the no-grants PAT; returns the bootstrapped workspace path. - NONEXISTENT_TABLE: module-level sentinel for the upcoming reader matrix. Subprocess uvicorn (mirrors tests/test_e2e_corporate_memory.py) instead of in-thread so DATA_DIR + module-level singletons in src.db don't bleed across tests. agnes CLI invoked via `python -m cli.main` instead of the .venv/bin/agnes shim, which depends on .pth file visibility that iCloud Drive intermittently re-hides on macOS.	2026-05-04 19:11:54 +02:00
ZdenekSrotyr	8d9323c99e	docs(claude-md): sweep surviving-verb da X references (Task 19 follow-up)	2026-05-04 19:01:27 +02:00
ZdenekSrotyr	3990fb0d85	docs(claude-md): rewrite verbs + paths for new CLI surface	2026-05-04 19:00:31 +02:00
ZdenekSrotyr	7e1dd1adba	refactor(cli): drop sync/fetch/analyst/metrics; register init/pull/push (BREAKING)	2026-05-04 18:59:51 +02:00
ZdenekSrotyr	5551f12bb0	fix(cli): hint text 'Run: da sync' → 'Run: agnes pull'	2026-05-04 18:42:21 +02:00
ZdenekSrotyr	ff5da0af90	feat(cli): agnes admin metrics {import,export,validate}	2026-05-04 18:39:05 +02:00
ZdenekSrotyr	42b8d0309b	feat(cli): agnes catalog --metrics replaces da metrics list/show	2026-05-04 18:33:17 +02:00
ZdenekSrotyr	8309141705	feat(cli): agnes snapshot create (folded from da fetch); friendly exit if no DuckDB	2026-05-04 18:32:30 +02:00
ZdenekSrotyr	5e1e8c4e14	feat(cli): agnes status = workspace state; old health check moves to agnes diagnose system	2026-05-04 18:29:15 +02:00
ZdenekSrotyr	b799aa534a	fix(cli): I1+I2 review — surface manifest_unauthorized + add 3 typed-error tests	2026-05-04 18:19:35 +02:00
ZdenekSrotyr	9b70ca3069	feat(cli): agnes init orchestrator + AGNES_WORKSPACE.md template	2026-05-04 18:15:08 +02:00
ZdenekSrotyr	60b6fbed97	feat(cli): agnes push command (extracted from sync --upload-only)	2026-05-04 18:09:57 +02:00
ZdenekSrotyr	7f89e1d594	feat(cli): agnes pull command (Typer wrapper around lib.pull.run_pull)	2026-05-04 18:07:28 +02:00
ZdenekSrotyr	15004126de	fix(cli-lib): I1+I2+I3 review fixes — token-precedence note, sync-state TODO, dry-run hermeticity test	2026-05-04 18:04:56 +02:00
ZdenekSrotyr	37da602060	feat(cli-lib): cli/lib/pull.py:run_pull primitive with lazy mkdir	2026-05-04 18:00:57 +02:00
ZdenekSrotyr	2b3d62fbf5	chore(.gitignore): allowlist cli/lib/ from generic lib/ rule (Task 7 follow-up)	2026-05-04 17:54:00 +02:00
ZdenekSrotyr	5aebeabf23	feat(cli-lib): cli/lib/hooks.py:install_claude_hooks	2026-05-04 17:53:20 +02:00
ZdenekSrotyr	d25d075ed2	docs(claude-md-template): rewrite verbs + paths for new CLI surface (Task 6) - Verb renames (da X -> agnes X for surviving verbs; legacy verbs already absent from this default template — admin overrides with legacy verbs are caught by Task 2's _LEGACY_STRINGS scan + Task 5's admin banner). - Path renames: data/parquet/ -> server/parquet/, data/duckdb/ -> user/duckdb/, data/metadata/ removed entirely (no longer exists per spec). - Drop user/artifacts/ from directory structure (spec workspace layout drops it; surviving paths: server/parquet/, user/duckdb/, user/snapshots/, user/sessions/). - Add AGNES_WORKSPACE.md pointer near top-of-template so analysts know where to find human-readable docs. Cleans Task 0.5's missed sweep on this file (was not in cli/ tree but is user-visible via /api/welcome). 81 claude_md/welcome_template tests pass.	2026-05-04 17:51:14 +02:00
ZdenekSrotyr	a92c624dba	feat(admin): yellow banner for legacy CLI verbs in workspace-prompt override	2026-05-04 17:46:50 +02:00
ZdenekSrotyr	8091620d33	fix(setup): role-aware clipboard render + JSON-escape ROLE injection Two Task 4 review fixes for app/web/templates/install.html: 1. JSON-escape `ROLE` JS const via `{{ role \| tojson }}` (defense in depth — removes the dependency on Jinja autoescape semantics for JS contexts; FastAPI's Literal validator already constrains role values). 2. Verify the analyst tile's clipboard payload is the analyst layout. The pre-existing role-aware plumbing (compute_default_agent_prompt threading role into setup_instructions_lines, picked up by the JS SETUP_INSTRUCTIONS_TEMPLATE array) was correct; adding regression tests that pin to the JS clipboard block specifically so a future inversion would fail loudly. Tests: analyst clipboard contains `agnes init` + `agnes catalog` and NOT `agnes auth import-token` / `agnes skills`; admin clipboard is the inverse. Plus an explicit assertion that ROLE is rendered via tojson.	2026-05-04 17:43:46 +02:00
ZdenekSrotyr	44234ba3ae	test(setup): add mutation-resistant ternary-direction assertion (Task 4 polish)	2026-05-04 17:37:54 +02:00
ZdenekSrotyr	7965f8021d	fix(setup): role-aware PAT scope+TTL in setupNewClaude JS (Task 4 spec fix)	2026-05-04 17:34:30 +02:00
ZdenekSrotyr	f731ee7897	feat(setup): /setup?role=analyst\|admin branching with role tiles	2026-05-04 17:28:47 +02:00
ZdenekSrotyr	54f83c281c	test(setup): I1+I2 review fixes — AGNES_WORKSPACE.md alignment + step-number pin	2026-05-04 17:23:15 +02:00
ZdenekSrotyr	ae00945cbf	fix(setup): clean stale 'da' refs in setup_instructions.py (Task 0.5 missed sweep)	2026-05-04 17:19:55 +02:00
ZdenekSrotyr	29e28ccbd3	feat(setup): add analyst role to install-prompt renderer	2026-05-04 17:17:59 +02:00
ZdenekSrotyr	59324f9361	feat(admin): scan CLAUDE.md override for legacy strings	2026-05-04 17:10:58 +02:00
ZdenekSrotyr	68639e54cf	test(tokens): tighten scope-default + add precedence + audit + reserved-key tests	2026-05-04 17:07:02 +02:00
ZdenekSrotyr	4ee7323436	feat(tokens): add scope + ttl_seconds fields with bootstrap-analyst clamp	2026-05-04 17:00:54 +02:00
ZdenekSrotyr	8fbf4c7873	refactor: Task 0.5 amendments — README/ARCHITECTURE sweep + main.py install hint + drop dead AGNES_SERVER_URL	2026-05-04 16:55:55 +02:00
ZdenekSrotyr	ed371c84d1	refactor(docs): sweep DA_* env vars + surviving da-verbs in docs/*.md (Task 0.5 fix)	2026-05-04 16:43:15 +02:00
ZdenekSrotyr	1563b05f2e	refactor(cli): hard-cutover env vars + config dir to AGNES_* Task 0.5 of clean-analyst-bootstrap. Greenfield rewrite — no fallback, no aliases. Existing dev environments lose their cached PAT and must re-authenticate. Env var renames (hard cutover): - DA_CONFIG_DIR -> AGNES_CONFIG_DIR - DA_SERVER -> AGNES_SERVER - DA_SERVER_URL -> AGNES_SERVER_URL (test-only stale ref, not in spec) - DA_NO_UPDATE_CHECK -> AGNES_NO_UPDATE_CHECK - DA_LOCAL_DIR -> AGNES_LOCAL_DIR - DA_TOKEN -> AGNES_TOKEN - DA_STREAM_RETRIES -> AGNES_STREAM_RETRIES Config dir rename: ~/.config/da/ -> ~/.config/agnes/ (across code, comments, docstrings, error messages, install templates, dev scripts). Stale `da X` references in CLI source (and adjacent app/, tests/): swept docstrings, comments, help text, and error messages where the verb survives the rewrite (init, pull, push, catalog, status, diagnose, auth, admin, skills, query, schema, describe, explore, disk-info, snapshot, login, logout, whoami, server, setup) and replaced `da X` with `agnes X`. Intentionally kept `da sync`, `da fetch`, `da analyst`, `da metrics` — those verbs are removed in later tasks; the legacy strings will be detected by `_LEGACY_STRINGS` (added in Task 2). Test fixes: - TestCLIVersion now asserts output starts with `agnes ` (was `da `). Test results: 2675 passed, 25 skipped (full pytest run, excluding 9 pre-existing test_db.py / test_user_management.py / test_e2e_extract.py / test_cli_binary_rename.py failures unrelated to this rename).	2026-05-04 16:35:44 +02:00
ZdenekSrotyr	8c8cdf6a6a	feat(cli): rename binary from da to agnes (BREAKING)	2026-05-04 16:05:14 +02:00
ZdenekSrotyr	841dcc8447	docs(spec+plan): round-4 review fixes — rename hygiene 5 must-fixes from rename-hygiene review: - B1: test command arrays ["da", ...] -> ["agnes", ...] in spec lines 433-446, 466, 502 and plan reader smoke matrix + clean-install integration tests + readers-in-pre-init test (39 occurrences) - B2: ~/.local/bin/da -> ~/.local/bin/agnes (binary path string in spec data flow + plan AGNES_WORKSPACE.md uninstall table) - B3: CHANGELOG missing BREAKING bullet for binary rename — added in both spec and plan CHANGELOG drafts - B4: plan CHANGELOG typo "previous agnes status" -> "previous da status" (server-health overview was historically da status) - B5: spec Components table missing row for the binary rename — added Client-side row mapping pyproject.toml + cli/main.py changes to plan Task 0	2026-05-04 15:57:07 +02:00
ZdenekSrotyr	5e7fa418d1	docs(spec+plan): rename CLI binary from da to agnes (BREAKING) - Spec rev 5: branding consistency. New CLI verbs use agnes prefix (agnes init, agnes pull, agnes push, agnes catalog, agnes status, agnes snapshot create, agnes admin, …). - Plan: add Phase 0 / Task 0 — pyproject.toml [project.scripts] entry rename to "agnes = cli.main:app" + Typer(name="agnes") in cli/main.py. - Legacy command references (da sync, da fetch, da analyst setup, da metrics) keep their da prefix throughout — they're historical artifacts being removed (preserved in CHANGELOG Removed section, _LEGACY_STRINGS constant for admin override scan, etc.). Bulk rename via Python regex with verb whitelist: 286 verb refs rewritten in plan, 265 in spec; 104+72 legacy refs restored to "da" post-pass (false positives where the doc was describing the legacy flow being replaced).	2026-05-04 15:50:44 +02:00
ZdenekSrotyr	fb8f55c335	docs(plan): clean-analyst-bootstrap implementation plan 25 tasks across 6 phases: - Phase 1: server-side foundation (PAT scope/TTL, legacy-strings scan, /setup?role= branching, claude_md_template) - Phase 2: cli/lib/ shared-library tree (hooks.py, pull.py) - Phase 3: new commands (init, pull, push, status, diagnose system, snapshot create, catalog --metrics, admin metrics) - Phase 4: wiring + delete obsolete (sync, fetch, analyst, metrics) - Phase 5: test fixtures + reader smoke matrix + clean-install integration - Phase 6: CHANGELOG + final verification TDD discipline throughout: test → fail → implement → pass → commit per task.	2026-05-04 15:22:10 +02:00
ZdenekSrotyr	b2cc6517aa	docs(spec): rev 4 — round-3 review fixes; cleared for implementation Round-3 review (2 load-bearing + 5 quality-of-life): - web_session fixture endpoint: POST /auth/password/login/web (form), not POST /auth/token (which doesn't exist); PAT mint requires session cookie via require_session_token - Renderer wording: "adopt" not "reuse" — da init/da pull are first-time adopters; synthetic {"detail": {"kind": ..., "hint": ...}} pattern documented (cli/commands/query.py:152, 165) - cli/lib/__init__.py mention for Hatchling wheel inclusion - _LEGACY_STRINGS constant + _scan_legacy_strings helper home named in app/api/claude_md.py - /api/welcome?server_url= query param explicit in data flow - ttl_seconds upper bound 315_360_000 (matches existing 3650 d cap) - Conftest line range 50-82 -> 50-83	2026-05-04 15:10:40 +02:00
ZdenekSrotyr	b52a37b680	docs(spec): rev 3 — round-2 review fixes + main-sync (post-0.32.0) Round-2 review (N1-N10): - N1 Token CLI keeps current `da auth token` location (not top-level) - N2 `da catalog --metrics --show <id>` decided in Components, dropped from Open questions - N3 `_install_claude_hooks` migrated to new `cli/lib/hooks.py` module - N4 Test sentinel `__nonexistent__` documented in fixtures - N5 `web_session` fixture uses real `POST /auth/token` with seeded password - N6 `AGNES_WORKSPACE.md` content asserts (PAT not leaked, placeholders substituted) added to clean-install integration test - N7 Admin UI legacy-strings banner concretized: `legacy_strings_detected` field + yellow banner in `/admin/workspace-prompt` editor - N8 `da metrics export/validate` relocate to `da admin metrics …` alongside `import` - N9 Bootstrap PAT verify endpoint switched from `/api/health` (unauth) to `/api/catalog/tables` (PAT-validating, matches `da auth import-token`) - N10 New `cli/lib/pull.py` and `cli/lib/hooks.py` modules inventoried Main-sync (rebased on 0.32.0 / #160): - Reconsidered: keep `da skills list / show` as analyst-side discovery (skill content was strengthened by #160 cost-guardrail/registry rails) - Bigger CLAUDE.md (repo-root) rewrite scope acknowledges new sections - `cli/error_render.py` (added in 0.32.0) reused by `da init` and `da pull` for consistent typed-error UX - Test fixtures piggyback on autouse `_reset_module_caches` from tests/conftest.py:50-82 (added in 0.32.0)	2026-05-04 15:05:44 +02:00
ZdenekSrotyr	973e96e6af	docs(spec): apply round-1 review — orphan/wiring fixes 11 minimum edits + minor cleanups from first review pass: - TTL field: add `ttl_seconds` alongside existing `expires_in_days` in PAT request, force-clamp to 1h for scope=bootstrap-analyst - Drop layered per-workspace config (no defined producer; multi-instance is edge case) — moved to Open questions - CLAUDE.md producer: explicit `GET /api/welcome` flow, address admin DB-override migration - URL: `/install` → `/setup?role=...` (matches existing routing; legacy `/install` keeps 302 redirect) - `da metrics import` → relocated under `da admin metrics import` - Repo-root CLAUDE.md added to rewritten files list - CLI surface: enumerate `da explore`, `da disk-info`, `da snapshot refresh/prune`, full `da snapshot create` flag list, `da push` flags - Test fixtures section: contracts for `fastapi_test_server`, `test_pat`, `test_pat_no_grants`, `zero_grants_workspace`, `web_session`, `client` - Cleanups: line ref for `_install_claude_hooks`, admin-namespace disclaimer, `da init`-doesn't-call-`da auth login` note, audit-log consumer note for PAT scope, `{workspace_path}` placeholder usage	2026-05-04 14:59:40 +02:00
ZdenekSrotyr	8cbf0aa818	docs(spec): clean analyst bootstrap — greenfield redesign Single-paste web→Claude Code workspace bootstrap. PAT-only auth, non-interactive `da init`, slim CLI surface (init/pull/push/snapshot), lazy-mkdir contract for empty-folder discipline, AGNES_WORKSPACE.md human docs, layered per-workspace config, comprehensive clean-install verification protocol. Greenfield rewrite — drops `da analyst setup`, `da sync`, `da fetch`, `da metrics`, `da skills`. Removes legacy `data/{parquet,duckdb,metadata}/` and `user/artifacts/` workspace dirs.	2026-05-04 14:59:40 +02:00
ZdenekSrotyr	baa2efe57f	Merge pull request #168 from keboola/zs/fix-160-remote-query-view-entities fix: #160 da query --remote works for BQ VIEW/MATERIALIZED_VIEW entities + cost guardrail + closing pre-existing RBAC bypass	2026-05-04 14:48:05 +02:00
ZdenekSrotyr	cf8930b593	chore(release): cut 0.32.0 — #160 da query --remote on VIEW + 4 reinforcing fixes CHANGELOG: rename [Unreleased] → [0.32.0] — 2026-05-04, prepend a new empty [Unreleased] for next-PR landing zone. pyproject.toml: version 0.31.0 → 0.32.0. Per repo discipline (memory: feedback_release_cut_with_pr.md), the release-cut commit lands as the FINAL commit of the PR that contained the user-visible behavior change — it does not get a separate PR. After merge: tag v0.32.0 on the merge commit + create a GitHub Release (memory: feedback_github_release_per_tag.md — the tag alone isn't enough; the Release prose is the operator-visible artifact). Headline: closes #160. da query --remote now resolves query_mode='remote' BQ rows whose entity is VIEW or MATERIALIZED_VIEW (the bug Pavel hit). Plus 4 reinforcing fixes — server-side cost guardrail (bq_max_scan_bytes, default 5 GiB), registry-gating of direct bq.* paths, bigquery_query() function-call backdoor closed, structured CLI render of typed BQ errors — and one operator-side admin convenience (BQ test-connection endpoint + billing_project placeholder UI). 14 issues caught and addressed across 6 iterations of Devin Review. E2E verified on agnes-zsrotyr.groupondev.com (commit `7f743d03`): - VIEW path resolves (count=23 from active_inventory_view) - VIEW aggregate parity vs filtered BASE TABLE - cost guardrail rejects with structured 400 detail - bq_path_not_registered 403 (incl. quoted "bq" variant) - bigquery_query() blocklist 400 - test-connection endpoint 200 with elapsed_ms	2026-05-04 14:37:52 +02:00
ZdenekSrotyr	7f743d0392	fix(cli): #168 review iter 6 — render empty-string diagnostics Devin Review iter #6 found 2 issues. 🟡 BUG: cli/error_render.py filtered out empty-string values via `detail[key] not in (None, "")` and `value not in (None, "")` before they could reach `_kv_line`. But `_kv_line` was specifically designed to render empty strings as `(empty)` — the filter shadowed that branch. The hidden field happens to be the most operator-actionable one in `cross_project_forbidden`: `billing_project: ""` is the exact diagnostic confirming WHY USER_PROJECT_DENIED fires. Change filter to `is not None`. Empty strings now flow through `_kv_line` and render as `billing_project: (empty)`. 📝 ANALYSIS: CHANGELOG wording for the test-connection endpoint said "the saved data_source.bigquery config", which Devin flagged as slightly misleading because `get_bq_access` is `@functools.cache`d — "Test connection" tests the config in the running process, not the just-saved YAML overlay. The save flow already returns `restart_required: True` and the UI shows a banner, so the behavior is documented; only the CHANGELOG wording was loose. Tightened to "the process-cached BqAccess... Tests the config active in the running process — after a save the response includes restart_required; click Test AFTER restart to validate the freshly-saved values." New test: test_renders_empty_string_as_empty_marker locks in the empty-string-as-(empty) rendering for the cross_project_forbidden case so a future filter change won't silently drop the diagnostic again. 9 affected render tests pass.	2026-05-04 14:30:43 +02:00
ZdenekSrotyr	4bd1919f77	fix(query): #168 review iter 5 — forbidden-table check uses registry IDs Devin Review iter #5 flagged a pre-existing class of name/id mismatch in app/api/query.py:131-136 — the SAME root cause as the bq.* RBAC issue I fixed in iter #3 (line 332/362). Devin called it out as "NOT introduced by this PR" / "might merit follow-up", but it's exactly the same security-boundary pattern this PR is hardening, so fixing here keeps the RBAC story consistent across the handler. The `forbidden = all_views - set(allowed)` comparison mixed types: - `all_views` carries DuckDB master view names (= registry display `name` from the orchestrator's CREATE VIEW) - `set(allowed)` carries registry IDs (resource_grants.resource_id) When `id != name` (e.g. id="bq.finance.ue", name="ue"), authorized users got spurious 403s — the view name landed in `forbidden` even though the caller had a valid grant on the registry id. Build a name->id map from the registry, then the forbidden check compares apples to apples: allowed_view_names = {r["name"] for r in registry_rows if r.get("name") and r.get("id") in allowed_ids} forbidden = all_views - allowed_view_names 107 affected tests pass; 487 pass in wider RBAC/query/access/admin domain — no regressions.	2026-05-04 14:18:43 +02:00
ZdenekSrotyr	a43533ca44	fix(cli): #168 review iter 4 — render `reason` when both kind+reason set Devin Review iter #4 caught: `_format_dict` in cli/error_render.py seeded `seen = {"kind", "reason"}` to keep both out of the kv block. But the label line uses only ONE of them (`kind or reason or "error"`), so the other was silently dropped. Quota rejections at app/api/query.py:423 (daily-budget) and 488 (concurrent-slot) emit BOTH keys: `{reason: "daily_byte_cap_exceeded", kind: "daily_bytes", ...}` and `{reason: "concurrent_slot_exceeded", kind: "concurrent_scans", ...}`. Operator only saw `kind` in the label and never the more specific `reason` value. Fix: track which key actually went into the label and skip only that one. The other appears in the kv section. Verified output: Error: daily_bytes (HTTP 429) reason: daily_byte_cap_exceeded current: 99999 ... 8 affected render tests pass.	2026-05-04 14:04:16 +02:00
ZdenekSrotyr	28aba4c1f9	fix(query): #168 review iter 3 — RBAC name-vs-id, placeholder dead code Devin Review iter #3 found 3 new real bugs after iter #2's fixes landed. 🔴 RBAC check at app/api/query.py:362 used `row["name"]` against `accessible_set`, but `accessible_set` is keyed by registry IDs (`get_accessible_tables` returns `resource_grants.resource_id` — table IDs, not display names). Confirmed by `_table_blocks` projection at `app/resource_types.py:157-158`. When `id != name` (e.g. `id="bq.finance.ue", name="ue"`), non-admin users with valid grants got 403 `bq_path_access_denied`. Switch to `row["id"]`. 🚩 Bare-name pass at app/api/query.py:332 had the same name-vs-id mismatch (different impact): legitimate accessible rows were skipped from `dry_run_set`, so the cost guardrail under-counted scan bytes for non-admin users. Could let an over-cap query through and under-bill quota. Switch to `row_id` comparison. 🟡 `placeholder_from` for billing_project was dead code. `_BQ_OPTIONAL_FIELD_DEFAULTS["billing_project"] = ""` seeded an empty string into every GET payload via `_ensure_bq_optional_fields`. JS `isUnset = (value === undefined)` evaluated False, so the `(defaults to <project>)` placeholder NEVER rendered. Drop the seed — field stays in `known_fields` (UI sees it) but routes through the unset rendering path on GET, where placeholder_from fires. Tests: test_get_surfaces_bq_fields_even_when_unset assertion flipped from "billing_project IS present" to "billing_project NOT auto-seeded" to lock in the new shape. 67 affected tests pass.	2026-05-04 13:51:36 +02:00

1 2 3 4 5 ...

601 commits