agnes-the-ai-analyst

Author	SHA1	Message	Date
ZdenekSrotyr	7d868159f2	address Devin Review on PR #264 BUG_0001 (red): config/claude_md_template.txt is the Jinja2 source for every analyst workspace's CLAUDE.md served via /api/welcome (src/claude_md.py). It still instructed the agent to use the removed --register-bq flag in 6 places — defeating the point of the PR for anyone who ran agnes init after merge. Rewritten: - ASCII routing diagram: "join with a local table" now points to "agnes snapshot create the remote side, then join locally" - "Three patterns" table → "Two patterns" (snapshot create + --remote) - "Hybrid query example" rewritten as snapshot-create + local join, with --remote called out as the escape hatch when the remote side is too big to snapshot - "When the table isn't in agnes catalog" — drop the ad-hoc --register-bq path; admins register, no analyst-side workaround - Footer cross-ref drops "hybrid-query examples" BUG_0002 (yellow): cli/error_render.py docstring line 7 said "All three previously flattened..." after I had already reduced "Three CLI paths" → "Two CLI paths" on line 3. "All three" → "Both".	2026-05-12 18:18:13 +02:00
Vojtech	a46b9dc928	/home install-hero polish: license link contrast, auto-mode reorder, Shift+Tab guidance (#243 ) * Make /home install-hero links readable against blue background The Claude license-options link added in the previous commit inherited the default `<a>` style (`var(--hp-primary)` blue), which renders as blue-on-blue and is unreadable inside the blue install-hero. Add a scoped `.install-hero a` rule that uses white with an underline (matching the existing lead-paragraph contrast pattern) so any link nested in the hero stays legible. * Reorder /home install flow: auto-mode is now Step 2, Agnes install becomes Step 3 Step 3 (was Step 2) pastes a ~20-command bash bootstrap into a fresh Claude Code session. Without auto-mode enabled first, each Bash/edit command needs a manual approve click — bad UX for first-time users. Move auto-mode from the outside-hero `<details>` reference block into the install-hero as a real Step 2, between "install Claude Code" and "install Agnes". Content is the persistent `acceptEdits` snippet (write to ~/.claude/settings.json) plus a one-liner pointing at Shift+Tab for users who are already inside a running Claude Code session. YOLO mode for full Bash auto-approve stays on /setup-advanced behind the existing link. The outside-hero `setup-collapsible[data-section="step3"]` block is dropped — auto-mode is no longer reference content, it's a real install step, and duplicating it would just diverge over time. Onboarded users no longer see the auto-mode block at all (consistent with Steps 1 + 3 also hiding post-onboarding). Completion banner copy updated: "Step 1, 2 & 3 done — Claude Code installed, auto-mode set, Agnes ready". Dashboard CTA partial and other templates don't reference step numbers for this flow, so no adaptation needed there. * Simplify /home Step 2 to Shift+Tab only — drop the JSON snippet Operator pointed out two issues with the prior Step 2: 1. The settings.json snippet is redundant. Claude Code's first Shift+Tab cycle to auto-accept mode already prompts the user whether to persist it as default — Claude writes the config itself, no manual file edit needed. 2. The snippet only showed the POSIX path `~/.claude/settings.json`, which doesn't translate to native Windows. Replace the snippet + copy button with a plain Shift+Tab instruction, explicitly call out the first-time "make this the default?" prompt, and note that Claude handles the config write itself — same flow on macOS / Linux / WSL / Windows. Adds a fallback line for users who already closed the post-OAuth session. * Tighten /home Step 2 install-note to two paragraphs Operator: drop the 'Claude writes the setting itself, so this works the same on macOS / Linux / WSL / Windows...' line plus the 'auto-approves file edits going forward; Bash commands stay gated — that's the safe default' line. Both were filler — the make-default prompt already implies persistence, and gated Bash is the obvious default users won't be surprised by. Result: paragraph 1 carries Shift+Tab + first-time make-default say-yes + closed-session fallback in one breath; paragraph 2 keeps the verbatim YOLO link. Same affordances, less vertical space.	2026-05-11 16:46:58 +00:00
minasarustamyan	19c5a7592a	Session capture queue, private session, and setup-prompt fixes (#242 ) * Capture session paths via SessionStart hook + lock parallel pushes Replace the encoding-based scan of ~/.claude/projects/<encoded-cwd>/ with a queue file populated by a new `agnes capture-session` SessionStart hook. The hook reads the documented `transcript_path` field from Claude Code's hook stdin JSON, sidestepping the cwd-to-folder encoding (which is an internal implementation detail and varies by Claude Code version). - New `agnes capture-session` subcommand appends transcript_path to <workspace>/.claude/agnes-sessions.txt. Silent on all malformed input so a hook chain failure doesn't clutter Claude Code startup. - `agnes push` now consumes the queue: atomic snapshot rename guards against hooks writing during the push window, successful uploads land in agnes-sessions-uploaded.txt (TSV: timestamp + path), failed paths are requeued. - Cross-platform single-instance lock via the filelock package (fcntl on POSIX, msvcrt on Windows). Concurrent SessionEnd hooks — common when the user closes several sessions at once — silent-exit on the losing side instead of all racing the upload. - Recovery: pre-existing snapshot files from a crashed push are picked up and processed before the live queue. - The SessionStart `agnes push` self-heal entry is dropped — it became redundant once the queue persists across runs (orphans from headless / crashed sessions ship out on the next interactive SessionEnd push). Existing workspaces auto-migrate via the marker-based replace logic. - Legacy encoding scan stays available behind `--legacy-scan` for one- off backfills of sessions predating the queue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Add /agnes-private + statusLine indicator for private sessions Users handling sensitive data inside Claude Code can now opt a session out of the Agnes upload pipeline, either proactively (right after session start) or reactively (mid-session). The `/agnes-private` slash command runs `agnes mark-private` deterministically via `!`-prefix direct bash — no AI in the loop. A workspace-installed statusLine surfaces a `🔒 agnes-private` indicator in Claude Code's status bar so the user sees the state at a glance. Authoritative source of "do not upload" is a separate file `<workspace>/.claude/agnes-sessions-private.txt` (one session_id per line). Both `capture-session` (queue writer) and `push` (queue reader) consult the list. This makes the slash-command / SessionStart-hook race impossible by construction: whichever runs first, the session is correctly filtered out. - `agnes mark-private` reads `CLAUDE_CODE_SESSION_ID` from env (set by Claude Code in every bash subprocess it spawns — stable documented API) and appends to the private list. - `agnes statusline` reads the session JSON Claude Code pipes on stdin, checks the private list, and emits the indicator or nothing. Optimized for the high call frequency of statusLine renders. - `capture-session` extracts session_id from hook stdin and skips queue write when the ID is already on the private list (race protection). - `push` filters snapshot entries by the private list and appends to a per-workspace audit log `agnes-sessions-private-skipped.txt`. - Queue format migrated from `<path>` to `<session_id>\t<path>`; legacy one-column lines still parse (empty session_id, still upload, can't be marked private retroactively — fine, they pre-date the feature). - `install_claude_hooks` writes a workspace statusLine unless the user already has a custom one (warn + preserve). Idempotent re-init. - `install_claude_commands` ships `agnes-private.md` alongside `update-agnes-plugins.md`. Per-template fallback so a missing template doesn't get clobbered with the wrong content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix setup-prompt + CLAUDE.md marketplace copy + drop skills step Three issues against the post-PR-#240 / post-PR-#237 state: 1. Setup prompt's marketplace block trailer (both has-stack and empty-stack variants) claimed the SessionStart hook keeps the marketplace clone in sync via `agnes refresh-marketplace --quiet` on every session and that admin grants land automatically — both false since PR #237 (0.47.x) moved the install/update path out of the hook into the `/update-agnes-plugins` slash command. The hook is `--check`-only: detects server-side changes, prompts the user to run the slash command, which does the full reconcile interactively with output visible in the transcript. 2. The empty-stack variant framed composition as "admin grants only", missing the actual three-source served stack: (admin RBAC ∩ /marketplace subscriptions) ∪ system-mandatory plugins (admin-pinned, auto-applied) ∪ Flea market installs (skills/agents bundled, plugins standalone) Updated copy spells out all three sources so analysts know where their stack picks live, and what the SessionStart hook actually does on change detection. 3. CLAUDE.md template's "Agnes Marketplace" section conflated eligibility (`resolve_allowed_plugins` — what's listed) with served stack (`resolve_user_marketplace` — what actually reaches Claude Code). The two are different: a user can be RBAC-eligible for a plugin without having subscribed to it on /marketplace. Rewrote the section to distinguish the eligibility set from the served stack and to describe the `--check`-only hook accurately. Plus: deleted the setup prompt's interactive Skills step (final step before Confirm). The named-opinion question — "do you want me to bulk-copy every skill into ~/.claude/skills/agnes/ or pull on-demand via `agnes skills show <name>`?" — had no obvious right answer for new users at the tail end of a wall of technical steps. On-demand lookup is the one-size-fits-all default; `agnes skills list/show` remain discoverable and the CLAUDE.md template references specific skills inline (e.g. agnes-data-querying in the BigQuery section) where they're relevant. Layout: Confirm shifts from step 9 to step 8. Tests updated, full setup/marketplace/welcome surface green (115 passed). Remaining full-suite failures are pre-existing (BQ/Keboola fixtures, Windows charmap collection error in test_v26_keboola_e2e) — verified against a clean stash, unrelated to this diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix session-queue race + snapshot PID-reuse data loss Two blocker fixes from the PR #242 review: 1. Concurrent SessionStart hooks could corrupt the queue file on Windows. Python's `open(path, "a")` is not atomic there — the CRT does not pass FILE_APPEND_DATA to CreateFile, so concurrent appenders (user opening several Claude Code windows simultaneously) could interleave bytes mid-line. The malformed lines then silently fail the parser and the entries are dropped. Fix: wrap append_to_queue, requeue_failed, and snapshot_queue in a short-lived FileLock on a dedicated `agnes-queue.lock`. Separate from `agnes-push.lock` so capture-session hooks don't block on the push command. New test_append_concurrent_threads_no_corruption reproduces the race with 4 threads x 50 appends. 2. Snapshot filenames embedded only the PID (`agnes-sessions.snapshot. <PID>.txt`). After a crashed push left a snapshot on disk and the OS recycled the PID for a new push, `os.rename` would atomically overwrite the recovery snapshot — every entry in it lost, silently. Fix: append a uuid8 hex tail (`agnes-sessions.snapshot.<PID>. <uuid8>.txt`). find_recovery_snapshots already globs the prefix so it picks up both old and new format. New test_snapshot_filename_is_unique_per_call asserts two consecutive snapshots under the same PID don't collide. Targeted tests green (47/47 in session_queue/capture_session/cli_push). Full suite failures unchanged from baseline (pre-existing BQ/Keboola fixture issues per CLAUDE.md). * Auto-refresh workspace hooks + bash-wrap all hook entries (Windows) Fixes from PR #242 second review (ZdenekSrotyr): 1. `uv.lock` regenerated to include `filelock 3.29.0` (declared in pyproject.toml but missing from the lock file — CI's lockfile-consistency check would fail; `uv pip install` on a clean cache would silently miss the dep). 2. `agnes self-upgrade` now auto-refreshes the workspace Claude Code hooks via the new `cli.lib.hooks.maybe_refresh_claude_hooks`. Closes the silent-stop migration gap: a v0.48 workspace would auto-upgrade the CLI from its existing SessionStart self-upgrade entry but never pick up the new `agnes capture-session` SessionStart hook, leaving the queue empty and `agnes push` uploading nothing. The refresh fires on both the "info is None" fast path (CLI already current — catches the second SessionStart after a prior upgrade) and the install-success path. Guarded by `workspace_has_agnes_hooks` so it never writes `.claude/settings.json` into directories that aren't Agnes workspaces (e.g. `agnes self-upgrade` invoked from `~/`). Errors are surfaced on stderr but never flip the upgrade exit code. 3. All Agnes-managed hooks are now wrapped in `bash -c "..."`. The self-upgrade+pull chained SessionStart entry was the only one still shipping unwrapped — Claude Code on Windows runs hook commands directly without a shell, so the `;` chain + `2>/dev/null` + `\|\| true` shell syntax silently no-op'd on native Windows installs without Git Bash on PATH. Workspaces still on the old form auto-upgrade via the refresh path above. Tests: +12 in test_lib_hooks.py (guard semantics, v0.48→v0.49 migration end-to-end, third-party-hook preservation, bash-wrap invariant). +5 in test_self_upgrade.py (refresh fires on info=None, fires on install success, skipped on failure, skipped on --check-only, refresh failure never flips exit code). 130 targeted tests green. The 2 pre-existing Windows path-separator failures in `test_smoke_test_detects_version_mismatch[uv\|pip]` are unrelated (path mismatch `\fake\uv\bin\agnes` vs `/fake/uv/bin/agnes` in test asserts, pre-PR baseline). * CHANGELOG: document PR-242 main features Closes ZdenekSrotyr #4: the [Unreleased] block was missing entries for the PR's primary surface — only the post-merge fix bullets and the unrelated setup-prompt copy change were captured. Adds: - ### Added: 6 bullets covering the session capture queue + new `agnes capture-session` subcommand, `/agnes-private` slash + `agnes mark-private`, `agnes statusline` + statusLine wiring, `--legacy-scan` opt-in fallback, single-instance push lock, and the new `filelock` runtime dep. - ### Changed: BREAKING bullet on the SessionStart / SessionEnd hook wire format change (capture-session as first SessionStart entry, push self-heal removed, SessionEnd push detached via nohup, all entries bash-wrapped). Folds the prior standalone bash-wrap bullet into this consolidated entry — Z's review flagged the layout shift as BREAKING, and grouping the related sub-changes makes the migration story readable in one place. - Operator migration is auto-handled by `maybe_refresh_claude_hooks` invoked from `agnes self-upgrade` (separate Changed entry below). No `agnes init` re-run required. Pre-queue session jsonls on upgrading workspaces still need a one-off `agnes push --legacy-scan` — flagged in the BREAKING bullet. No code change; doc only. * Drop permanent 4xx uploads instead of requeueing forever Closes ZdenekSrotyr #5. Previously the push retry path requeued any non-200 response except the literal "file not found on disk", so 401 (token expired), 403 (RBAC denial), 413 (payload too large), 400 (server-side validation) cycled through every push run forever — the queue grew without bound and each run re-bombarded the server with the same deterministically-failing upload. Now 4xx (except 408 Request Timeout + 429 Too Many Requests, which the HTTP spec marks as transient) is dropped and audit-logged to `<workspace>/.claude/agnes-sessions-failed.txt`: <iso_ts>\t<session_id>\t<status>\t<transcript_path> 5xx and network errors continue to requeue — those reflect server / transport state that can change between runs, so retry is the right behavior. The audit log piggybacks on the push single-instance lock (agnes-push.lock) — push is the only writer to this file, same as the existing `mark_uploaded` and `mark_private_skipped` paths, so no separate filelock is needed. `agnes push --json` surfaces a new `dropped_permanent` counter; non- quiet stdout mentions the audit-log path so operators tailing the output have a pointer to the forensic trail. Tests: +7 in test_cli_push.py (401/400/403/413 → drop; 408/429 → requeue; 500/502/503 → requeue; network exception → requeue; --json `dropped_permanent` counter; stdout audit-log pointer). +1 in test_session_queue.py (mark_failed_permanent TSV format). 127/129 targeted tests green. The 2 pre-existing Windows path-separator failures in `test_smoke_test_detects_version_mismatch [uv\|pip]` are unrelated (path mismatch `\fake\uv\bin\agnes` vs `/fake/uv/bin/agnes` in test asserts, pre-PR baseline). * Catch OSError in push lock acquisition Closes ZdenekSrotyr #8. `acquire_or_skip` in `cli/lib/push_lock.py` previously caught only `filelock.Timeout`. Any `OSError` from `FileLock.acquire` — read-only filesystem, permission denied on `.claude/`, disk full, hardware I/O failure — propagated as an unhandled traceback. Two visible failure modes: - SessionEnd hook: `\|\| true` in the wrapper swallowed the error, so daily pushes silently never ran. Operator had no signal. - Manual `agnes push`: ugly Python traceback dumped to the terminal instead of a clean exit. Now `OSError` is treated the same as `Timeout` — yield `None`, caller returns cleanly with rc=0. The operator's environment in these scenarios has bigger problems than missing session uploads, so we swallow rather than retry-loop or surface a noisy warning. Test: `test_push_silent_exit_when_filelock_raises_oserror` patches the `FileLock` used inside `push_lock` to raise OSError on acquire, verifies push exits 0 with no traceback and the queue is preserved for the next attempt. * Address remaining S2 items from PR-242 review Four items from ZdenekSrotyr's S2 list: S2.10 — `_install_statusline` truthy check (cli/lib/hooks.py): replace `if existing:` with explicit `if existing is None or existing == "":`. Documents and tests the behavior for both edge cases (explicit-null and empty-string `statusLine`) — both treated as "not configured" rather than "explicit user opt-out", so we install ours. Two new tests in test_lib_hooks.py pin the contract. S2.6 — onboarding docs for /agnes-private. New "Private sessions" subsection in `config/claude_md_template.txt` (next to Data Sync) covering the slash command, statusbar indicator, and audit-log location. One-line tip in `app/web/setup_instructions.py` so the feature is discoverable at onboarding. S2.9 — e2e privacy test (tests/test_e2e_privacy.py). Wires capture_session → mark_private → push against a recording fake api_post and asserts zero session uploads for the marked one. Three cases: mark-before-capture (queue write skipped), mark-after-capture (push-side filter catches it + audit-logs), control (unmarked sessions upload normally). David #8 — `--legacy-scan` help text now documents the private-list gap (legacy entries carry empty session_id, so the filter is not consulted). The practical impact is bounded — pre-queue sessions cannot have been marked private since the private list is a queue-era feature — but the disclaimer in the help text means an operator running a backfill is not surprised. 68 targeted tests green (3 new e2e + 2 new truthy edge tests + existing). 2 pre-existing Windows path-separator failures in test_smoke_test_detects_version_mismatch[uv\|pip] unchanged. Remaining S2 items (statusline mkdir push-back, capture-session silent-fail follow-up) handled in PR comment + follow-up issue respectively. * Address remaining S2 follow-ups (David #8, S2.7, David #11) Three items left over from Mina's bbf63472 batch — that commit addressed S2.6/S2.9/S2.10 + documented David #8 in help text but deferred the actual implementations of S2.7, David #11, and the real David #8 fix to follow-ups. This commit closes them. David #8 — `agnes push --legacy-scan` now consults the private list. Claude Code names jsonls `<session-id>.jsonl`, so the file stem IS the session id; the legacy-scan path can apply the same private filter the queue path uses. Both the dry-run and live-upload code paths fixed. Help text updated (no longer warns the filter is bypassed). Two new tests in test_cli_push.py cover the upload-skip path + the dry-run `would_skip_private` segregation. S2.7 — `statusline`/`is_private` no longer mkdir-pollutes arbitrary workdirs. Split `_claude_dir` into `_claude_dir_writable` (used only from `add_private`) and `_claude_dir_readonly` (no mkdir). The read-only public helpers (`private_list_path`, `read_all_private`, `is_private`) compose the no-mkdir variant by default; `add_private` opts in via `writable=True`. Added a process-local mtime-keyed cache around `read_all_private` so in-process callers (push doing one stat per upload candidate, future `agnes diagnose`) don't re-parse the file on every check. Cache eviction on `add_private` so a sub-second write+read sequence doesn't see stale data even on coarse-mtime filesystems. Two new tests pin the no-mkdir contract + the in-same-second add+read consistency. David #11 — `agnes capture-session` writes a breadcrumb log on every invocation. New `<workspace>/.claude/agnes-capture-session.log` TSV: `<iso_ts>\t<outcome>\t<detail>` where outcome covers every silent- exit path (`ok`, `private_skip`, `empty_stdin`, `bad_json`, `not_object`, `no_transcript_path`, `stdin_read_error`, `write_error`). Gives operators a signal to detect "hook fires but queue stays empty" — without it, an upstream Claude Code stdin- contract change is invisible because the hook always exits 0. Log rolls at 256 KiB so it doesn't grow unbounded on long-lived workspaces. Best-effort: a breadcrumb-write failure is itself swallowed so the hook contract stays "exit 0 always". Skipped in non-Agnes workdirs (no `.claude/` exists) so opening Claude Code in `~/` doesn't pollute it. Five new tests in test_capture_session.py cover the success / bad_json / no_transcript_path / private_skip / no-pollute paths. 115 targeted tests green (test_cli_push, test_capture_session, test_private_list, test_session_queue, test_e2e_privacy, test_lib_hooks, test_statusline, test_mark_private). --------- Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-05-11 13:31:16 +00:00
Vojtech	41829e8a45	Setup-prompt + bootstrap fixes from 2026-05-10 init report (#240 ) * Setup-prompt + bootstrap fixes from David's 2026-05-10 init report Three issues from clean-machine bootstrap evidence: 1. `agnes refresh-marketplace --bootstrap` failed to recover when the local clone existed but Claude Code's marketplace registry had lost the `agnes` entry. Bootstrap path now parses `claude plugin marketplace list`, re-runs `claude plugin marketplace add ~/.agnes/marketplace` when missing, and treats `add` failures as fatal (was warn-and-continue, root cause of the cascade into "Marketplace 'agnes' not found" plugin install errors). 2. Setup prompt now always emits the marketplace-registration block, even when the operator has zero plugin grants. Pre-wires the SessionStart hook so future admin grants land automatically without re-running setup. Block copy adapts: empty list shows "no plugins granted yet", populated list shows "install plugins". 3. Setup prompt registers the Atlassian Remote MCP server unattended (`claude mcp add --transport sse atlassian https://mcp.atlassian.com/v1/sse`). Hosted Remote MCP, OAuth handled automatically by Claude Code on first use. Asana / GWS stay on the /home connector cards (PAT/keychain flows don't fit unattended bootstrap). Confirm step nudges the user toward the /home connector cards for the PAT-flow services. CLAUDE.md template renames the marketplace section to "Agnes Marketplace" and documents that all plugins are addressed as `<plugin>@agnes` regardless of upstream slug. Layout: Confirm shifts from step 6/8 to step 9 across all variants (preflight, marketplace, MCP all unconditional). Tests updated. * Link Claude license options from /home install pane Step-1 Claude install on /home pointed users to OAuth without explaining what to do if they don't have a Pro/Max subscription. Add a one-line follow-up link to the plan-tier section on /setup-advanced (new `#claude-plan` anchor) so first-time users discover the subscription tiers rather than bouncing on the OAuth screen. * Add idempotent + no-TLS-bypass guardrails to /home connector prompts The Asana / Google Workspace / Atlassian connector prompts on /home already shipped a precheck step that short-circuits when the service is already wired, but they didn't carry the same idempotency + surface-errors-verbatim + don't-disable-TLS-verification guardrails the bash bootstrap prompt has. Add a one-paragraph 'Ground rules' block at the top of each prompt so a connector failure doesn't tempt the model into bypass workarounds, matching the same posture David's 2026-05-10 init report flagged for the bash flow. * skip Source: lines in marketplace registry detector `claude plugin marketplace list` prints a `Source: <local path>` line under each registered marketplace; the local clone almost always lives under a path containing the marketplace name itself (`~/.agnes/marketplace`). A naive \\bagnes\\b match over the full stdout therefore false-positives whenever ANY unrelated marketplace sits under `~/.agnes-…/` or similar. Filter Source: lines out before matching so the recovery path actually re-adds when needed instead of silently falling through to a broken `marketplace update agnes`. Adds regression test covering the substring-only case. * drop customer-specific tokens from CHANGELOG entries Per CLAUDE.md vendor-agnostic OSS rule ("nothing customer-specific ... in changelogs"): - "agnes-vrysanek.groupondev.com" -> "a private-CA Agnes deployment" - "Groupon Marketplace / groupon-marketplace" -> "<Org> Marketplace / <org>-marketplace" (placeholder example) - Removed "David flagged" attribution language; init-report context stays intact, just stripped of the named host + brand --------- Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-05-10 20:24:00 +02:00
ZdenekSrotyr	28423907fd	feat: clean CLI errors + init progress + skip-materialize + claude.md catalog pointer Three first-try-failure-surface fixes from Pavel's #185 trace + the template guidance question, all under PR #188's umbrella so they land together with the file_server / parallel pull / Tier 1 work. 1. CLI clean-error wrapper — new AgnesTransportError raised by the api_*/stream_download helpers when httpx times out / drops / refuses, plus a top-level Typer wrapper (cli/main.py) that prints one-line "Error: …" + actionable hint and exits non-zero. Full traceback goes to ~/.config/agnes/last-error.log for support forwarding. Unhandled Exceptions are caught at the same boundary so no Python traceback ever leaks to the analyst's terminal. Pavel's #185 Phase 3B: a 30-frame httpx traceback from a slow BQ --remote query made it look like a CLI bug. Now: clean message + hint pointing at `agnes snapshot create` / partition-column guidance. Entry point in pyproject.toml flipped from `cli.main:app` → `cli.main:_run_with_clean_errors` so the wrapper actually runs under the installed `agnes` binary. 2. agnes init / agnes pull --skip-materialize + progress bar. --skip-materialize omits query_mode='materialized' rows from the download set so a first init doesn't spend 44 minutes silently pulling a single 6 GB parquet (Pavel's #185 Phase 1). Rich-driven per-file progress bar with label/bytes/rate/ETA renders to stderr when not --quiet and not --json. Aggregates across the parallel ThreadPoolExecutor workers added earlier in this PR. 3. config/claude_md_template.txt: explicit one-line snippet pointing at `agnes catalog --json \| jq '.tables[] \| select(.id=="<id>")'` for per-table descriptions + restated invariant: "the description field on each catalog row is the authoritative business-rules text — re-read live, never copy into this file." Resolves the regression-or-feature debate between Pavel (wants annotations) and the user feedback that landed in the prior commit (don't embed table-specific content; tables change). Catalog command stays the source of truth.	2026-05-05 18:11:59 +02:00
ZdenekSrotyr	7a72ea9c37	fix: Devin Review on #188 — try_files fallback + auto-upgrade ordering Two bugs Devin caught: 1. Caddy `try_files A B C` rewrites the URI to its LAST entry when no file matches (per Caddy docs). Without an explicit "back to original URI" fallback, a parquet missing from all three known static paths would get rewritten to `/jira/data/<id>.parquet`, and the reverse_proxy below would forward THAT rewritten URI to app:8000 → 404. The PR's documented "missed → falls through to app handler" promise didn't actually hold for legacy / future connectors. Append `/api/data/<id>/download` as the final try_files entry so the reverse_proxy receives the analyst-facing URI. 2. agnes-auto-upgrade.sh's TLS-overlay decision (which checks Caddyfile existence) ran BEFORE the config re-fetch loop. If a tick's fetch added a previously-missing Caddyfile, this tick's docker compose would still omit `--profile tls` until the next 5-min tick — a window where the recreate uses the wrong overlay set. Move the COMPOSE_FILES tls extension AFTER the fetch. Also strip the workspace prompt of table-list / metric-count enumerations (per user feedback): those are dynamic snapshots that go stale; replace with explicit "use `agnes catalog` / `agnes schema` / `agnes describe` to discover" guidance plus a note about `rough_size_hint` semantics. The Available Datasets `{% for t in tables %}` loop is gone — analysts use the live CLI instead.	2026-05-05 17:24:42 +02:00
ZdenekSrotyr	30e81a15b9	feat(workspace-prompt): decision tree + size-hint so analyst Claude gets it right first try Three concrete changes addressing the "analyst Claude misuses the CLI" class of bugs (image.png table — issues #3, #5, plus the recurrent "how big is this table" guesswork): 1. config/claude_md_template.txt — the template agnes init writes to <workspace>/CLAUDE.md. Surfaces every catalog-row field with a why, adds a query_mode-based decision tree, explicit --estimate scoping (snapshot create ONLY — was the #1 first-try error), an agnes fetch → agnes snapshot create rename note, and a 6-row failure-mode table that maps each common error wording to its right next step. 2. app/api/v2_catalog.py — populate rough_size_hint for local + materialized rows from the on-disk parquet size, bucketed small/medium/large/very_large. Was hardcoded null with a TODO; AI couldn't tell "is this 6.8 GB" without a failed --remote round-trip. 3. cli/update_check.py — the [update] banner survived the da→agnes rename and printed "[update] da X is out of date" on every command, training analysts to associate the binary with the old name. Verified by rendering the template against representative contexts (33/33 tests pass) and running every use case from the original screenshot through the real CLI against a dev VM.	2026-05-05 16:44:24 +02:00
ZdenekSrotyr	3d58768143	fix: address Devin Review findings — incomplete renames + estimate guard 13 Devin findings across 10 files: 🔴 Critical: - app/api/v2_catalog.py:42 — `_fetch_hint` returns `da fetch` in /api/v2/catalog responses (user-visible in every catalog list) - cli/skills/agnes-data-querying.md — 11 stale `da fetch`/`da sync` refs in the bundled skill markdown - config/claude_md_template.txt:38 — referenced `agnes pull --docs-only` flag that does NOT exist in agnes pull (removed; spec only ships --quiet/--json/ --dry-run) 🟡 Important: - app/api/admin.py:252 — `da fetch` in bq_max_scan_bytes hint - cli/commands/auth.py:119 — `da sync` in import-token docstring (--help text) - cli/commands/tokens.py:48 — "Export it so `da` can use it" prose - ARCHITECTURE.md — 4 stale rows in CLI commands table - README.md — stale paragraphs for analysts (da sync, da analyst setup) 🚩 Substantive observations addressed: - app/api/query.py:249,302,489 — server-side error/help strings still said `da sync`/`da fetch` (returned in API responses to clients) - cli/commands/snapshot.py:235-241 — DuckDB existence guard incorrectly blocked `--estimate` (server-side dry-run that never opens local DB). Added test ensuring estimate path skips the guard. Skipped (intentionally historical): - app/api/admin.py:2377,2429,2437 — historical comments describing past manifest-vs-sync_state bug; past tense, accurate to keep as `da sync`.	2026-05-04 20:05:06 +02:00
ZdenekSrotyr	d25d075ed2	docs(claude-md-template): rewrite verbs + paths for new CLI surface (Task 6) - Verb renames (da X -> agnes X for surviving verbs; legacy verbs already absent from this default template — admin overrides with legacy verbs are caught by Task 2's _LEGACY_STRINGS scan + Task 5's admin banner). - Path renames: data/parquet/ -> server/parquet/, data/duckdb/ -> user/duckdb/, data/metadata/ removed entirely (no longer exists per spec). - Drop user/artifacts/ from directory structure (spec workspace layout drops it; surviving paths: server/parquet/, user/duckdb/, user/snapshots/, user/sessions/). - Add AGNES_WORKSPACE.md pointer near top-of-template so analysts know where to find human-readable docs. Cleans Task 0.5's missed sweep on this file (was not in cli/ tree but is user-visible via /api/welcome). 81 claude_md/welcome_template tests pass.	2026-05-04 17:51:14 +02:00
ZdenekSrotyr	a2157ee807	fix(claude_md): restore full default content (BQ cost guard, hybrid example, ad-hoc table, deeper guidance)	2026-05-04 05:48:04 +02:00
ZdenekSrotyr	f01eb4143d	feat(db,repo,renderer): schema v23 + claude_md_template + ClaudeMd renderer - Bump SCHEMA_VERSION 22 → 23; add claude_md_template singleton table to _SYSTEM_SCHEMA and _V22_TO_V23_MIGRATIONS; wire migration + fresh-install seed - src/repositories/claude_md_template.py: ClaudeMdTemplateRepository (get/set/reset) mirroring WelcomeTemplateRepository; defensive re-seed in get() - src/claude_md.py: compute_default_claude_md / render_claude_md / build_claude_md_context — rich renderer with RBAC-filtered tables, metrics, and marketplaces; reads override from claude_md_template or falls back to config/claude_md_template.txt; raises TemplateError on broken override - config/claude_md_template.txt: default Jinja2 markdown template restored from PR #167 history (tables, metrics, marketplaces, BQ guidance, corporate memory, directory structure, per-user footer)	2026-05-03 22:43:56 +02:00
ZdenekSrotyr	c4d23cf235	feat(admin-prompt): update editor UX + docs for banner context - admin_welcome.html: update subtitle, description, placeholder cheatsheet (drop tables/metrics/marketplaces/sync_interval; add user-null note and security note). Textarea initial value is now empty (no default template to show). Preview pane uses innerHTML (HTML output). refreshStatus sets editor to empty when no override. Preview pane styled as light surface. Reset modal copy updated (no banner shown, not "OSS-shipped template"). - config/claude_md_template.txt: deleted (markdown template is gone; default is now no banner). - docs/agent-setup-prompt.md: rewritten for variant C — describes the /setup banner, smaller placeholder table, security/sanitization notes, anonymous-user guard, example HTML snippet.	2026-05-03 16:12:13 +02:00
ZdenekSrotyr	d055417377	feat(config): default welcome template in jinja2 + sync_interval	2026-05-03 16:10:48 +02:00
Vojtech	bd7b8c3233	fix(analyst): document BigQuery remote-query capability in bootstrap CLAUDE.md template (#154 ) * fix(analyst): document BigQuery remote-query capability in bootstrap CLAUDE.md template Closes #153. The CLAUDE.md template generated by `da analyst bootstrap` (config/claude_md_template.txt) covered metrics, sync, corporate memory, and directory layout — but had ZERO mention of query_mode: "remote", da fetch, da query --remote, or --register-bq. Result: the AI analyst running in a freshly-bootstrapped workspace had no idea BigQuery-backed tables existed, no path to fetch unsynced data, and no fallback for tables not in the catalog. Validated against /Users/<user>/foundry-ai/foundryai-data-analyst/CLAUDE.md on 2026-05-01: section confirmed missing. Workspace-level (parent-dir) CLAUDE.md carried legacy SSH-heredoc instructions but the analyst-level file (which Claude reads as primary project context) had nothing. ## Changes ### config/claude_md_template.txt (+83) Added a `## Remote Queries (BigQuery)` section covering: - Discovery first — `da catalog --json \| jq '...'` to see all tables with their query_mode, then `da schema` and `da describe` for shape. - Three query patterns: - `da fetch` (preferred) — materialize a filtered subset locally, query the snapshot, drop when done. - `da query --remote` — one-shot server-side execution (cheap probes). - `da query --register-bq` — hybrid joins between local + ad-hoc BQ. - `da fetch` estimate-first discipline — rules of thumb on --select / --where / --estimate / snapshot reuse. - BigQuery SQL flavor cheat sheet for `--where` (DATE literal, DATE_SUB, REGEXP_CONTAINS, CAST AS INT64). - Unknown-table fallback: when a table isn't in `da catalog` at all, use ad-hoc `--register-bq` if the agnes server SA has BQ access, or ask admin to register with `query_mode: "remote"` for ongoing use. - Pointer to `da skills show agnes-data-querying` for deeper guidance. ### docs/setup/claude_md_template.txt (deleted) Stale 359-line template that documented the deprecated SSH-heredoc remote_query.sh protocol. No code references it (verified via grep across .py / .sh / .yml / .md). Removing eliminates two failure modes: 1. A future refactor accidentally pulling it into a workspace and shipping deprecated guidance to analyst Claude sessions. 2. Reviewer confusion over which template is canonical. ### CHANGELOG.md `### Fixed` and `### Removed` entries under [Unreleased]. ## Tested - Manually walked the diff against `da skills show agnes-data-querying` output on a live VM (foundryai-development) — patterns + flags match the modern CLI exactly. - Re-bootstrap test deferred: requires network round-trip; pattern is identical to existing template substitution path so render is not at risk. ## Out of scope - The companion gap that data_description.md often only enumerates query_mode: "local" tables (no signal that other modes exist) — separate concern, fix likely belongs in the metadata generator on the server side, not in the analyst template. - Encouraging admins to register frequently-queried BQ tables as `query_mode: "remote"` in the registry — workflow improvement, not a code bug. * chore(release): cut 0.28.0 --------- Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-05-01 12:06:41 +02:00
PavelDo	e1108b6112	feat(memory): corporate memory v1+v1.5 + 0.15.0 (#72 ) Adds corporate memory v1 (verification flywheel + contradiction detection + confidence scoring) and v1.5 (audience-based distribution + per-item privacy + admin curation). Server: GET /api/memory/bundle returns mandatory + ranked-approved items within a token budget; POST /api/memory/admin/mandate accepts an audience field gated against user_group_members; /api/memory/stats uses SQL aggregation. CLI: da sync writes received items to .claude/rules/km_*.md. Verification detector extracts knowledge candidates from session JSONL files. Auto-tagging via Haiku when ai: is configured. Adapted from the v9-era branch onto v13/v14 RBAC: _is_privileged_viewer + _effective_groups now query user_group_members JOIN user_groups; require_role(Role.KM_ADMIN) replaced with require_admin (km_admin collapsed into admin). Schema v15: knowledge_items context-engineering columns + knowledge_contradictions + session_extraction_state. Schema v16: verification_evidence. Cuts release v0.15.0 (also bundles #116 /me/debug page).	2026-04-29 07:16:22 +02:00
ZdenekSrotyr	c825ead209	feat: add CLAUDE.md template for analyst bootstrap	2026-04-10 19:41:07 +02:00

16 commits