agnes-the-ai-analyst

Author	SHA1	Message	Date
Vojtech	79a958ec26	feat(setup): configurable instance brand + connector setup overhaul (#268 ) - instance.brand (env AGNES_INSTANCE_BRAND, default "Agnes") + instance.workspace_dir replace hard-coded "Agnes" / "~/Agnes" across /home, /setup, /setup-advanced, /login, /install, /me/debug, and the Claude Code clipboard setup script. Terraform-friendly env override; defaults preserve existing Agnes branding. - Explicit "create workspace folder" step on /home (OS-tabbed mkdir+cd) + same step baked into the clipboard script as step 2. Drops the implicit assumption that `agnes init --workspace .` lands in a sensibly-cd'd shell. - Final "Restart Claude Code" step in the setup script (unconditional, between connectors and Confirm) so freshly-installed plugins, MCP servers, and SessionStart hooks load on the next Claude Code session. - Asana reverted from hosted Remote MCP back to PAT + raw REST against app.asana.com/api/1.0. MCP envelope shape consumed ~5x tokens per call; the PAT path lets the agent read flat REST fields. Existing MCP registration is detected and the user is asked whether to remove it (default Y, with benefits listed: token cost, no third-party hop, no OAuth refresh dance, deterministic envelope shape). - Atlassian connector instructs picking the longest API-token expiry (today "1 year") to cut re-mint friction. No public query-parameter hook exists on id.atlassian.com to pre-select expiry, so the prompt documents the manual click and acknowledges that limitation. - Uniform ✅ / ❌ per-connector marker contract (Asana, GWS, Atlassian) for the Confirm summary to grep. Each connector now ends with a Claude-driven end-to-end test that uses Claude Code's own bash to exercise the stored credential and prints "✅ <Connector> integration verified — ..." (or the failure variant).	2026-05-12 17:10:08 +02:00
Vojtech	c09c85d13a	fix(cta): clipboard fallback + fold Atlassian MCP into connectors (#249 ) * fix(cta): fall back to textarea+execCommand when Clipboard API rejects The "Setup a new Claude Code" CTA fetches /auth/tokens, parses the JSON response, renders the setup script, THEN calls `navigator.clipboard.writeText()`. Modern browsers (Safari, Firefox, and Chrome on stricter configurations) reject `writeText` with NotAllowedError when transient user activation has been consumed by an intervening `await` — which is exactly the case here. Users perceived this as "the browser blocked the copy" and got the manual-paste fallback modal even though the textarea + `document.execCommand('copy')` path WOULD have worked synchronously without needing fresh user activation. `copyToClipboard` now: - prefers the modern Clipboard API (unchanged for the happy path) - on writeText rejection, falls back to `copyViaTextarea` instead of surfacing the rejection to the caller's catch block. `copyViaTextarea` is the previously-inline textarea fallback factored out into a named helper, with two small hardening touches: - `readonly` + `tabindex=-1` so the hidden textarea doesn't steal focus or pop the virtual keyboard on mobile. - explicit `setSelectionRange(0, text.length)` to belt-and-braces the selection on iOS Safari (where `.select()` alone sometimes selects zero chars on touch-focused textareas). Only the CTA button needed this — the Step-1 install-command and the connector-copy buttons all call `writeText` synchronously inside the click handler (no awaits in between), so they keep their existing user-gesture context and didn't hit the same rejection. No template changes there. * refactor(home): fold Atlassian MCP registration into connectors block The standalone "Register the Atlassian MCP server" step (was step 6 in the unified setup script) moves INTO the Atlassian connector's prompt body so all Atlassian-related setup lives in one logical group. Same intent that #247 carried for connectors, applied one level deeper: the hosted Remote MCP registration is part of "set up Atlassian", not its own ungrouped step. What changed: - `app/web/connector_prompts.py` — the Atlassian prompt's step 5 replaces the speculative "Register the on-demand Atlassian MCP under .claude/mcp/atlassian" line with the actual hosted Remote MCP registration: `claude mcp add --transport sse atlassian https://mcp.atlassian.com/v1/sse \|\| true`. The `\|\| true` keeps re-runs idempotent and the body explains the OAuth-on-first-use contract. Both /home's Atlassian tile and the inlined setup-script Atlassian sub-block emit this line — single source of truth holds. - `app/web/setup_instructions.py` — `_mcp_servers_block` deleted; the `mcp_servers` step is removed from `_step_numbers`; resolve_lines no longer calls it. - Renumbering: install (1), init (2), catalog (3), preflight (4), marketplace (5), diagnose (6), connectors (7), confirm (8). Was: 6 = mcp_servers, 7 = diagnose, 8 = connectors, 9 = confirm. - `tests/test_setup_instructions.py` — Confirm step 9→8, Connect 8→7, diagnose 7→6, mcp_servers references dropped. `test_step_numbering_with_connectors_step` now asserts `"mcp_servers" not in steps`. Stray-Confirm assertion lists shift by one position. - `tests/test_setup_page_unified.py` + `tests/test_web_ui.py` — same step-number shifts in the rendered /setup preview assertions. The `claude mcp add` line is still the Atlassian Remote-MCP path that the 2026-05-10 init-report Fix C added — only its position in the flow changes. /home Atlassian tile copying continues to install the MCP too (the prompt body the tile pastes contains the same line). 112 tests pass. * feat(atlassian): operator-overrideable base URL via AGNES_ATLASSIAN_BASE_URL Adds an env var / YAML key the operator (Terraform module, customer-VM template, OSS instance.yaml) can set to bake the Atlassian Cloud site root into the connector prompt — so end users don't have to guess / paste their org's `https://<myorg>.atlassian.net`. When set, the Atlassian connector prompt (rendered on both /home tile and inlined into the setup-script step 7 Atlassian sub-block) replaces step 1's "Ask me for my Atlassian Cloud site URL and email" with a one-line note that the URL is already provisioned by the operator and asks only for the email. Step 4's helper-script body has the `BASE_URL='<the site URL I gave you>'` placeholder substituted with the literal value. When unset (empty), the existing "ask the user" flow remains — no regression for OSS instances. Resolution + normalization in `get_atlassian_base_url()`: - env `AGNES_ATLASSIAN_BASE_URL` > yaml `instance.atlassian.base_url` > "" - strips trailing slash + trailing `/wiki` so the canonical value is the bare site root. Matches the per-user helper script's normalization at storage time (atlassian_prompt step 4 guard 2), so the literal baked in by the operator stays consistent with what the user's helper script would have computed from their input. Plumbing: - `app/instance_config.py`: new `get_atlassian_base_url()` resolver. - `app/web/connector_prompts.py`: - `atlassian_prompt(, base_url: str = "")` — string-replace two explicit placeholder phrases when base_url is truthy; otherwise return the prompt unchanged. - `all_connector_prompts(..., atlassian_base_url: str = "")` — forwards the kwarg. - `app/web/router.py` (`_build_context`): reads `get_atlassian_base_url()` and passes it through to `all_connector_prompts(...)` so both the /home tile context AND the inlined-script `resolve_lines(...)` call use the same value. - `src/welcome_template.py` (`compute_default_agent_prompt`): same threading via the existing import-on-demand path. Tests (`tests/test_home_route_resolution.py`): - `get_atlassian_base_url` resolver: default empty, env override, trailing-slash strip, trailing-`/wiki` strip. - `atlassian_prompt(base_url=...)`: literal URL baked in, ask-step removed, placeholder replaced, operator-baked-in copy appears. - `atlassian_prompt(base_url="")`: existing ask-the-user flow unchanged. - `all_connector_prompts(atlassian_base_url=...)`: kwarg threads through to the rendered atlassian prompt. 135 tests pass. feat(asana): register hosted Asana Remote MCP in connector prompt The Asana connector prompt only stored a PAT in the OS keychain + ran a curl verify against /api/1.0/users/me. That set Claude Code up for direct `curl` calls but didn't actually wire Asana into Claude's tool list — so the user couldn't ask Claude to "find my open Asana tasks" and have it work. Symmetric oversight to the Atlassian connector's original speculative `.claude/mcp/atlassian` line that this branch already replaced with `claude mcp add --transport sse atlassian https://mcp.atlassian.com/v1/sse`. Adds a new step 5 that registers Asana's hosted Remote MCP: claude mcp add --transport http asana https://mcp.asana.com/mcp \|\| true This is the V2 endpoint (streamable HTTP transport, launched February 2026). The V1 SSE endpoint at https://mcp.asana.com/sse was deprecated 2026-05-11 (today) and must NOT be used — calling it out explicitly in the prompt body so a future operator who finds an old reference doesn't paste the dead URL. OAuth is handled by Claude Code at first use, same model as the Atlassian MCP step. The PAT stored in step 3 stays for direct `curl` calls (precheck + ad-hoc scripts) — the MCP path uses its own OAuth grant, not the PAT. Old step 5 (revoke instructions) renumbers to step 6 and adds the `claude mcp remove asana` cleanup hint. Same single-source-of-truth invariant holds: /home Asana tile + the inlined Asana sub-block in the setup script (step 7 connectors) both emit identical text from `asana_prompt()`. 71 tests pass. * feat(asana): drive MCP OAuth login + end-to-end validation post-register `claude mcp add --transport http asana ...` only registers the server in Claude Code's local config — it does NOT trigger OAuth. The browser tab opens the first time any `mcp__asana__` tool gets invoked. So the previous step 5 left a user looking at a "registered" MCP that, in practice, hadn't authed yet and would fail on first real use. Same blind spot Atlassian's prompt also has, but Asana was the one called out in the latest review pass. Adds a new step 6 between MCP registration (step 5) and the revoke instructions (now step 7): a. Tell the user verbatim what's about to happen — a low-impact read through the MCP will pop the OAuth browser tab; sign in with the same account whose PAT they stored in step 3 and approve. Frames the OAuth as one-time so users don't wait for it on every later call. b. Drive an actual MCP read. Don't prescribe the exact tool name because the Asana MCP's exposed surface (`mcp__asana__`) is versioned upstream and we don't want to pin to a name that gets renamed. Instead: tell Claude to pick the lightest read from its surfaced tool list (users-me / list-workspaces / equivalent). Document the recovery path when Claude Code times out waiting for the OAuth tool use: `claude mcp list` to confirm registration before retrying. c. Print a single one-line proof that combines wiring + auth: "Asana MCP connected as <name> — <N> workspace(s) visible." Explicit anti-echo callout for tokens, task content, comments. On failure, surface the exact Claude-Code error and stop — no silent pass. d. Sanity-check that the MCP OAuth identity and the PAT identity reference the same Asana account. Easy mistake to make when the user has multiple Asana accounts — flag only on mismatch, keep quiet when they match. Recovery: `claude mcp remove asana && claude logout asana` then redo step 5. Step 7 (revoke) absorbs both the keychain delete + the `claude logout asana` line so users have a single place to undo everything. 43 tests pass. * fix(init): clear stale CA env vars on Windows before any TLS handshake Reported by the 2026-05-11 Windows test pass: after `agnes init` the gws connector failed with `UnknownIssuer` TLS errors because `SSL_CERT_FILE` and `REQUESTS_CA_BUNDLE` were still set in Windows User scope pointing at `C:\Users\localadmin\.config\agnes\ca-bundle.pem` — a file that did not exist on the test host. Past Agnes installs (the setup-prompt trust block + older bootstrap helpers) write those pointers when they materialize a combined Agnes-CA bundle; when the bundle file later disappears (re-init on a new VM, machine swap, the ~/.agnes dir wiped), the pointers go stale and every native Windows TLS handshake fails before Agnes itself runs. SSL_CERT_FILE in particular REPLACES (not appends to) the trust store, so a stale pointer is silently catastrophic. `agnes init` now clears stale pointers in two layers before the first server roundtrip: 1. Current-process env (os.environ) — what the immediately-following `api_get` to /api/catalog/tables actually reads. Without this, init itself blows up before it gets to step 2. 2. Windows User-scope env via PowerShell `[Environment]::SetEnvironmentVariable(name, $null, 'User')` — what every future shell + every native tool (gws, claude.exe, pip, uv) inherits. The 2026-05-11 reporter expected this exact cleanup ("init was supposed to clear these but they persisted"). The cleanup is best-effort and conservative: - Only deletes a var when its value points at a path that does NOT exist on disk. Intentional operator config (e.g. SSL_CERT_FILE pointing at a corp certifi bundle) stays put. - PowerShell missing / restricted execution policy / WSL-without-pwsh: swallowed silently. The current-process leg still runs, which unblocks init even on hosts where the User-scope leg cannot fire. Tests (`tests/test_init_ca_cleanup.py`, 6 cases): - Stale pointers → removed from process env. - Real-path pointers → preserved. - Non-Windows hosts: PowerShell is not invoked. - Windows hosts: PowerShell IS invoked with a script that checks all three vars + uses Test-Path + SetEnvironmentVariable. - PowerShell FileNotFoundError: cleanup swallows it, does not raise. - `_is_windows_host()` reflects sys.platform. * refactor(asana): MCP-first flow — drop PAT storage, precheck via `claude mcp list` The Asana hosted MCP at https://mcp.asana.com/mcp authenticates via OAuth (Claude Code holds the grant; browser tab pops on first tool use). The earlier prompt walked the user through creating + keychain- storing an Asana Personal Access Token AND registering the MCP — two parallel auth surfaces for one connector. Once the MCP works, the PAT has no consumer: the precheck/verify steps that used `curl $BASE/api/1.0/users/me` are just redundant proof that Asana itself is reachable, which the OAuth handshake already establishes. Removed: - Step 0 keychain probe + curl verify against /users/me with PAT. - Step 1 open developer-console / create PAT. - Step 2 click "+ New access token", warn shown-ONCE. - Step 3 helper-script for keychain-storage (per-OS bodies: macOS `security add-generic-password`, Linux `secret-tool store`, Windows `cmdkey /generic`). - Step 4 PAT-side `users/me` verify. - Step 5's split that kept the PAT around for direct curl scripts. - Step 6d's "MCP vs PAT identity sanity check" — there is no PAT anymore, nothing to mismatch against. New flow (3 steps total): - Step 0 precheck: `claude mcp list \| grep ^asana` — if found, the server is registered AND Claude Code is holding its OAuth grant (otherwise prior failure would have removed it); print "Asana MCP already registered — skipping setup" and stop. Tells the user the explicit reset command (`claude mcp remove asana && claude logout asana`) so a re-register stays one paste. - Step 1: `claude mcp add --transport http asana https://mcp.asana.com/mcp` — no `\|\| true` because step 0 should have caught the "already exists" case. Step explains the V2-vs-V1 endpoint distinction (V1 SSE deprecated 2026-05-11) and the abort-clean recovery if the precheck somehow missed the existing server. - Step 2: same OAuth + low-impact-read validation pattern as before. - Step 3: revoke instructions (mcp remove + logout + Asana-side app revoke at app.asana.com/Settings → Apps). Both surfaces (the /home Asana tile and the inlined Asana sub-block in the setup script's step 7) emit the new text from the same asana_prompt() — single-source-of-truth invariant intact. 77 tests pass.	2026-05-11 21:54:51 +02:00
Vojtech	a46b9dc928	/home install-hero polish: license link contrast, auto-mode reorder, Shift+Tab guidance (#243 ) * Make /home install-hero links readable against blue background The Claude license-options link added in the previous commit inherited the default `<a>` style (`var(--hp-primary)` blue), which renders as blue-on-blue and is unreadable inside the blue install-hero. Add a scoped `.install-hero a` rule that uses white with an underline (matching the existing lead-paragraph contrast pattern) so any link nested in the hero stays legible. * Reorder /home install flow: auto-mode is now Step 2, Agnes install becomes Step 3 Step 3 (was Step 2) pastes a ~20-command bash bootstrap into a fresh Claude Code session. Without auto-mode enabled first, each Bash/edit command needs a manual approve click — bad UX for first-time users. Move auto-mode from the outside-hero `<details>` reference block into the install-hero as a real Step 2, between "install Claude Code" and "install Agnes". Content is the persistent `acceptEdits` snippet (write to ~/.claude/settings.json) plus a one-liner pointing at Shift+Tab for users who are already inside a running Claude Code session. YOLO mode for full Bash auto-approve stays on /setup-advanced behind the existing link. The outside-hero `setup-collapsible[data-section="step3"]` block is dropped — auto-mode is no longer reference content, it's a real install step, and duplicating it would just diverge over time. Onboarded users no longer see the auto-mode block at all (consistent with Steps 1 + 3 also hiding post-onboarding). Completion banner copy updated: "Step 1, 2 & 3 done — Claude Code installed, auto-mode set, Agnes ready". Dashboard CTA partial and other templates don't reference step numbers for this flow, so no adaptation needed there. * Simplify /home Step 2 to Shift+Tab only — drop the JSON snippet Operator pointed out two issues with the prior Step 2: 1. The settings.json snippet is redundant. Claude Code's first Shift+Tab cycle to auto-accept mode already prompts the user whether to persist it as default — Claude writes the config itself, no manual file edit needed. 2. The snippet only showed the POSIX path `~/.claude/settings.json`, which doesn't translate to native Windows. Replace the snippet + copy button with a plain Shift+Tab instruction, explicitly call out the first-time "make this the default?" prompt, and note that Claude handles the config write itself — same flow on macOS / Linux / WSL / Windows. Adds a fallback line for users who already closed the post-OAuth session. * Tighten /home Step 2 install-note to two paragraphs Operator: drop the 'Claude writes the setting itself, so this works the same on macOS / Linux / WSL / Windows...' line plus the 'auto-approves file edits going forward; Bash commands stay gated — that's the safe default' line. Both were filler — the make-default prompt already implies persistence, and gated Bash is the obvious default users won't be surprised by. Result: paragraph 1 carries Shift+Tab + first-time make-default say-yes + closed-session fallback in one breath; paragraph 2 keeps the verbatim YOLO link. Same affordances, less vertical space.	2026-05-11 16:46:58 +00:00
minasarustamyan	19c5a7592a	Session capture queue, private session, and setup-prompt fixes (#242 ) * Capture session paths via SessionStart hook + lock parallel pushes Replace the encoding-based scan of ~/.claude/projects/<encoded-cwd>/ with a queue file populated by a new `agnes capture-session` SessionStart hook. The hook reads the documented `transcript_path` field from Claude Code's hook stdin JSON, sidestepping the cwd-to-folder encoding (which is an internal implementation detail and varies by Claude Code version). - New `agnes capture-session` subcommand appends transcript_path to <workspace>/.claude/agnes-sessions.txt. Silent on all malformed input so a hook chain failure doesn't clutter Claude Code startup. - `agnes push` now consumes the queue: atomic snapshot rename guards against hooks writing during the push window, successful uploads land in agnes-sessions-uploaded.txt (TSV: timestamp + path), failed paths are requeued. - Cross-platform single-instance lock via the filelock package (fcntl on POSIX, msvcrt on Windows). Concurrent SessionEnd hooks — common when the user closes several sessions at once — silent-exit on the losing side instead of all racing the upload. - Recovery: pre-existing snapshot files from a crashed push are picked up and processed before the live queue. - The SessionStart `agnes push` self-heal entry is dropped — it became redundant once the queue persists across runs (orphans from headless / crashed sessions ship out on the next interactive SessionEnd push). Existing workspaces auto-migrate via the marker-based replace logic. - Legacy encoding scan stays available behind `--legacy-scan` for one- off backfills of sessions predating the queue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Add /agnes-private + statusLine indicator for private sessions Users handling sensitive data inside Claude Code can now opt a session out of the Agnes upload pipeline, either proactively (right after session start) or reactively (mid-session). The `/agnes-private` slash command runs `agnes mark-private` deterministically via `!`-prefix direct bash — no AI in the loop. A workspace-installed statusLine surfaces a `🔒 agnes-private` indicator in Claude Code's status bar so the user sees the state at a glance. Authoritative source of "do not upload" is a separate file `<workspace>/.claude/agnes-sessions-private.txt` (one session_id per line). Both `capture-session` (queue writer) and `push` (queue reader) consult the list. This makes the slash-command / SessionStart-hook race impossible by construction: whichever runs first, the session is correctly filtered out. - `agnes mark-private` reads `CLAUDE_CODE_SESSION_ID` from env (set by Claude Code in every bash subprocess it spawns — stable documented API) and appends to the private list. - `agnes statusline` reads the session JSON Claude Code pipes on stdin, checks the private list, and emits the indicator or nothing. Optimized for the high call frequency of statusLine renders. - `capture-session` extracts session_id from hook stdin and skips queue write when the ID is already on the private list (race protection). - `push` filters snapshot entries by the private list and appends to a per-workspace audit log `agnes-sessions-private-skipped.txt`. - Queue format migrated from `<path>` to `<session_id>\t<path>`; legacy one-column lines still parse (empty session_id, still upload, can't be marked private retroactively — fine, they pre-date the feature). - `install_claude_hooks` writes a workspace statusLine unless the user already has a custom one (warn + preserve). Idempotent re-init. - `install_claude_commands` ships `agnes-private.md` alongside `update-agnes-plugins.md`. Per-template fallback so a missing template doesn't get clobbered with the wrong content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix setup-prompt + CLAUDE.md marketplace copy + drop skills step Three issues against the post-PR-#240 / post-PR-#237 state: 1. Setup prompt's marketplace block trailer (both has-stack and empty-stack variants) claimed the SessionStart hook keeps the marketplace clone in sync via `agnes refresh-marketplace --quiet` on every session and that admin grants land automatically — both false since PR #237 (0.47.x) moved the install/update path out of the hook into the `/update-agnes-plugins` slash command. The hook is `--check`-only: detects server-side changes, prompts the user to run the slash command, which does the full reconcile interactively with output visible in the transcript. 2. The empty-stack variant framed composition as "admin grants only", missing the actual three-source served stack: (admin RBAC ∩ /marketplace subscriptions) ∪ system-mandatory plugins (admin-pinned, auto-applied) ∪ Flea market installs (skills/agents bundled, plugins standalone) Updated copy spells out all three sources so analysts know where their stack picks live, and what the SessionStart hook actually does on change detection. 3. CLAUDE.md template's "Agnes Marketplace" section conflated eligibility (`resolve_allowed_plugins` — what's listed) with served stack (`resolve_user_marketplace` — what actually reaches Claude Code). The two are different: a user can be RBAC-eligible for a plugin without having subscribed to it on /marketplace. Rewrote the section to distinguish the eligibility set from the served stack and to describe the `--check`-only hook accurately. Plus: deleted the setup prompt's interactive Skills step (final step before Confirm). The named-opinion question — "do you want me to bulk-copy every skill into ~/.claude/skills/agnes/ or pull on-demand via `agnes skills show <name>`?" — had no obvious right answer for new users at the tail end of a wall of technical steps. On-demand lookup is the one-size-fits-all default; `agnes skills list/show` remain discoverable and the CLAUDE.md template references specific skills inline (e.g. agnes-data-querying in the BigQuery section) where they're relevant. Layout: Confirm shifts from step 9 to step 8. Tests updated, full setup/marketplace/welcome surface green (115 passed). Remaining full-suite failures are pre-existing (BQ/Keboola fixtures, Windows charmap collection error in test_v26_keboola_e2e) — verified against a clean stash, unrelated to this diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix session-queue race + snapshot PID-reuse data loss Two blocker fixes from the PR #242 review: 1. Concurrent SessionStart hooks could corrupt the queue file on Windows. Python's `open(path, "a")` is not atomic there — the CRT does not pass FILE_APPEND_DATA to CreateFile, so concurrent appenders (user opening several Claude Code windows simultaneously) could interleave bytes mid-line. The malformed lines then silently fail the parser and the entries are dropped. Fix: wrap append_to_queue, requeue_failed, and snapshot_queue in a short-lived FileLock on a dedicated `agnes-queue.lock`. Separate from `agnes-push.lock` so capture-session hooks don't block on the push command. New test_append_concurrent_threads_no_corruption reproduces the race with 4 threads x 50 appends. 2. Snapshot filenames embedded only the PID (`agnes-sessions.snapshot. <PID>.txt`). After a crashed push left a snapshot on disk and the OS recycled the PID for a new push, `os.rename` would atomically overwrite the recovery snapshot — every entry in it lost, silently. Fix: append a uuid8 hex tail (`agnes-sessions.snapshot.<PID>. <uuid8>.txt`). find_recovery_snapshots already globs the prefix so it picks up both old and new format. New test_snapshot_filename_is_unique_per_call asserts two consecutive snapshots under the same PID don't collide. Targeted tests green (47/47 in session_queue/capture_session/cli_push). Full suite failures unchanged from baseline (pre-existing BQ/Keboola fixture issues per CLAUDE.md). * Auto-refresh workspace hooks + bash-wrap all hook entries (Windows) Fixes from PR #242 second review (ZdenekSrotyr): 1. `uv.lock` regenerated to include `filelock 3.29.0` (declared in pyproject.toml but missing from the lock file — CI's lockfile-consistency check would fail; `uv pip install` on a clean cache would silently miss the dep). 2. `agnes self-upgrade` now auto-refreshes the workspace Claude Code hooks via the new `cli.lib.hooks.maybe_refresh_claude_hooks`. Closes the silent-stop migration gap: a v0.48 workspace would auto-upgrade the CLI from its existing SessionStart self-upgrade entry but never pick up the new `agnes capture-session` SessionStart hook, leaving the queue empty and `agnes push` uploading nothing. The refresh fires on both the "info is None" fast path (CLI already current — catches the second SessionStart after a prior upgrade) and the install-success path. Guarded by `workspace_has_agnes_hooks` so it never writes `.claude/settings.json` into directories that aren't Agnes workspaces (e.g. `agnes self-upgrade` invoked from `~/`). Errors are surfaced on stderr but never flip the upgrade exit code. 3. All Agnes-managed hooks are now wrapped in `bash -c "..."`. The self-upgrade+pull chained SessionStart entry was the only one still shipping unwrapped — Claude Code on Windows runs hook commands directly without a shell, so the `;` chain + `2>/dev/null` + `\|\| true` shell syntax silently no-op'd on native Windows installs without Git Bash on PATH. Workspaces still on the old form auto-upgrade via the refresh path above. Tests: +12 in test_lib_hooks.py (guard semantics, v0.48→v0.49 migration end-to-end, third-party-hook preservation, bash-wrap invariant). +5 in test_self_upgrade.py (refresh fires on info=None, fires on install success, skipped on failure, skipped on --check-only, refresh failure never flips exit code). 130 targeted tests green. The 2 pre-existing Windows path-separator failures in `test_smoke_test_detects_version_mismatch[uv\|pip]` are unrelated (path mismatch `\fake\uv\bin\agnes` vs `/fake/uv/bin/agnes` in test asserts, pre-PR baseline). * CHANGELOG: document PR-242 main features Closes ZdenekSrotyr #4: the [Unreleased] block was missing entries for the PR's primary surface — only the post-merge fix bullets and the unrelated setup-prompt copy change were captured. Adds: - ### Added: 6 bullets covering the session capture queue + new `agnes capture-session` subcommand, `/agnes-private` slash + `agnes mark-private`, `agnes statusline` + statusLine wiring, `--legacy-scan` opt-in fallback, single-instance push lock, and the new `filelock` runtime dep. - ### Changed: BREAKING bullet on the SessionStart / SessionEnd hook wire format change (capture-session as first SessionStart entry, push self-heal removed, SessionEnd push detached via nohup, all entries bash-wrapped). Folds the prior standalone bash-wrap bullet into this consolidated entry — Z's review flagged the layout shift as BREAKING, and grouping the related sub-changes makes the migration story readable in one place. - Operator migration is auto-handled by `maybe_refresh_claude_hooks` invoked from `agnes self-upgrade` (separate Changed entry below). No `agnes init` re-run required. Pre-queue session jsonls on upgrading workspaces still need a one-off `agnes push --legacy-scan` — flagged in the BREAKING bullet. No code change; doc only. * Drop permanent 4xx uploads instead of requeueing forever Closes ZdenekSrotyr #5. Previously the push retry path requeued any non-200 response except the literal "file not found on disk", so 401 (token expired), 403 (RBAC denial), 413 (payload too large), 400 (server-side validation) cycled through every push run forever — the queue grew without bound and each run re-bombarded the server with the same deterministically-failing upload. Now 4xx (except 408 Request Timeout + 429 Too Many Requests, which the HTTP spec marks as transient) is dropped and audit-logged to `<workspace>/.claude/agnes-sessions-failed.txt`: <iso_ts>\t<session_id>\t<status>\t<transcript_path> 5xx and network errors continue to requeue — those reflect server / transport state that can change between runs, so retry is the right behavior. The audit log piggybacks on the push single-instance lock (agnes-push.lock) — push is the only writer to this file, same as the existing `mark_uploaded` and `mark_private_skipped` paths, so no separate filelock is needed. `agnes push --json` surfaces a new `dropped_permanent` counter; non- quiet stdout mentions the audit-log path so operators tailing the output have a pointer to the forensic trail. Tests: +7 in test_cli_push.py (401/400/403/413 → drop; 408/429 → requeue; 500/502/503 → requeue; network exception → requeue; --json `dropped_permanent` counter; stdout audit-log pointer). +1 in test_session_queue.py (mark_failed_permanent TSV format). 127/129 targeted tests green. The 2 pre-existing Windows path-separator failures in `test_smoke_test_detects_version_mismatch [uv\|pip]` are unrelated (path mismatch `\fake\uv\bin\agnes` vs `/fake/uv/bin/agnes` in test asserts, pre-PR baseline). * Catch OSError in push lock acquisition Closes ZdenekSrotyr #8. `acquire_or_skip` in `cli/lib/push_lock.py` previously caught only `filelock.Timeout`. Any `OSError` from `FileLock.acquire` — read-only filesystem, permission denied on `.claude/`, disk full, hardware I/O failure — propagated as an unhandled traceback. Two visible failure modes: - SessionEnd hook: `\|\| true` in the wrapper swallowed the error, so daily pushes silently never ran. Operator had no signal. - Manual `agnes push`: ugly Python traceback dumped to the terminal instead of a clean exit. Now `OSError` is treated the same as `Timeout` — yield `None`, caller returns cleanly with rc=0. The operator's environment in these scenarios has bigger problems than missing session uploads, so we swallow rather than retry-loop or surface a noisy warning. Test: `test_push_silent_exit_when_filelock_raises_oserror` patches the `FileLock` used inside `push_lock` to raise OSError on acquire, verifies push exits 0 with no traceback and the queue is preserved for the next attempt. * Address remaining S2 items from PR-242 review Four items from ZdenekSrotyr's S2 list: S2.10 — `_install_statusline` truthy check (cli/lib/hooks.py): replace `if existing:` with explicit `if existing is None or existing == "":`. Documents and tests the behavior for both edge cases (explicit-null and empty-string `statusLine`) — both treated as "not configured" rather than "explicit user opt-out", so we install ours. Two new tests in test_lib_hooks.py pin the contract. S2.6 — onboarding docs for /agnes-private. New "Private sessions" subsection in `config/claude_md_template.txt` (next to Data Sync) covering the slash command, statusbar indicator, and audit-log location. One-line tip in `app/web/setup_instructions.py` so the feature is discoverable at onboarding. S2.9 — e2e privacy test (tests/test_e2e_privacy.py). Wires capture_session → mark_private → push against a recording fake api_post and asserts zero session uploads for the marked one. Three cases: mark-before-capture (queue write skipped), mark-after-capture (push-side filter catches it + audit-logs), control (unmarked sessions upload normally). David #8 — `--legacy-scan` help text now documents the private-list gap (legacy entries carry empty session_id, so the filter is not consulted). The practical impact is bounded — pre-queue sessions cannot have been marked private since the private list is a queue-era feature — but the disclaimer in the help text means an operator running a backfill is not surprised. 68 targeted tests green (3 new e2e + 2 new truthy edge tests + existing). 2 pre-existing Windows path-separator failures in test_smoke_test_detects_version_mismatch[uv\|pip] unchanged. Remaining S2 items (statusline mkdir push-back, capture-session silent-fail follow-up) handled in PR comment + follow-up issue respectively. * Address remaining S2 follow-ups (David #8, S2.7, David #11) Three items left over from Mina's bbf63472 batch — that commit addressed S2.6/S2.9/S2.10 + documented David #8 in help text but deferred the actual implementations of S2.7, David #11, and the real David #8 fix to follow-ups. This commit closes them. David #8 — `agnes push --legacy-scan` now consults the private list. Claude Code names jsonls `<session-id>.jsonl`, so the file stem IS the session id; the legacy-scan path can apply the same private filter the queue path uses. Both the dry-run and live-upload code paths fixed. Help text updated (no longer warns the filter is bypassed). Two new tests in test_cli_push.py cover the upload-skip path + the dry-run `would_skip_private` segregation. S2.7 — `statusline`/`is_private` no longer mkdir-pollutes arbitrary workdirs. Split `_claude_dir` into `_claude_dir_writable` (used only from `add_private`) and `_claude_dir_readonly` (no mkdir). The read-only public helpers (`private_list_path`, `read_all_private`, `is_private`) compose the no-mkdir variant by default; `add_private` opts in via `writable=True`. Added a process-local mtime-keyed cache around `read_all_private` so in-process callers (push doing one stat per upload candidate, future `agnes diagnose`) don't re-parse the file on every check. Cache eviction on `add_private` so a sub-second write+read sequence doesn't see stale data even on coarse-mtime filesystems. Two new tests pin the no-mkdir contract + the in-same-second add+read consistency. David #11 — `agnes capture-session` writes a breadcrumb log on every invocation. New `<workspace>/.claude/agnes-capture-session.log` TSV: `<iso_ts>\t<outcome>\t<detail>` where outcome covers every silent- exit path (`ok`, `private_skip`, `empty_stdin`, `bad_json`, `not_object`, `no_transcript_path`, `stdin_read_error`, `write_error`). Gives operators a signal to detect "hook fires but queue stays empty" — without it, an upstream Claude Code stdin- contract change is invisible because the hook always exits 0. Log rolls at 256 KiB so it doesn't grow unbounded on long-lived workspaces. Best-effort: a breadcrumb-write failure is itself swallowed so the hook contract stays "exit 0 always". Skipped in non-Agnes workdirs (no `.claude/` exists) so opening Claude Code in `~/` doesn't pollute it. Five new tests in test_capture_session.py cover the success / bad_json / no_transcript_path / private_skip / no-pollute paths. 115 targeted tests green (test_cli_push, test_capture_session, test_private_list, test_session_queue, test_e2e_privacy, test_lib_hooks, test_statusline, test_mark_private). --------- Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-05-11 13:31:16 +00:00
Vojtech	41829e8a45	Setup-prompt + bootstrap fixes from 2026-05-10 init report (#240 ) * Setup-prompt + bootstrap fixes from David's 2026-05-10 init report Three issues from clean-machine bootstrap evidence: 1. `agnes refresh-marketplace --bootstrap` failed to recover when the local clone existed but Claude Code's marketplace registry had lost the `agnes` entry. Bootstrap path now parses `claude plugin marketplace list`, re-runs `claude plugin marketplace add ~/.agnes/marketplace` when missing, and treats `add` failures as fatal (was warn-and-continue, root cause of the cascade into "Marketplace 'agnes' not found" plugin install errors). 2. Setup prompt now always emits the marketplace-registration block, even when the operator has zero plugin grants. Pre-wires the SessionStart hook so future admin grants land automatically without re-running setup. Block copy adapts: empty list shows "no plugins granted yet", populated list shows "install plugins". 3. Setup prompt registers the Atlassian Remote MCP server unattended (`claude mcp add --transport sse atlassian https://mcp.atlassian.com/v1/sse`). Hosted Remote MCP, OAuth handled automatically by Claude Code on first use. Asana / GWS stay on the /home connector cards (PAT/keychain flows don't fit unattended bootstrap). Confirm step nudges the user toward the /home connector cards for the PAT-flow services. CLAUDE.md template renames the marketplace section to "Agnes Marketplace" and documents that all plugins are addressed as `<plugin>@agnes` regardless of upstream slug. Layout: Confirm shifts from step 6/8 to step 9 across all variants (preflight, marketplace, MCP all unconditional). Tests updated. * Link Claude license options from /home install pane Step-1 Claude install on /home pointed users to OAuth without explaining what to do if they don't have a Pro/Max subscription. Add a one-line follow-up link to the plan-tier section on /setup-advanced (new `#claude-plan` anchor) so first-time users discover the subscription tiers rather than bouncing on the OAuth screen. * Add idempotent + no-TLS-bypass guardrails to /home connector prompts The Asana / Google Workspace / Atlassian connector prompts on /home already shipped a precheck step that short-circuits when the service is already wired, but they didn't carry the same idempotency + surface-errors-verbatim + don't-disable-TLS-verification guardrails the bash bootstrap prompt has. Add a one-paragraph 'Ground rules' block at the top of each prompt so a connector failure doesn't tempt the model into bypass workarounds, matching the same posture David's 2026-05-10 init report flagged for the bash flow. * skip Source: lines in marketplace registry detector `claude plugin marketplace list` prints a `Source: <local path>` line under each registered marketplace; the local clone almost always lives under a path containing the marketplace name itself (`~/.agnes/marketplace`). A naive \\bagnes\\b match over the full stdout therefore false-positives whenever ANY unrelated marketplace sits under `~/.agnes-…/` or similar. Filter Source: lines out before matching so the recovery path actually re-adds when needed instead of silently falling through to a broken `marketplace update agnes`. Adds regression test covering the substring-only case. * drop customer-specific tokens from CHANGELOG entries Per CLAUDE.md vendor-agnostic OSS rule ("nothing customer-specific ... in changelogs"): - "agnes-vrysanek.groupondev.com" -> "a private-CA Agnes deployment" - "Groupon Marketplace / groupon-marketplace" -> "<Org> Marketplace / <org>-marketplace" (placeholder example) - Removed "David flagged" attribution language; init-report context stays intact, just stripped of the named host + brand --------- Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-05-10 20:24:00 +02:00
minasarustamyan	4fb2818a19	Add /marketplace browse page + Model B opt-in stack composition (#230 ) * Add /marketplace browse page + Model B opt-in stack composition New /marketplace browse surface unifies the curated marketplaces (admin-managed git mirrors) and the community Flea Market behind three tabs — Curated / Flea / My Stack — with per-tab category filter, search across both sources with scope checkboxes, and numeric pagination, all driven by URL query state. Plugin detail at /marketplace/curated/<slug>/<plugin> and /marketplace/flea/<id>; nested skill / agent detail at /marketplace/curated/<slug>/<plugin>/ {skill,agent}/<name> and the flea-side single-page detail. Model B opt-in: an RBAC grant on a curated plugin is now only eligibility. The user must click "Add to my stack" for it to enter their served Claude Code marketplace. Composition flips from (rbac ∖ opt_outs) ∪ store_installs to (rbac ∩ subscriptions) ∪ store_installs. The legacy user_plugin_optouts table is renamed user_curated_subscriptions (schema v27) — same table shape, inverted semantic, repository methods become subscribe / unsubscribe / is_subscribed. UX vocabulary: Install → Add to my stack, Installed → In your stack, card "Installed" badge → "In stack" (amber pill), tab "My Subscriptions" → "My Stack". Bridges the two-step model (server-side bookmark vs. on-laptop install) the previous label hid. Click triggers an inline post-add hint panel under the description with the agnes refresh-marketplace recipe + Copy chip, dismissible per-browser via localStorage. Per-tab info blocks above the filter row: - Curated: trust signal — "Each plugin here has a named curator accountable for it." (blue accent + See-all-curators link) - Flea: open-shelf signal — "Anyone in the company can upload here." (purple accent + Tips-for-sharing link) - My Stack: personal-shelf orientation — "Your AI stack — everything you've added." (slate accent, no link) Tabs carry per-tab Heroicons (shield-check / building-storefront / rectangle-stack) tinted to match each tab's accent; flips white when the tab is active for contrast. Hero illustration anchored to the right of the blue hero panel (absolute, 47% wide, behind the search row content). Hidden under 900px viewport. Action-row CTAs realigned to publication intent: curated "How to add new content" → "Submit a plugin" (links to the guide page); flea button removed since +Upload sits next to it. Empty-state CTAs match. /marketplace/guide/{curated,flea} routes now host publication-flow guide pages with placeholder ledes — full copy to be authored separately. Categories: Heroicons-based icons mapped per category in src/category_icons.py (zero new dependencies; SVG path strings inlined). Marketplace cards, filter pills, and detail pages read from the same source. API endpoints under /api/marketplace: - GET /items per-tab listing (curated / flea / my) - GET /categories per-tab non-zero counts - GET /curated/{slug}/{plugin} plugin detail - POST/DELETE /curated/{slug}/{plugin}/install subscribe toggle - GET /curated/{slug}/{plugin}/{skill,agent}/{name} inner item The tab=my branch reads directly from user_curated_subscriptions ∪ user_store_installs (not resolve_user_marketplace, which bundles flea skills/agents into a single store-bundle synthetic entry useful for serving the Claude Code marketplace ZIP/git but wrong for browsing where each item should appear as its own card). Detail pages: plugin detail surfaces inner skills/agents as clickable nested cards; commands/hooks/MCPs render as plain name lists. Skill/agent detail mirrors the plugin layout with kind-tinted accents (skill = green, agent = purple), Description + Details sidebar, Files + Docs sections, and the "How to call it" copy-able invocation chip showing /<plugin>:<inner-name> exactly as Claude Code namespaces it post-install. Curated nested has no install button — links back to the parent plugin. Navbar: standalone "My AI Stack" relabelled "My Stack" and points at /marketplace?tab=my; "Store" link removed (Store flow is reachable via the Flea Market tab's +Upload button). The standalone /my-ai-stack and /store routes still work for old bookmarks. Tests cover the new browse / categories / install / RBAC paths under tests/test_marketplace_api.py; existing marketplace and store tests updated for Model B (explicit subscribe in fixtures). Schema bumped v26 → v27 with idempotent migration that wipes existing user_plugin_optouts rows on flip and adds marketplace_plugins.created_at with registered_at backfill. * Fix v28 migration + post-rebase test fallout v28 ALTER TABLE marketplace_plugins ADD COLUMN created_at conflicted with _SYSTEM_SCHEMA's earlier CREATE that already includes the column on fresh installs (test fixtures starting at any pre-v28 version trip on it). Switch to ADD COLUMN IF NOT EXISTS — same idiom as the upstream v27 Keboola sync-strategy migration on the same ladder. Two test patches needed after the rebase bumped SCHEMA_VERSION 27 → 28: - test_keboola_v27_migration.py: test_schema_version_constant_is_27 was pinning ==27. Loosened to >=27 (the test's purpose is to verify the v27 Keboola migration, not to pin the current SCHEMA_VERSION). - test_setup_page_unified.py: was monkeypatching resolve_allowed_plugins but compute_default_agent_prompt now reads from resolve_user_marketplace (Model B-aware). Stub the right function so the test exercises the v28 served-set path. * Harden curated skill/agent inner endpoints against path traversal `_read_inner`, the `skill_dir` walk in `curated_skill_detail`, and the `agent_path.stat` in `curated_agent_detail` joined URL path-params onto `plugin_root` without verifying the resolved candidate stayed inside it. Starlette's `[^/]+` on `{skill_name}` / `{agent_name}` blocks the direct URL exploit (encoded `/` 404s before the handler), but a curator-planted symlink inside a curated marketplace's git mirror could still dereference outside the plugin tree on read. Adds `_safe_join(plugin_root, *parts)` doing `Path.resolve(strict=True)` + `relative_to(plugin_root.resolve())`, used by all three call sites so the boundary is enforced once and consistently. Tests cover the helper directly (normal path resolves, escaping `..` returns None, escaping symlink returns None, missing file returns None) plus an end-to-end check that the symlink case actually 404s on the HTTP endpoint. Symlink tests skip on Windows where symlink creation needs elevated permissions; they run on Linux CI. --------- Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>	2026-05-08 14:22:19 +02:00
Minas Arustamyan	50e0463501	feat(marketplace): clone-based plugin setup + auto-refresh SessionStart hook Adds end-to-end flow for installing and keeping the per-user filtered Claude Code marketplace in sync with the user's Agnes stack (admin RBAC grants \ MyAIStack opt-outs U /store installs). Setup (one-liner in install prompt step 5): `agnes refresh-marketplace --bootstrap` clones the per-user marketplace bare repo to ~/.agnes/marketplace, strips PAT from the cloned origin URL, registers the local path with Claude Code, and installs every plugin in the served manifest at --scope project. Replaces a 15-line inline shell sequence that tripped Claude Code's agent-driven `rm -rf` permission gate. Auto-refresh (SessionStart hook installed by `agnes init`): `agnes refresh-marketplace --quiet` runs every Claude Code session, fetches+resets the clone (server rebuilds as orphan commits, so pull --ff-only is impossible), and version-aware reconciles: - missing in workspace -> claude plugin install <name>@agnes --scope project - version differs -> claude plugin update <name>@agnes - matches -> skip Don't auto-uninstall plugins that disappeared from the manifest -- a transient empty manifest from the server would wipe the stack. Hook output: when --quiet AND something actually changed, emits Claude Code hook JSON on stdout -- `systemMessage` (transient toast) and `hookSpecificOutput.additionalContext` (model-side system reminder), both carrying the change summary plus a "/exit + restart Claude Code" instruction (Claude only scans plugins at session start). Windows hook compatibility: the refresh-marketplace hook command is wrapped in `bash -c "..."` because Claude Code on Windows runs hook commands directly without invoking a shell, so `2>/dev/null \|\| true` would otherwise be passed as literal argv tokens. Cross-cutting: - cli/lib/marketplace.py: shared CLONE_DIR + MARKETPLACE_NAME constants. - cli/lib/hooks.py: SessionStart now has two independent entries (pull + refresh-marketplace) so a failure in one doesn't suppress the other; legacy `da sync` and prior single-pull layouts upgrade cleanly on re-init. - PAT injection on every git fetch via per-invocation credential helper (token in \$AGNES_TOKEN env, never in argv or .git/config). - Pre-snapshot of installed plugins captured BEFORE `claude plugin marketplace update` so silent auto-applied version bumps still fire notifications. - scripts/dev/agnes-client-reset.sh: cleans ~/.claude/plugins/marketplaces/agnes, ~/.claude/plugins/cache/agnes, drops uv build cache, documents workspace-scoped residue that can't be enumerated from the script. - app/web/setup_instructions.py: legacy AGNES_DEBUG_AUTH path also uses clone (direct HTTPS marketplace add is broken end-to-end on every Claude Code distribution -- stores response as single file, plugin source paths then 404). 28 new tests (test_cli_refresh_marketplace.py) + extended hook + setup template tests cover bootstrap, fetch+reset ordering, version-aware reconcile, project-path filtering, hook JSON shape, and the bash-c Windows wrapper invariant.	2026-05-07 06:59:13 +02:00
ZdenekSrotyr	2ee529533f	refactor(setup-page): drop role query param The `/setup` route no longer accepts `?role=analyst\|admin`. The route signature drops the `Literal[...] = Query(...)` parameter and the silent admin-downgrade block (`if role == "admin" and not is_admin: role = "analyst"`). The `role` ctx variable threaded into install.html also goes away — Task 6 cleans up the template's role-tile UI and the JS PAT-mint ternary. `?role=` is silently ignored by FastAPI for unknown query params, so existing bookmarks (none in production — the param was added in this PR and never shipped) just degrade to the unified layout. No RedirectResponse shim needed. Tests: drop the entire `tests/test_setup_page_roles.py` file (eight role-branching tests that no longer apply) and add `tests/test_setup_page_unified.py` with three tests: - `test_setup_page_renders_unified_layout` - `test_setup_page_ignores_role_query_param` - `test_setup_page_renders_marketplace_for_user_with_grants` - `test_install_legacy_path_redirects_to_setup` Also replace the role-aware `test_install_preview_*` tests in test_web_ui.py with unified-layout assertions. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 5.	2026-05-04 22:16:59 +02:00

8 commits