# Changelog All notable changes to Agnes AI Data Analyst. Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html), pre-1.0 — public surface (CLI flags, REST endpoints, `instance.yaml` schema, `extract.duckdb` contract) may shift between minor versions; breaking changes called out under **Changed** or **Removed** with the **BREAKING** marker. CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every CI build; semver tags (`v0.X.Y`) are cut at release boundaries and reference the same commit as a `stable-*` tag from the same day. --- ## [Unreleased] ## [0.50.0] — 2026-05-12 ### Added - Skill and agent detail pages (`/marketplace/curated///{skill,agent}/`) now render the same rich curator-authored content as the plugin detail page. New optional per-item fields in `marketplace-metadata.json` under `plugins..skills.` and `plugins..agents.`: `display_name`, `tagline`, `category` (per-item override; falls back to parent plugin's category when absent), `description` (markdown body for "Description" panel), `use_cases[]` ("When to use it" cards), `sample_interaction` (Claude Code-style dark transcript Q&A panel — same Catppuccin Mocha treatment as plugin detail), `when_to_use` (markdown disambiguation block "When to use this", typically referencing alternative skills/agents), and `invocation` (curator-provided literal command string, e.g. `/my-plugin:tool ` or `@my-agent:role` — overrides the computed `:` chip when set, and works correctly for both `/` skill prefix and `@` agent prefix). - Plugin detail page and listing card render rich curator-authored content from `marketplace-metadata.json`. New optional plugin-level fields: `display_name` (overrides the technical plugin id on the hero h1 + listing card name + mac-window titlebar label), `tagline` (1-line value prop replacing the verbose marketplace.json description on cards and the hero subtitle), `description` (multi-paragraph markdown body rendered into the "What it does" panel as sanitized HTML), `use_cases[]` (each entry `{title, description, prompt}` — drives a new "When to use it" 3-column card grid), and `sample_interaction` (`{user, assistant}` — drives a new "Example" Q&A panel with the assistant side rendered as safe markdown). All fields are optional; sections only render when the curator has filled them, so un-enriched plugins look exactly like before. Read on-demand from the working tree (cached by mtime per marketplace), so curator edits land at the next request without waiting for a sync cycle. Server-side markdown render via `markdown-it-py` + `nh3` sanitizer with a description-scoped tag allowlist (no iframes, no images, no inline HTML). See `docs/curated-marketplace-format.md` for the schema reference. ### Changed - Plugin detail hero (`/marketplace/{curated,flea}/.../...`) and skill/agent detail hero (`/marketplace/curated/.../skill|agent/...`, `/marketplace/flea/.../...`) get a redesigned cover area: the 160×160 square is replaced with a macOS-style window frame (3 traffic-light dots + a centered titlebar label showing the entity name — plugin's `manifest_name`, or skill/agent name), and the cover body is constrained to the 715:310 aspect ratio so curator-uploaded covers no longer crop to a square. Window is 380px wide; meta column (h1, tagline, curator, pills/badges) and the absolutely-positioned install/remove actions in the top-right are unchanged. Fallback when no `cover_photo_url` is set is identical to before (translucent gradient + initials — `PL` for plugin, `SK` for skill, `AG` for agent), just inside the window body. - Inner skill/agent cards in the plugin detail's "Internal structure" section also adopt the 715:310 cover aspect ratio (previously fixed 78px tall). No window chrome on inner cards — just the matching proportions, so cover photos read consistently across hero and grid tiles. - `/marketplace?tab=my` (My Stack) gains the same category + type (plugin / skill / agent / all) filter pills the Flea tab has. The items endpoint already supported both filters on `tab=my`; the categories endpoint now also excludes curated subscriptions from category counts when the type filter is set to `skill` or `agent` (curated plugins are always `type=plugin`), so the pill counts stay in sync with what the grid actually shows. Curated browse stays plugin-only and continues to hide the type filter. - **BREAKING** Curated marketplace enrichment file renamed from `.claude-plugin/agnes-metadata.json` to `.claude-plugin/marketplace-metadata.json`. **Curators of upstream marketplace repos must rename the file in their repo** — Agnes no longer reads the old filename (clean cut, no fallback). **Operator-side note:** running instances with the old file already cached under `marketplaces//.claude-plugin/agnes-metadata.json` will see plugin enrichment disappear from the UI until the upstream curator renames + the next nightly sync overwrites the working tree. To force the refresh sooner, hit `POST /api/marketplaces/{id}/sync` (admin) or `POST /api/marketplaces/sync-all` once the rename is upstream. The Python API renames in lockstep: `read_agnes_metadata` → `read_marketplace_metadata`, `AGNES_METADATA_REL` → `MARKETPLACE_METADATA_REL`, `AGNES_METADATA_MAX_BYTES` → `MARKETPLACE_METADATA_MAX_BYTES`. The synth Claude Code marketplace's strip rule (`.agnes/**` + the metadata file) follows the new filename. See `docs/curated-marketplace-format.md`. ### Fixed (PR #251 follow-ups) - **Cache eviction was unbounded under marketplace count growth (review must-fix).** `app/api/marketplace.py::_read_metadata_cached`'s eviction predicate only swept stale entries for the CURRENT marketplace; with N>100 distinct marketplaces each carrying one mtime key, the cap silently failed and memory grew linearly. Replaced with a bounded `OrderedDict` LRU (cap = 256 entries) that drops the oldest insert on overflow regardless of marketplace_id. Cache stress test pinned in `test_marketplace_metadata.py`. - **Curator-controlled markdown could dominate request CPU (review must-fix).** `render_safe` runs on every plugin / inner-detail request via pure-Python `markdown-it-py`. A curator commit of a 1 MiB `description` (under the file-level cap) × QPS = curator-controlled CPU burn. Resolver now enforces a per-field cap of 64 KiB on `description` / `when_to_use` / `sample_interaction.assistant` via `MARKETPLACE_METADATA_FIELD_MAX_BYTES`, with safe UTF-8-boundary truncation + warning log when the cap fires. - **Inner-detail endpoints bypassed the metadata cache (review must-fix).** `_curated_inner_enrichment`, `_curated_inner_cover`, and `curated_detail` (skill/agent grid enrichment) called `read_marketplace_metadata` directly, defeating the mtime cache that the plugin listing already shared. Routed all three through `_read_metadata_cached`. Skill/agent detail hits are now O(1) re-parses per marketplace per mtime instead of O(QPS). - **Truthy-vs-presence trap in plugin/inner enrichment merge (review should-fix).** API-layer enrichment writers used `if resolved.get(k):` which silently dropped any future falsy-but-valid resolver field (`bool featured=False`, `int priority=0`, `str category=''`) — the parent value would inherit through `{**parent, **enrichment}` merge instead. Switched API-layer writers to presence check (`if k in resolved`) so the resolver's contract is the authority on field presence. ### Internal - Vendor-agnostic OSS cleanup: removed operator-specific token references (`/grpn-eng:` / `@grpn-eng:` / `.foundryai/`) from `src/marketplace_metadata.py` docstring, `app/web/templates/marketplace_item_detail.html` JS comment, `docs/curated-marketplace-format.md`, and `tests/test_marketplace_metadata.py` fixtures. Replaced with generic `/my-plugin:tool` / `@my-agent:role` / `.example/` placeholders. - New tests for the must-fixes above (cache stress at >256 entries, per-field byte cap with UTF-8 boundary preservation, truthy-vs-presence resolver contract) plus XSS regression coverage on `render_safe` for `javascript:` autolinks (raw + reference + mixed-case), `data:`, `vbscript:` schemes, and positive-coverage for `http`/`https`/`mailto` allowlist + `noopener noreferrer` rel attribute. ## [0.49.1] — 2026-05-11 ### Added - **`instance.admin_email` operator config knob** (env `AGNES_INSTANCE_ADMIN_EMAIL` > YAML `instance.admin_email` > unset). When set, the `/home` Google Workspace connector tile renders an "Email admin" mailto button so analysts whose operator hasn't pre-provisioned a shared OAuth app can request one without leaving the workspace. Empty default cleanly hides the button. - **Connector setup folded into the main install script (step 8).** New `app/web/connector_prompts.py` is the single source of truth for the Asana / Google Workspace / Atlassian per-tool prompts; `_connectors_block` in `setup_instructions.py` inlines them under per-connector default-yes asks (empty/Enter installs; only "no" skips). Same prompts power the `/home` tile cards via `{{ connector_prompts. }}` so editing one place updates both surfaces. Resolves the "extra paste step" friction surfaced by the 2026-05-09 onboarding test — fresh install becomes one paste end-to-end (Agnes + skills + connectors). Note: see #246 for the planned move of the connector prompt set into the operator-side overlay (so non-Atlassian/Asana/GWS shops aren't bound to this opinion). ### Changed - **`/home` install hero polish** — license-options link contrast against the blue gradient (white + underline; matches lead-paragraph pattern), step reorder so auto-mode (Shift+Tab) becomes step 2 and Agnes install shifts to step 3 (auto-mode must be on BEFORE the ~20-command bash bootstrap so each Bash/edit doesn't need a manual approve click), step-2 simplification (Shift+Tab-only — Claude Code prompts to persist as default; no `~/.claude/settings.json` snippet to maintain). Onboarded users no longer see the auto-mode block. Completion banner reads "Step 1, 2 & 3 done — Claude Code installed, auto-mode set, Agnes ready". - **`/home` onboarding friction fixes from internal usability testing** — improved hero copy clarity, connector tile gating notes (so users understand why some tiles are disabled), Asana / GWS / Atlassian prompt-correctness fixes (Atlassian three-guard structure: length floor → URL normalization → Jira-then-Confluence verify with 401 short-circuit; GWS `127.0.0.1` → `localhost` correction grounded in `strings` analysis of the `gws` binary), step layout clarification, and post-OAuth-session fallback line for users who closed the OAuth window before saving. - **Setup script step layout: connectors becomes step 8, Confirm shifts to step 9.** Skills step deleted in #242 (on-demand `agnes skills show ` is the default; bulk-copying skills was an opinion question). Layout now: install (1), init (2), catalog (3), preflight (4), marketplace (5), mcp_servers (6), diagnose (7), connectors (8), confirm (9). ### Removed - **BREAKING: `/corporate-memory` page + dashboard widget + nav link restricted to admins.** The `/corporate-memory` route now requires `require_admin` (was `get_current_user`); non-admin users hitting it see 403 (was 200). The Memory link in the top nav and the corporate-memory widget on `/dashboard` are hidden via `{% if session.user.is_admin %}` guards. **Asymmetry:** the underlying `/api/memory/*` endpoints stay on `get_current_user` so CLI / agent flows that POST a knowledge item or fetch `/api/memory` keep working; the gating is web-UI-only. Operators who relied on non-admin web access need to either grant Admin to those users or use the API. ## [0.49.0] — 2026-05-11 ### Fixed (PR #242 follow-ups) - **`/agnes-private` legacy-scan gap closed (David #8 from PR review).** `agnes push --legacy-scan` now consults the private list using the jsonl file stem as the session id (Claude Code names them `.jsonl`). Previously legacy-scan entries carried an empty session_id, so `--legacy-scan` would upload every transcript on disk regardless of whether the user later marked it private. - **`statusline`/`is_private` no longer mkdir-pollutes arbitrary workdirs (S2.7 from PR review).** Read paths now use a side-effect- free helper that returns the `.claude/` path WITHOUT creating it; only `add_private` materializes the dir. Adds a process-local mtime-keyed cache around `read_all_private` so in-process callers (push doing one stat per upload candidate, `agnes diagnose` scanning workspaces) don't re-parse the file every time. - **`agnes capture-session` writes an operability breadcrumb log (David #11 from PR review).** Every invocation appends one TSV line to `/.claude/agnes-capture-session.log` with the outcome (`ok`, `private_skip`, `bad_json`, `no_transcript_path`, …). Gives operators a signal to detect "hook fires but queue stays empty" — without it, an upstream Claude Code stdin-contract change is invisible because the hook always exits 0. Log rolls at 256 KiB. Best-effort: a breadcrumb-write failure is swallowed so the hook contract stays "exit 0 always". Skipped in non-Agnes workdirs (no `.claude/`) so opening Claude Code in `~/` doesn't pollute it. ### Added - **Session capture queue + new `agnes capture-session` SessionStart subcommand.** Replaces the previous encoding-based scan of `~/.claude/projects/` for session jsonls (which depended on Claude Code's cwd-to-folder encoding — a moving target across versions). The hook reads Claude Code's documented stdin JSON (`transcript_path`) and appends `\t` to `/.claude/agnes-sessions.txt`. `agnes push` then atomically renames that queue to a snapshot, processes it, and re-queues failed uploads. Recovery snapshots from a crashed push are picked up on the next run. Concurrent SessionStart hooks (multiple Claude Code windows opening at once) are serialized by a short-lived `agnes-queue.lock` so the queue is race-free on every OS. - **`/agnes-private` slash command + `agnes mark-private` subcommand.** Mark the current Claude Code session as private — its transcript is skipped by `agnes push` and audit-logged to `/.claude/agnes-sessions-private-skipped.txt` instead. The slash command runs deterministically via `!`-prefix bash (no AI in the loop). State lives in `/.claude/agnes-sessions-private.txt` (one session_id per line) and is the authoritative source — both `capture-session` and `push` consult it, so the slash-command-before-capture and capture-before-slash-command races both resolve safely without an ordering dependency. Requires the `CLAUDE_CODE_SESSION_ID` environment variable that Claude Code sets in every bash subprocess it spawns; `agnes mark-private` exits 1 if missing (defends against accidental invocations from a regular terminal). - **`agnes statusline` subcommand + statusLine wiring.** Renders `🔒 agnes-private` in the Claude Code status bar when the current session is marked private; empty string otherwise. `agnes init` wires it to Claude Code's `statusLine` setting. Polite to existing customizations — if the workspace `settings.json` already has a `statusLine`, the install preserves it untouched and emits a one-line stderr warning instructing the operator how to compose `agnes statusline` into their own command. - **`agnes push --legacy-scan` opt-in fallback** scans `~/.claude/projects/` via the pre-queue encoding-based path. Use for one-off backfill of session jsonls that pre-date the queue mechanism on workspaces upgrading from < v0.49. Note: legacy-scanned entries have empty `session_id`, so the `/agnes-private` list filter never matches — backfill uploads bypass the private list. Document this gap before running a backfill on a workspace that has previously-marked-private sessions in the encoding-based location. - **Single-instance lock for `agnes push`** (cross-platform via `filelock`: `fcntl.flock` on POSIX, `msvcrt.locking` on Windows). When the user closes several Claude Code sessions simultaneously, every SessionEnd hook fires its own `agnes push` — exactly one acquires `/.claude/agnes-push.lock` and runs, the rest silent-exit. Prevents concurrent uploads from each other's queues and matches the existing `bash -c "( nohup ... & ) ; true"` SessionEnd wrapping (push must survive Claude Code's ~1s SIGTERM in `-p` headless mode). - **New `filelock>=3.13,<4` runtime dependency.** Backs both the push single-instance lock and the queue-write serialization above. ### Changed - **BREAKING: SessionStart / SessionEnd hook wire format.** `agnes init` (and the new `agnes self-upgrade` auto-refresh path) write a different hook layout than v0.48: - SessionStart gains `agnes capture-session` as the very first entry — feeds the new session-capture queue that powers `agnes push`. Must run before any other SessionStart hook so the `transcript_path` is captured even if a later hook fails. - SessionStart's previous `agnes push` self-heal entry is removed — the queue persists across runs so orphan jsonls from headless / crashed sessions ship out on the next SessionEnd push naturally. Workspaces upgrading from < v0.49 with sessions that pre-date the queue mechanism need a one-off `agnes push --legacy-scan` to backfill them; see `--legacy-scan` entry above. - SessionEnd `agnes push` is wrapped in a `nohup` subshell so the upload survives Claude Code's `-p` headless SIGTERM (~1s after hook fires) and completes the full upload cycle. The synchronous form would lose 5-30s of uploads to the kill. - All entries are wrapped in `bash -c "..."` for Windows compatibility — Claude Code on Windows runs hook commands directly without a shell, so any `;` chain / `2>/dev/null` redirection / `|| true` short-circuit silently no-op'd previously. Existing workspaces auto-migrate to the new layout on the next session-start via `maybe_refresh_claude_hooks` invoked from `agnes self-upgrade` (see separate Changed entry). No operator action required. - **`agnes self-upgrade` now auto-refreshes the workspace Claude Code hooks** so an existing Agnes workspace picks up the new SessionStart / SessionEnd layout the moment its CLI is upgraded — no need to re-run `agnes init` after a release. Without this, an existing v0.48 workspace would auto-upgrade the CLI via its own SessionStart self-upgrade entry, but the new `agnes capture-session` hook (added in this release) would never get installed, the queue would stay empty, and `agnes push` would silently stop uploading sessions. The refresh fires on both the "info is None" fast path (CLI already current — handles the second SessionStart after a prior upgrade) and after a successful install. Guarded by `cli.lib.hooks.workspace_has_agnes_hooks` so it never writes `.claude/settings.json` into directories that aren't Agnes workspaces (e.g. `agnes self-upgrade` from `~/`). Failures are best-effort — they're surfaced on stderr but never flip the upgrade exit code. ### Added - **Onboarding docs for the `/agnes-private` privacy feature.** `config/claude_md_template.txt` gains a short "Private sessions" subsection (next to "Data Sync") covering the slash command, statusbar indicator, and audit-log location. The web-served setup prompt (`app/web/setup_instructions.py`) gets a one-line mention so analysts learn the feature exists at onboarding instead of by accident. ### Changed - **`_install_statusline` distinguishes explicit `null` / empty-string `statusLine` from absent key.** Previously the `if existing:` truthy check silently took the same path for all three cases. The new `existing is None or existing == ""` branch documents and tests the behavior (install ours — treated as "not configured" rather than "explicit user opt-out"). Two new tests pin both edge cases. ### Fixed - **`agnes push --legacy-scan` help text documents the private-list gap.** Legacy-scan entries carry an empty `session_id`, so the `/agnes-private` filter is not consulted. The practical impact is bounded — pre-queue sessions cannot have been marked private (the private list is a queue-era feature) — but the help text now spells out the gap so an operator running a backfill is not surprised. - **`agnes push` no longer crashes on filesystem errors when acquiring the single-instance lock.** `acquire_or_skip` in `cli/lib/push_lock.py` now treats `OSError` (read-only filesystem, permission denied on `.claude/`, disk full, hardware I/O failure) the same as `filelock.Timeout` — yields `None`, push exits cleanly. Previously the `OSError` propagated as an unhandled traceback; invisible in the SessionEnd hook context (the `|| true` wrapper swallowed it), but ugly in a manual `agnes push` invocation. - **`agnes push` no longer infinite-loops on permanent 4xx failures.** Previously any non-200 response except the literal `file not found on disk` was re-queued, so 401 (token expired), 403 (RBAC denial), 413 (payload too large), 400 (server-side validation error) cycled through every push run forever — the queue grew without bound and each run re-bombarded the server with the same failing upload. 4xx (except 408 Request Timeout + 429 Too Many Requests, which the HTTP spec marks as transient) is now dropped + audit-logged to `/.claude/agnes-sessions-failed.txt` instead (TSV: `\t\t\t`). 5xx and network errors continue to re-queue (genuinely transient — server or transport state can change between runs). `agnes push --json` surfaces a new `dropped_permanent` counter; non-quiet stdout mentions the audit-log path so operators tailing the output have a pointer to the forensic trail. - **Session capture queue: concurrent SessionStart hooks no longer corrupt the queue file on Windows.** `append_to_queue`, `requeue_failed`, and `snapshot_queue` in `cli/lib/session_queue.py` now hold a short-lived `agnes-queue.lock` (filelock) while writing. Previously the code assumed Python's `open(path, "a")` is atomic on NTFS for small writes; it isn't — the Windows CRT does not pass `FILE_APPEND_DATA` to `CreateFile`, so concurrent appenders (e.g. user opens several Claude Code windows simultaneously) could interleave bytes mid-line and the parser would silently drop the malformed entries. The lock is separate from `agnes-push.lock` — capture-session hooks don't block on the push command. - **Session capture queue: snapshot filenames now include a uuid8 tail so a recycled OS PID cannot silently overwrite a recovery snapshot left behind by a crashed push.** `snapshot_queue` previously named files `agnes-sessions.snapshot..txt`; after a crash + PID reuse (Linux default `kernel.pid_max=32768`), `os.rename` atomically replaces the recovery file with the new snapshot, losing every entry in it. New format: `agnes-sessions.snapshot...txt`; `find_recovery_snapshots` already uses a glob so the change is backward-compatible with snapshots written by older CLI versions. ### Changed - **Setup prompt + CLAUDE.md template: marketplace copy now reflects the actual three-source served stack composition + `--check`-only SessionStart hook.** Previous text (shipped in 0.48.0 / PR #240) said the SessionStart hook keeps the marketplace clone in sync via `agnes refresh-marketplace --quiet` on every session, and that admin grants land automatically without re-running setup — both false since PR #237 (0.47.x) moved the install/update path out of the hook into the `/update-agnes-plugins` slash command. The hook is `--check`-only: it detects server-side changes and prompts the user to run the slash command, which does the full reconcile interactively with output visible in the transcript. Updated copy spells out the real composition of the served stack — `(admin RBAC ∩ /marketplace subscriptions) ∪ system-mandatory plugins ∪ Flea market installs` — rather than the admin-grants-only framing the previous copy implied. Affects: `app/web/setup_instructions.py:_marketplace_block` (both trailer variants) and `config/claude_md_template.txt` (Agnes Marketplace section). ### Removed - **Setup prompt's interactive Skills step deleted.** The final step before Confirm used to ask the user verbatim whether to bulk-copy every `agnes skills` markdown file into `~/.claude/skills/agnes/` or pull them on-demand via `agnes skills show `. The named-opinion question with no obvious right answer was confusing for new users at the tail end of a wall of technical steps. On-demand lookup via `agnes skills show ` is the one-size-fits-all default — the CLI knowledge base remains discoverable through `agnes skills list` and the CLAUDE.md template references specific skills (e.g. `agnes-data-querying`) inline where they're relevant. Layout: Confirm shifts from step 9 to step 8 across all variants. ## [0.48.0] — 2026-05-10 ### Fixed - **`agnes refresh-marketplace --bootstrap` now recovers when the local marketplace clone exists but Claude Code's registry has lost the `agnes` entry** (fresh Claude Code install on the same machine, manual `claude plugin marketplace remove agnes`, or an earlier interrupted bootstrap). The previous behaviour skipped `_bootstrap_clone` whenever `~/.agnes/marketplace/.git` existed and fell straight through to `claude plugin marketplace update agnes`, which failed with `Marketplace 'agnes' not found. Available marketplaces: claude-plugins-official` and cascaded into per-plugin install errors. The bootstrap path now parses `claude plugin marketplace list`, calls `claude plugin marketplace add ~/.agnes/marketplace` when `agnes` isn't registered, and only then proceeds with fetch + reset + reconcile. Idempotent: a second bootstrap run with `agnes` already registered is a no-op. In the same path, `claude plugin marketplace add` failures are now fatal instead of `warn:`-and-continue. The previous warn-and-continue was the root cause of the cascade above — the operator never saw the real error from `add`, only the downstream "Marketplace not found" symptoms. Source: 2026-05-10 init report from a clean-machine bootstrap against a private-CA Agnes deployment. ### Added - **Setup prompt always registers the `agnes` Claude Code marketplace**, even when the operator has zero plugin grants. Registering the per-user marketplace clone pre-wires the SessionStart hook so future admin grants land automatically on the next Claude Code session without re-running setup. The marketplace block's copy adapts: empty plugin list shows "no plugins granted yet", populated list shows "install plugins". Steps 4 (preflight) + 5 (marketplace) are now always emitted; Confirm shifts from step 6 to step 9 across the full layout. - **Setup prompt registers the Atlassian Remote MCP server unattended** via `claude mcp add --transport sse atlassian https://mcp.atlassian.com/v1/sse` (Fix C in the 2026-05-10 init-report response). Hosted Remote MCP, so Claude Code handles OAuth automatically the first time the operator asks it to read a Jira ticket or Confluence page — no PAT/keychain dance. Idempotent across re-runs (`|| true` swallows the "server already exists" exit). Asana and Google Workspace stay on the /home connector cards because their PAT/CLI flows don't fit an unattended bootstrap. - **Setup prompt's Confirm step nudges the user toward connector cards on /home** for Asana / Google Workspace / Atlassian PAT flows that the bash script can't automate. Surfaces the cards so analysts don't finish bootstrap thinking they're fully wired. - **System plugin tier (schema v39).** Admins can now mark a curated marketplace plugin as a system plugin via a new toggle in the Details modal on `/admin/marketplaces`. Marking materializes a `resource_grants` row for every existing user_group and a `user_plugin_optouts` (subscription) row for every existing user, so the plugin lands in every user's stack from day one. Hooks on user-create (Google OAuth, email magic-link, admin-create, scheduler token) and group-create (admin POST + Google Workspace sync ensure) fan out the same materialization to new principals. The resolver itself is unchanged — system semantics emerge from the materialized rows. UI locks the corresponding controls: `/admin/access` checkbox is checked + disabled with a SYSTEM pill; `/marketplace` browse cards show a "Required" badge and the detail-page install button reads "✓ Required by your org"; `/my-ai-stack` toggle is disabled with a System pill. Backend guards return 409 on the bypass paths (`DELETE /api/admin/grants` for system grants, `PUT /api/my-stack/curated/.../{enabled:false}`, `DELETE /api/marketplace/curated/.../install`). Unmark flips the flag only — materialized rows persist so admins curate cleanup at their leisure via the now-unlocked `/admin/access` checkboxes. Endpoints: `POST` / `DELETE /api/marketplaces/{id}/plugins/{name}/system`. - **`/update-agnes-plugins` slash command** — installed automatically by `agnes init` into `/.claude/commands/`. Runs `agnes refresh-marketplace` (the chatty default mode) so the user sees install/update progress streamed into the Claude Code transcript and can react to errors interactively, instead of having a full reconcile happen silently behind a SessionStart hook. - **`agnes refresh-marketplace --check`** — lightweight detector mode for the SessionStart hook. Runs `git fetch` only, compares local `HEAD` with remote `FETCH_HEAD`, and emits a Claude Code hook JSON message pointing the user at `/update-agnes-plugins` when there are remote changes. Silent when up to date. No `git reset`, no `claude plugin marketplace update`, no plugin install/update side effects. - **Flea-market entity edit feature with version history (schema v38).** Owner + admin can now edit a store entity from a real Edit page at `/marketplace/flea/{id}/edit` (replaces the prior "coming soon" placeholder). Editable fields: display name, description, category, video URL, cover photo, and an optional new bundle. Type is locked (400 `type_locked` on change attempt). Display-name change renames the on-disk slug for both the live `plugin/` dir and the version dir, mirroring the rename-on-archive flow. Each bundle update creates a new version: bytes bake into `${DATA_DIR}/store//versions/v/plugin/`, run the standard guardrails pipeline. **Deferred promotion:** the live `plugin/` dir and `entity.version_no` stay at the prior approved version through the LLM review window, so existing installers keep receiving the previously approved bundle while the new version is being validated. Promotion (live swap + version_no/version/file_size bump) happens only on LLM approval; if the new version is blocked, installers continue serving the prior approved version indefinitely. The entity row carries `version_no` (current served index) and `version_history` JSON (append-only per-version metadata: hash, sha256, size, submission_id, created_at, created_by). Existing entities backfill to v1 with a single-entry history seeded from the row's current `version` hash. **Block-while-pending:** an in-flight LLM review blocks any further edit with 409 `prior_version_pending`. Owner waits ~5-30s; the detail page Edit button renders disabled in the same window. **Rollback:** new endpoint `POST /api/store/entities/{id}/versions/{n}/restore` (owner + admin) copies a prior version's bundle forward as v and re-runs guardrails. Forward-only history — the original row keeps its verdict; the new copy gets a fresh one. Detail page renders a Versions card with restore buttons for owner/admin only. **Admin queue** gains a `v#` column (with "current" badge) and a separate Hash column. Submission detail page surfaces Version + Bundle hash rows. Activity timeline splits into per-submission + entity-wide cards so admins can tell version-scoped events apart from entity-wide ones; entity-wide rows render `vN` chips when the audit row's params reference a version. ### Changed - **CLAUDE.md template renames the marketplace section to "Agnes Marketplace — plugins available to you"** and clarifies that Claude Code addresses every plugin as `@agnes` regardless of upstream marketplace slug — the per-user aggregated marketplace name is always `agnes`. Resolves the naming-drift confusion flagged in the 2026-05-10 init report (CLAUDE.md previously rendered upstream marketplace registry names like ` Marketplace` / `-marketplace` without explaining the typed name is always `agnes`). Upstream marketplace names still render as nested bullets so admins see what's been folded in. - **SessionStart marketplace hook is now read-only.** The hook installed by `agnes init` was previously `agnes refresh-marketplace --quiet`, which performed a full fetch+reset+install cycle on every session start (slow, invisible to the user, not interactively recoverable). It now runs `agnes refresh-marketplace --check` — detect-only — and surfaces a hint to run `/update-agnes-plugins` when updates are available. Existing workspaces auto-upgrade on next `agnes init` (the substring marker `agnes refresh-marketplace` matches both the old and new entry shapes, so the idempotent-replace path correctly rewrites them). - **Marketplace "Added to your stack" hint points at `/update-agnes-plugins`.** The post-install green panel on plugin and skill/agent detail pages used to suggest `agnes refresh-marketplace` in a shell prompt and reference the SessionStart auto-install. With the hook now being detect-only, that text was outdated. The hint is condensed to a single instruction — open a new Claude Code session and run `/update-agnes-plugins` — with the slash command in a copy chip. Affects `marketplace_plugin_detail.html` and `marketplace_item_detail.html`. - **Plugin / skill / agent detail page install button split into two elements when in stack.** The single button that morphed between `+ Add to my stack` and `✓ In your stack` did not communicate the uninstall affordance — clicking the green "In your stack" button silently removed the plugin with no visible signal that the click meant "remove". The installed state now renders an inline white status label `✓ In your stack` *before* a separate red-bordered `✕ Remove from stack` button on the same row. Both buttons share the install button's exact height to avoid layout shift on toggle. System plugins still render the locked amber pill `✓ Required by your org` with no Remove button (API refuses uninstall with 409). The post-action hint panel now also fires on remove with the title flipped to `✓ Removed from your stack` — Claude Code needs the same `/update-agnes-plugins` refresh either way. - **`/admin/marketplaces` Details modal "Mark as system" toggle redesigned.** The toggle button was previously near-invisible — same border + neutral-gray text as surrounding row metadata. It now renders as a balanced amber-toned chip with a shield icon: outlined white when the plugin is off-system (calls attention without shouting), tinted amber-50 when on-system (reads as "currently active, click to revert"). The native `confirm()` dialog is replaced with a structured modal that summarizes the fanout consequences (RBAC grants for every group, subscriptions for every user, locked in user-facing UI, new principals inherit it). ### Removed - **BREAKING: `/store` and `/my-ai-stack` page routes deleted.** Both surfaces are fully replaced by `/marketplace?tab=flea` and `/marketplace?tab=my` respectively, which already render the same data via the unified marketplace tabs. Hard delete with no redirects — stale bookmarks 404. The upload wizard at `/store/new`, the flea detail/edit at `/marketplace/flea/{id}[/edit]`, the admin queue at `/admin/store/submissions`, and all `/api/store/*` + `/api/my-stack` endpoints stay untouched. The `agnes my-stack` CLI subcommand and `agnes store` are unaffected. Internal hard-coded hrefs (advanced setup page, store upload-wizard Cancel button, admin marketplaces modal copy, navbar active-state guard) repointed to the new tab URLs. - **BREAKING: `agnes refresh-marketplace --quiet` flag.** Replaced by `--check` (detect-only) and the new `/update-agnes-plugins` slash command (interactive update). Existing SessionStart hooks calling `--quiet` will silent-noop after the CLI upgrade — the hook's `2>/dev/null || true` swallows the unknown-flag error — until the user re-runs `agnes init`, which rewrites the hook to use `--check` and installs the slash command. Dashboard `/setup` flow re-runs `agnes init` automatically on next paste. - **BREAKING: legacy `git config --global http..sslVerify=false` downgrade in the install setup prompt.** The marketplace step (step 5) used to emit this line on `AGNES_DEBUG_AUTH=1` instances when no `ca_pem` was readable from `AGNES_TLS_FULLCHAIN_PATH` (default `/data/state/certs/fullchain.pem`). It tripped Claude Code auto-mode classifiers ("do not disable TLS verification" rule) and silently masked operator misconfigurations — a debug-auth instance without a fullchain on disk would fall through to a TLS-disabled clone instead of surfacing the missing cert. With this change there is exactly one trust-bootstrap path: the cross-platform step 0 trust block (gated on `_read_agnes_ca_pem` returning a PEM). Operators serving a self-signed or private-CA cert MUST place the fullchain at the configured path so step 0 picks it up; publicly-trusted certs need no trust block at all. The `self_signed_tls` parameter on `app.web.setup_instructions.resolve_lines` and `render_setup_instructions` is also dropped (was only consumed by the deleted block). ### Fixed - **`v34→v35` migration is now idempotent under partial-rebuild recovery.** The original list-form `_V34_TO_V35_MIGRATIONS` ran four ALTER statements in sequence: `ADD _vis_v35` → `UPDATE _vis_v35 = visibility_status` → `DROP visibility_status` → `RENAME _vis_v35 TO visibility_status`. If the RENAME failed for any reason after the DROP succeeded (DuckDB lock contention at startup, scheduler-vs-app race opening `system.duckdb`, container kill mid-migration, …), the DB was stranded with `_vis_v35` populated and `visibility_status` missing — and `schema_version` never bumped because the UPDATE at the bottom of the migration ladder only runs when *every* step succeeds. Subsequent restarts then hit `DROP visibility_status` again with no `IF EXISTS` guard and looped on the same error; the only recovery was hand-editing the DB. The migration is rewritten as a Python function `_v34_to_v35_migrate` that inspects the table's columns up front and dispatches into one of three paths: clean v34 (run the full rebuild), partial v35 with `_vis_v35` only (finish the RENAME alone), or both columns present (drop the temp). The audit columns (`archived_at`, `archived_by`) ship first behind `IF NOT EXISTS` so they're safe in all states. Operators stranded by the original bug recover automatically on next startup. Tests cover the three direct paths plus an end-to-end scenario where `_ensure_schema` walks a `schema_version=32` DB with the half-applied state up through to v36. ### Security - **Prompt-injection hardening for store guardrails LLM review (#1).** `SYSTEM_PROMPT` is now passed via the Anthropic SDK's dedicated `system=` parameter instead of being concatenated into the user message. Bundle file contents are wrapped in `...` sentinels that the system prompt declares data-only; literal sentinel strings appearing in user content are escaped (`<_bundle_>`) so an adversarial README can't forge a closing tag and inject instructions. The system prompt explicitly tells the reviewer to flag injection attempts inside `` rather than follow them. See `tests/test_store_guardrails_prompt_injection.py` for the corpus. - **Static security scan documented as signal, not gate (#6 partial).** Module docstring + admin-queue copy + `docs/STORE_GUARDRAILS.md` call out that substring matches are suggestive only — the LLM verdict carries the safety determination. Documentation files (`.md`, `.txt`, `.rst`, `.html`, `.json`, `.yaml`, `.yml`, `.toml`) now skip static scan to avoid false positives on prose that legitimately discusses `eval`/`exec`. AST-mode for Python source is tracked as a follow-up. ### Added - **Stuck-review reaper (schema v35 + new endpoint).** `POST /api/admin/run-reap-stuck-reviews` flips submissions stuck at `status='pending_llm'` past the configured grace (`guardrails.stuck_review_grace_seconds`, default 1800s) to `review_error`. Scheduler invokes every 15 min. Without this a worker crash between status flip and verdict write left rows pending forever. Set the knob to 0 to disable. - **PUT /api/store/entities/{id} atomic rename (#2).** Bundle updates now bake into a sibling `plugin.staging-/` dir, run inline checks against the staging copy, then atomic- rename onto the live path on success. Failed checks leave the live tree byte-for-byte intact. Pre-fix the bake wrote into the live path BEFORE checks ran; concurrent GETs could see partial / unverified content. - **Schema v35 → v36** re-applies `NOT NULL` + `DEFAULT 'pending'` on `store_entities.visibility_status` (lost in the v34→v35 column rebuild). Value-list invariant remains application-side enforced via the repo whitelist (DuckDB `ADD CHECK` on existing columns is not supported). ### Changed - **BG-task verdict-vs-archive race fixed (#3).** `StoreEntitiesRepository.set_visibility_if_pending` flips visibility only when the row is still in the review window (`pending` / `hidden`). When an admin archives an entity while the LLM review is in flight, the BG verdict no longer clobbers the archive — admin's decision wins. Skipped flips emit a `store.submission.bg_verdict_skipped` audit row so admins can see why an "approved" verdict didn't publish. - **Quota counter widened to all reject states (#9).** `count_blocked_for_submitter_since` now counts `blocked_inline`, `blocked_llm`, AND `review_error` against the per-submitter daily cap. Pre-fix a bot triggering only LLM-blocked verdicts was unbounded. - **Un-archive clears archive metadata (#11).** `set_visibility` nulls `archived_at` + `archived_by` when transitioning OUT of `'archived'` so a future read doesn't show stale archive forensics on an approved row. - **Missing `risk_level` surfaces as `review_error` (#10).** An LLM response that omits or empties `risk_level` no longer defaults to `medium` (which looked like a model decision and silently blocked); it persists as `review_error` with `error='missing_risk_level'` so the admin gets a real Retry button. - **Sort-key whitelist for admin queue (#23).** `/api/admin/store/submissions?sort=…` rejects unknown keys with HTTP 400 `invalid_sort_key`. Pre-fix a substring-replace chain could drop column references silently when one column name was a substring of another. - **FSM doc comment in `_SYSTEM_SCHEMA` corrected (#12).** Explicit insert/transition/lifecycle sections describe the actual status machine instead of the misleading `pending → pending_llm → ...` chain. `pending_inline` clarified as reserved-but-unused. - **Soft delete (Archive) for store entities (schema v35).** `DELETE /api/store/entities/{id}` is now soft by default — flips `visibility_status='archived'` + stamps `archived_at` / `archived_by`. Bundle stays on disk, existing `user_store_installs` continue serving the bundle through `marketplace.zip` / `.git` so already-installed users don't lose the plugin. Browse listings hide archived entries from everyone (including the owner — admins triage). New installs refused. My AI Stack still shows installed-but-archived entries with a subtle *"Archived by owner"* badge. **Hard delete** moves to `DELETE /api/store/entities/{id}?hard=true` — admin-only. Drops the bundle bytes + cascades to remove `user_store_installs` (existing users lose the plugin on next sync). Use only for legal / privacy removals where the bytes have to go. Detail-page UX: owner of an approved entity sees an **Archive** button. Admin sees both **Archive** and a separate red **Hard delete (admin)** button with an install-count warning in the confirm dialog. Quarantined (pending / blocked) entities lock both buttons for the owner — admin still sees both. **Visibility-leak gates (similar audit):** `/api/store/owners` + `/api/marketplace/categories?tab=flea` now filter to `visibility_status='approved'` for non-admin callers (admin sees all). Without this, owner identity + per-category counts of quarantined or archived entries leaked through the public dropdown / filter chips. ### Changed - **Rename-on-archive frees the name for re-upload.** Archiving an entity now appends `__archived__` to `store_entities.name` in the same UPDATE that flips `visibility_status='archived'`. The on-disk skill / agent / plugin subdir is renamed in lockstep (`skills//` → `skills//`) and SKILL.md / agent.md / plugin.json frontmatter `name` is rewritten so consumers' Claude Code resolves the new slug after their next sync. The `(owner_user_id, name)` UNIQUE slot AND the global `-by-` invocation slot free up, so the same owner can re-upload under the original name without picking a new one. Admin un-archive (set_visibility from 'archived' to 'approved') strips the suffix; if the original slot is taken by a re-upload, the un-archived row gets `-restored-N`. Display layer (admin queue, my-stack, marketplace cards / detail) strips the suffix so users see the original label with an "Archived" badge instead of the marker. Trade-off: existing installers see the plugin renamed on next pull and need to re-add (one-tap recovery via the My AI Stack card; same data, new slug). `audit_log.params['original_name']` preserves forensic traceability. - **Admin submissions queue: Archived chip filters live entity visibility via LEFT JOIN, not denormalized submission status.** Verdict (`store_submissions.status`) is immutable forensic record; lifecycle (`store_entities.visibility_status`) is the live source of truth. Any code path that flips visibility now surfaces in the queue immediately — no denormalization to drift. *Deleted* chip still filters `entity_id IS NULL AND status='deleted'` (entity row is gone after hard delete; explicit marker required). The submission detail page renders Status (verdict) and Entity lifecycle side by side. Closes the bug where archiving an entity outside the soft-delete API didn't surface under `?status=archived`. - **Consolidated `/store/{id}` into `/marketplace/flea/{id}`.** The legacy detail surface is gone; the unified marketplace detail page is the canonical home for every flea entity. Three in-tree callers (upload-success redirect, My AI Stack card href, /store browse card href) now point straight at the new URL — no redirect hop. Stale external `/store/{id}` bookmarks 404. The marketplace detail templates (`marketplace_plugin_detail.html` + `marketplace_item_detail.html`) gained the **quarantine banner** (extracted into a shared `_quarantine_banner.html` partial), an **owner-actions strip** (Edit "coming soon" + Delete with locked variants), and the **install-button gating** (gray inert when non-approved). The marketplace listing now surfaces a small **"Under review" / "Quarantined"** corner badge on the submitter's own non-approved cards (only visible to them; everyone else still sees only approved entries). ### Added - **Visibility gate on `/marketplace/flea/{id}` + `/api/marketplace/flea/{id}/detail`.** Non-owner non-admin gets 404 (not 403, no leak) on any non-approved entity — closes the bypass where guessing an entity_id pulled the bundle metadata through the marketplace JSON feed even though the entity was excluded from the public listing. - **`StoreEntitiesRepository.list(include_owner_id=…)`.** When set, the WHERE expands to `(visibility_status IN (...) OR owner_user_id = :uid)` so the caller's own non-approved entries surface alongside everyone's approved ones. Used by `/api/store/entities` and `/api/marketplace/items?tab=flea`. ### Removed - **`/store/{id}` route + `store_detail.html` template.** Replaced by the consolidated marketplace detail surface above. ### Removed - **`store_submissions.retry_count` column (schema v34).** Counter mixed two unrelated things (LLM error count + admin rescan count), was asymmetric (Retry LLM didn't bump but Rescan did), and is fully redundant with the audit_log activity timeline now rendered on the detail page — every rescan / retry / review_error is a row there with timestamp + actor. Removed from schema, repo signatures, admin endpoints, and the detail-page metadata. ### Internal - Migrate `src/marketplace_asset_mirror.py` from `urllib.request` to `httpx` (PR #234 review #16). The asset mirror was the only HTTP call site in Agnes still using `urllib.request`; every other module (CLI, Jira / OpenMetadata / OpenAI connectors, scheduler, Telegram bot) already used `httpx`. Following the existing convention has three concrete benefits here: (a) the SSRF defence collapses from five urllib classes (`_PinnedHTTPConnection`, `_PinnedHTTPSConnection`, `_PinnedHTTPHandler`, `_PinnedHTTPSHandler`, `_SafeRedirectHandler`) into a single `_SSRFGuardTransport` because httpx invokes `handle_request()` on every redirect hop, so re-validation is automatic; (b) the per-leg URL host is rewritten to the SSRF-validated IP and the original hostname is preserved in the `Host` header + `sni_hostname` extension, defeating DNS rebinding without subclassing `HTTPConnection` / `HTTPSConnection`; (c) error handling collapses from `URLError` + `HTTPError` + manual unwrap into one `httpx.HTTPError` catch + specific subclasses for timeout / too-many-redirects, matching the `_translate_transport_error` shape from `cli/client.py`. The shared `httpx.Client` is built lazily at module load (same pattern as `cli/client.py:_get_shared_client`) with `follow_redirects=True`, `max_redirects=5`, and our custom transport. Externally observable behaviour is unchanged: same `FetchOutcome` statuses (ok / not_modified / failed / rejected), same manifest format, same conditional GET semantics. Tests migrated from `urllib`-shaped fakes to `httpx`-shaped (`status_code`, `iter_bytes`, context manager); five urllib-specific tests replaced with httpx equivalents (transport unit tests + DNS-rebinding integration test). - Maintainability cleanup batch (PR #234 review #10, #14, #11). **#10:** dropped `_path_under` from `app/api/marketplace.py` — it was a byte-equivalent clone of `_safe_join` (same `Path.resolve(strict=True) + relative_to()` containment check), so the three callers in the v32 asset / doc / mirrored endpoints now share the existing helper. **#14:** renamed `src/marketplace_assets.py` → `src/marketplace_asset_validation.py` so the file's purpose (image / doc magic-byte validators + Content-Type allowlist + agnes-metadata parsers) is obvious from the name and the previous overlap with `src/marketplace_asset_mirror.py` is gone; six call-site imports updated in lockstep. **#11:** consolidated the three URL builders that resolve `/api/marketplace/curated///{asset,doc,mirrored}/...` paths — `_internal_asset_url` / `_internal_doc_url` / `_mirrored_asset_url` lived in `src/marketplace.py`, while a copy named `_mirrored_url` lived in `app/api/marketplace.py` with a "must stay aligned" comment. The new module `src/marketplace_urls.py` is the single source of truth; both call sites import from it. The route-handler endpoints themselves still own the path string literals — keeping the builders identical to the route declarations remains a checklist item. - Consolidate marketplace detail-page video embeds + format-guide CSS (PR #234 review #12, #13). The YouTube nocookie / Vimeo / `