# Changelog All notable changes to Agnes AI Data Analyst. Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html), pre-1.0 — public surface (CLI flags, REST endpoints, `instance.yaml` schema, `extract.duckdb` contract) may shift between minor versions; breaking changes called out under **Changed** or **Removed** with the **BREAKING** marker. CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every CI build; semver tags (`v0.X.Y`) are cut at release boundaries and reference the same commit as a `stable-*` tag from the same day. --- ## [Unreleased] ### Fixed - **`agnes refresh-marketplace --bootstrap` now recovers when the local marketplace clone exists but Claude Code's registry has lost the `agnes` entry** (fresh Claude Code install on the same machine, manual `claude plugin marketplace remove agnes`, or an earlier interrupted bootstrap). The previous behaviour skipped `_bootstrap_clone` whenever `~/.agnes/marketplace/.git` existed and fell straight through to `claude plugin marketplace update agnes`, which failed with `Marketplace 'agnes' not found. Available marketplaces: claude-plugins-official` and cascaded into per-plugin install errors. The bootstrap path now parses `claude plugin marketplace list`, calls `claude plugin marketplace add ~/.agnes/marketplace` when `agnes` isn't registered, and only then proceeds with fetch + reset + reconcile. Idempotent: a second bootstrap run with `agnes` already registered is a no-op. In the same path, `claude plugin marketplace add` failures are now fatal instead of `warn:`-and-continue. The previous warn-and-continue was the root cause of the cascade above — the operator never saw the real error from `add`, only the downstream "Marketplace not found" symptoms. Source: 2026-05-10 init report from a clean-machine bootstrap against a private-CA Agnes deployment. ### Added - **Setup prompt always registers the `agnes` Claude Code marketplace**, even when the operator has zero plugin grants. Registering the per-user marketplace clone pre-wires the SessionStart hook so future admin grants land automatically on the next Claude Code session without re-running setup. The marketplace block's copy adapts: empty plugin list shows "no plugins granted yet", populated list shows "install plugins". Steps 4 (preflight) + 5 (marketplace) are now always emitted; Confirm shifts from step 6 to step 9 across the full layout. - **Setup prompt registers the Atlassian Remote MCP server unattended** via `claude mcp add --transport sse atlassian https://mcp.atlassian.com/v1/sse` (Fix C in the 2026-05-10 init-report response). Hosted Remote MCP, so Claude Code handles OAuth automatically the first time the operator asks it to read a Jira ticket or Confluence page — no PAT/keychain dance. Idempotent across re-runs (`|| true` swallows the "server already exists" exit). Asana and Google Workspace stay on the /home connector cards because their PAT/CLI flows don't fit an unattended bootstrap. - **Setup prompt's Confirm step nudges the user toward connector cards on /home** for Asana / Google Workspace / Atlassian PAT flows that the bash script can't automate. Surfaces the cards so analysts don't finish bootstrap thinking they're fully wired. - **`/update-agnes-plugins` slash command** — installed automatically by `agnes init` into `/.claude/commands/`. Runs `agnes refresh-marketplace` (the chatty default mode) so the user sees install/update progress streamed into the Claude Code transcript and can react to errors interactively, instead of having a full reconcile happen silently behind a SessionStart hook. - **`agnes refresh-marketplace --check`** — lightweight detector mode for the SessionStart hook. Runs `git fetch` only, compares local `HEAD` with remote `FETCH_HEAD`, and emits a Claude Code hook JSON message pointing the user at `/update-agnes-plugins` when there are remote changes. Silent when up to date. No `git reset`, no `claude plugin marketplace update`, no plugin install/update side effects. - **Flea-market entity edit feature with version history (schema v38).** Owner + admin can now edit a store entity from a real Edit page at `/marketplace/flea/{id}/edit` (replaces the prior "coming soon" placeholder). Editable fields: display name, description, category, video URL, cover photo, and an optional new bundle. Type is locked (400 `type_locked` on change attempt). Display-name change renames the on-disk slug for both the live `plugin/` dir and the version dir, mirroring the rename-on-archive flow. Each bundle update creates a new version: bytes bake into `${DATA_DIR}/store//versions/v/plugin/`, run the standard guardrails pipeline. **Deferred promotion:** the live `plugin/` dir and `entity.version_no` stay at the prior approved version through the LLM review window, so existing installers keep receiving the previously approved bundle while the new version is being validated. Promotion (live swap + version_no/version/file_size bump) happens only on LLM approval; if the new version is blocked, installers continue serving the prior approved version indefinitely. The entity row carries `version_no` (current served index) and `version_history` JSON (append-only per-version metadata: hash, sha256, size, submission_id, created_at, created_by). Existing entities backfill to v1 with a single-entry history seeded from the row's current `version` hash. **Block-while-pending:** an in-flight LLM review blocks any further edit with 409 `prior_version_pending`. Owner waits ~5-30s; the detail page Edit button renders disabled in the same window. **Rollback:** new endpoint `POST /api/store/entities/{id}/versions/{n}/restore` (owner + admin) copies a prior version's bundle forward as v and re-runs guardrails. Forward-only history — the original row keeps its verdict; the new copy gets a fresh one. Detail page renders a Versions card with restore buttons for owner/admin only. **Admin queue** gains a `v#` column (with "current" badge) and a separate Hash column. Submission detail page surfaces Version + Bundle hash rows. Activity timeline splits into per-submission + entity-wide cards so admins can tell version-scoped events apart from entity-wide ones; entity-wide rows render `vN` chips when the audit row's params reference a version. ### Changed - **CLAUDE.md template renames the marketplace section to "Agnes Marketplace — plugins available to you"** and clarifies that Claude Code addresses every plugin as `@agnes` regardless of upstream marketplace slug — the per-user aggregated marketplace name is always `agnes`. Resolves the naming-drift confusion flagged in the 2026-05-10 init report (CLAUDE.md previously rendered upstream marketplace registry names like ` Marketplace` / `-marketplace` without explaining the typed name is always `agnes`). Upstream marketplace names still render as nested bullets so admins see what's been folded in. - **SessionStart marketplace hook is now read-only.** The hook installed by `agnes init` was previously `agnes refresh-marketplace --quiet`, which performed a full fetch+reset+install cycle on every session start (slow, invisible to the user, not interactively recoverable). It now runs `agnes refresh-marketplace --check` — detect-only — and surfaces a hint to run `/update-agnes-plugins` when updates are available. Existing workspaces auto-upgrade on next `agnes init` (the substring marker `agnes refresh-marketplace` matches both the old and new entry shapes, so the idempotent-replace path correctly rewrites them). - **Marketplace "Added to your stack" hint points at `/update-agnes-plugins`.** The post-install green panel on plugin and skill/agent detail pages used to suggest `agnes refresh-marketplace` in a shell prompt and reference the SessionStart auto-install. With the hook now being detect-only, that text was outdated. The hint is condensed to a single instruction — open a new Claude Code session and run `/update-agnes-plugins` — with the slash command in a copy chip. Affects `marketplace_plugin_detail.html` and `marketplace_item_detail.html`. ### Removed - **BREAKING: `agnes refresh-marketplace --quiet` flag.** Replaced by `--check` (detect-only) and the new `/update-agnes-plugins` slash command (interactive update). Existing SessionStart hooks calling `--quiet` will silent-noop after the CLI upgrade — the hook's `2>/dev/null || true` swallows the unknown-flag error — until the user re-runs `agnes init`, which rewrites the hook to use `--check` and installs the slash command. Dashboard `/setup` flow re-runs `agnes init` automatically on next paste. - **BREAKING: legacy `git config --global http..sslVerify=false` downgrade in the install setup prompt.** The marketplace step (step 5) used to emit this line on `AGNES_DEBUG_AUTH=1` instances when no `ca_pem` was readable from `AGNES_TLS_FULLCHAIN_PATH` (default `/data/state/certs/fullchain.pem`). It tripped Claude Code auto-mode classifiers ("do not disable TLS verification" rule) and silently masked operator misconfigurations — a debug-auth instance without a fullchain on disk would fall through to a TLS-disabled clone instead of surfacing the missing cert. With this change there is exactly one trust-bootstrap path: the cross-platform step 0 trust block (gated on `_read_agnes_ca_pem` returning a PEM). Operators serving a self-signed or private-CA cert MUST place the fullchain at the configured path so step 0 picks it up; publicly-trusted certs need no trust block at all. The `self_signed_tls` parameter on `app.web.setup_instructions.resolve_lines` and `render_setup_instructions` is also dropped (was only consumed by the deleted block). ### Fixed - **`v34→v35` migration is now idempotent under partial-rebuild recovery.** The original list-form `_V34_TO_V35_MIGRATIONS` ran four ALTER statements in sequence: `ADD _vis_v35` → `UPDATE _vis_v35 = visibility_status` → `DROP visibility_status` → `RENAME _vis_v35 TO visibility_status`. If the RENAME failed for any reason after the DROP succeeded (DuckDB lock contention at startup, scheduler-vs-app race opening `system.duckdb`, container kill mid-migration, …), the DB was stranded with `_vis_v35` populated and `visibility_status` missing — and `schema_version` never bumped because the UPDATE at the bottom of the migration ladder only runs when *every* step succeeds. Subsequent restarts then hit `DROP visibility_status` again with no `IF EXISTS` guard and looped on the same error; the only recovery was hand-editing the DB. The migration is rewritten as a Python function `_v34_to_v35_migrate` that inspects the table's columns up front and dispatches into one of three paths: clean v34 (run the full rebuild), partial v35 with `_vis_v35` only (finish the RENAME alone), or both columns present (drop the temp). The audit columns (`archived_at`, `archived_by`) ship first behind `IF NOT EXISTS` so they're safe in all states. Operators stranded by the original bug recover automatically on next startup. Tests cover the three direct paths plus an end-to-end scenario where `_ensure_schema` walks a `schema_version=32` DB with the half-applied state up through to v36. ### Security - **Prompt-injection hardening for store guardrails LLM review (#1).** `SYSTEM_PROMPT` is now passed via the Anthropic SDK's dedicated `system=` parameter instead of being concatenated into the user message. Bundle file contents are wrapped in `...` sentinels that the system prompt declares data-only; literal sentinel strings appearing in user content are escaped (`<_bundle_>`) so an adversarial README can't forge a closing tag and inject instructions. The system prompt explicitly tells the reviewer to flag injection attempts inside `` rather than follow them. See `tests/test_store_guardrails_prompt_injection.py` for the corpus. - **Static security scan documented as signal, not gate (#6 partial).** Module docstring + admin-queue copy + `docs/STORE_GUARDRAILS.md` call out that substring matches are suggestive only — the LLM verdict carries the safety determination. Documentation files (`.md`, `.txt`, `.rst`, `.html`, `.json`, `.yaml`, `.yml`, `.toml`) now skip static scan to avoid false positives on prose that legitimately discusses `eval`/`exec`. AST-mode for Python source is tracked as a follow-up. ### Added - **Stuck-review reaper (schema v35 + new endpoint).** `POST /api/admin/run-reap-stuck-reviews` flips submissions stuck at `status='pending_llm'` past the configured grace (`guardrails.stuck_review_grace_seconds`, default 1800s) to `review_error`. Scheduler invokes every 15 min. Without this a worker crash between status flip and verdict write left rows pending forever. Set the knob to 0 to disable. - **PUT /api/store/entities/{id} atomic rename (#2).** Bundle updates now bake into a sibling `plugin.staging-/` dir, run inline checks against the staging copy, then atomic- rename onto the live path on success. Failed checks leave the live tree byte-for-byte intact. Pre-fix the bake wrote into the live path BEFORE checks ran; concurrent GETs could see partial / unverified content. - **Schema v35 → v36** re-applies `NOT NULL` + `DEFAULT 'pending'` on `store_entities.visibility_status` (lost in the v34→v35 column rebuild). Value-list invariant remains application-side enforced via the repo whitelist (DuckDB `ADD CHECK` on existing columns is not supported). ### Changed - **BG-task verdict-vs-archive race fixed (#3).** `StoreEntitiesRepository.set_visibility_if_pending` flips visibility only when the row is still in the review window (`pending` / `hidden`). When an admin archives an entity while the LLM review is in flight, the BG verdict no longer clobbers the archive — admin's decision wins. Skipped flips emit a `store.submission.bg_verdict_skipped` audit row so admins can see why an "approved" verdict didn't publish. - **Quota counter widened to all reject states (#9).** `count_blocked_for_submitter_since` now counts `blocked_inline`, `blocked_llm`, AND `review_error` against the per-submitter daily cap. Pre-fix a bot triggering only LLM-blocked verdicts was unbounded. - **Un-archive clears archive metadata (#11).** `set_visibility` nulls `archived_at` + `archived_by` when transitioning OUT of `'archived'` so a future read doesn't show stale archive forensics on an approved row. - **Missing `risk_level` surfaces as `review_error` (#10).** An LLM response that omits or empties `risk_level` no longer defaults to `medium` (which looked like a model decision and silently blocked); it persists as `review_error` with `error='missing_risk_level'` so the admin gets a real Retry button. - **Sort-key whitelist for admin queue (#23).** `/api/admin/store/submissions?sort=…` rejects unknown keys with HTTP 400 `invalid_sort_key`. Pre-fix a substring-replace chain could drop column references silently when one column name was a substring of another. - **FSM doc comment in `_SYSTEM_SCHEMA` corrected (#12).** Explicit insert/transition/lifecycle sections describe the actual status machine instead of the misleading `pending → pending_llm → ...` chain. `pending_inline` clarified as reserved-but-unused. - **Soft delete (Archive) for store entities (schema v35).** `DELETE /api/store/entities/{id}` is now soft by default — flips `visibility_status='archived'` + stamps `archived_at` / `archived_by`. Bundle stays on disk, existing `user_store_installs` continue serving the bundle through `marketplace.zip` / `.git` so already-installed users don't lose the plugin. Browse listings hide archived entries from everyone (including the owner — admins triage). New installs refused. My AI Stack still shows installed-but-archived entries with a subtle *"Archived by owner"* badge. **Hard delete** moves to `DELETE /api/store/entities/{id}?hard=true` — admin-only. Drops the bundle bytes + cascades to remove `user_store_installs` (existing users lose the plugin on next sync). Use only for legal / privacy removals where the bytes have to go. Detail-page UX: owner of an approved entity sees an **Archive** button. Admin sees both **Archive** and a separate red **Hard delete (admin)** button with an install-count warning in the confirm dialog. Quarantined (pending / blocked) entities lock both buttons for the owner — admin still sees both. **Visibility-leak gates (similar audit):** `/api/store/owners` + `/api/marketplace/categories?tab=flea` now filter to `visibility_status='approved'` for non-admin callers (admin sees all). Without this, owner identity + per-category counts of quarantined or archived entries leaked through the public dropdown / filter chips. ### Changed - **Rename-on-archive frees the name for re-upload.** Archiving an entity now appends `__archived__` to `store_entities.name` in the same UPDATE that flips `visibility_status='archived'`. The on-disk skill / agent / plugin subdir is renamed in lockstep (`skills//` → `skills//`) and SKILL.md / agent.md / plugin.json frontmatter `name` is rewritten so consumers' Claude Code resolves the new slug after their next sync. The `(owner_user_id, name)` UNIQUE slot AND the global `-by-` invocation slot free up, so the same owner can re-upload under the original name without picking a new one. Admin un-archive (set_visibility from 'archived' to 'approved') strips the suffix; if the original slot is taken by a re-upload, the un-archived row gets `-restored-N`. Display layer (admin queue, my-stack, marketplace cards / detail) strips the suffix so users see the original label with an "Archived" badge instead of the marker. Trade-off: existing installers see the plugin renamed on next pull and need to re-add (one-tap recovery via the My AI Stack card; same data, new slug). `audit_log.params['original_name']` preserves forensic traceability. - **Admin submissions queue: Archived chip filters live entity visibility via LEFT JOIN, not denormalized submission status.** Verdict (`store_submissions.status`) is immutable forensic record; lifecycle (`store_entities.visibility_status`) is the live source of truth. Any code path that flips visibility now surfaces in the queue immediately — no denormalization to drift. *Deleted* chip still filters `entity_id IS NULL AND status='deleted'` (entity row is gone after hard delete; explicit marker required). The submission detail page renders Status (verdict) and Entity lifecycle side by side. Closes the bug where archiving an entity outside the soft-delete API didn't surface under `?status=archived`. - **Consolidated `/store/{id}` into `/marketplace/flea/{id}`.** The legacy detail surface is gone; the unified marketplace detail page is the canonical home for every flea entity. Three in-tree callers (upload-success redirect, My AI Stack card href, /store browse card href) now point straight at the new URL — no redirect hop. Stale external `/store/{id}` bookmarks 404. The marketplace detail templates (`marketplace_plugin_detail.html` + `marketplace_item_detail.html`) gained the **quarantine banner** (extracted into a shared `_quarantine_banner.html` partial), an **owner-actions strip** (Edit "coming soon" + Delete with locked variants), and the **install-button gating** (gray inert when non-approved). The marketplace listing now surfaces a small **"Under review" / "Quarantined"** corner badge on the submitter's own non-approved cards (only visible to them; everyone else still sees only approved entries). ### Added - **Visibility gate on `/marketplace/flea/{id}` + `/api/marketplace/flea/{id}/detail`.** Non-owner non-admin gets 404 (not 403, no leak) on any non-approved entity — closes the bypass where guessing an entity_id pulled the bundle metadata through the marketplace JSON feed even though the entity was excluded from the public listing. - **`StoreEntitiesRepository.list(include_owner_id=…)`.** When set, the WHERE expands to `(visibility_status IN (...) OR owner_user_id = :uid)` so the caller's own non-approved entries surface alongside everyone's approved ones. Used by `/api/store/entities` and `/api/marketplace/items?tab=flea`. ### Removed - **`/store/{id}` route + `store_detail.html` template.** Replaced by the consolidated marketplace detail surface above. ### Removed - **`store_submissions.retry_count` column (schema v34).** Counter mixed two unrelated things (LLM error count + admin rescan count), was asymmetric (Retry LLM didn't bump but Rescan did), and is fully redundant with the audit_log activity timeline now rendered on the detail page — every rescan / retry / review_error is a row there with timestamp + actor. Removed from schema, repo signatures, admin endpoints, and the detail-page metadata. ### Internal - Migrate `src/marketplace_asset_mirror.py` from `urllib.request` to `httpx` (PR #234 review #16). The asset mirror was the only HTTP call site in Agnes still using `urllib.request`; every other module (CLI, Jira / OpenMetadata / OpenAI connectors, scheduler, Telegram bot) already used `httpx`. Following the existing convention has three concrete benefits here: (a) the SSRF defence collapses from five urllib classes (`_PinnedHTTPConnection`, `_PinnedHTTPSConnection`, `_PinnedHTTPHandler`, `_PinnedHTTPSHandler`, `_SafeRedirectHandler`) into a single `_SSRFGuardTransport` because httpx invokes `handle_request()` on every redirect hop, so re-validation is automatic; (b) the per-leg URL host is rewritten to the SSRF-validated IP and the original hostname is preserved in the `Host` header + `sni_hostname` extension, defeating DNS rebinding without subclassing `HTTPConnection` / `HTTPSConnection`; (c) error handling collapses from `URLError` + `HTTPError` + manual unwrap into one `httpx.HTTPError` catch + specific subclasses for timeout / too-many-redirects, matching the `_translate_transport_error` shape from `cli/client.py`. The shared `httpx.Client` is built lazily at module load (same pattern as `cli/client.py:_get_shared_client`) with `follow_redirects=True`, `max_redirects=5`, and our custom transport. Externally observable behaviour is unchanged: same `FetchOutcome` statuses (ok / not_modified / failed / rejected), same manifest format, same conditional GET semantics. Tests migrated from `urllib`-shaped fakes to `httpx`-shaped (`status_code`, `iter_bytes`, context manager); five urllib-specific tests replaced with httpx equivalents (transport unit tests + DNS-rebinding integration test). - Maintainability cleanup batch (PR #234 review #10, #14, #11). **#10:** dropped `_path_under` from `app/api/marketplace.py` — it was a byte-equivalent clone of `_safe_join` (same `Path.resolve(strict=True) + relative_to()` containment check), so the three callers in the v32 asset / doc / mirrored endpoints now share the existing helper. **#14:** renamed `src/marketplace_assets.py` → `src/marketplace_asset_validation.py` so the file's purpose (image / doc magic-byte validators + Content-Type allowlist + agnes-metadata parsers) is obvious from the name and the previous overlap with `src/marketplace_asset_mirror.py` is gone; six call-site imports updated in lockstep. **#11:** consolidated the three URL builders that resolve `/api/marketplace/curated///{asset,doc,mirrored}/...` paths — `_internal_asset_url` / `_internal_doc_url` / `_mirrored_asset_url` lived in `src/marketplace.py`, while a copy named `_mirrored_url` lived in `app/api/marketplace.py` with a "must stay aligned" comment. The new module `src/marketplace_urls.py` is the single source of truth; both call sites import from it. The route-handler endpoints themselves still own the path string literals — keeping the builders identical to the route declarations remains a checklist item. - Consolidate marketplace detail-page video embeds + format-guide CSS (PR #234 review #12, #13). The YouTube nocookie / Vimeo / `