agnes-the-ai-analyst/tests/test_cli_refresh_marketplace.py
minasarustamyan c6c72b9c00
feat(flea): marketplace refactor — data model, attribution, UI unification (#342)
* feat(flea): phase-1 — title, tagline, synthetic_name columns + upload UX

Schema v49 adds three user-facing metadata columns to store_entities:

- title (NOT NULL) — humanized display name shown on marketplace
  surfaces in later phases. Acronym-aware humanizer in
  src/store_naming.py (27 entries: MCP, API, OAuth, S3, …) shared
  with the frontend via Jinja-injected dict so JS pre-fill and
  Python backfill produce identical output.
- tagline (NULL, ≤200 chars) — optional short description for card
  listings. Long-form `description` stays.
- synthetic_name (NOT NULL) — deterministic `<name>-by-<owner_username>`
  stored as a column for indexing and as the single source of truth
  for attribution lookups in later phases. Today's bundle bake still
  uses suffixed_name() at the same call sites.

Migration (_v48_to_v49_migrate, Python function — humanize has no
SQL equivalent) backfills existing rows: title from
humanize_name(strip_archive_suffix(name)), synthetic from the concat
formula; tagline stays NULL. Idempotent (ADD COLUMN IF NOT EXISTS +
SET NOT NULL no-op on re-run).

Upload form (store_upload.html step 2) reorders fields: Title
(pre-filled from server-side humanize, JS keeps it in sync until
the user edits manually) → Name + dark synthetic preview on one
row (matches marketplace_item_detail.html dark code styling, no
copy button — preview only) → Short description with character
counter → Description (unchanged). Edit form (store_edit.html)
mirrors the layout with pre-filled values from the entity row.

API:

- POST /api/store/entities/preview returns `title` (humanized
  fallback) for upload form pre-fill.
- POST + PUT /api/store/entities accept `title` and `tagline` form
  fields with 100/200-char validation; PUT recomputes
  synthetic_name when `name` changes (caller responsibility per
  repo contract).
- StoreEntityResponse exposes all three new fields.

Repository:

- create() takes title + tagline + synthetic_name as optional
  kwargs with derived defaults (humanize_name(name) / concat) so
  existing test fixtures don't need to thread them.
- update() supports partial updates on all three; tagline empty
  string clears via NULL sentinel.
- archive() recomputes synthetic_name on rename to the archived
  slug so the column stays consistent with name.

Tests:

- New test_schema_v48_to_v49_migration.py: fresh install,
  populated-row backfill (incl. archived row strip), idempotence,
  NOT NULL constraint verification.
- test_store_naming.py: 14 humanize parametrize cases + acronym
  dict invariants.
- test_store_api.py::TestStoreV49Metadata: preview humanize, POST
  with explicit + fallback title, 100/200-char rejects, PUT
  partial update + synthetic recompute on rename.
- Schema version assertion bumps (48 → 49) in test_db_schema_version,
  test_home_stats, test_schema_v42_migration, test_schema_v46_migration.

Phase 1 only — surface rendering on cards / detail pages and
Claude Code bundle propagation come in later phases.

* feat(flea): phase-2 — wire title/tagline/owner through marketplace cards + detail pages

Phase 1 (7f4cfcbb) populated the three new columns on store_entities;
phase 2 surfaces them across the web presentation layer so the kebab-
case slug + bare username no longer leak into user-facing copy.

API:

- `_flea_to_item` now takes `conn` (both callsites updated) and sets
  `display_name=entity.title`, `tagline=entity.tagline`, `owner=
  _resolve_owner_display(conn, owner_user_id, owner_username)` —
  matches the chain the curated path already uses (users.name →
  users.email → fallback). The card JS chain `it.display_name ||
  it.name` then renders the friendly form; `name` stays at the
  suffixed slug as the technical identifier JS uses for fallbacks.
- `flea_detail` adds `display_name` + `tagline` to PluginDetailResponse
  so the standalone skill/agent + plugin detail heroes pick them up
  through the existing `d.display_name` / `d.tagline` chains.
- `_flea_inner_parent_fields` swaps `parent_display_name` from
  `strip_archive_suffix(name)` to `entity.title or strip_archive_suffix(
  name)`. Drives parent-plugin label in four surfaces at once:
  breadcrumb 3rd segment, hero "part of <plugin>" meta-row,
  helper "This skill is part of <plugin>" panel, and the Details
  sidebar's "Parent plugin" row.

Templates — `marketplace_item_detail.html`:

- Pre-render: browser title, hero h1, and hero-window-label read
  `(entity.title if entity else None) or inner_name or item_name or
  plugin_name` so the SSR shell shows the friendly title before the
  JS fetch lands (no flash of kebab-case).
- Breadcrumb last segment for flea standalone drops the `d.manifest_name
  || heroTitle` fallback in favour of just `heroTitle` — manifest_name
  is the suffixed slug and users explicitly didn't want it in the path.
- Hero meta-row for flea standalone is now hidden. The prior "by
  <author> · N installed · <size>" line duplicated install count
  (hero telemetry chip below), owner + bundle size (Details sidebar).

Templates — `marketplace_plugin_detail.html`:

- Same SSR pre-render swap (title, h1, window-label, crumb-name).
- Hero tagline element starts hidden; JS shows it only when
  `d.tagline` is truthy. Pre-fix it fell back to `d.description`
  (long-form text), which read awkwardly under the h1 and pulled the
  hero too tall. Description still renders in the "What it does"
  panel below the hero.
- Initial "Loading…" placeholder removed so entities without a
  tagline don't flash that text mid-fetch.

Tests:

- New `TestFleaPhase2Presentation` class in test_marketplace_api.py
  (6 cases): card title + tagline + full-name owner, owner fallback
  chain when users.name is NULL, flea_detail exposes title + tagline,
  tagline null when omitted, inner skill parent_display_name uses
  entity.title (explicit + humanize-fallback variants).
- Updated `TestListItems.test_flea_lists_uploads` to assert both
  `display_name == "Alpha"` (humanized) and `name ==
  "alpha-by-alice"` (suffixed slug compat).
- Updated `TestWebPages.test_marketplace_flea_detail_page_renders`
  to look for the humanized title ("Page Skill") in the SSR shell
  instead of the kebab-case `page-skill`.

* feat(flea): phase-3 — read synthetic_name from DB, suffixed_name() only on write

Phase 1 added the column + backfill, repo write paths keep it in sync.
Phase 3 routes every READ callsite through `store_entities.synthetic_name`
directly instead of recomputing `<name>-by-<owner_username>` on the fly,
and switches the collision query off the inline string concat. The
`suffixed_name()` primitive now lives exclusively in write flows.

Read callsites updated (all read `entity["synthetic_name"]` directly,
no fallback — the column is NOT NULL and a missing value would be a
real bug worth surfacing as KeyError):

- app/api/marketplace.py:_flea_to_item — card MarketplaceItem.name.
- app/api/marketplace.py:flea_detail — PluginDetailResponse.manifest_name.
- app/api/store.py:_entity_to_response — StoreEntityResponse.invocation_name.
- app/api/store.py PUT bundle re-bake — `suffixed` passed to
  `_bake_plugin_tree`; entity is loaded pre-rename, so its
  synthetic_name is the OLD value `_bake_plugin_tree` expects.
- app/api/store.py PUT rename — `old_suffix` for `_rename_baked_tree`.
- app/api/my_stack.py — StoreInstallEntry.invocation_name.
- src/marketplace_filter.py — manifest_name in served plugin entry.

`suffixed_name` imports removed from marketplace.py, my_stack.py, and
marketplace_filter.py (no remaining callsites). store.py keeps the
import for its write paths:

- POST create (`suffixed = suffixed_name(final_name, username)` →
  passed to `_bake_plugin_tree` and `repo.create(synthetic_name=...)`).
- PUT rename collision check (`new_suffixed`).
- PUT rename `new_suffix` for `_rename_baked_tree` (proposed value).
- PUT rename `new_synthetic` for `repo.update(synthetic_name=...)`.
- Archive `old_suffix` + `new_suffix` for `_rename_baked_tree`
  (retro-compute pre-archive value after `repo.archive` already
  overwrote the DB row with the post-archive synthetic).

Collision SQL — `_suffixed_already_taken`:

  WHERE name || '-by-' || owner_username = ?   (before)
  WHERE synthetic_name = ?                     (after)

Same matches today (phase 1 backfill + NOT NULL invariant + write
paths in sync); indexable + single source of truth going forward.

Repository:

- UserStoreInstallsRepository.list_for_user explicit SELECT extended
  with `se.title`, `se.tagline`, `se.synthetic_name` so my_stack and
  marketplace_filter callers can read them off the joined row.

Tests:

- test_store_api.py::test_invocation_name_reads_from_synthetic_column —
  upload entity, manually override the column with a non-canonical
  value, verify GET response returns the override (proves read path
  consumes the column, not recomputes).
- test_marketplace_api.py::test_flea_card_and_detail_read_synthetic_name_from_db —
  same proof for `MarketplaceItem.name` (card) and
  `PluginDetailResponse.manifest_name` (detail).

* feat(flea): phase-4 — rename agnes-store-bundle → flea (synthetic plugin)

The synthetic plugin that wraps loose flea-market skills + agents into
one Claude Code plugin is renamed from `agnes-store-bundle` to `flea`.
Plugin-type flea uploads (their own standalone plugin entry) are
unaffected.

Constants:
- src/marketplace_filter.py:
  - BUNDLE_PLUGIN_NAME: "agnes-store-bundle" → "flea"  (Claude Code
    plugin manifest name + .claude-plugin/plugin.json name)
  - BUNDLE_PREFIXED_NAME: "store-bundle" → "flea"      (on-disk ZIP /
    git tree path, now plugins/flea/...)

Attribution layer (services/session_processors/usage_lib.py):
- FLEA_BUNDLE_PREFIX: "agnes-store-bundle" → "flea". The JSONL
  invocation identifier going forward is `flea:<skill-name>`.
- New `_LEGACY_FLEA_BUNDLE_PREFIXES = ("agnes-store-bundle",)`.
  `MarketplaceItemLookup.resolve()` + `_attribute_event()` accept BOTH
  the new and the legacy prefix so historic usage_events (~90-day
  retention) continue attributing to source='flea'. The tuple becomes
  a no-op once the rename has been live past the retention window —
  a follow-up commit can drop it then.
- USAGE_PROCESSOR_VERSION bumped 6 → 7 so the session-pipeline reprocess
  loop re-runs attribution with the new + legacy prefix branches.

User-facing copy:
- /api/store/bundle.zip Content-Disposition filename: agnes-store-bundle.zip → flea.zip
- `agnes admin store pull` default --out: agnes-store-bundle.zip → flea.zip
- Docstrings + JS comment + welcome template comment updated.

Tests:
- skill_flea.jsonl fixture identifier updated to flea:flea-skill.
- New skill_flea_legacy.jsonl with the legacy prefix for backward-compat
  coverage.
- New test `test_legacy_agnes_store_bundle_prefix_resolves` replays the
  legacy fixture and asserts source='flea' attribution still lands.
- All other test assertions / mocks substituted mechanically:
  test_session_processor_usage.py, test_usage_rollups.py,
  test_marketplace_filter_store.py, test_store_api.py,
  test_cli_refresh_marketplace.py.
- `_seed_flea_entity` (test_usage_rollups.py) + `_seed_attribution`
  (test_session_processor_usage.py) helpers now supply the NOT NULL
  `title` + `synthetic_name` columns from phase 1, since they INSERT
  directly bypassing the repo's create() fallback.

Client rollover note (CHANGELOG): `agnes refresh-marketplace` will
install the new `flea@agnes` plugin and the local marketplace clone's
`plugins/store-bundle/` source folder is removed via `git reset --hard`.
Whether Claude Code itself auto-prunes the orphan `agnes-store-bundle
@agnes` registry entry is undocumented — to verify empirically on the
dev VM. If the orphan entry lingers, a follow-up will add targeted
cleanup; until then users can manually run
`claude plugin uninstall agnes-store-bundle@agnes`.

Verified locally: 98 passed (session_processor_usage + usage_rollups +
marketplace_filter_store + cli_refresh_marketplace) + 228 passed/2
skipped (store_api + marketplace_api + admin_store_submissions +
store_entity_versions + store_repositories).

* fix(flea): phase-5 — attribution keyspace mismatch (closes #335)

Pre-fix every flea skill/agent invocation silently fell through to
`usage_events.source = 'builtin'`. Root cause: lookup tables in
`services/session_processors/usage_lib.py` keyed `_flea_entities` (and
the derived `_flea_plugins` set) by `store_entities.name` — the
un-suffixed display name. Claude Code writes invocations as
`flea:<synthetic_name>` (e.g. `flea:xlsx-by-c-marustamyan`), so
`dict.get(local)` always missed and the resolver fell through to
builtin. Result: marketplace cards, detail telemetry chips, admin
group-by-source all showed 0 flea invocations even when the raw
JSONL stream was correct.

Phase 1 added the `synthetic_name` column + backfill; phase 4 renamed
the bundle prefix to `flea`; phase 5 finally flips the lookup
keyspace to match what JSONL writes.

usage_lib.py:
- `MarketplaceItemLookup.__init__` preload: `SELECT synthetic_name,
  type FROM store_entities` (was `SELECT name, type`). `_flea_plugins`
  set derived from those keys, so it now carries synthetic_names
  too — matches what Claude Code writes when invoking a skill nested
  inside a flea plugin (`<synthetic>:<inner>`).
- `rebuild_rollups` preload: same SELECT change; also derives
  `flea_plugins` and threads it through `_aggregate_events` /
  `_rebuild_window`.
- `_attribute_event`: signature extended with `flea_plugins`; new
  branch `if prefix in flea_plugins: return ("flea", default_type,
  prefix, local)` for flea-plugin-nested skills/agents. This branch
  was added to `MarketplaceItemLookup.resolve()` in v6 (commit
  e076ebbe) but the rollup builder's helper was never updated to
  match, so nested skills inside flea plugins silently dropped out
  of the daily/window fact tables.
- `USAGE_PROCESSOR_VERSION`: 7 → 8. Forces the session-pipeline
  reprocess loop to re-attribute existing usage_events rows with
  the corrected lookup so rollup tables fill correctly on the next
  tick.

marketplace.py — 4 API stats lookup callsites switched from
`entity["name"]` to `entity["synthetic_name"]`:
- `_flea_to_item` (card stats lookup)
- `flea_detail` (`_build_telemetry` + `_load_inner_items_stats_by_parent`)
- `flea_skill_detail` (inner detail `parent_plugin` key)
- `flea_agent_detail` (inner detail `parent_plugin` key)

Tests:
- `skill_flea.jsonl` invocation: `flea:flea-skill` →
  `flea:flea-skill-by-alice` (mirrors what Claude Code writes after
  phase 1/4 — the suffixed synthetic_name).
- `test_flea_skill_attributed_with_empty_parent` assertion: rollup
  `name` column now carries the synthetic_name.

No legacy `agnes-store-bundle` prefix backward compat — clean cut per
user direction (dev phase, no production data worth preserving).

Verified locally: 53 passed targeted (session_processor_usage +
usage_rollups + marketplace_filter_store) + 215 passed/2 skipped
broader (store_api + marketplace_api + admin_store_submissions +
store_entity_versions).

* fix(flea): phase-6 — plugin-level rollup aggregation parity for flea

Flea plugin entity cards + detail pages showed 0 invocations even
though nested skills had correct rollup rows. Root cause: the
plugin-level aggregation pass in `_aggregate_events` was hardcoded
to `source='curated'` only:

    if source != "curated" or not parent:
        continue
    if group_by_day:
        pkey = (day, "curated", "plugin", "", parent)
    else:
        pkey = ("curated", "plugin", "", parent)

So flea plugin entities never got a synthetic
`(source='flea', type='plugin', parent_plugin='', name=<synth>)`
row aggregating nested invocations. `_load_invocation_stats('flea')`
filters `parent_plugin = ''` and returned no row for flea plugin
entity cards, so `stats.get(entity["synthetic_name"])` missed and
the API exposed 0/0.

Triggered by empirical observation on the dev VM —
`codex-second-opinion-by-c-marustamyan` plugin showed 0 calls in
the listing card while its three inner skills (codex-setup ×3,
codex-review ×1, codex-second-opinion ×1) had the expected child
rollup rows.

Fix:

- Extend the guard to `source in ("curated", "flea")`.
- Replace the hardcoded `"curated"` in the `pkey` tuple with the
  loop's `source` variable, so flea aggregation lands as `source=
  'flea'` and curated aggregation continues landing as
  `source='curated'`.

API path unchanged — `_load_invocation_stats('flea')` filters
`parent_plugin = ''` already picks up the new aggregated row
alongside standalone skill/agent rows. Rollup `name` field carries
the synthetic_name keyspace; no collision between standalone entity
synthetic and plugin entity synthetic (global suffix uniqueness
enforced by `_suffixed_already_taken`).

`USAGE_PROCESSOR_VERSION` bumped 8 → 9 to force a reprocess pass so
historic nested-invocation data fills the new plugin-level rows on
the next tick (instead of waiting for the next live invocation).

Tests:

- New `test_flea_plugin_row_aggregates_children` mirrors the existing
  `test_curated_plugin_row_aggregates_children`: seeds a flea plugin
  entity, three nested events (one user invoking two skills, a
  second user invoking one) → asserts the aggregated plugin row
  carries count=3, distinct_users=2 (union, not sum), plus the child
  rows survive alongside.

Verified locally: 43 passed (session_processor_usage + usage_rollups)
+ 82 passed/2 skipped broader (+ marketplace_filter_store +
marketplace_api).

* refactor(marketplace): phase-7 — unify Details sidebar across detail surfaces

Five marketplace detail surfaces (curated plugin, flea plugin, curated
inner skill/agent, flea inner skill/agent, flea standalone skill/agent)
had drifted on which Details rows they show and what order — the same
field landed in different positions, some fields duplicated hero info,
and the flea plugin Owner row leaked the kebab-case `owner_username`
slug instead of the user's real name. This commit aligns all five
surfaces on a single scan order driven by UX priority:

  identity → life-stage → telemetry → debug-tier

Concretely:

  1. Curator / Owner          (first scan signal — trust)
  2. Parent plugin            (inner skill/agent only)
  3. Released                 (top-level only — plugins + flea standalone)
  4. Last used                (recency)
  5. Active days              (engagement consistency)
  6. Version                  (flea standalone only — content hash)
  7. Bundle size              (debug-tier)

Dropped:

  - Slug field on plugin detail surfaces (`marketplace_id` for curated,
    `entity_id` for flea). Pure debug info, never user-relevant; URL
    already carries it.
  - Category + Installs on flea standalone skill/agent detail.
    Category is already shown as a hero badge; install count is in
    the hero telemetry chip — sidebar duplication added noise.

Owner display:

  - Flea plugin Owner row now reads `d.owner_display` (resolved through
    `users.name → users.email → owner_username` by `_resolve_owner_display`
    in `app/api/marketplace.py:1491`) instead of the raw `d.author_name`
    (which is `owner_username`, the kebab-case slug). API field already
    populated from phase 2; templates just consume it.
  - Curated Curator row continues to read `d.author_name` from
    marketplace-metadata.json; `owner_todo` placeholder behavior
    preserved.

Files:

  - app/web/templates/marketplace_plugin_detail.html — rewrote the
    Details render loop (lines 1364-1427 area). Slug row removed,
    rows reordered, Owner branch reads `d.owner_display`.
  - app/web/templates/marketplace_item_detail.html — both branches of
    the Details sidebar (inner skill/agent + flea standalone) re-laid
    around the same scan order. Telemetry helper unchanged, just
    repositioned. Category + Installs rows removed from the
    standalone branch.

No new tests — no existing test asserts the precise order of Details
rows or references the dropped fields in a sidebar context (grep
confirmed). API surface unchanged.

Verified locally: 84 passed / 2 skipped on `test_marketplace_api.py`
+ `test_store_api.py`.

* fix(flea): post-review hardening — N+1, v50 UNIQUE, docs, test cleanup

Addresses 5 critical findings from PR #342 code review:

1. N+1 query in `_flea_to_item` — owner-display resolution previously
   ran one `SELECT … FROM users WHERE id = ?` per item in the listing
   comprehension. Now batched via `_load_users_display` IN-query
   prefetch; 50 items drops 51 user queries to 2. Regression-guarded
   by `TestFleaOwnerDisplayBatched` (spies `_resolve_owner_display`
   and asserts it's not called inside the list path).

2. Misleading comment in `src/marketplace_filter.py` claimed the
   attribution layer accepts both `agnes-store-bundle` and `flea`
   prefixes — it doesn't (clean cut per CHANGELOG). Rewrote to match
   reality.

3. CHANGELOG `[Unreleased]` had two `### Changed` blocks. Merged into
   one (BREAKING bullet first).

4. New v49→v50 migration adds `UNIQUE INDEX
   idx_store_entities_synthetic_name`. v49 made `synthetic_name` the
   canonical attribution key but uniqueness was only app-enforced;
   v50 promotes the invariant to the DB layer. Migration pre-checks
   for existing duplicates and raises `RuntimeError` listing them
   rather than letting `CREATE UNIQUE INDEX` fail mid-way. v48→v49
   migration gained an `is_nullable='YES'` guard on its `SET NOT NULL`
   ALTERs so re-runs on a fully-migrated DB don't trip DuckDB's
   "cannot alter entry … entries depend on it" block (the new index
   counts as such an entry). Index is created by the migration only —
   keeping it out of `_SYSTEM_SCHEMA` preserves fresh-install ordering
   (CREATE TABLE → v49 ALTERs → v50 CREATE INDEX).

5. Deleted three redundant version-pinned schema asserts whose names
   lied about their bodies (`test_schema_version_is_42` asserting
   `== 49`, etc.). Canonical assert lives in
   `test_db_schema_version.py`, renamed to
   `test_schema_version_matches_constant`.

* fix(db): gate v34→v38 store_entities ALTER COLUMN steps on column state

CI on Linux failed `test_v17_to_v18_drops_*` after the v50 UNIQUE INDEX
landed. Root cause: those tests open a DB at the full target version,
seed fixtures, then reset `schema_version` to 17 and reopen — forcing
the ladder to re-run from 17 → current. With the v50 index now in place,
DuckDB blocks intermediate `ALTER COLUMN` steps on `store_entities`
("Cannot drop this column: an index depends on a column after it!" /
"Cannot alter entry because there are entries that depend on it"),
because `synthetic_name` (the indexed column) sits positionally after
the columns those steps touch.

Fix: convert the three SQL-list migrations that hit store_entities into
defensive Python functions:

- `_v34_to_v35_migrate` short-circuits when `synthetic_name` already
  exists (post-v49 shape — the visibility_status rebuild is moot and
  the DROP COLUMN would be blocked by the index).
- `_v35_to_v36_migrate` gates the `visibility_status SET NOT NULL` +
  `SET DEFAULT` on `is_nullable='YES'` so it's a true no-op when the
  column is already constrained.
- `_v37_to_v38_migrate` gates the `version_no SET NOT NULL` step the
  same way.

Forward-roll path (real installs that never reset schema_version) is
unchanged: the gates fire `YES` → ALTERs run. The fix only changes
behavior for the "DB is already at v50 shape but version row says 17"
scenario the tests construct.

---------

Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
2026-05-19 02:32:41 +02:00

1312 lines
51 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

"""Tests for `agnes refresh-marketplace` Typer wrapper."""
from __future__ import annotations
import json
import re
import subprocess
from pathlib import Path
from typing import Optional
import pytest
from typer.testing import CliRunner
from cli.commands import refresh_marketplace as rm_module
from cli.commands.refresh_marketplace import refresh_marketplace_app
# CI-safety: Typer/rich emits ANSI escapes in --help output. Strip before asserts.
_ANSI_RE = re.compile(r"\x1b\[[0-9;]*m")
def _clean(s: str) -> str:
return _ANSI_RE.sub("", s)
runner = CliRunner()
# --- Test fixtures and helpers --------------------------------------------------
class _RecordedCall:
"""Captures a single subprocess.run invocation for assertion."""
def __init__(self, cmd: list[str], env: Optional[dict] = None) -> None:
self.cmd = cmd
self.env = env or {}
class _SubprocessRecorder:
"""Replaces subprocess.run with a recording stub. Each scripted result
is matched by command-prefix against incoming calls."""
def __init__(self) -> None:
self.calls: list[_RecordedCall] = []
self.scripts: list[tuple[tuple[str, ...], subprocess.CompletedProcess]] = []
def script(self, prefix: tuple[str, ...], returncode: int = 0,
stdout: str = "", stderr: str = "") -> None:
"""Register a scripted response. Calls whose cmd starts with
``prefix`` get this CompletedProcess. Most-specific (longest)
prefixes match first, so a ``claude plugin list --json`` script
wins over a generic ``claude`` fallback."""
self.scripts.append(
(prefix, subprocess.CompletedProcess(args=list(prefix), returncode=returncode,
stdout=stdout, stderr=stderr))
)
def run(self, cmd, *args, env=None, capture_output=False, text=False, check=False, **kwargs):
self.calls.append(_RecordedCall(cmd=list(cmd), env=dict(env) if env else {}))
# Match longest prefix first so more specific scripts beat generic ones.
sorted_scripts = sorted(self.scripts, key=lambda s: -len(s[0]))
for prefix, scripted in sorted_scripts:
if tuple(cmd[:len(prefix)]) == prefix:
return scripted
return subprocess.CompletedProcess(args=list(cmd), returncode=0, stdout="", stderr="")
@pytest.fixture
def recorder(monkeypatch) -> _SubprocessRecorder:
rec = _SubprocessRecorder()
monkeypatch.setattr(rm_module.subprocess, "run", rec.run)
return rec
@pytest.fixture
def with_clone(tmp_path, monkeypatch) -> Path:
"""Materialize a fake `~/.agnes/marketplace/` with `.git/` and an empty
marketplace.json so the reconcile step has something to parse."""
clone = tmp_path / "marketplace"
(clone / ".git").mkdir(parents=True)
(clone / ".claude-plugin").mkdir(parents=True)
(clone / ".claude-plugin" / "marketplace.json").write_text(
json.dumps({"name": "agnes", "plugins": []}),
encoding="utf-8",
)
monkeypatch.setattr(rm_module, "CLONE_DIR", clone)
return clone
@pytest.fixture
def with_token(tmp_path, monkeypatch) -> str:
cfg_dir = tmp_path / "_cfg"
cfg_dir.mkdir(parents=True)
(cfg_dir / "token.json").write_text(
json.dumps({"access_token": "test-pat-1234", "email": "dev@localhost"}),
encoding="utf-8",
)
monkeypatch.setenv("AGNES_CONFIG_DIR", str(cfg_dir))
return "test-pat-1234"
@pytest.fixture
def claude_in_path(monkeypatch):
monkeypatch.setattr(rm_module.shutil, "which", lambda name: "/fake/claude" if name == "claude" else None)
@pytest.fixture
def claude_not_in_path(monkeypatch):
monkeypatch.setattr(rm_module.shutil, "which", lambda name: None)
def _set_marketplace_manifest(clone: Path, plugins: list[dict]) -> None:
"""Rewrite the local marketplace.json with the given plugin list.
Each entry must have at least ``name`` and ``version`` (the reconcile
flow ignores entries without a version since it can't compare)."""
manifest = {"name": "agnes", "plugins": plugins}
(clone / ".claude-plugin" / "marketplace.json").write_text(
json.dumps(manifest), encoding="utf-8",
)
def _plugin_list_json(entries: list[dict]) -> str:
return json.dumps(entries)
# --- Tests ----------------------------------------------------------------------
def test_refresh_marketplace_help():
result = runner.invoke(refresh_marketplace_app, ["--help"])
assert result.exit_code == 0
cleaned = _clean(result.output)
# --check is the SessionStart-hook-friendly detector mode (replaced
# --quiet, which used to perform a full reconcile silently).
assert "--check" in cleaned
assert "--bootstrap" in cleaned
# --quiet was removed in favour of --check + the /update-agnes-plugins
# slash command. --auto-upgrade was removed earlier (version-aware
# reconcile is the default).
assert "--quiet" not in cleaned
assert "--auto-upgrade" not in cleaned
def test_refresh_marketplace_no_clone_is_silent_noop_with_check(tmp_path, monkeypatch, recorder):
monkeypatch.setattr(rm_module, "CLONE_DIR", tmp_path / "nonexistent")
result = runner.invoke(refresh_marketplace_app, ["--check"])
assert result.exit_code == 0
assert _clean(result.output) == ""
assert recorder.calls == []
def test_refresh_marketplace_no_clone_explains_in_manual_mode(tmp_path, monkeypatch, recorder):
monkeypatch.setattr(rm_module, "CLONE_DIR", tmp_path / "nonexistent")
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
assert "No marketplace clone" in _clean(result.output)
assert recorder.calls == []
def test_no_clone_short_circuits_before_token_check(tmp_path, monkeypatch, recorder):
"""The no-clone no-op path must NOT require a token.
The SessionStart hook (`agnes refresh-marketplace --check`) runs in
every workspace that has the hook installed, including ones where no
agnes token is configured (e.g. a fresh CI checkout, a workspace
that never went through `agnes init`, a project sharing the user's
SessionStart settings.json without sharing their agnes config dir).
Forcing token resolution before the no-op short-circuit would surface
spurious auth_failed errors on those legitimate no-marketplace setups.
Regression: an earlier rev moved the token check above the clone-
exists check (needed it for --bootstrap), which broke CI on the
silent-noop tests that don't seed a token.
"""
# No token on disk, no AGNES_TOKEN env var, no clone.
cfg_dir = tmp_path / "_cfg_empty"
cfg_dir.mkdir()
monkeypatch.setenv("AGNES_CONFIG_DIR", str(cfg_dir))
monkeypatch.delenv("AGNES_TOKEN", raising=False)
monkeypatch.setattr(rm_module, "CLONE_DIR", tmp_path / "nonexistent")
# --check (hook context).
result = runner.invoke(refresh_marketplace_app, ["--check"])
assert result.exit_code == 0, (
f"hook context should silent-noop without a token; got exit "
f"{result.exit_code} and output {result.output!r}"
)
assert _clean(result.output) == ""
assert recorder.calls == []
# Manual mode (no flags): hint, but still exit 0 + no token resolution.
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
assert "No marketplace clone" in _clean(result.output)
assert recorder.calls == []
def test_refresh_marketplace_no_token_friendly_exit(with_clone, tmp_path, monkeypatch, recorder):
cfg_dir = tmp_path / "_cfg_empty"
cfg_dir.mkdir()
monkeypatch.setenv("AGNES_CONFIG_DIR", str(cfg_dir))
monkeypatch.delenv("AGNES_TOKEN", raising=False)
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 1
assert "Traceback" not in (_clean(result.output) + _clean(result.stderr or ""))
assert recorder.calls == []
def test_refresh_marketplace_uses_fetch_plus_reset_not_pull(
with_clone, with_token, claude_in_path, recorder,
):
"""Server-side bare repos rebuild as orphan commits, so `git pull --ff-only`
cannot reconcile. Refresh must `git fetch + reset --hard FETCH_HEAD`."""
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
git_calls = [c for c in recorder.calls if c.cmd and c.cmd[0] == "git"]
assert len(git_calls) >= 2
fetch = git_calls[0]
assert "-c" in fetch.cmd
assert fetch.cmd[fetch.cmd.index("-c") + 1].startswith("credential.helper=")
assert "fetch" in fetch.cmd and "origin" in fetch.cmd
for arg in fetch.cmd:
assert with_token not in arg
assert fetch.env.get("AGNES_TOKEN") == with_token
reset = git_calls[1]
assert "reset" in reset.cmd and "--hard" in reset.cmd and "FETCH_HEAD" in reset.cmd
assert not any("pull" in c.cmd for c in git_calls)
def test_refresh_marketplace_calls_claude_marketplace_update_after_fetch(
with_clone, with_token, claude_in_path, recorder,
):
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
update_calls = [c for c in recorder.calls
if c.cmd[:4] == ["claude", "plugin", "marketplace", "update"]]
assert update_calls
assert update_calls[0].cmd[4] == rm_module.MARKETPLACE_NAME
def test_refresh_marketplace_skips_claude_when_not_in_path(
with_clone, with_token, claude_not_in_path, recorder,
):
"""Claude not on PATH → git fetch+reset still runs, claude steps skipped
with stderr warning, exit 0."""
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
assert any(c.cmd[:1] == ["git"] for c in recorder.calls)
assert not any(c.cmd[:1] == ["claude"] for c in recorder.calls)
assert "claude" in _clean(result.output).lower()
def test_refresh_marketplace_git_fetch_failure_exits_nonzero(
with_clone, with_token, claude_in_path, recorder,
):
recorder.script(("git", "-c"), returncode=1, stderr="fatal: unable to access ...")
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 1
assert not any(c.cmd[:1] == ["claude"] for c in recorder.calls)
# --- Version-aware reconciliation -----------------------------------------------
def test_reconcile_installs_missing_plugins(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Plugin in manifest but not installed in this workspace → install."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_set_marketplace_manifest(with_clone, [
{"name": "grpn-eng", "version": "1.0.0"},
{"name": "grpn-fin", "version": "0.5.0"}, # new
])
recorder.script(
("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([
{"id": "grpn-eng@agnes", "version": "1.0.0", "projectPath": str(workspace)},
]),
)
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
install_targets = sorted(
c.cmd[3] for c in recorder.calls
if c.cmd[:3] == ["claude", "plugin", "install"]
)
assert install_targets == [f"grpn-fin@{rm_module.MARKETPLACE_NAME}"]
# No update calls (version of grpn-eng matches).
update_calls = [c for c in recorder.calls if c.cmd[:3] == ["claude", "plugin", "update"]]
assert update_calls == []
def test_reconcile_updates_when_manifest_version_differs(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Plugin already installed but at older version than the manifest →
update. Critical for the /store skill+agent bundle whose version is
a content hash that bumps on every skill add/remove without changing
the plugin set."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_set_marketplace_manifest(with_clone, [
{"name": "grpn-eng", "version": "1.1.0"}, # admin pushed new version
{"name": "flea", "version": "deadbeefcafef00d"}, # bundle bumped
])
recorder.script(
("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([
{"id": "grpn-eng@agnes", "version": "1.0.0", "projectPath": str(workspace)},
{"id": "flea@agnes", "version": "0123456789abcdef",
"projectPath": str(workspace)},
]),
)
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
update_targets = sorted(
c.cmd[3] for c in recorder.calls
if c.cmd[:3] == ["claude", "plugin", "update"]
)
assert update_targets == [
f"flea@{rm_module.MARKETPLACE_NAME}",
f"grpn-eng@{rm_module.MARKETPLACE_NAME}",
]
# No installs (both already present).
assert not any(c.cmd[:3] == ["claude", "plugin", "install"] for c in recorder.calls)
def test_reconcile_noop_when_versions_match(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Versions all match → no install/update calls (just fetch + claude
marketplace update)."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_set_marketplace_manifest(with_clone, [
{"name": "grpn-eng", "version": "1.0.0"},
])
recorder.script(
("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([
{"id": "grpn-eng@agnes", "version": "1.0.0", "projectPath": str(workspace)},
]),
)
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
assert not any(c.cmd[:3] == ["claude", "plugin", "install"] for c in recorder.calls)
assert not any(c.cmd[:3] == ["claude", "plugin", "update"] for c in recorder.calls)
def test_reconcile_filters_by_project_path(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""A plugin installed in a SIBLING workspace doesn't count as installed
here — must trigger install in this workspace."""
workspace = tmp_path / "ws"
workspace.mkdir()
sibling = tmp_path / "sibling"
sibling.mkdir()
monkeypatch.chdir(workspace)
_set_marketplace_manifest(with_clone, [
{"name": "grpn-eng", "version": "1.0.0"},
])
recorder.script(
("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([
{"id": "grpn-eng@agnes", "version": "1.0.0", "projectPath": str(sibling)},
]),
)
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
install_targets = sorted(
c.cmd[3] for c in recorder.calls
if c.cmd[:3] == ["claude", "plugin", "install"]
)
assert install_targets == [f"grpn-eng@{rm_module.MARKETPLACE_NAME}"]
def test_reconcile_skips_third_party_marketplace(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Plugins from non-agnes marketplaces must be ignored entirely
(not counted as installed, not considered for install/update)."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_set_marketplace_manifest(with_clone, [
{"name": "grpn-eng", "version": "1.0.0"},
])
recorder.script(
("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([
{"id": "third-party-thing@some-other", "version": "1.0.0",
"projectPath": str(workspace)},
]),
)
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
# grpn-eng must be installed (not seen as already-present).
install_targets = sorted(
c.cmd[3] for c in recorder.calls
if c.cmd[:3] == ["claude", "plugin", "install"]
)
assert install_targets == [f"grpn-eng@{rm_module.MARKETPLACE_NAME}"]
# third-party plugin must NOT be touched in any way.
assert not any(
c.cmd[:3] == ["claude", "plugin", "update"]
and c.cmd[3].startswith("third-party-thing")
for c in recorder.calls
)
def test_reconcile_handles_empty_marketplace(
with_clone, with_token, claude_in_path, recorder,
):
"""Empty manifest plugins array → no install/update calls, no warning."""
# with_clone fixture seeds an empty manifest by default.
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
assert not any(c.cmd[:3] == ["claude", "plugin", "install"] for c in recorder.calls)
assert not any(c.cmd[:3] == ["claude", "plugin", "update"] for c in recorder.calls)
def test_reconcile_warns_when_plugin_list_unparseable(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""If `claude plugin list --json` returns garbage, warn and skip
reconcile rather than fail. The fetch+reset already happened, so
Claude Code will pick up the changes naturally on next session."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_set_marketplace_manifest(with_clone, [{"name": "grpn-eng", "version": "1.0.0"}])
recorder.script(("claude", "plugin", "list", "--json"),
returncode=0, stdout="not json at all")
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
assert not any(c.cmd[:3] == ["claude", "plugin", "install"] for c in recorder.calls)
assert not any(c.cmd[:3] == ["claude", "plugin", "update"] for c in recorder.calls)
# --- Reload hint (default + slash-command chatty path) -------------------------
def test_manual_mode_prints_reload_hint_when_anything_changed(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""When `agnes refresh-marketplace` runs without --quiet AND something
actually got installed/updated, the operator needs to know they should
`/reload-plugins` in Claude Code to pick up the change. Print the hint
at end of run."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_set_marketplace_manifest(with_clone, [{"name": "grpn-fin", "version": "0.5.0"}])
recorder.script(("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([]))
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
out = _clean(result.output)
assert "/reload-plugins" in out
def test_manual_mode_no_change_does_not_print_reload_hint(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Manual `agnes refresh-marketplace` over an already-up-to-date stack
must NOT spam the reload hint — there's nothing to reload for.
"Up to date" now also means the workspace `enabledPlugins` map already
matches the stack; without that seed the enable step would otherwise
flip a missing entry to `true` and legitimately request a reload.
"""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
settings_dir = workspace / ".claude"
settings_dir.mkdir()
(settings_dir / "settings.json").write_text(
json.dumps({"enabledPlugins": {"grpn-eng@agnes": True}}), encoding="utf-8",
)
_set_marketplace_manifest(with_clone, [{"name": "grpn-eng", "version": "1.0.0"}])
recorder.script(
("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([
{"id": "grpn-eng@agnes", "version": "1.0.0", "projectPath": str(workspace)},
]),
)
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
out = _clean(result.output)
assert "/reload-plugins" not in out
def test_manual_mode_does_not_emit_hook_json(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Default mode (no flags) emits human-readable text — never a JSON envelope.
Hook JSON is reserved for `--check`. The slash command runs the
default chatty path, so its output is plain prose for the user."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_set_marketplace_manifest(with_clone, [{"name": "grpn-fin", "version": "0.5.0"}])
recorder.script(("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([]))
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
out = _clean(result.output)
assert "grpn-fin" in out
assert not out.strip().startswith("{"), \
f"manual mode should not emit JSON envelope; got: {out.strip()[:200]!r}"
# --- --bootstrap flag (initial install path) ------------------------------------
def test_bootstrap_flag_appears_in_help():
result = runner.invoke(refresh_marketplace_app, ["--help"])
assert result.exit_code == 0
assert "--bootstrap" in _clean(result.output)
def test_no_bootstrap_no_clone_is_noop_default(
tmp_path, monkeypatch, with_token, recorder,
):
"""Without --bootstrap, missing clone → silent no-op (manual mode hint).
No git/claude calls happen."""
monkeypatch.setattr(rm_module, "CLONE_DIR", tmp_path / "nonexistent")
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0
assert "No marketplace clone" in _clean(result.output)
# No subprocess calls — we exited before fetch+reset.
assert recorder.calls == []
def test_bootstrap_with_no_existing_clone_clones_and_registers(
tmp_path, monkeypatch, with_token, claude_in_path, recorder,
):
"""--bootstrap on a fresh machine (no clone yet) must:
1. git clone https://x:<PAT>@host/marketplace.git/ to CLONE_DIR
2. git remote set-url origin <token-stripped URL>
3. claude plugin marketplace add <CLONE_DIR>
4. then proceed to the normal fetch+reset+reconcile flow
PAT must be in the clone URL (HTTP Basic in user-info, the only
auth path raw `git clone` understands), but stripped from the
origin URL after the clone so it doesn't sit at rest in
.git/config."""
# `with_token` fixture already wrote token.json + set AGNES_CONFIG_DIR;
# just append the server URL config so bootstrap can read it.
cfg_dir = tmp_path / "_cfg"
(cfg_dir / "config.yaml").write_text(
"server: https://agnes.example.com\n", encoding="utf-8",
)
clone_target = tmp_path / "fresh_marketplace"
monkeypatch.setattr(rm_module, "CLONE_DIR", clone_target)
# Create the .git/ dir as a side effect of the scripted clone so the
# subsequent fetch+reset path sees a "cloned" state.
real_run = recorder.run
def fake_run(cmd, *args, **kwargs):
if cmd[:2] == ["git", "clone"]:
(clone_target / ".git").mkdir(parents=True, exist_ok=True)
(clone_target / ".claude-plugin").mkdir(parents=True, exist_ok=True)
(clone_target / ".claude-plugin" / "marketplace.json").write_text(
json.dumps({"name": "agnes", "plugins": []}),
encoding="utf-8",
)
return real_run(cmd, *args, **kwargs)
monkeypatch.setattr(rm_module.subprocess, "run", fake_run)
result = runner.invoke(refresh_marketplace_app, ["--bootstrap"])
assert result.exit_code == 0, result.output
# 1. git clone with embedded PAT.
clone_calls = [c for c in recorder.calls if c.cmd[:2] == ["git", "clone"]]
assert len(clone_calls) == 1
clone = clone_calls[0]
assert any(
with_token in arg and "agnes.example.com/marketplace.git/" in arg
for arg in clone.cmd
), f"PAT-bearing clone URL must be in argv, got: {clone.cmd}"
assert str(clone_target) in clone.cmd
# 2. remote set-url (PAT-stripped URL).
set_url_calls = [
c for c in recorder.calls
if c.cmd[:5] == ["git", "-C", str(clone_target), "remote", "set-url"]
]
assert len(set_url_calls) == 1
new_url = set_url_calls[0].cmd[6]
assert "agnes.example.com/marketplace.git/" in new_url
assert with_token not in new_url
assert "x:" not in new_url
# 3. claude plugin marketplace add <clone_target>.
add_calls = [
c for c in recorder.calls
if c.cmd[:4] == ["claude", "plugin", "marketplace", "add"]
]
assert len(add_calls) == 1
assert add_calls[0].cmd[4] == str(clone_target)
def test_bootstrap_clone_failure_exits_nonzero(
tmp_path, monkeypatch, with_token, claude_in_path, recorder,
):
"""If `git clone` fails during bootstrap, exit non-zero and don't
proceed to fetch+reset."""
# `with_token` fixture already created _cfg + token.json; just add
# the server URL config so the bootstrap path can read it.
cfg_dir = tmp_path / "_cfg"
(cfg_dir / "config.yaml").write_text(
"server: https://agnes.example.com\n", encoding="utf-8",
)
monkeypatch.setattr(rm_module, "CLONE_DIR", tmp_path / "fresh_marketplace")
recorder.script(("git", "clone"), returncode=1, stderr="fatal: TLS error")
result = runner.invoke(refresh_marketplace_app, ["--bootstrap"])
assert result.exit_code == 1
# The fetch+reset step should NOT have run (we exit on bootstrap failure).
fetch_calls = [c for c in recorder.calls if "fetch" in c.cmd and "origin" in c.cmd]
assert fetch_calls == []
def test_bootstrap_with_existing_clone_skips_clone_proceeds_to_refresh(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""--bootstrap on a machine that already has a clone must NOT re-clone
(idempotent). It just falls through to the normal fetch+reset path."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
result = runner.invoke(refresh_marketplace_app, ["--bootstrap"])
assert result.exit_code == 0
# No git clone (clone already existed).
clone_calls = [c for c in recorder.calls if c.cmd[:2] == ["git", "clone"]]
assert clone_calls == []
# But fetch+reset DID happen.
fetch_calls = [c for c in recorder.calls if "fetch" in c.cmd and "origin" in c.cmd]
assert fetch_calls
reset_calls = [c for c in recorder.calls if "reset" in c.cmd and "--hard" in c.cmd]
assert reset_calls
# --- --check flag (SessionStart-hook detector mode) -----------------------------
def _stage_rev_parse(monkeypatch, recorder, *, head: str, remote_head: str) -> None:
"""Wrap recorder.run so `git rev-parse HEAD` returns the local SHA
and `git ls-remote origin HEAD` returns the remote SHA, while every
other command falls through to the recorder's normal handling.
Used by --check tests to drive the local-HEAD vs remote-HEAD
comparison independently of the (mocked) git invocation.
"""
real_run = recorder.run
def staged_run(cmd, *args, **kwargs):
if "rev-parse" in cmd:
recorder.calls.append(
_RecordedCall(cmd=list(cmd), env=dict(kwargs.get("env") or {}))
)
return subprocess.CompletedProcess(
args=list(cmd), returncode=0, stdout=head + "\n", stderr="",
)
if "ls-remote" in cmd:
recorder.calls.append(
_RecordedCall(cmd=list(cmd), env=dict(kwargs.get("env") or {}))
)
return subprocess.CompletedProcess(
args=list(cmd), returncode=0,
stdout=f"{remote_head}\tHEAD\n", stderr="",
)
return real_run(cmd, *args, **kwargs)
monkeypatch.setattr(rm_module.subprocess, "run", staged_run)
def test_check_emits_hook_json_when_remote_changed(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""`--check` + local HEAD differs from remote HEAD →
Claude Code hook JSON on stdout pointing the user at
`/update-agnes-plugins`. The hook never installs anything itself."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_stage_rev_parse(monkeypatch, recorder, head="abc123", remote_head="def456")
result = runner.invoke(refresh_marketplace_app, ["--check"])
assert result.exit_code == 0
out = _clean(result.output).strip()
assert out, "--check must emit hook JSON when remote has changes"
payload = json.loads(out)
assert "/update-agnes-plugins" in payload["systemMessage"], payload
assert "marketplace" in payload["systemMessage"].lower(), payload
assert payload["hookSpecificOutput"]["hookEventName"] == "SessionStart"
assert "/update-agnes-plugins" in payload["hookSpecificOutput"]["additionalContext"]
def test_check_silent_when_remote_unchanged(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""`--check` + local HEAD == remote HEAD → silent exit 0, no JSON
output. Avoids spamming the user with "updates available" on every
session start when nothing actually changed."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_stage_rev_parse(monkeypatch, recorder, head="samehash", remote_head="samehash")
result = runner.invoke(refresh_marketplace_app, ["--check"])
assert result.exit_code == 0
assert _clean(result.output).strip() == ""
def test_check_does_not_call_claude_plugin_anything(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""`--check` must NOT call `claude plugin install/update` or
`claude plugin marketplace update`. Those side effects belong to
the `/update-agnes-plugins` slash command, which the user runs
interactively when they're ready."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
# Even WITH a remote diff, --check must stay read-only.
_stage_rev_parse(monkeypatch, recorder, head="abc", remote_head="def")
result = runner.invoke(refresh_marketplace_app, ["--check"])
assert result.exit_code == 0
forbidden_prefixes = (
["claude", "plugin", "install"],
["claude", "plugin", "update"],
["claude", "plugin", "marketplace", "update"],
)
for prefix in forbidden_prefixes:
assert not any(c.cmd[: len(prefix)] == prefix for c in recorder.calls), (
f"--check must not invoke {' '.join(prefix)}; got: "
f"{[c.cmd for c in recorder.calls]!r}"
)
def test_check_does_not_git_reset(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""`--check` is read-only against the git tree. Must NOT call
`git reset --hard` — that would silently apply remote changes the
user hasn't agreed to yet."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_stage_rev_parse(monkeypatch, recorder, head="abc", remote_head="def")
result = runner.invoke(refresh_marketplace_app, ["--check"])
assert result.exit_code == 0
reset_calls = [c for c in recorder.calls if "reset" in c.cmd]
assert reset_calls == [], (
f"--check must not call git reset; got: {[c.cmd for c in reset_calls]!r}"
)
def test_check_runs_git_ls_remote_not_fetch(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""`--check` must use `git ls-remote origin HEAD` — one HTTPS
round-trip, no objects downloaded — and must NOT run `git fetch`.
This is the whole point of the SessionStart-hook detector: ~0.51 s
instead of ~8 s. If somebody regresses this back to fetch, this
test catches it."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_stage_rev_parse(monkeypatch, recorder, head="abc", remote_head="abc")
result = runner.invoke(refresh_marketplace_app, ["--check"])
assert result.exit_code == 0
ls_remote_calls = [
c for c in recorder.calls
if c.cmd and c.cmd[0] == "git" and "ls-remote" in c.cmd
and "origin" in c.cmd and "HEAD" in c.cmd
]
assert ls_remote_calls, (
f"--check must run `git ls-remote origin HEAD`; got: "
f"{[c.cmd for c in recorder.calls]!r}"
)
# Same credential helper wiring as the default mode — PAT in env, not argv.
ls_remote = ls_remote_calls[0]
assert "-c" in ls_remote.cmd
assert ls_remote.cmd[ls_remote.cmd.index("-c") + 1].startswith("credential.helper=")
assert ls_remote.env.get("AGNES_TOKEN") == with_token
# No `git fetch` — that's the slow path we replaced.
fetch_calls = [
c for c in recorder.calls
if c.cmd and c.cmd[0] == "git" and "fetch" in c.cmd
]
assert fetch_calls == [], (
f"--check must NOT run `git fetch` (slow path); got: "
f"{[c.cmd for c in fetch_calls]!r}"
)
def test_check_no_clone_silent_exit_zero(tmp_path, monkeypatch, with_token, recorder):
"""`--check` on a workspace without a marketplace clone → silent
exit 0 (matches the old --quiet hook no-op semantics, so workspaces
that never bootstrapped don't spam "no clone" warnings on every
session start)."""
monkeypatch.setattr(rm_module, "CLONE_DIR", tmp_path / "nonexistent")
result = runner.invoke(refresh_marketplace_app, ["--check"])
assert result.exit_code == 0
assert _clean(result.output).strip() == ""
assert recorder.calls == []
def test_check_ls_remote_failure_exits_one(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""A failed `git ls-remote` (network down, auth rejected, etc.) →
exit 1 so the surrounding `|| true` in the hook command swallows it
cleanly. No hook JSON is emitted (we don't know if the remote
changed)."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
# `("git", "-c")` matches the credential-helper wiring shared by
# ls-remote and fetch — fine here since ls-remote is the only git
# subprocess --check runs.
recorder.script(("git", "-c"), returncode=1, stderr="fatal: unable to access ...")
result = runner.invoke(refresh_marketplace_app, ["--check"])
assert result.exit_code == 1
# No hook JSON on failure — the hook surrounding `|| true` swallows
# the non-zero exit so users don't see a half-written message.
assert not _clean(result.output).strip().startswith("{")
def test_check_and_bootstrap_are_mutually_exclusive(
tmp_path, monkeypatch, with_token, recorder,
):
"""Mixing the two modes makes no sense (one is read-only detector,
the other is destructive clone-and-reconcile). Reject the combo
with a non-zero exit instead of silently picking one."""
monkeypatch.setattr(rm_module, "CLONE_DIR", tmp_path / "fresh_marketplace")
result = runner.invoke(refresh_marketplace_app, ["--check", "--bootstrap"])
assert result.exit_code == 2
assert recorder.calls == []
# --- --bootstrap recovery: clone-exists-but-CC-not-registered -------------------
def test_bootstrap_recovers_when_clone_exists_but_cc_marketplace_missing(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Clone survived but Claude Code's registry doesn't list `agnes`
(fresh Claude Code install on the same box, manual remove, etc.).
`--bootstrap` must re-register the clone with `claude plugin
marketplace add CLONE_DIR` BEFORE falling through to fetch+reset+
`marketplace update agnes` — otherwise the update fails with
"Marketplace 'agnes' not found", which is the bug from David's
2026-05-10 init report."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
# `claude plugin marketplace list` returns ONLY the upstream Anthropic
# marketplace — no `agnes` entry. This is the state on a clean Claude
# Code install where the prior `agnes` registration got wiped.
recorder.script(
("claude", "plugin", "marketplace", "list"),
stdout=(
"Configured marketplaces:\n"
"\n"
" claude-plugins-official\n"
" Source: GitHub (anthropics/claude-plugins-official)\n"
),
)
result = runner.invoke(refresh_marketplace_app, ["--bootstrap"])
assert result.exit_code == 0, result.output
add_calls = [
c for c in recorder.calls
if c.cmd[:4] == ["claude", "plugin", "marketplace", "add"]
]
assert len(add_calls) == 1, (
f"--bootstrap with existing clone but missing CC registration must "
f"call `claude plugin marketplace add`; got: {[c.cmd for c in recorder.calls]!r}"
)
assert add_calls[0].cmd[4] == str(with_clone)
def test_bootstrap_skips_register_when_cc_marketplace_already_present(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Clone exists AND Claude Code already has `agnes` registered →
`--bootstrap` must NOT re-add (idempotent). A redundant add would
surface the `Marketplace 'agnes' already exists` error and abort
the recovery path uselessly."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
recorder.script(
("claude", "plugin", "marketplace", "list"),
stdout=(
"Configured marketplaces:\n"
"\n"
" agnes\n"
" Source: Local path (/Users/x/.agnes/marketplace)\n"
" claude-plugins-official\n"
" Source: GitHub (anthropics/claude-plugins-official)\n"
),
)
result = runner.invoke(refresh_marketplace_app, ["--bootstrap"])
assert result.exit_code == 0, result.output
add_calls = [
c for c in recorder.calls
if c.cmd[:4] == ["claude", "plugin", "marketplace", "add"]
]
assert add_calls == [], (
f"--bootstrap must not re-add when `agnes` is already registered; "
f"got: {[c.cmd for c in add_calls]!r}"
)
def test_bootstrap_does_not_false_positive_on_source_path_substring(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Regression: registry detector must not match the marketplace name
when it appears only inside a `Source: …` line of an UNRELATED
marketplace. Real-world trigger: an earlier `claude plugin marketplace
add ~/.agnes/some-other-clone` registers a different marketplace whose
Source line still mentions `.agnes`, which a naive `\\bagnes\\b` over
the full stdout would treat as `agnes` already registered. Recovery
path then skips the add and falls through to a guaranteed-broken
`marketplace update agnes`."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
# `agnes` (our marketplace name) appears ONLY in the Source path,
# never as a registered marketplace header. Recovery must add it.
recorder.script(
("claude", "plugin", "marketplace", "list"),
stdout=(
"Configured marketplaces:\n"
"\n"
" third-party-fork\n"
" Source: Local path (/Users/x/.agnes-related/marketplace)\n"
" claude-plugins-official\n"
" Source: GitHub (anthropics/claude-plugins-official)\n"
),
)
result = runner.invoke(refresh_marketplace_app, ["--bootstrap"])
assert result.exit_code == 0, result.output
add_calls = [
c for c in recorder.calls
if c.cmd[:4] == ["claude", "plugin", "marketplace", "add"]
]
assert len(add_calls) == 1, (
f"--bootstrap must not be fooled by `agnes` substring inside an "
f"unrelated `Source:` line; expected one add call, got: "
f"{[c.cmd for c in recorder.calls]!r}"
)
assert add_calls[0].cmd[4] == str(with_clone)
def test_bootstrap_marketplace_add_failure_is_fatal_on_fresh_clone(
tmp_path, monkeypatch, with_token, claude_in_path, recorder,
):
"""`claude plugin marketplace add` failure during fresh-clone bootstrap
must be fatal — silent warn-and-continue is the bug that caused David's
init report to cascade into 4× `Marketplace 'agnes' not found` plugin
install errors. Returning non-zero with the actual `add` stderr is the
signal operators need to fix their machine state."""
cfg_dir = tmp_path / "_cfg"
(cfg_dir / "config.yaml").write_text(
"server: https://agnes.example.com\n", encoding="utf-8",
)
clone_target = tmp_path / "fresh_marketplace"
monkeypatch.setattr(rm_module, "CLONE_DIR", clone_target)
real_run = recorder.run
def fake_run(cmd, *args, **kwargs):
if cmd[:2] == ["git", "clone"]:
(clone_target / ".git").mkdir(parents=True, exist_ok=True)
(clone_target / ".claude-plugin").mkdir(parents=True, exist_ok=True)
(clone_target / ".claude-plugin" / "marketplace.json").write_text(
json.dumps({"name": "agnes", "plugins": []}),
encoding="utf-8",
)
return real_run(cmd, *args, **kwargs)
monkeypatch.setattr(rm_module.subprocess, "run", fake_run)
recorder.script(
("claude", "plugin", "marketplace", "add"),
returncode=1,
stderr="error: filesystem path is not readable",
)
result = runner.invoke(refresh_marketplace_app, ["--bootstrap"])
assert result.exit_code == 1, result.output
# Fetch+reset must NOT have run after the fatal add failure.
fetch_calls = [c for c in recorder.calls if "fetch" in c.cmd and "origin" in c.cmd]
assert fetch_calls == [], (
f"bootstrap must abort on `add` failure; fetch should not run, got: "
f"{[c.cmd for c in fetch_calls]!r}"
)
def test_bootstrap_recovery_add_failure_is_fatal_on_existing_clone(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""When the recovery path (clone exists, CC registry empty) tries to
re-add and `claude plugin marketplace add` fails, exit non-zero
instead of pressing on to a guaranteed-broken `marketplace update`."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
recorder.script(
("claude", "plugin", "marketplace", "list"),
stdout="Configured marketplaces:\n\n claude-plugins-official\n",
)
recorder.script(
("claude", "plugin", "marketplace", "add"),
returncode=1,
stderr="error: not a directory",
)
result = runner.invoke(refresh_marketplace_app, ["--bootstrap"])
assert result.exit_code == 1
# `marketplace update agnes` must NOT have run — that's the cascade we're
# cutting off.
update_calls = [
c for c in recorder.calls
if c.cmd[:4] == ["claude", "plugin", "marketplace", "update"]
]
assert update_calls == [], (
f"recovery must abort before `marketplace update` when add fails; got: "
f"{[c.cmd for c in update_calls]!r}"
)
# --- enabledPlugins workspace-settings write -----------------------------------
#
# Refresh's reconcile step doesn't just register plugins in the global
# `~/.claude/plugins/installed_plugins.json`; it also has to write
# `enabledPlugins["<name>@agnes"] = true` into the workspace
# `.claude/settings.json`. Without that entry, Claude Code treats the
# plugin as disabled regardless of registry presence. These tests pin the
# helper's contract end-to-end through the Typer command, since the helper
# touches the filesystem and is easier to verify via the real settings.json
# state than via additional mocking.
def _read_workspace_settings(workspace: Path) -> dict:
settings_path = workspace / ".claude" / "settings.json"
return json.loads(settings_path.read_text(encoding="utf-8"))
def test_enable_writes_missing_key_to_workspace_settings(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Fresh workspace with no `.claude/settings.json` → refresh creates the
file with `enabledPlugins` populated from the manifest."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
_set_marketplace_manifest(with_clone, [
{"name": "grpn", "version": "1.0.0"},
{"name": "grpn-data", "version": "1.1.0"},
])
recorder.script(("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([]))
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0, result.output
settings = _read_workspace_settings(workspace)
assert settings.get("enabledPlugins") == {
"grpn@agnes": True,
"grpn-data@agnes": True,
}
def test_enable_writes_to_existing_settings_preserving_other_keys(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Workspace already has settings.json with hooks/model/permissions.
Refresh must add `enabledPlugins` without disturbing existing keys."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
settings_dir = workspace / ".claude"
settings_dir.mkdir()
pre_existing = {
"model": "sonnet",
"permissions": {"allow": ["Read", "Bash"]},
"hooks": {"SessionStart": [{"hooks": [{"type": "command", "command": "echo hi"}]}]},
}
(settings_dir / "settings.json").write_text(
json.dumps(pre_existing, indent=2), encoding="utf-8",
)
_set_marketplace_manifest(with_clone, [{"name": "grpn", "version": "1.0.0"}])
recorder.script(("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([]))
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0, result.output
settings = _read_workspace_settings(workspace)
assert settings["model"] == "sonnet"
assert settings["permissions"] == {"allow": ["Read", "Bash"]}
assert settings["hooks"] == pre_existing["hooks"]
assert settings["enabledPlugins"] == {"grpn@agnes": True}
def test_enable_overrides_local_false_back_to_true(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""User locally `claude plugin disable`-d a stack plugin (enabledPlugins
has `false`). Stack is source of truth → refresh re-enables it."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
settings_dir = workspace / ".claude"
settings_dir.mkdir()
(settings_dir / "settings.json").write_text(
json.dumps({"enabledPlugins": {"grpn@agnes": False}}), encoding="utf-8",
)
_set_marketplace_manifest(with_clone, [{"name": "grpn", "version": "1.0.0"}])
recorder.script(("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([
{"id": "grpn@agnes", "version": "1.0.0",
"projectPath": str(workspace)},
]))
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0, result.output
settings = _read_workspace_settings(workspace)
assert settings["enabledPlugins"] == {"grpn@agnes": True}
# Re-enabled → reload hint should fire (even though no install/update).
assert "/reload-plugins" in _clean(result.output)
def test_enable_is_idempotent_when_already_true(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Every plugin in manifest already `true` in settings → refresh must
not rewrite the file (mtime stable) and must not advertise enable
events."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
settings_dir = workspace / ".claude"
settings_dir.mkdir()
settings_path = settings_dir / "settings.json"
settings_path.write_text(
json.dumps({"enabledPlugins": {"grpn@agnes": True}}, indent=2),
encoding="utf-8",
)
mtime_before = settings_path.stat().st_mtime_ns
_set_marketplace_manifest(with_clone, [{"name": "grpn", "version": "1.0.0"}])
recorder.script(("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([
{"id": "grpn@agnes", "version": "1.0.0",
"projectPath": str(workspace)},
]))
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0, result.output
settings = _read_workspace_settings(workspace)
assert settings["enabledPlugins"] == {"grpn@agnes": True}
assert settings_path.stat().st_mtime_ns == mtime_before, (
"no-op refresh must not rewrite settings.json"
)
# No install/update/enable changes → no reload hint.
assert "/reload-plugins" not in _clean(result.output)
def test_enable_preserves_non_agnes_plugins_in_map(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Workspace's `enabledPlugins` contains entries from other marketplaces
(e.g. coupons-team-skills). Refresh must not touch those keys; it only
adds/sets `@agnes` entries."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
settings_dir = workspace / ".claude"
settings_dir.mkdir()
(settings_dir / "settings.json").write_text(
json.dumps({"enabledPlugins": {
"coupons-skills@coupons-team-skills": True,
"platform-tools@coupons-team-skills": False, # user disabled
}}),
encoding="utf-8",
)
_set_marketplace_manifest(with_clone, [{"name": "grpn", "version": "1.0.0"}])
recorder.script(("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([]))
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0, result.output
settings = _read_workspace_settings(workspace)
assert settings["enabledPlugins"] == {
"coupons-skills@coupons-team-skills": True,
"platform-tools@coupons-team-skills": False,
"grpn@agnes": True,
}
def test_enable_runs_regardless_of_override_sentinel(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""`refresh-marketplace` is a runtime command — it ignores the
Initial Workspace Template sentinel and updates `enabledPlugins`
even in admin-templated (override: true) workspaces. The sentinel
governs `agnes init` skip only; runtime must keep the workspace in
sync with the user's current marketplace stack."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
settings_dir = workspace / ".claude"
settings_dir.mkdir()
# Admin-managed sentinel — must NOT block runtime enable.
(settings_dir / "init-complete").write_text(
"completed_at: 2026-05-13T14:32:00Z\n"
"agnes_version: 0.53.0\n"
"override: true\n",
encoding="utf-8",
)
# No pre-existing settings.json — refresh creates one with enabledPlugins.
_set_marketplace_manifest(with_clone, [{"name": "grpn", "version": "1.0.0"}])
recorder.script(("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([]))
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0, result.output
settings = _read_workspace_settings(workspace)
assert settings.get("enabledPlugins") == {"grpn@agnes": True}
def test_reload_hint_printed_when_only_enable_changes(
with_clone, with_token, claude_in_path, recorder, monkeypatch, tmp_path,
):
"""Nothing to install/update, but enable map had a stale `false` entry
→ refresh flips it to `true` and prints the /reload-plugins hint so
the user knows to reload the running session."""
workspace = tmp_path / "ws"
workspace.mkdir()
monkeypatch.chdir(workspace)
settings_dir = workspace / ".claude"
settings_dir.mkdir()
(settings_dir / "settings.json").write_text(
json.dumps({"enabledPlugins": {"grpn@agnes": False}}), encoding="utf-8",
)
_set_marketplace_manifest(with_clone, [{"name": "grpn", "version": "1.0.0"}])
recorder.script(("claude", "plugin", "list", "--json"),
stdout=_plugin_list_json([
{"id": "grpn@agnes", "version": "1.0.0",
"projectPath": str(workspace)},
]))
result = runner.invoke(refresh_marketplace_app, [])
assert result.exit_code == 0, result.output
out = _clean(result.output)
assert "/reload-plugins" in out
# No install or update should have been triggered.
assert not any(c.cmd[:3] == ["claude", "plugin", "install"] for c in recorder.calls)
assert not any(c.cmd[:3] == ["claude", "plugin", "update"] for c in recorder.calls)