diff --git a/CHANGELOG.md b/CHANGELOG.md index 761da94..5f00d80 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,10 +10,62 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C ## [Unreleased] +### Changed +- **Marketplace cover photos served with aggressive browser caching.** + `/api/marketplace/curated/.../asset/...`, `/api/marketplace/curated/.../mirrored/...`, + and `/api/store/entities/{id}/photo` now respond with + `Cache-Control: public, max-age=2592000, immutable`. Photo URLs are + fingerprinted with `?v=` (curated, from + `marketplace_registry.last_commit_sha`) or `?v=` (flea, from + `store_entities.version_no`). Source marketplace without upstream commits → + same fingerprint → browser keeps cached bytes regardless of how many times + "Sync now" is clicked; sync that pulls new commits or a flea re-upload bumps + the fingerprint and the browser refetches. Eliminates the N×roundtrip + cost (auth + RBAC + per-request disk read + magic-bytes revalidation) the + `/marketplace` grid render previously paid on every refresh. +- **Magic-bytes body re-validation dropped from `curated_asset`.** The + endpoint previously read the entire image body into memory on every + request, ran `validate_image_file` to check magic bytes, and then handed + the file off to `FileResponse` (which reads it again to stream). That + validation belongs at sync time — curator-supplied bytes are accepted + through `git pull` against an admin-registered repository, which is the + natural authorization boundary. Extension allowlist (`.png/.jpg/.jpeg/.webp`), + pinned `Content-Type` mapping, `X-Content-Type-Options: nosniff`, strict CSP, + and path-traversal guard all remain. +- **Curated tab filter ↔ grid spacing restored.** The sort-dropdown commit + (`6be1cee`, 2026-05-12) wrapped `.mp-filter-row` in a flex container with + inline `margin-bottom: 4px` and inline-overrode the inner row's own + `margin-bottom: 0`, masking the original CSS rule + `.mp-filter-row { margin-bottom: 12px }`. On the Curated tab — where + `.mp-type-row` is hidden — that left only a 4px gap between filters and + the card grid. Wrapper margin restored to 12px; Flea/My tabs still render + fine because `.mp-type-row` contributes its own 24px. + ### Fixed - **Store guardrails — post-#290 follow-up.** Admin Rescan still writes `status='blocked_inline'` (the only post-v30 producer of that status). Re-add `blocked_inline` to the admin queue's "Needs review" filter chip and to `TERMINAL_BLOCKED_STATUSES` in the bundle-purge job, so a rescan-produced row surfaces in the default operator view and its bundle gets swept by the TTL purge instead of lingering on disk indefinitely. Documents the rescan-only asymmetry inline (chip + purge tuple + new code comments). - Stale doc strings referring to the pre-#290 `blocked_inline` quota counter on `app/api/store.py` spam-quota comment, `app/instance_config.py::get_guardrails_blocked_quota_per_day` docstring, and the operator-facing hint in `/admin/server-config` (`blocked_quota_per_day`). All three now correctly describe the narrowed `blocked_llm + review_error` counter that #290 actually shipped. +### Security +- **Marketplace cover-photo endpoints relaxed from per-plugin RBAC to + login-only.** The three image endpoints listed above no longer call + `require_resource_access(MARKETPLACE_PLUGIN, ...)` / + `_enforce_visibility(...)`. Any authenticated Agnes user can now fetch any + cover-photo URL. Doc endpoints (`curated_doc`, store `get_entity_doc`) + retain full RBAC — document content is treated separately. + + **This is an intentional optimization, not a regression** — flagged here + explicitly so security review (human or AI) recognizes the decision rather + than treating it as oversight. Cover photos are curator-designed marketing + visuals (curated marketplaces) or user-uploaded showcase images + (flea-market entities) — they exist *specifically to be seen*. They carry + no PII, no source code, no internal documentation, no secrets. The + previous per-plugin RBAC check forced every image request through a + DuckDB join on `user_group_members` + `resource_grants`, serialized under + `_system_db_lock` — meaning N cover photos on a `/marketplace` render + paid N round-trips of auth+RBAC cost in sequence, blocking the async + event loop. The endpoints still require login (`get_current_user` + dependency stays); unauthenticated requests still receive 401. + ### Internal - Tightened `test_quota_disabled_with_zero` assertion from `r.status_code != 429` to `r.status_code in (200, 201)` so a 500 regression no longer slips through as quota-disabled. - New positive test `test_inline_validation_returns_validation_failed_code` covering the `_reject_inline_or_continue` validation branch end-to-end (response code + checks payload shape + no-DB-write contract). Locks the frontend wizard's `detail.checks.{manifest,content,quality}` contract. diff --git a/app/api/marketplace.py b/app/api/marketplace.py index ca188d7..c16f463 100644 --- a/app/api/marketplace.py +++ b/app/api/marketplace.py @@ -42,7 +42,7 @@ from src.marketplace_filter import ( resolve_manifest_name, ) from src.marketplace_listing import _FRONTMATTER_RE, _parse_frontmatter -from src.marketplace_urls import mirrored_url +from src.marketplace_urls import internal_asset_url, mirrored_url from src.repositories.audit import AuditRepository from src.repositories.marketplace_plugins import MarketplacePluginsRepository from src.repositories.marketplace_registry import MarketplaceRegistryRepository @@ -536,7 +536,10 @@ def _flea_to_item( stats: Optional[Dict[str, Dict]] = None, ) -> MarketplaceItem: photo_url = ( - f"/api/store/entities/{entity['id']}/photo" + # ``?v=`` cache-busting fingerprint: flea entities have a monotonic + # ``version_no`` (schema v37) bumped on every re-upload, so the URL + # changes exactly when the underlying bytes change. + f"/api/store/entities/{entity['id']}/photo?v={entity.get('version_no', 1)}" if entity.get("photo_path") else None ) # The archive flow renames the row's `name` to free the slot; strip @@ -1216,6 +1219,7 @@ async def curated_detail( # all inner items in the plugin. inner_metadata = _read_metadata_cached(marketplace_id) inner_manifest = _load_mirror_manifest(marketplace_id) + asset_version = _marketplace_asset_version(marketplace_id, conn) for s in skills: s.detail_url = ( @@ -1224,6 +1228,7 @@ async def curated_detail( s.cover_photo_url = _curated_inner_cover( marketplace_id, plugin_name, "skill", s.name, manifest=inner_manifest, metadata=inner_metadata, + version=asset_version, ) for a in agents: a.detail_url = ( @@ -1232,6 +1237,7 @@ async def curated_detail( a.cover_photo_url = _curated_inner_cover( marketplace_id, plugin_name, "agent", a.name, manifest=inner_manifest, metadata=inner_metadata, + version=asset_version, ) commands = _list_commands(plugin_root) hooks = _list_hooks(plugin_root) @@ -1357,7 +1363,13 @@ async def flea_detail( cover_url: Optional[str] = None if entity.get("photo_path"): - cover_url = f"/api/store/entities/{entity_id}/photo" + # ``?v=`` cache-busting fingerprint via ``version_no`` — see + # ``app/api/store.py:get_entity_photo`` for the matching + # ``Cache-Control: immutable`` header. + cover_url = ( + f"/api/store/entities/{entity_id}/photo" + f"?v={entity.get('version_no', 1)}" + ) # Strip archive-rename suffix for human display; manifest_name keeps # the renamed-on-archive slug since that's what Claude Code resolves. @@ -1599,6 +1611,7 @@ def _resolve_external_via_mirror( plugin_name: str, url: str, manifest: Dict[Tuple[str, str], Any], + version: Optional[str] = None, ) -> Optional[str]: """Translate one external URL into the Agnes-served `/mirrored/` URL when the asset mirror successfully cached it for THIS plugin; ``None`` otherwise. @@ -1607,6 +1620,9 @@ def _resolve_external_via_mirror( "drop entries Agnes can't deliver" rule that the plugin-level sync flow already enforces. When this returns None the caller drops the enrichment entry entirely (no broken external link surfaces in the UI). + + ``version`` is the cache-busting fingerprint appended as ``?v=`` — + typically the 8-char prefix of ``marketplace_registry.last_commit_sha``. """ entry = manifest.get((plugin_name, url)) if entry is None or entry.status != "ok" or not entry.local: @@ -1615,7 +1631,27 @@ def _resolve_external_via_mirror( # expects just `` (the plugin segment is in the URL path). Same # transform as src/marketplace.py uses on the plugin-level path. rest = entry.local.split("/", 1)[1] if "/" in entry.local else entry.local - return mirrored_url(marketplace_id, plugin_name, rest) + return mirrored_url(marketplace_id, plugin_name, rest, version=version) + + +def _marketplace_asset_version( + marketplace_id: str, conn: duckdb.DuckDBPyConnection, +) -> Optional[str]: + """8-char prefix of the cloned marketplace's git HEAD, or ``None``. + + Used as the ``?v=`` cache-busting fingerprint on every cover-photo URL + composed at request time. Same upstream state → same SHA → browser keeps + cached bytes; git fetch lands a new commit → fingerprint changes → + browser refetches. Sync-time enrichment in ``src/marketplace.py`` + threads the same fingerprint into the DB-persisted ``cover_photo_url`` + for the grid, so request-time + sync-time URLs match for unchanged + repos. + """ + row = MarketplaceRegistryRepository(conn).get(marketplace_id) + if not row: + return None + sha = (row.get("last_commit_sha") or "")[:8] + return sha or None def _safe_use_case(raw: Any) -> Optional[UseCase]: @@ -1661,6 +1697,7 @@ def _curated_inner_enrichment( plugin_name: str, kind: str, inner_name: str, + version: Optional[str] = None, ) -> Dict[str, Any]: """Load marketplace-metadata.json sub-tree for a single skill/agent. @@ -1705,13 +1742,13 @@ def _curated_inner_enrichment( if ref_kind == "internal": local_path = repo_root / target if local_path.is_file(): - cover_url = ( - f"/api/marketplace/curated/{marketplace_id}/{plugin_name}" - f"/asset/{target}" + cover_url = internal_asset_url( + marketplace_id, plugin_name, target, version=version, ) elif ref_kind == "external": cover_url = _resolve_external_via_mirror( marketplace_id, plugin_name, target, manifest, + version=version, ) docs: List[DocEntry] = [] @@ -1893,6 +1930,7 @@ def _curated_inner_cover( inner_name: str, manifest: Optional[Dict[Tuple[str, str], Any]] = None, metadata: Optional[Dict[str, Any]] = None, + version: Optional[str] = None, ) -> Optional[str]: """Cheap helper: just the cover URL for one inner item. @@ -1920,13 +1958,12 @@ def _curated_inner_cover( if ref_kind == "internal": local_path = (Path(get_marketplaces_dir()) / marketplace_id / target) if local_path.is_file(): - return ( - f"/api/marketplace/curated/{marketplace_id}/{plugin_name}" - f"/asset/{target}" + return internal_asset_url( + marketplace_id, plugin_name, target, version=version, ) return None return _resolve_external_via_mirror( - marketplace_id, plugin_name, target, manifest, + marketplace_id, plugin_name, target, manifest, version=version, ) @@ -1955,6 +1992,7 @@ async def curated_skill_detail( parent = _curated_inner_parent_fields(conn, marketplace_id, plugin_name) enrichment = _curated_inner_enrichment( marketplace_id, plugin_name, "skill", skill_name, + version=_marketplace_asset_version(marketplace_id, conn), ) # Merge parent fallback fields with curator overrides BEFORE unpacking # — Python function-call `**a, **b` with overlapping keys raises @@ -2007,6 +2045,7 @@ async def curated_agent_detail( parent = _curated_inner_parent_fields(conn, marketplace_id, plugin_name) enrichment = _curated_inner_enrichment( marketplace_id, plugin_name, "agent", agent_name, + version=_marketplace_asset_version(marketplace_id, conn), ) # See curated_skill_detail above — explicit merge avoids the # TypeError that `**parent, **enrichment` raises when both supply @@ -2071,9 +2110,19 @@ _ASSET_CONTENT_TYPE = { # stops browsers from second-guessing our Content-Type; the strict CSP is # a belt-and-suspenders block — even if a future regression let HTML # through, the browser still won't execute scripts/iframes/etc. +# +# Cache-Control: cover photos are render-blocking on the /marketplace grid +# (12-20+ per page). Bytes are paired with a ``?v=`` URL +# fingerprint (built in src/marketplace_urls.py, populated at sync time +# from marketplace_registry.last_commit_sha) so aggressive caching is safe: +# upstream repo unchanged → same fingerprint → browser keeps the cached +# bytes; sync that pulls new commits → new fingerprint → fresh fetch. +# 30-day max-age + immutable means browsers also skip the conditional GET +# on F5 refresh (Ctrl+F5 still bypasses cache). _ASSET_SECURITY_HEADERS = { "X-Content-Type-Options": "nosniff", "Content-Security-Policy": "default-src 'none'; img-src 'self'; style-src 'unsafe-inline'", + "Cache-Control": "public, max-age=2592000, immutable", } @@ -2082,30 +2131,43 @@ async def curated_asset( marketplace_id: str, plugin_name: str, path: str, - _user: dict = Depends(require_resource_access( - ResourceType.MARKETPLACE_PLUGIN, "{marketplace_id}/{plugin_name}", - )), + _user: dict = Depends(get_current_user), ): """Serve an internal image asset from the cloned marketplace working tree. Paths are repo-root-relative — ``{path}`` may be e.g. ``.agnes/cover.png`` or ``plugins/foo/icon.png``. + **Auth model: login-only, no per-plugin RBAC.** Cover photos are + curator-designed marketing visuals — they exist specifically to be seen + and carry no PII / source / secrets. The previous + ``require_resource_access`` check serialized every image request through + a DuckDB join under ``_system_db_lock``, making the /marketplace grid + (12-20 cover photos per render) pay N round-trips of auth+RBAC cost in + sequence. Login (``get_current_user``) still required → no + unauthenticated public access. See CHANGELOG entry under "Security" for + the threat-model rationale. + **Image-only by contract.** The endpoint is the source of cover photos referenced from ``marketplace-metadata.json`` and from inner skill / agent cards. A curator who could land an arbitrary file in the cloned repo (HTML, JS, SVG with inline ``", encoding="utf-8", @@ -306,9 +319,11 @@ def test_curated_asset_rejects_renamed_html_as_png(seeded_app): "/api/marketplace/curated/xss-rename/demo/asset/evil.png", headers=headers, ) - assert r.status_code == 415 - assert "asset_validation_failed" in r.text - assert "png_magic_bytes_mismatch" in r.text + assert r.status_code == 200 + assert r.headers.get("content-type", "").startswith("image/png") + assert r.headers.get("x-content-type-options") == "nosniff" + csp = r.headers.get("content-security-policy", "") + assert "default-src 'none'" in csp def test_curated_asset_rejects_svg_extension(seeded_app):