perf(marketplace): cache cover photos + restore Curated filter spacing (#294)

* perf(marketplace): browser-cache cover photos + restore Curated tab filter spacing

Cover photos on /marketplace grid now serve with `Cache-Control: public,
max-age=2592000, immutable` plus URL fingerprinting (`?v=<commit-sha8>`
for curated, `?v=<version_no>` for flea) so browser refresh stops hitting
the server entirely for unchanged assets. Per-plugin RBAC dropped from
the three image endpoints (curated_asset, curated_mirrored, get_entity_photo)
in favor of login-only auth — eliminates _system_db_lock contention on
parallel image requests. Per-request magic-bytes revalidation also dropped
from curated_asset (it was re-reading the file just to discard the bytes,
then FileResponse read it again).

Spacing bug: sort-dropdown commit (6be1cee) wrapped .mp-filter-row in a
new flex container with inline margin-bottom:4px, masking the original
12px CSS rule. Curated tab (where .mp-type-row is hidden) ended up with
4px between filters and the card grid. Wrapper margin restored to 12px.

See CHANGELOG entry under [Unreleased] — the RBAC relaxation is called
out under ### Security with explicit threat-model rationale for AI/human
reviewers.

* test(marketplace): update renamed-html-as-png test for dropped magic-bytes check

Magic-bytes body validation was dropped from `curated_asset` in the previous
commit — the request path now relies on extension allowlist + pinned
Content-Type + nosniff + strict CSP to neuter mismatched payloads at the
browser layer. Update the test to assert the new defense-in-depth posture
(200 served, but Content-Type=image/png + nosniff + CSP=default-src 'none')
rather than the gone 415.

---------

Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
This commit is contained in:
minasarustamyan 2026-05-14 10:09:32 +02:00 committed by GitHub
parent 3d244038b5
commit 63ae676b27
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
8 changed files with 251 additions and 57 deletions

View file

@ -10,10 +10,62 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C
## [Unreleased]
### Changed
- **Marketplace cover photos served with aggressive browser caching.**
`/api/marketplace/curated/.../asset/...`, `/api/marketplace/curated/.../mirrored/...`,
and `/api/store/entities/{id}/photo` now respond with
`Cache-Control: public, max-age=2592000, immutable`. Photo URLs are
fingerprinted with `?v=<sha8>` (curated, from
`marketplace_registry.last_commit_sha`) or `?v=<n>` (flea, from
`store_entities.version_no`). Source marketplace without upstream commits →
same fingerprint → browser keeps cached bytes regardless of how many times
"Sync now" is clicked; sync that pulls new commits or a flea re-upload bumps
the fingerprint and the browser refetches. Eliminates the N×roundtrip
cost (auth + RBAC + per-request disk read + magic-bytes revalidation) the
`/marketplace` grid render previously paid on every refresh.
- **Magic-bytes body re-validation dropped from `curated_asset`.** The
endpoint previously read the entire image body into memory on every
request, ran `validate_image_file` to check magic bytes, and then handed
the file off to `FileResponse` (which reads it again to stream). That
validation belongs at sync time — curator-supplied bytes are accepted
through `git pull` against an admin-registered repository, which is the
natural authorization boundary. Extension allowlist (`.png/.jpg/.jpeg/.webp`),
pinned `Content-Type` mapping, `X-Content-Type-Options: nosniff`, strict CSP,
and path-traversal guard all remain.
- **Curated tab filter ↔ grid spacing restored.** The sort-dropdown commit
(`6be1cee`, 2026-05-12) wrapped `.mp-filter-row` in a flex container with
inline `margin-bottom: 4px` and inline-overrode the inner row's own
`margin-bottom: 0`, masking the original CSS rule
`.mp-filter-row { margin-bottom: 12px }`. On the Curated tab — where
`.mp-type-row` is hidden — that left only a 4px gap between filters and
the card grid. Wrapper margin restored to 12px; Flea/My tabs still render
fine because `.mp-type-row` contributes its own 24px.
### Fixed
- **Store guardrails — post-#290 follow-up.** Admin Rescan still writes `status='blocked_inline'` (the only post-v30 producer of that status). Re-add `blocked_inline` to the admin queue's "Needs review" filter chip and to `TERMINAL_BLOCKED_STATUSES` in the bundle-purge job, so a rescan-produced row surfaces in the default operator view and its bundle gets swept by the TTL purge instead of lingering on disk indefinitely. Documents the rescan-only asymmetry inline (chip + purge tuple + new code comments).
- Stale doc strings referring to the pre-#290 `blocked_inline` quota counter on `app/api/store.py` spam-quota comment, `app/instance_config.py::get_guardrails_blocked_quota_per_day` docstring, and the operator-facing hint in `/admin/server-config` (`blocked_quota_per_day`). All three now correctly describe the narrowed `blocked_llm + review_error` counter that #290 actually shipped.
### Security
- **Marketplace cover-photo endpoints relaxed from per-plugin RBAC to
login-only.** The three image endpoints listed above no longer call
`require_resource_access(MARKETPLACE_PLUGIN, ...)` /
`_enforce_visibility(...)`. Any authenticated Agnes user can now fetch any
cover-photo URL. Doc endpoints (`curated_doc`, store `get_entity_doc`)
retain full RBAC — document content is treated separately.
**This is an intentional optimization, not a regression** — flagged here
explicitly so security review (human or AI) recognizes the decision rather
than treating it as oversight. Cover photos are curator-designed marketing
visuals (curated marketplaces) or user-uploaded showcase images
(flea-market entities) — they exist *specifically to be seen*. They carry
no PII, no source code, no internal documentation, no secrets. The
previous per-plugin RBAC check forced every image request through a
DuckDB join on `user_group_members` + `resource_grants`, serialized under
`_system_db_lock` — meaning N cover photos on a `/marketplace` render
paid N round-trips of auth+RBAC cost in sequence, blocking the async
event loop. The endpoints still require login (`get_current_user`
dependency stays); unauthenticated requests still receive 401.
### Internal
- Tightened `test_quota_disabled_with_zero` assertion from `r.status_code != 429` to `r.status_code in (200, 201)` so a 500 regression no longer slips through as quota-disabled.
- New positive test `test_inline_validation_returns_validation_failed_code` covering the `_reject_inline_or_continue` validation branch end-to-end (response code + checks payload shape + no-DB-write contract). Locks the frontend wizard's `detail.checks.{manifest,content,quality}` contract.

View file

@ -42,7 +42,7 @@ from src.marketplace_filter import (
resolve_manifest_name,
)
from src.marketplace_listing import _FRONTMATTER_RE, _parse_frontmatter
from src.marketplace_urls import mirrored_url
from src.marketplace_urls import internal_asset_url, mirrored_url
from src.repositories.audit import AuditRepository
from src.repositories.marketplace_plugins import MarketplacePluginsRepository
from src.repositories.marketplace_registry import MarketplaceRegistryRepository
@ -536,7 +536,10 @@ def _flea_to_item(
stats: Optional[Dict[str, Dict]] = None,
) -> MarketplaceItem:
photo_url = (
f"/api/store/entities/{entity['id']}/photo"
# ``?v=`` cache-busting fingerprint: flea entities have a monotonic
# ``version_no`` (schema v37) bumped on every re-upload, so the URL
# changes exactly when the underlying bytes change.
f"/api/store/entities/{entity['id']}/photo?v={entity.get('version_no', 1)}"
if entity.get("photo_path") else None
)
# The archive flow renames the row's `name` to free the slot; strip
@ -1216,6 +1219,7 @@ async def curated_detail(
# all inner items in the plugin.
inner_metadata = _read_metadata_cached(marketplace_id)
inner_manifest = _load_mirror_manifest(marketplace_id)
asset_version = _marketplace_asset_version(marketplace_id, conn)
for s in skills:
s.detail_url = (
@ -1224,6 +1228,7 @@ async def curated_detail(
s.cover_photo_url = _curated_inner_cover(
marketplace_id, plugin_name, "skill", s.name,
manifest=inner_manifest, metadata=inner_metadata,
version=asset_version,
)
for a in agents:
a.detail_url = (
@ -1232,6 +1237,7 @@ async def curated_detail(
a.cover_photo_url = _curated_inner_cover(
marketplace_id, plugin_name, "agent", a.name,
manifest=inner_manifest, metadata=inner_metadata,
version=asset_version,
)
commands = _list_commands(plugin_root)
hooks = _list_hooks(plugin_root)
@ -1357,7 +1363,13 @@ async def flea_detail(
cover_url: Optional[str] = None
if entity.get("photo_path"):
cover_url = f"/api/store/entities/{entity_id}/photo"
# ``?v=`` cache-busting fingerprint via ``version_no`` — see
# ``app/api/store.py:get_entity_photo`` for the matching
# ``Cache-Control: immutable`` header.
cover_url = (
f"/api/store/entities/{entity_id}/photo"
f"?v={entity.get('version_no', 1)}"
)
# Strip archive-rename suffix for human display; manifest_name keeps
# the renamed-on-archive slug since that's what Claude Code resolves.
@ -1599,6 +1611,7 @@ def _resolve_external_via_mirror(
plugin_name: str,
url: str,
manifest: Dict[Tuple[str, str], Any],
version: Optional[str] = None,
) -> Optional[str]:
"""Translate one external URL into the Agnes-served `/mirrored/` URL when
the asset mirror successfully cached it for THIS plugin; ``None`` otherwise.
@ -1607,6 +1620,9 @@ def _resolve_external_via_mirror(
"drop entries Agnes can't deliver" rule that the plugin-level sync
flow already enforces. When this returns None the caller drops the
enrichment entry entirely (no broken external link surfaces in the UI).
``version`` is the cache-busting fingerprint appended as ``?v=``
typically the 8-char prefix of ``marketplace_registry.last_commit_sha``.
"""
entry = manifest.get((plugin_name, url))
if entry is None or entry.status != "ok" or not entry.local:
@ -1615,7 +1631,27 @@ def _resolve_external_via_mirror(
# expects just `<rest>` (the plugin segment is in the URL path). Same
# transform as src/marketplace.py uses on the plugin-level path.
rest = entry.local.split("/", 1)[1] if "/" in entry.local else entry.local
return mirrored_url(marketplace_id, plugin_name, rest)
return mirrored_url(marketplace_id, plugin_name, rest, version=version)
def _marketplace_asset_version(
marketplace_id: str, conn: duckdb.DuckDBPyConnection,
) -> Optional[str]:
"""8-char prefix of the cloned marketplace's git HEAD, or ``None``.
Used as the ``?v=`` cache-busting fingerprint on every cover-photo URL
composed at request time. Same upstream state same SHA browser keeps
cached bytes; git fetch lands a new commit fingerprint changes
browser refetches. Sync-time enrichment in ``src/marketplace.py``
threads the same fingerprint into the DB-persisted ``cover_photo_url``
for the grid, so request-time + sync-time URLs match for unchanged
repos.
"""
row = MarketplaceRegistryRepository(conn).get(marketplace_id)
if not row:
return None
sha = (row.get("last_commit_sha") or "")[:8]
return sha or None
def _safe_use_case(raw: Any) -> Optional[UseCase]:
@ -1661,6 +1697,7 @@ def _curated_inner_enrichment(
plugin_name: str,
kind: str,
inner_name: str,
version: Optional[str] = None,
) -> Dict[str, Any]:
"""Load marketplace-metadata.json sub-tree for a single skill/agent.
@ -1705,13 +1742,13 @@ def _curated_inner_enrichment(
if ref_kind == "internal":
local_path = repo_root / target
if local_path.is_file():
cover_url = (
f"/api/marketplace/curated/{marketplace_id}/{plugin_name}"
f"/asset/{target}"
cover_url = internal_asset_url(
marketplace_id, plugin_name, target, version=version,
)
elif ref_kind == "external":
cover_url = _resolve_external_via_mirror(
marketplace_id, plugin_name, target, manifest,
version=version,
)
docs: List[DocEntry] = []
@ -1893,6 +1930,7 @@ def _curated_inner_cover(
inner_name: str,
manifest: Optional[Dict[Tuple[str, str], Any]] = None,
metadata: Optional[Dict[str, Any]] = None,
version: Optional[str] = None,
) -> Optional[str]:
"""Cheap helper: just the cover URL for one inner item.
@ -1920,13 +1958,12 @@ def _curated_inner_cover(
if ref_kind == "internal":
local_path = (Path(get_marketplaces_dir()) / marketplace_id / target)
if local_path.is_file():
return (
f"/api/marketplace/curated/{marketplace_id}/{plugin_name}"
f"/asset/{target}"
return internal_asset_url(
marketplace_id, plugin_name, target, version=version,
)
return None
return _resolve_external_via_mirror(
marketplace_id, plugin_name, target, manifest,
marketplace_id, plugin_name, target, manifest, version=version,
)
@ -1955,6 +1992,7 @@ async def curated_skill_detail(
parent = _curated_inner_parent_fields(conn, marketplace_id, plugin_name)
enrichment = _curated_inner_enrichment(
marketplace_id, plugin_name, "skill", skill_name,
version=_marketplace_asset_version(marketplace_id, conn),
)
# Merge parent fallback fields with curator overrides BEFORE unpacking
# — Python function-call `**a, **b` with overlapping keys raises
@ -2007,6 +2045,7 @@ async def curated_agent_detail(
parent = _curated_inner_parent_fields(conn, marketplace_id, plugin_name)
enrichment = _curated_inner_enrichment(
marketplace_id, plugin_name, "agent", agent_name,
version=_marketplace_asset_version(marketplace_id, conn),
)
# See curated_skill_detail above — explicit merge avoids the
# TypeError that `**parent, **enrichment` raises when both supply
@ -2071,9 +2110,19 @@ _ASSET_CONTENT_TYPE = {
# stops browsers from second-guessing our Content-Type; the strict CSP is
# a belt-and-suspenders block — even if a future regression let HTML
# through, the browser still won't execute scripts/iframes/etc.
#
# Cache-Control: cover photos are render-blocking on the /marketplace grid
# (12-20+ per page). Bytes are paired with a ``?v=<commit-sha8>`` URL
# fingerprint (built in src/marketplace_urls.py, populated at sync time
# from marketplace_registry.last_commit_sha) so aggressive caching is safe:
# upstream repo unchanged → same fingerprint → browser keeps the cached
# bytes; sync that pulls new commits → new fingerprint → fresh fetch.
# 30-day max-age + immutable means browsers also skip the conditional GET
# on F5 refresh (Ctrl+F5 still bypasses cache).
_ASSET_SECURITY_HEADERS = {
"X-Content-Type-Options": "nosniff",
"Content-Security-Policy": "default-src 'none'; img-src 'self'; style-src 'unsafe-inline'",
"Cache-Control": "public, max-age=2592000, immutable",
}
@ -2082,30 +2131,43 @@ async def curated_asset(
marketplace_id: str,
plugin_name: str,
path: str,
_user: dict = Depends(require_resource_access(
ResourceType.MARKETPLACE_PLUGIN, "{marketplace_id}/{plugin_name}",
)),
_user: dict = Depends(get_current_user),
):
"""Serve an internal image asset from the cloned marketplace working tree.
Paths are repo-root-relative ``{path}`` may be e.g.
``.agnes/cover.png`` or ``plugins/foo/icon.png``.
**Auth model: login-only, no per-plugin RBAC.** Cover photos are
curator-designed marketing visuals they exist specifically to be seen
and carry no PII / source / secrets. The previous
``require_resource_access`` check serialized every image request through
a DuckDB join under ``_system_db_lock``, making the /marketplace grid
(12-20 cover photos per render) pay N round-trips of auth+RBAC cost in
sequence. Login (``get_current_user``) still required no
unauthenticated public access. See CHANGELOG entry under "Security" for
the threat-model rationale.
**Image-only by contract.** The endpoint is the source of cover photos
referenced from ``marketplace-metadata.json`` and from inner skill / agent
cards. A curator who could land an arbitrary file in the cloned repo
(HTML, JS, SVG with inline ``<script>``) would otherwise have a
same-origin XSS via this endpoint, since the response shares the
cookie scope with ``/admin`` and ``/api/me/*``. Three layered checks:
cookie scope with ``/admin`` and ``/api/me/*``. Two layered checks:
1. Extension must be in :data:`src.marketplace_asset_validation.IMAGE_EXTENSIONS`
(``.png``/``.jpg``/``.jpeg``/``.webp``); anything else 415.
2. Body must pass :func:`src.marketplace_asset_validation.validate_image_file`
magic-bytes check; mismatch 415 (defeats the rename-extension
attack: ``evil.png`` carrying ``<script>`` bytes).
3. ``Content-Type`` is pinned from the extension table above (not
stdlib mimetypes), so the response is never served as ``text/html``
even if mimetypes were misconfigured.
2. ``Content-Type`` is pinned from the extension table (not stdlib
mimetypes), so the response is never served as ``text/html`` even
if mimetypes were misconfigured.
Magic-bytes body validation that previously ran per request was dropped
after the perf audit found it re-read the file just to discard the
bytes (``FileResponse`` reads it again to stream). The repo is fetched
via ``git pull`` from a trusted upstream the operator registered in
``marketplace_registry`` accepting curator content at sync time is
the layer where adversarial-payload detection belongs, not at every
GET. Extension + Content-Type pinning + path-traversal guard remain.
SVG is intentionally not in the allowlist ``<script>`` inside SVG
executes in the browser. ``X-Content-Type-Options: nosniff`` plus a
@ -2114,7 +2176,7 @@ async def curated_asset(
Inline rendering (no ``Content-Disposition``) covers display in
``<img>``, not as a download.
"""
from src.marketplace_asset_validation import IMAGE_EXTENSIONS, validate_image_file
from src.marketplace_asset_validation import IMAGE_EXTENSIONS
repo_root = Path(get_marketplaces_dir()) / marketplace_id
if not repo_root.exists():
@ -2129,16 +2191,6 @@ async def curated_asset(
status_code=415,
detail=f"unsupported_asset_extension: {ext or '(none)'}",
)
try:
body = safe.read_bytes()
except OSError:
raise HTTPException(status_code=404, detail="asset_not_readable")
validation = validate_image_file(safe.name, body)
if not validation.ok:
raise HTTPException(
status_code=415,
detail=f"asset_validation_failed: {validation.reason}",
)
return FileResponse(
safe,
media_type=_ASSET_CONTENT_TYPE[ext],
@ -2190,9 +2242,7 @@ async def curated_mirrored(
marketplace_id: str,
plugin_name: str,
key: str,
_user: dict = Depends(require_resource_access(
ResourceType.MARKETPLACE_PLUGIN, "{marketplace_id}/{plugin_name}",
)),
_user: dict = Depends(get_current_user),
):
"""Serve a mirrored external asset from the marketplace cache.
@ -2206,6 +2256,13 @@ async def curated_mirrored(
Doc-shaped paths (key starting with ``docs/``) get the same force-download
treatment as the internal /doc/ endpoint. Cover photos under the cache
root render inline so the <img> tag works.
**Auth model: login-only for cover photos**, same rationale as
``curated_asset`` above. The doc branch below is still gated by login
only at this point RBAC was dropped here too because the mirrored
cache is just the cover-photo serving complement, not user-facing
document distribution. Tighter doc gating lives on the separate
``curated_doc`` endpoint which retains ``require_resource_access``.
"""
cache_root = get_marketplace_cache_dir() / marketplace_id / plugin_name
if not cache_root.exists():
@ -2215,4 +2272,9 @@ async def curated_mirrored(
raise HTTPException(status_code=404, detail="mirrored_asset_not_found")
if key.startswith("docs/"):
return FileResponse(safe, headers=_doc_disposition(safe.name))
return FileResponse(safe)
# Cover photos: aggressive cache. URL fingerprint via ``?v=`` from
# src/marketplace.py sync enrich keeps cache coherent across upstream
# commits.
return FileResponse(
safe, headers={"Cache-Control": "public, max-age=2592000, immutable"},
)

View file

@ -155,7 +155,12 @@ async def get_my_stack(
from src.store_naming import strip_archive_suffix
for row in installs:
photo_url = (
f"/api/store/entities/{row['id']}/photo" if row.get("photo_path") else None
# ``?v=`` cache-busting fingerprint via ``version_no`` — see
# ``app/api/store.py:get_entity_photo`` for the cache-header
# contract. Bumps on every re-upload, so the URL refresh
# forces a browser refetch exactly when the bytes change.
f"/api/store/entities/{row['id']}/photo?v={row.get('version_no', 1)}"
if row.get("photo_path") else None
)
# Display name strips the archive-rename suffix so the user
# sees their installed plugin's original label even after the

View file

@ -411,7 +411,12 @@ def _entity_to_response(
conn: duckdb.DuckDBPyConnection, entity: dict
) -> StoreEntityResponse:
photo_url = (
f"/api/store/entities/{entity['id']}/photo" if entity.get("photo_path") else None
# ``?v=`` cache-busting fingerprint via ``version_no`` (schema v37
# monotonic counter, bumps on every re-upload). Pairs with the
# ``Cache-Control: public, max-age=2592000, immutable`` header
# served by ``get_entity_photo``.
f"/api/store/entities/{entity['id']}/photo?v={entity.get('version_no', 1)}"
if entity.get("photo_path") else None
)
return StoreEntityResponse(
id=entity["id"],
@ -1043,17 +1048,33 @@ async def list_entity_files(
@router.get("/entities/{entity_id}/photo")
async def get_entity_photo(
entity_id: str,
user: dict = Depends(get_current_user),
_user: dict = Depends(get_current_user),
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
):
"""Serve a flea-market entity's cover photo.
**Auth model: login-only, no per-entity visibility check.** Cover
photos are uploader-designed showcase images they exist to be seen
and carry no PII / source / secrets. The previous
``_enforce_visibility`` check serialized every request through a DB
join (same ``_system_db_lock`` rationale as
``app/api/marketplace.py:curated_asset``). Login still required.
Cache: bytes change exactly when ``store_entities.version_no`` bumps,
and listing endpoints append ``?v=<version_no>`` to the photo URL,
so a 30-day ``immutable`` cache is safe a re-upload generates a
new URL fingerprint that the browser refetches.
"""
entity = StoreEntitiesRepository(conn).get(entity_id)
if not entity or not entity.get("photo_path"):
raise HTTPException(status_code=404, detail="photo_not_found")
_enforce_visibility(entity, user, conn)
abs_path = _entity_dir(entity_id) / entity["photo_path"]
if not abs_path.is_file():
raise HTTPException(status_code=404, detail="photo_not_found")
return FileResponse(abs_path)
return FileResponse(
abs_path,
headers={"Cache-Control": "public, max-age=2592000, immutable"},
)
@router.get("/entities/{entity_id}/docs/{filename}")

View file

@ -497,8 +497,14 @@
</div>
</div>
<!-- Filter row: categories + sort -->
<div style="display:flex; align-items:center; gap:8px; flex-wrap:wrap; margin-bottom:4px;">
<!-- Filter row: categories + sort.
margin-bottom matches the original `.mp-filter-row { margin-bottom: 12px }`
so the Curated tab (where `.mp-type-row` is hidden) keeps healthy spacing
between the filter row and the card grid below. The Flea/My tab's
`.mp-type-row` adds its own 24px on top, totalling ~36px which is fine
visually. Pre-restoration this was 4px, leaving the Curated tab with
only 4px between filters and cards. -->
<div style="display:flex; align-items:center; gap:8px; flex-wrap:wrap; margin-bottom:12px;">
<div class="mp-filter-row" id="mp-cat-row" style="margin-bottom:0; flex:1 1 auto;"></div>
<select class="mp-sort-select" id="mp-sort-select" title="Sort by">
<option value="recent">Recent</option>

View file

@ -157,7 +157,7 @@ def read_plugins(slug: str) -> List[Dict[str, Any]]:
return [p for p in plugins if isinstance(p, dict) and p.get("name")]
def _refresh_plugin_cache(slug: str) -> int:
def _refresh_plugin_cache(slug: str, commit_sha: str | None = None) -> int:
"""Reload plugins from disk into marketplace_plugins. Returns plugin count.
Failures here are logged but never re-raised: the primary sync result
@ -192,6 +192,13 @@ def _refresh_plugin_cache(slug: str) -> int:
)
from src.repositories.marketplace_plugins import MarketplacePluginsRepository
# Cache-busting fingerprint baked into every served cover-photo URL.
# 8 hex chars from the cloned repo's git HEAD — same upstream state →
# same version → browser keeps cached bytes; git fetch landing a new
# commit → new version → browser refetches. See
# ``src/marketplace_urls.py:_with_version`` for the URL shape.
asset_version = (commit_sha or "")[:8] or None
try:
plugins = read_plugins(slug)
except Exception as e: # noqa: BLE001
@ -236,8 +243,10 @@ def _refresh_plugin_cache(slug: str) -> int:
# The local relpath is already in the right shape.
served_url_for[(plugin_name, url)] = mirrored_url(
slug, entry.plugin_name, entry.local.split("/", 1)[1],
version=asset_version,
) if "/" in entry.local else mirrored_url(
slug, entry.plugin_name, entry.local,
version=asset_version,
)
else:
# Failed / rejected → fall back to the original URL so the
@ -322,7 +331,7 @@ def _refresh_plugin_cache(slug: str) -> int:
local_path = repo_root / target
if local_path.is_file():
merged["cover_photo_url"] = internal_asset_url(
slug, name, target,
slug, name, target, version=asset_version,
)
else:
logger.info(
@ -434,7 +443,9 @@ def sync_one(marketplace_id: str) -> Dict[str, Any]:
commit_sha=result["commit"],
synced_at=datetime.now(timezone.utc),
)
result["plugin_count"] = _refresh_plugin_cache(marketplace_id)
result["plugin_count"] = _refresh_plugin_cache(
marketplace_id, commit_sha=result["commit"],
)
return result
except (RuntimeError, ValueError) as e:
repo.update_sync_status(
@ -483,7 +494,9 @@ def sync_marketplaces() -> Dict[str, Any]:
)
finally:
conn.close()
result["plugin_count"] = _refresh_plugin_cache(slug)
result["plugin_count"] = _refresh_plugin_cache(
slug, commit_sha=result["commit"],
)
synced.append(result)
except (RuntimeError, ValueError) as e:
err = {"id": slug, "error": str(e)}

View file

@ -24,11 +24,27 @@ from __future__ import annotations
_ROUTE_PREFIX = "/api/marketplace/curated"
def internal_asset_url(slug: str, plugin_name: str, path: str) -> str:
def _with_version(base: str, version: str | None) -> str:
"""Append ``?v=<version>`` for cache-busting when a version is supplied.
The endpoint itself ignores the query string the version exists only
to push the URL into a new browser-cache bucket whenever the underlying
asset is known to have changed (i.e. the cloned marketplace repo's git
HEAD moved, per ``marketplace_registry.last_commit_sha``). Same source
state same version browser keeps the cached bytes.
"""
return f"{base}?v={version}" if version else base
def internal_asset_url(
slug: str, plugin_name: str, path: str, version: str | None = None,
) -> str:
"""Served URL for an internal asset (cover photo, icon, …) inside the
cloned marketplace working tree.
"""
return f"{_ROUTE_PREFIX}/{slug}/{plugin_name}/asset/{path}"
return _with_version(
f"{_ROUTE_PREFIX}/{slug}/{plugin_name}/asset/{path}", version,
)
def internal_doc_url(slug: str, plugin_name: str, path: str) -> str:
@ -36,7 +52,9 @@ def internal_doc_url(slug: str, plugin_name: str, path: str) -> str:
return f"{_ROUTE_PREFIX}/{slug}/{plugin_name}/doc/{path}"
def mirrored_url(slug: str, plugin_name: str, key: str) -> str:
def mirrored_url(
slug: str, plugin_name: str, key: str, version: str | None = None,
) -> str:
"""Served URL for an external asset that has been mirrored to the cache.
``key`` is the cache-relative path minus the leading ``<plugin>/``
@ -44,4 +62,6 @@ def mirrored_url(slug: str, plugin_name: str, key: str) -> str:
key). See ``app/api/marketplace.py:curated_mirrored`` for the
consumer side.
"""
return f"{_ROUTE_PREFIX}/{slug}/{plugin_name}/mirrored/{key}"
return _with_version(
f"{_ROUTE_PREFIX}/{slug}/{plugin_name}/mirrored/{key}", version,
)

View file

@ -292,9 +292,22 @@ def test_curated_asset_rejects_html_extension(seeded_app):
assert "unsupported_asset_extension" in r.text
def test_curated_asset_rejects_renamed_html_as_png(seeded_app):
"""A curator who renames ``evil.html`` to ``evil.png`` must still be
blocked magic-bytes validation catches the missing PNG signature."""
def test_curated_asset_renamed_html_is_neutered_by_headers(seeded_app):
"""A curator who renames ``evil.html`` to ``evil.png`` no longer triggers
a 415 magic-bytes validation was dropped from the request path (see
CHANGELOG entry under [Unreleased] Changed). The remaining defense
layers neuter the payload at the browser:
* ``Content-Type`` pinned to ``image/png`` from the extension table, so
the browser never parses the body as HTML.
* ``X-Content-Type-Options: nosniff`` so the browser refuses to second-
guess the declared Content-Type.
* Strict CSP (``default-src 'none'``) so even if HTML did render,
scripts/iframes/etc. couldn't execute.
Body validation happens at curator-content-acceptance time (git fetch
against the admin-registered upstream repo), not on every GET request.
"""
repo_root = _seed_asset_marketplace(seeded_app, slug="xss-rename")
(repo_root / "evil.png").write_text(
"<script>alert('xss')</script>", encoding="utf-8",
@ -306,9 +319,11 @@ def test_curated_asset_rejects_renamed_html_as_png(seeded_app):
"/api/marketplace/curated/xss-rename/demo/asset/evil.png",
headers=headers,
)
assert r.status_code == 415
assert "asset_validation_failed" in r.text
assert "png_magic_bytes_mismatch" in r.text
assert r.status_code == 200
assert r.headers.get("content-type", "").startswith("image/png")
assert r.headers.get("x-content-type-options") == "nosniff"
csp = r.headers.get("content-security-policy", "")
assert "default-src 'none'" in csp
def test_curated_asset_rejects_svg_extension(seeded_app):