* feat(store): flea-market upload guardrails + soft delete + JOIN-based admin queue
Adds an end-to-end guardrails pipeline for store uploads (manifest +
static-security + LLM review), persists blocked bundles for forensics,
introduces soft-delete (Archive) semantics, consolidates the legacy
/store/{id} surface into /marketplace/flea/{id}, and reworks the admin
queue so lifecycle filters read live entity visibility via LEFT JOIN
rather than a denormalized submission column.
Schema v29 → v35:
* v29 store_submissions table + store_entities.visibility_status
* v30 file_size, bundle_sha256, bundle_purged_at on submissions
* v31 reshape store_submissions (drop legacy unique on entity_id)
* v32 store_entities.archived_at/by + 'archived' visibility value
* v33 drop store_submissions.retry_count (unused)
* v34 ensure idx_store_submissions_entity exists post column-drop
* v35 broaden visibility_status enum + JOIN architecture cutover
Pipeline (src/store_guardrails/):
* Inline checks: manifest_check, static_scan, quality_check
* LLM review configurable haiku|sonnet|opus (default haiku)
* BackgroundTasks-driven async path with structured-output JSON
* Per-submitter daily quota (default 50)
* 30-day TTL purge job (POST /api/admin/run-blocked-purge)
* Bundle SHA256 + size persisted; sha256 survives purge for forensics
Visibility model:
* pending | approved | hidden | archived
* _enforce_visibility returns 404 (no leak) for non-owner non-admin
* Owner sees own non-approved entries via include_owner_id widening
* Install refused with 409 entity_not_approved when not approved
Soft-delete (DELETE /api/store/entities/{id}):
* Default = soft (visibility_status='archived'); existing installs
keep getting served the bundle so users don't lose the plugin
* ?hard=true admin-only: drops bundle + cascades user_store_installs
* Hard-delete preserves entity_id on submission as tombstone so
audit_log linkage survives for the activity timeline
Admin queue lifecycle (the JOIN refactor):
* Verdict (store_submissions.status) is immutable forensic record
* Lifecycle (store_entities.visibility_status) is live state
* /admin/store/submissions Archived chip translates to
`e.visibility_status='archived'` via LEFT JOIN — any path that
flips visibility surfaces in the queue immediately
* Detail page renders Status (verdict) and Entity lifecycle side by
side so admins see "approved at review, now archived" at a glance
URL consolidation:
* /store/{id} deleted (no redirect, stale bookmarks 404)
* /marketplace/flea/{id} is the canonical detail surface
* Three in-tree callers (upload-success, my-stack card, store
listing card) updated to point at the new URL
* Quarantine banner extracted to _quarantine_banner.html partial,
self-guarded, included from both flea detail templates
* Banner JS auto-refreshes when the verdict lands by polling
/api/marketplace/flea/{id}/detail (visibility_status +
submission_status — the latter is needed because blocked_llm
keeps the entity at visibility_status='pending')
Audit log resource format:
* runner.py emits prefixed `store_submission:{id}` (post-fix)
* Detail-page timeline query handles three patterns: prefixed
submission, helper-emitted `store_entity:{sub_id}`, and bare-id
legacy rows — all surface in the activity timeline
UX fixes:
* Owner sees Under review / Quarantined / Hidden banner with status
* Install button gray-disabled (not blue) when non-approved
* Owner cannot delete quarantined entries (403); admin can
* Admin queue: filter chips, sortable columns, paging, page-size
* Auto-refresh queue every 5s while pending rows are visible
* Store upload page file picker no longer opens twice (label →
input default action collided with explicit JS handler)
Tests: 168 passed across the guardrails suites (admin submissions,
store API, inline / LLM / purge guardrails, store repositories,
marketplace filter, schema version). New regression coverage
includes: archive surfaces via JOIN even when API path is bypassed;
deleted submission renders activity timeline (tombstone); flea
detail surfaces submission_status only for owner/admin; detail page
renders Entity lifecycle row; audit log resource format covers both
helper and runner paths.
* fix(store-guardrails): PR #233 follow-up — prompt injection, atomic PUT, BG race, schema, reaper, sort whitelist
Addresses 9 of the 23 findings from the PR #233 review (spec at
docs/superpowers/specs/2026-05-09-pr233-guardrails-fixes-spec.md).
Merge-gate items #1-#6 plus high-value mediums #7, #9-#12, #23.
Architectural items (#8 enum split, #14 factory) and pure
maintainability (#15-#22) deferred to follow-ups.
Security:
* #1 prompt injection — SYSTEM_PROMPT now passed via the SDK's
dedicated system= parameter; bundle wrapped in <bundle>...</bundle>
sentinels declared data-only by the system prompt; literal
sentinel strings in user content are escaped so an adversarial
README can't forge a close tag.
* #6 static scan honesty — module docstring + admin copy + docs
declare static scan as signal not gate; .md/.txt/.rst/.html/.json/
.yaml/.yml/.toml skipped to avoid false positives on prose.
AST mode for Python deferred (separate flag, FP comparison work).
Correctness:
* #2 PUT atomicity — bundles bake into plugin.staging-<rand>/
alongside live, atomic-rename on success; failed checks leave
live tree byte-for-byte intact.
* #3 BG-task race — set_visibility_if_pending guards verdict flips
to the (pending, hidden) review window; admin archives during
review survive; skipped flips audit-logged.
* #4 v35 NOT NULL/DEFAULT — schema v35→v36 re-applies them on
store_entities.visibility_status. CHECK constraint enforced
application-side (DuckDB ADD CHECK on existing column unsupported).
* #7 stuck-review reaper — reap_stuck_llm_reviews flips pending_llm
rows older than guardrails.stuck_review_grace_seconds (default
1800) to review_error. Scheduler runs every 15 min via new
/api/admin/run-reap-stuck-reviews. Set knob to 0 to disable.
* #9 quota counter — count_blocked_for_submitter_since now counts
blocked_inline + blocked_llm + review_error so a submitter
triggering only LLM-blocked verdicts is bounded.
* #10 missing risk_level — surfaces as review_error with
error='missing_risk_level' instead of silently defaulting to
'medium' (which looked like a model-decided block).
* #11 archived_at clear — set_visibility nulls archived_at +
archived_by when transitioning out of 'archived' so a future
read doesn't show stale archive forensics on an approved row.
Maintainability:
* #12 FSM doc comment — accurate insert/transition/lifecycle
description in src/db.py near store_submissions schema.
* #23 sort-key whitelist — admin queue rejects unknown sort keys
with 400 invalid_sort_key; substring-replace footgun removed.
Deferred (separate PRs):
* #5 quota race — proper fix requires asyncio.Lock spanning the
full pipeline; threading.Lock blocks event loop, DuckDB MVCC
doesn't help. API-level slowapi bounds worst case for now.
* #6 part 3 (AST static scan), #8 (enum split), #13 (import
bundle docs), #14 (factory consolidation), #15-#22 (maint).
Tests:
* New: tests/test_store_guardrails_prompt_injection.py (corpus +
trust-boundary invariants), tests/test_store_put_atomic.py,
tests/test_store_guardrails_reaper.py.
* Extended: test_store_guardrails_llm.py (system param, missing
risk_level, BG race), test_admin_store_submissions.py (quota
counter widening, sort whitelist 400), test_store_repositories.py
(un-archive metadata clear), test_db_schema_version.py (v36).
* Full suite: 3738 passed; 17 pre-existing baseline failures
unchanged (db migration tests, cli binary rename, catalog export,
user mgmt v5 backfill — confirmed by stash + rerun on clean tree).
127 lines
5 KiB
Python
127 lines
5 KiB
Python
"""Naming helpers for Store entities.
|
|
|
|
The marketplace served to Claude Code is flat: skill / agent / plugin names
|
|
must be globally unique within a user's view, otherwise Claude Code resolves
|
|
the second-loaded entity over the first. To prevent collisions across
|
|
different Store owners uploading entities with the same display name, every
|
|
Store-derived plugin is suffixed with the owner's sanitized email-local-part
|
|
(``-by-<username>``) at upload time.
|
|
|
|
The username is **snapshotted on the entity row** at upload — it does not
|
|
auto-update if the owner's email changes later. Per product spec, emails are
|
|
stable in this deployment; we don't refactor on email rename.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import hashlib
|
|
import re
|
|
from pathlib import Path
|
|
from typing import Iterable
|
|
|
|
_SANITIZE_RE = re.compile(r"[^a-z0-9-]+")
|
|
_DASH_COLLAPSE_RE = re.compile(r"-+")
|
|
|
|
|
|
def sanitize_username(email: str) -> str:
|
|
"""Convert an email to a Claude-Code-safe username slug.
|
|
|
|
Takes the local-part (everything before the first ``@``), lowercases it,
|
|
replaces every run of non-``[a-z0-9-]`` characters with a single ``-``,
|
|
collapses repeats, and trims leading/trailing dashes.
|
|
|
|
sanitize_username("alice_smith@example.com") -> "alice-smith"
|
|
sanitize_username("john.doe+claude@acme.com") -> "john-doe-claude"
|
|
sanitize_username("USER@example.com") -> "user"
|
|
|
|
Raises ``ValueError`` if the local-part sanitizes to an empty string —
|
|
callers (the upload endpoint) translate that to a 400.
|
|
|
|
Note: this mapping is **many-to-one** — ``alice.smith@x`` and
|
|
``alice_smith@x`` both yield ``alice-smith``. The Store namespace is
|
|
flat in Claude Code, so two such users uploading entities with the
|
|
same display name would produce identical ``<name>-by-<username>``
|
|
suffixes and collide in the served marketplace + bundle. The upload
|
|
endpoint enforces global uniqueness on the suffixed value via
|
|
``app.api.store._suffixed_already_taken`` and rejects the second one
|
|
with 409 ``conflict_global_suffix``; the per-owner UNIQUE on
|
|
``store_entities(owner_user_id, name)`` alone does not catch this.
|
|
"""
|
|
local = email.split("@", 1)[0].lower()
|
|
s = _SANITIZE_RE.sub("-", local)
|
|
s = _DASH_COLLAPSE_RE.sub("-", s).strip("-")
|
|
if not s:
|
|
raise ValueError(f"email local-part sanitizes to empty: {email!r}")
|
|
return s
|
|
|
|
|
|
def suffixed_name(original_name: str, username: str) -> str:
|
|
"""``<original-name>-by-<username>`` — the display+invocation name baked
|
|
into Store-derived plugin/skill/agent files at upload time.
|
|
"""
|
|
return f"{original_name}-by-{username}"
|
|
|
|
|
|
# v36+: archive renames the entity's `name` to free the (owner, name)
|
|
# slot and the global suffix slot for re-upload. The marker is a
|
|
# fixed-token plus epoch suffix so display-strip and is-archived
|
|
# detection are deterministic. Uploaders are blocked from picking
|
|
# a literal name matching this pattern via _NAME_RE in
|
|
# app/api/store.py.
|
|
_ARCHIVE_MARKER = "__archived__"
|
|
_ARCHIVE_NAME_RE = re.compile(rf"{_ARCHIVE_MARKER}\d+$")
|
|
|
|
|
|
def make_archive_name(original_name: str, archived_at_epoch: int) -> str:
|
|
"""Compute the suffixed name to write into ``store_entities.name``
|
|
when an entity transitions to ``visibility_status='archived'``.
|
|
|
|
The suffix frees the per-owner (owner_user_id, name) UNIQUE slot
|
|
AND the global ``<name>-by-<username>`` slug slot, so the owner
|
|
can re-upload under the original name without picking a new one.
|
|
Existing installers see the renamed slug on the next sync.
|
|
"""
|
|
return f"{original_name}{_ARCHIVE_MARKER}{int(archived_at_epoch)}"
|
|
|
|
|
|
def is_archived_name(name: str) -> bool:
|
|
"""Whether ``name`` carries the archive-rename suffix."""
|
|
return bool(_ARCHIVE_NAME_RE.search(name or ""))
|
|
|
|
|
|
def strip_archive_suffix(name: str) -> str:
|
|
"""Return the display form of a possibly-archived ``name``.
|
|
|
|
No-op when the input doesn't carry the archive suffix. Used by
|
|
admin queue + my-stack templates so the user-facing label shows
|
|
the original name with an "Archived" badge instead of the ugly
|
|
suffix.
|
|
"""
|
|
if not name:
|
|
return name
|
|
return _ARCHIVE_NAME_RE.sub("", name)
|
|
|
|
|
|
def compute_entity_version(plugin_dir: Path) -> str:
|
|
"""Content-addressed version for a Store entity's plugin tree.
|
|
|
|
Hashes every regular file under ``plugin_dir`` in sorted-relative-path
|
|
order, including each file's relative path in the digest so that a rename
|
|
counts as a content change. Returns the first 16 hex chars of the SHA-256
|
|
— short enough for human-readable use in plugin.json ``version`` and
|
|
audit messages, long enough to be collision-free in practice.
|
|
"""
|
|
h = hashlib.sha256()
|
|
for f in _iter_files(plugin_dir):
|
|
rel = f.relative_to(plugin_dir).as_posix()
|
|
h.update(rel.encode("utf-8"))
|
|
h.update(b"\x00")
|
|
h.update(f.read_bytes())
|
|
h.update(b"\x00")
|
|
return h.hexdigest()[:16]
|
|
|
|
|
|
def _iter_files(root: Path) -> Iterable[Path]:
|
|
if not root.is_dir():
|
|
return []
|
|
return sorted(p for p in root.rglob("*") if p.is_file())
|