From cd030287766d65d9a1349ed3f1ee59bc3d681c9e Mon Sep 17 00:00:00 2001
From: Vojtech <119944107+cvrysanek@users.noreply.github.com>
Date: Sat, 16 May 2026 09:12:29 +0400
Subject: [PATCH] fix(store): restore reuses prior approved verdict + admin
 detail surfaces content_quality (#332)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* fix(store): restore reuses prior approved verdict; admin detail surfaces content_quality

Live bug on agnes-development: entity 6ba2ee1d…'s v5 submission (third
restore of v1, byte-identical to v1/v2/v4/v6) landed `blocked_llm`
while the other identical-hash siblings landed `approved`. Anthropic
structured output is non-deterministic — same bytes flipped
`content_quality.verdict` pass↔fail across calls. Admin detail page
made the failure look mysterious: only security-findings table
rendered, so a content-quality-only block showed up as
"No findings — model verdict was clean".

Two fixes:

1. Restore endpoint reuses a prior `approved` submission's verdict
   when the restored bundle hash matches an existing history entry
   AND `reviewed_by_model` matches. Skips the LLM call, stamps the
   new submission with the prior verdict + `reused_from_submission_id`
   marker. Deterministic + saves Anthropic tokens. Gated on
   schedule_async_llm so guardrails-off keeps its existing path.

2. Admin detail template now renders `content_quality.issues` in its
   own table + adds an explicit "Blocked but no findings recorded"
   notice for the transient-non-determinism case + surfaces the
   reuse marker when present.

Reuse falls back to a real LLM call when:
- prior submission's reviewed_by_model doesn't match current (admin
  upgraded tier Haiku → Sonnet → Opus)
- prior submission was guardrails-off (no reviewed_by_model)
- no history entry has matching hash

Tests:
- TestRestoreReusesApprovedVerdict::test_restore_of_approved_version_skips_llm_and_reuses_verdict
- TestRestoreReusesApprovedVerdict::test_restore_legacy_v1_falls_back_to_llm

* fix(store): admin detail v# by submission_id + version switcher

Three related fixes surfaced live by a user inspecting submission
47bbc1f5… on localhost where v# rendered as v1 even though current
was v10.

1. Admin queue + admin detail derive submission v# by submission_id
   instead of hash. Pre-fix the loop matched first hash-equal entry
   in version_history — always v1 when bundles were byte-identical
   (which is the common case after the restore-reuse path). Two
   call sites updated:
   - `src/repositories/store_submissions.py:list_for_admin` (queue
     v# column)
   - `app/web/router.py:admin_store_submission_detail_page` (detail
     page v# chip on each section header)

   Same fix pattern as PR #330 for runner / override.

2. New version-switcher card on admin detail page lists every
   submission linked to the entity with status + reviewed_by_model +
   click-to-jump. Solves the user's secondary ask ("should be a way
   to switch different versions on the submission detail").

3. Initial POST now backfills the v1 seed entry's submission_id
   right after creating the v1 submission. The helper
   `update_history_submission_id` existed but no production code
   path called it — so v1 always had submission_id=None and every
   "find v# for submission" lookup silently failed for v1.

171 tests green on touched surface.

* release: 0.54.24 — restore reuses prior approved verdict + admin detail content_quality + v# by submission_id (Codex/Live follow-up to #330/#331)

---------

Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
---
 CHANGELOG.md                                  |  46 ++++
 app/api/store.py                              | 155 +++++++++++++
 app/web/router.py                             |  60 ++++-
 .../admin_store_submission_detail.html        | 107 ++++++++-
 pyproject.toml                                |   2 +-
 src/repositories/store_submissions.py         |  10 +-
 tests/test_store_entity_versions.py           | 218 ++++++++++++++++++
 7 files changed, 584 insertions(+), 14 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 24487ca..6818b62 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -10,6 +10,52 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C
 
 ## [Unreleased]
 
+## [0.54.24] — 2026-05-16
+
+### Fixed
+- Flea-market admin submissions UI now derives the per-submission
+  `v#` label by **submission_id**, not **hash**. Hash-based lookup
+  mislabeled every byte-identical reupload (and every reused-verdict
+  restore — common after the restore-reuse fix below) as `v1`
+  because the loop picked the FIRST history entry with matching
+  hash. Affected both the admin queue column (`v#`) and the per-
+  section chips on the detail page. Same fix pattern as PR #330
+  (runner / override paths).
+- Flea-market admin submission detail page gained a version-switcher
+  card listing every submission linked to the same entity with
+  status badge + reviewed_by_model + click-to-jump. Lets admins
+  compare verdicts across versions without bouncing back to the
+  queue.
+- Flea-market initial POST now backfills the v1 seed entry's
+  `submission_id` immediately after creating the v1 submission row.
+  Pre-fix the v1 history entry always carried `submission_id=None`
+  so downstream lookups (`_version_no_for_submission`, admin queue
+  v#, admin detail chip, restore-reuse) silently failed for v1.
+- Flea-market restore endpoint now reuses the prior approved
+  submission's LLM verdict when the restored bundle is
+  byte-identical to a history entry already reviewed by the same
+  `review_model`. Pre-fix every restore re-ran the LLM; Anthropic
+  structured output is non-deterministic — same bytes flipped
+  `content_quality.verdict` pass↔fail across calls, so a second
+  restore of an already-approved version could spuriously land at
+  `blocked_llm`. Reuse skips the LLM, stamps the new submission
+  with the prior verdict + `reused_from_submission_id` marker,
+  and saves the Anthropic token cost. Surfaced live on a
+  development deployment where the third restore of a v1 bundle
+  (same hash as v1/v2/v4/v6 — multiple identical re-uploads)
+  landed `blocked_llm` while sibling submissions were `approved`.
+- Admin submission detail page now surfaces
+  `llm_findings.content_quality.issues` in its own table next to
+  the security-findings table. Pre-fix the template only rendered
+  security findings, so a submission blocked purely on
+  `content_quality.verdict='fail'` (no security findings) showed
+  up as "No findings — model verdict was clean" even though
+  `status='blocked_llm'`. Also adds an explicit
+  "Blocked but no findings recorded" notice when the verdict is
+  blocked but neither findings list is populated (transient LLM
+  non-determinism), pointing admin at Rescan / Override. Reuse
+  markers (`reused_from_submission_id`) render too.
+
 ## [0.54.23] — 2026-05-16
 
 ### Fixed
diff --git a/app/api/store.py b/app/api/store.py
index effcc7d..2cfbb6e 100644
--- a/app/api/store.py
+++ b/app/api/store.py
@@ -51,6 +51,7 @@ from app.auth.dependencies import _get_db, get_current_user
 from app.instance_config import (
     get_guardrails_enabled,
     get_guardrails_llm_provider_ready,
+    get_guardrails_review_model,
 )
 from app.utils import get_store_dir
 from src.db import get_system_db
@@ -431,6 +432,58 @@ async def _hold_entity_write_lock(entity_id: str):
         yield
 
 
+def _find_reusable_approved_verdict(
+    entity_row: Dict[str, Any],
+    new_version_hash: str,
+    subs_repo,
+    current_review_model: str,
+) -> Optional[tuple]:
+    """Find a prior `approved` submission whose bundle hash matches
+    the new submission's hash and whose reviewer was the same model.
+
+    Returns ``(prior_submission_id, prior_llm_findings)`` for the
+    caller to reuse, or ``None`` if no eligible prior verdict exists.
+
+    Rationale:
+        Anthropic structured output is non-deterministic. The
+        content_quality.verdict in particular can flip pass↔fail on
+        byte-identical bundles (observed live on a development
+        deployment: across five identical-hash submissions of one
+        entity, four landed `approved` and one landed `blocked_llm`
+        because the model flagged a description as weak on that one
+        call). When the user restores an already-approved version
+        — or re-uploads byte-identical bundles — the previous verdict
+        is the authoritative one and should be reused.
+
+        Gated on ``reviewed_by_model`` match so a stricter model can
+        still re-review under tightened rules (admin upgrades from
+        Haiku → Sonnet → Opus).
+    """
+    for entry in (entity_row.get("version_history") or []):
+        if entry.get("hash") != new_version_hash:
+            continue
+        sid = entry.get("submission_id")
+        if not sid:
+            continue
+        try:
+            sub = subs_repo.get(sid)
+        except Exception:
+            continue
+        if not sub:
+            continue
+        if sub.get("status") != "approved":
+            continue
+        # Skip rows that pre-date the LLM tier (guardrails-off prior
+        # approvals carry no reviewed_by_model — caller still needs
+        # the LLM to validate the bundle under guardrails-on rules).
+        if not sub.get("reviewed_by_model"):
+            continue
+        if sub.get("reviewed_by_model") != current_review_model:
+            continue
+        return sid, sub.get("llm_findings") or {}
+    return None
+
+
 def _version_no_for_submission(
     entity_row: Dict[str, Any], submission_id: str,
 ) -> Optional[int]:
@@ -1495,6 +1548,15 @@ async def create_entity(
             file_size=bundle_meta.file_size,
             bundle_sha256=bundle_meta.sha256,
         )
+        # Backfill the v1 seed's submission_id so downstream lookups
+        # (`_version_no_for_submission`, admin queue v#, admin detail
+        # v# chip, restore reuse) can resolve v1 the same way they
+        # resolve v2+. Pre-fix this was deferred to a later
+        # `update_history_submission_id` call that never landed in
+        # the create path — leaving the v1 entry with
+        # `submission_id=None` and every "find v# for this submission"
+        # lookup silently failing for v1.
+        repo.update_history_submission_id(entity_id, 1, sub_id)
         _audit(
             conn, user["id"],
             "store.submission.accepted" if guardrails_on else "store.submission.approved",
@@ -2136,6 +2198,99 @@ async def _restore_version_locked(
     hold_for_review = guardrails_enabled
     schedule_async_llm = guardrails_enabled and provider_ready
     subs_repo = StoreSubmissionsRepository(conn)
+
+    # Idempotent restore: when the restored bundle is byte-identical
+    # to a prior `approved` submission reviewed by the SAME model,
+    # reuse that verdict and skip the async LLM step. Anthropic
+    # structured output is non-deterministic — same bytes can flip
+    # content_quality.verdict pass↔fail across calls, so a second
+    # restore of an already-approved version could spuriously block.
+    # Reuse keeps the chain deterministic + saves tokens.
+    # `restore` (and its byte-for-byte recovery contract) is the
+    # most natural place for this; PUT could extend later. Gated on
+    # `schedule_async_llm` so guardrails-off keeps its existing
+    # auto-approve path and enabled-but-not-ready doesn't silently
+    # promote without a verdict in env.
+    reuse_verdict = None
+    if schedule_async_llm:
+        try:
+            current_model = get_guardrails_review_model()
+            reuse_verdict = _find_reusable_approved_verdict(
+                entity, new_version, subs_repo, current_model,
+            )
+        except Exception:
+            logger.exception(
+                "restore: reusable-verdict lookup failed for entity %s",
+                entity_id,
+            )
+            reuse_verdict = None
+
+    if reuse_verdict is not None:
+        prior_sid, prior_findings = reuse_verdict
+        reused_findings = dict(prior_findings)
+        reused_findings["reused_from_submission_id"] = prior_sid
+        reused_findings["reused_reason"] = (
+            "byte-identical bundle to a prior approved submission; "
+            "skipped re-running LLM review to avoid spurious "
+            "content_quality flips (Anthropic structured output is "
+            "non-deterministic)"
+        )
+        sub_id = subs_repo.create(
+            submitter_id=user["id"],
+            submitter_email=user.get("email"),
+            type=entity["type"],
+            name=entity["name"],
+            version=new_version,
+            status="approved",
+            entity_id=entity_id,
+            inline_checks=inline.to_response_dict(),
+            llm_findings=reused_findings,
+            file_size=target_meta.file_size,
+            bundle_sha256=target_meta.sha256,
+        )
+        # Stamp reviewed_by_model + llm_findings via update_status
+        # (create doesn't accept reviewed_by_model). Terminal-state
+        # CAS guard allows this since the row was just inserted at
+        # 'approved' — we're re-affirming, not clobbering.
+        subs_repo.update_status(
+            sub_id, status="approved",
+            llm_findings=reused_findings,
+            reviewed_by_model=current_model,
+            allow_terminal_overwrite=True,
+        )
+        appended_n = repo.append_version_history(
+            entity_id,
+            version_hash=new_version,
+            sha256=target_meta.sha256,
+            size=target_meta.file_size,
+            submission_id=sub_id,
+            created_by=user["id"],
+        )
+        _audit(
+            conn, user["id"],
+            "store.submission.reused_approved_verdict",
+            f"store_submission:{sub_id}",
+            {
+                "entity_id": entity_id,
+                "restored_from_version_no": version_no,
+                "new_version_no": appended_n,
+                "reused_from_submission_id": prior_sid,
+                "reviewed_by_model": current_model,
+            },
+        )
+        _audit(
+            conn, user["id"], "store.entity.restore", entity_id,
+            {"restored_from_version_no": version_no,
+             "new_version_no": appended_n,
+             "submission_id": sub_id,
+             "reused_verdict": True},
+        )
+        # Forward-only promote (same gate as runner / override).
+        if appended_n > int(entity.get("version_no") or 0):
+            promote_to_version(entity_id, appended_n, repo)
+        _invalidate_etag()
+        return _entity_to_response(conn, repo.get(entity_id))  # type: ignore[arg-type]
+
     sub_id = subs_repo.create(
         submitter_id=user["id"],
         submitter_email=user.get("email"),
diff --git a/app/web/router.py b/app/web/router.py
index 32f6da4..8064db7 100644
--- a/app/web/router.py
+++ b/app/web/router.py
@@ -2061,23 +2061,64 @@ async def admin_store_submission_detail_page(
     # Verdict (sub.status) is immutable forensic record; lifecycle
     # (entity.visibility_status) reflects current state — see plan
     # "Admin Submissions Filter: Use Entity Visibility, Not Denormalized Status".
-    # Also derive submission_version_no by matching sub.version (hash)
-    # against the entity's version_history (v37 edit feature).
+    # Resolve THIS submission's version_no via submission_id (NOT
+    # hash — multiple history entries can share a hash when the user
+    # re-uploads byte-identical bundles, and the hash-match-first-wins
+    # loop always picked v1, mislabeling every reupload as v1). Same
+    # fix as PR #330 for the runner / override paths; we missed this
+    # display site at the time.
     entity_visibility_status = None
     entity_version_no = None
     submission_version_no = None
+    sibling_submissions: list = []
     if sub.get("entity_id"):
         ent = StoreEntitiesRepository(conn).get(sub["entity_id"])
         if ent:
             entity_visibility_status = ent.get("visibility_status")
             entity_version_no = ent.get("version_no")
-            for entry in (ent.get("version_history") or []):
-                try:
-                    if entry.get("hash") == sub.get("version"):
-                        submission_version_no = int(entry.get("n"))
-                        break
-                except (TypeError, ValueError):
-                    continue
+            from app.api.store import _version_no_for_submission
+            submission_version_no = _version_no_for_submission(
+                ent, submission_id,
+            )
+            # Build a version-switcher: every submission row linked to
+            # this entity, sorted newest first, with its derived v#.
+            # Admin clicks a row → jumps to that submission's detail.
+            # Surfaces multi-version entities clearly + lets admin
+            # compare verdicts across versions without bouncing back
+            # to the queue.
+            history = ent.get("version_history") or []
+            history_by_sub: dict = {}
+            for entry in history:
+                sid = entry.get("submission_id")
+                if sid:
+                    try:
+                        history_by_sub[sid] = int(entry.get("n"))
+                    except (TypeError, ValueError):
+                        continue
+            # Direct query — list_for_admin doesn't filter by entity_id
+            # and we don't want to add a parameter for this one display
+            # need. Order by created_at DESC so newest is first in the
+            # switcher.
+            ent_sub_rows = [
+                dict(zip(["id", "status", "version", "created_at", "reviewed_by_model"], row))
+                for row in conn.execute(
+                    "SELECT id, status, version, created_at, reviewed_by_model "
+                    "FROM store_submissions "
+                    "WHERE entity_id = ? "
+                    "ORDER BY created_at DESC",
+                    [sub["entity_id"]],
+                ).fetchall()
+            ]
+            for row in ent_sub_rows:
+                sibling_submissions.append({
+                    "id": row["id"],
+                    "status": row.get("status"),
+                    "version": row.get("version"),
+                    "created_at": row.get("created_at"),
+                    "version_no": history_by_sub.get(row["id"]),
+                    "reviewed_by_model": row.get("reviewed_by_model"),
+                    "is_current": row["id"] == submission_id,
+                })
 
     other_count = StoreSubmissionsRepository(conn).count_for_submitter(
         sub["submitter_id"], exclude_id=submission_id,
@@ -2159,6 +2200,7 @@ async def admin_store_submission_detail_page(
         entity_visibility_status=entity_visibility_status,
         entity_version_no=entity_version_no,
         submission_version_no=submission_version_no,
+        sibling_submissions=sibling_submissions,
     )
     return templates.TemplateResponse(request, "admin_store_submission_detail.html", ctx)
 
diff --git a/app/web/templates/admin_store_submission_detail.html b/app/web/templates/admin_store_submission_detail.html
index f87675e..cf65559 100644
--- a/app/web/templates/admin_store_submission_detail.html
+++ b/app/web/templates/admin_store_submission_detail.html
@@ -258,6 +258,58 @@
     {% endif %}
   </div>
 
+  {# ── Version switcher ────────────────────────────────────────────────────── #}
+  {# All submissions linked to this entity, newest first. Admin clicks
+     any row to jump to its detail. Surfaces multi-version entities so
+     verdicts across versions can be compared without bouncing back to
+     the queue. #}
+  {% if sibling_submissions and sibling_submissions|length > 1 %}
+  <div class="det-card">
+    <h2>Submissions for this entity ({{ sibling_submissions|length }})</h2>
+    <table class="data-table">
+      <thead>
+        <tr>
+          <th>v#</th>
+          <th>Status</th>
+          <th>Hash</th>
+          <th>Reviewed by</th>
+          <th>Created</th>
+          <th></th>
+        </tr>
+      </thead>
+      <tbody>
+        {% for s in sibling_submissions %}
+        <tr {% if s.is_current %}style="background: var(--surface-muted, #f9fafb);"{% endif %}>
+          <td>
+            {% if s.version_no %}
+              <strong>v{{ s.version_no }}</strong>
+            {% else %}
+              <span class="empty-msg">—</span>
+            {% endif %}
+            {% if s.version_no and s.version_no == entity_version_no %}
+              <span class="badge approved" style="margin-left:4px;font-size:9px;">current</span>
+            {% endif %}
+          </td>
+          <td><span class="badge {{ s.status }}">{{ s.status }}</span></td>
+          <td><code>{{ (s.version or '')[:12] }}</code></td>
+          <td style="font-size:11px;color:var(--text-secondary,#6b7280);">{{ s.reviewed_by_model or '—' }}</td>
+          <td style="font-size:11px;color:var(--text-secondary,#6b7280);">
+            {{ s.created_at.strftime("%Y-%m-%d %H:%M") if s.created_at else "" }}
+          </td>
+          <td>
+            {% if s.is_current %}
+              <span class="empty-msg">viewing</span>
+            {% else %}
+              <a href="/admin/store/submissions/{{ s.id }}">Open →</a>
+            {% endif %}
+          </td>
+        </tr>
+        {% endfor %}
+      </tbody>
+    </table>
+  </div>
+  {% endif %}
+
   {# ── Override history ────────────────────────────────────────────────────── #}
   {% if sub.override_by %}
   <div class="det-card">
@@ -404,6 +456,7 @@
           <p class="det-summary">{{ sub.llm_findings.summary }}</p>
         {% endif %}
         {% if sub.llm_findings.findings %}
+        <h3 style="margin: 18px 0 6px 0; font-size: 13px; color: var(--text-secondary, #6b7280); text-transform: uppercase; letter-spacing: 0.4px;">Security findings</h3>
         <table class="data-table">
           <thead>
             <tr><th>Severity</th><th>Category</th><th>File</th><th>Explanation</th><th>Fix hint</th></tr>
@@ -420,8 +473,58 @@
             {% endfor %}
           </tbody>
         </table>
-        {% else %}
-          <div class="empty-msg">No findings — model verdict was clean.</div>
+        {% endif %}
+        {# Content-quality issues. Pre-fix the template only rendered
+           the security findings table, so a submission blocked
+           purely on content_quality (e.g. weak description) showed
+           up as 'No findings — model verdict was clean' even though
+           status was blocked_llm. Surfaced live by an admin
+           investigating a blocked re-restore of an already-approved
+           bundle. #}
+        {% set cq = sub.llm_findings.content_quality %}
+        {% if cq and cq.issues %}
+        <h3 style="margin: 18px 0 6px 0; font-size: 13px; color: var(--text-secondary, #6b7280); text-transform: uppercase; letter-spacing: 0.4px;">
+          Content quality issues
+          <span class="badge {{ 'fail' if cq.verdict == 'fail' else 'safe' }}" style="margin-left: 6px;">{{ cq.verdict }}</span>
+        </h3>
+        <table class="data-table">
+          <thead>
+            <tr><th>File</th><th>Field</th><th>Issue</th><th>Suggested rewrite</th></tr>
+          </thead>
+          <tbody>
+            {% for issue in cq.issues %}
+            <tr>
+              <td><code>{{ issue.file }}</code></td>
+              <td><code>{{ issue.field }}</code></td>
+              <td>{{ issue.issue }}</td>
+              <td>{{ issue.hint or "" }}</td>
+            </tr>
+            {% endfor %}
+          </tbody>
+        </table>
+        {% endif %}
+        {% if not sub.llm_findings.findings and not (cq and cq.issues) %}
+          {% if sub.status in ['blocked_llm', 'review_error'] %}
+            <div class="det-error-box">
+              <strong>Blocked but no findings recorded.</strong>
+              This is usually a transient LLM non-determinism: the
+              reviewer returned a fail verdict without enumerating
+              the offending items. Click <em>Rescan</em> to re-run
+              the pipeline, or <em>Override</em> with a written
+              reason if you've verified the bundle offline.
+            </div>
+          {% else %}
+            <div class="empty-msg">No findings — model verdict was clean.</div>
+          {% endif %}
+        {% endif %}
+        {% if sub.llm_findings.reused_from_submission_id %}
+          <div class="empty-msg" style="margin-top: 12px;">
+            ✓ Verdict reused from prior approved submission
+            <a href="/admin/store/submissions/{{ sub.llm_findings.reused_from_submission_id }}">
+              {{ sub.llm_findings.reused_from_submission_id[:8] }}
+            </a>
+            (byte-identical bundle — LLM was not re-called).
+          </div>
         {% endif %}
       {% endif %}
     {% elif sub.status in ['pending_inline', 'pending_llm'] %}
diff --git a/pyproject.toml b/pyproject.toml
index b6c5d37..d5f0c31 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "agnes-the-ai-analyst"
-version = "0.54.23"
+version = "0.54.24"
 description = "Agnes — AI Data Analyst platform for AI analytical systems"
 requires-python = ">=3.11,<3.14"
 license = "MIT"
diff --git a/src/repositories/store_submissions.py b/src/repositories/store_submissions.py
index bfc0981..6a4a0d2 100644
--- a/src/repositories/store_submissions.py
+++ b/src/repositories/store_submissions.py
@@ -451,10 +451,16 @@ class StoreSubmissionsRepository:
                 history = []
             item["entity_version_history"] = history
             item["version_no"] = None
-            sub_hash = item.get("version")
+            # Look up THIS submission's version_no by submission_id,
+            # NOT by hash. Hash-based lookup mislabeled every
+            # byte-identical reupload (and every reused-verdict
+            # restore — common after PR #332) as v1 because the loop
+            # picked the FIRST history entry with matching hash.
+            # Same fix-pattern as PR #330 for runner / override.
+            sub_id = item.get("id")
             for entry in history:
                 try:
-                    if entry.get("hash") == sub_hash:
+                    if entry.get("submission_id") == sub_id:
                         item["version_no"] = int(entry.get("n"))
                         break
                 except (TypeError, ValueError):
diff --git a/tests/test_store_entity_versions.py b/tests/test_store_entity_versions.py
index 798a1bb..6e0ffd0 100644
--- a/tests/test_store_entity_versions.py
+++ b/tests/test_store_entity_versions.py
@@ -1992,3 +1992,221 @@ class TestRescanPromotesNonCurrent:
         assert ent_after["version"] != v1_hash, (
             "entity.version hash must move to v2 after rescan promote"
         )
+
+class TestRestoreReusesApprovedVerdict:
+    """Live-bug fix: a second restore of an already-approved version
+    sometimes flipped to `blocked_llm` because Anthropic structured
+    output is non-deterministic — same bytes, different
+    `content_quality.verdict` across calls. Restore now detects
+    byte-identical bundles backed by a prior `approved` submission
+    (same reviewed_by_model) and reuses that verdict."""
+
+    def test_restore_of_approved_version_skips_llm_and_reuses_verdict(
+        self, web_client, monkeypatch,
+    ):
+        from pathlib import Path
+        from app.utils import get_store_dir
+        from src.repositories.store_submissions import StoreSubmissionsRepository
+        from src.store_guardrails.runner import run_llm_review
+
+        monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-fake-for-reuse")
+        owner_id, owner_cookies = _create_user(web_client, "reuse@x.com")
+        eid = _upload_clean(web_client, owner_cookies, name="reuseme")
+
+        monkeypatch.setattr(
+            "app.api.store.get_guardrails_enabled", lambda: True,
+        )
+        monkeypatch.setattr(
+            "app.api.store.get_guardrails_llm_provider_ready", lambda: True,
+        )
+        monkeypatch.setattr(
+            "app.api.store.get_guardrails_review_model",
+            lambda: "claude-haiku-4-5-test",
+        )
+        # Stub the BG-task scheduler so only our direct `run_llm_review`
+        # call writes the verdict (real BG uses default_model_loader
+        # which resolves to a different reviewed_by_model and would
+        # win the terminal-state CAS race).
+        monkeypatch.setattr(
+            "app.api.store._schedule_llm_review",
+            lambda *a, **kw: None,
+        )
+
+        approve_calls = {"n": 0}
+
+        def mock_approve(*a, **kw):
+            approve_calls["n"] += 1
+            return {
+                "risk_level": "safe", "summary": "ok",
+                "findings": [], "template_placeholders_found": 0,
+                "reviewed_by_model": "claude-haiku-4-5-test",
+                "error": None,
+                "content_quality": {"verdict": "pass", "issues": []},
+            }
+        monkeypatch.setattr(
+            "src.store_guardrails.llm_review.review_bundle", mock_approve,
+        )
+
+        v2_zip = _make_skill_zip("reuseme", body="V2 body unique " * 30)
+        r = web_client.put(
+            f"/api/store/entities/{eid}",
+            files={"file": ("v2.zip", v2_zip, "application/zip")},
+            cookies=owner_cookies,
+        )
+        assert r.status_code == 200, r.text
+        conn = get_system_db()
+        v2_sub_id = StoreSubmissionsRepository(conn).latest_for_entity(eid)["id"]
+        conn.close()
+        run_llm_review(
+            v2_sub_id,
+            plugin_dir=Path(get_store_dir()) / eid / "versions" / "v2" / "plugin",
+            conn_factory=get_system_db,
+            api_key_loader=lambda: "sk",
+            model_loader=lambda: "claude-haiku-4-5-test",
+        )
+        calls_after_v2 = approve_calls["n"]
+
+        v3_zip = _make_skill_zip("reuseme", body="V3 body unique " * 30)
+        r = web_client.put(
+            f"/api/store/entities/{eid}",
+            files={"file": ("v3.zip", v3_zip, "application/zip")},
+            cookies=owner_cookies,
+        )
+        assert r.status_code == 200, r.text
+        conn = get_system_db()
+        v3_sub_id = StoreSubmissionsRepository(conn).latest_for_entity(eid)["id"]
+        conn.close()
+        run_llm_review(
+            v3_sub_id,
+            plugin_dir=Path(get_store_dir()) / eid / "versions" / "v3" / "plugin",
+            conn_factory=get_system_db,
+            api_key_loader=lambda: "sk",
+            model_loader=lambda: "claude-haiku-4-5-test",
+        )
+        calls_after_v3 = approve_calls["n"]
+        # BG task in TestClient may fire run_llm_review on its own
+        # in addition to our direct call, so don't assert exact count
+        # difference — only check the restore window.
+        assert calls_after_v3 > calls_after_v2
+
+        # Restore v2 → byte-identical to v2 approved entry. Reuse
+        # must fire → no new LLM call.
+        r = web_client.post(
+            f"/api/store/entities/{eid}/versions/2/restore",
+            cookies=owner_cookies,
+        )
+        assert r.status_code == 200, r.text
+        assert approve_calls["n"] == calls_after_v3, (
+            f"restore of approved version must NOT call LLM; "
+            f"count grew from {calls_after_v3} to {approve_calls['n']}"
+        )
+
+        conn = get_system_db()
+        ent_after = StoreEntitiesRepository(conn).get(eid)
+        v4_sub_id = next(
+            (e["submission_id"] for e in ent_after["version_history"]
+             if int(e["n"]) == 4),
+            None,
+        )
+        v4_sub = StoreSubmissionsRepository(conn).get(v4_sub_id) if v4_sub_id else None
+        conn.close()
+        assert v4_sub is not None
+        assert v4_sub["status"] == "approved"
+        assert v4_sub["reviewed_by_model"] == "claude-haiku-4-5-test"
+        assert (v4_sub["llm_findings"] or {}).get(
+            "reused_from_submission_id"
+        ) == v2_sub_id
+        assert ent_after["version_no"] == 4
+
+    def test_restore_legacy_v1_falls_back_to_llm(
+        self, web_client, monkeypatch,
+    ):
+        """v1 seed (guardrails-OFF approval, no reviewed_by_model) is
+        NOT eligible for reuse. Restoring v1 must schedule a real
+        LLM review under guardrails-on."""
+        from pathlib import Path
+        from app.utils import get_store_dir
+        from src.repositories.store_submissions import StoreSubmissionsRepository
+        from src.store_guardrails.runner import run_llm_review
+
+        monkeypatch.setenv("ANTHROPIC_API_KEY", "sk-ant-fake-fallback")
+        owner_id, owner_cookies = _create_user(web_client, "noreuse@x.com")
+        eid = _upload_clean(web_client, owner_cookies, name="noreuse")
+
+        monkeypatch.setattr(
+            "app.api.store.get_guardrails_enabled", lambda: True,
+        )
+        monkeypatch.setattr(
+            "app.api.store.get_guardrails_llm_provider_ready", lambda: True,
+        )
+        monkeypatch.setattr(
+            "app.api.store.get_guardrails_review_model",
+            lambda: "claude-haiku-4-5-test",
+        )
+        # Stub the BG-task scheduler so only our direct `run_llm_review`
+        # call writes the verdict (real BG uses default_model_loader
+        # which resolves to a different reviewed_by_model and would
+        # win the terminal-state CAS race).
+        monkeypatch.setattr(
+            "app.api.store._schedule_llm_review",
+            lambda *a, **kw: None,
+        )
+
+        approve_calls = {"n": 0}
+
+        def mock_approve(*a, **kw):
+            approve_calls["n"] += 1
+            return {
+                "risk_level": "safe", "summary": "ok",
+                "findings": [], "template_placeholders_found": 0,
+                "reviewed_by_model": "claude-haiku-4-5-test",
+                "error": None,
+                "content_quality": {"verdict": "pass", "issues": []},
+            }
+        monkeypatch.setattr(
+            "src.store_guardrails.llm_review.review_bundle", mock_approve,
+        )
+
+        v2_zip = _make_skill_zip("noreuse", body="V2 body " * 30)
+        r = web_client.put(
+            f"/api/store/entities/{eid}",
+            files={"file": ("v2.zip", v2_zip, "application/zip")},
+            cookies=owner_cookies,
+        )
+        assert r.status_code == 200, r.text
+        conn = get_system_db()
+        v2_sub_id = StoreSubmissionsRepository(conn).latest_for_entity(eid)["id"]
+        conn.close()
+        run_llm_review(
+            v2_sub_id,
+            plugin_dir=Path(get_store_dir()) / eid / "versions" / "v2" / "plugin",
+            conn_factory=get_system_db,
+            api_key_loader=lambda: "sk",
+            model_loader=lambda: "claude-haiku-4-5-test",
+        )
+        calls_after_v2 = approve_calls["n"]
+
+        r = web_client.post(
+            f"/api/store/entities/{eid}/versions/1/restore",
+            cookies=owner_cookies,
+        )
+        assert r.status_code == 200, r.text
+        conn = get_system_db()
+        v3_sub_id = StoreSubmissionsRepository(conn).latest_for_entity(eid)["id"]
+        conn.close()
+        run_llm_review(
+            v3_sub_id,
+            plugin_dir=Path(get_store_dir()) / eid / "versions" / "v3" / "plugin",
+            conn_factory=get_system_db,
+            api_key_loader=lambda: "sk",
+            model_loader=lambda: "claude-haiku-4-5-test",
+        )
+        # No reuse: BG/manual LLM calls fired (count grew).
+        assert approve_calls["n"] > calls_after_v2
+        conn = get_system_db()
+        v3_sub = StoreSubmissionsRepository(conn).get(v3_sub_id)
+        conn.close()
+        assert v3_sub["status"] == "approved"
+        assert (v3_sub["llm_findings"] or {}).get(
+            "reused_from_submission_id"
+        ) is None