* feat(store-guardrails): enforce per-component description quality
Two-tier hard guardrail on flea-market submissions. Empty / placeholder /
single-word descriptions now block before any LLM call; vague-but-passes-
floor descriptions block on the substantive LLM review layer.
Tier 1 — inline mechanical check (src/store_guardrails/content_check.py).
Walks the baked plugin tree, evaluates each component (plugin manifest,
agents, skills, commands) plus the submission-level form description
against a 60-char / 25-char (commands) / 5-distinct-word / 200-char-body
floor with a placeholder denylist (TODO, TBD, {{var}}, etc.). Floors
calibrated against real ecosystem norms: Claude / superpowers /
compound-engineering skill packs cluster 150–220 chars, npm / Docker /
VS Code at 100–120. InlineResult.passed now ANDs in content.status.
Tier 2 — LLM review extension (prompts.py + llm_review.py). System
prompt gains a content-quality criterion; REVIEW_JSON_SCHEMA carries a
content_quality {verdict, issues[]} object alongside the existing
security findings. is_safe() requires content_quality.verdict == 'pass'.
Single LLM call covers both dimensions. MAX_RESPONSE_TOKENS bumped
2000 → 2500 for the extra payload. Verdicts missing content_quality
treated as pass (backwards compat with already-recorded rows).
Submitter UX:
- /store/new wizard now carries a "Before you upload — what passes
review" collapsible disclosure on both step 1 and step 2 with the
bar + patterns that work. Live char counter on the description
field. Per-component preview table (green/red dots from the new
summarize_for_preview helper) renders after the ZIP /preview round
trip, scoping each finding to its file.
- New /store/examples page with rejected/passes pairs for skill /
agent / plugin / command plus a "Why these limits" research table.
Anchored sections (#skill / #agent / #plugin / #command) so the
rejection banner can deep-link by component_type.
- Quarantine banner _content_findings.html groups findings by file
(one "See <type> example ↗" per component, not per field) and
translates field codes (frontmatter.description / body / etc.) to
plain-English labels. _content_howto_fix.html surfaces a static
"Re-upload as new version" + "See examples" action row beneath any
content failure on the entity detail page.
- _parse_frontmatter moved to src/store_guardrails/_frontmatter.py so
the new check module shares the parser without inverting the
app → src dependency direction.
Tests:
- New tests/test_store_guardrails_content.py (29 cases) covering
every failure code per component type plus submission-level checks
and the summarize_components / summarize_for_preview helpers.
- Extended test_store_guardrails_inline.py for the new
InlineResult.content field + aggregate behaviour.
- Extended test_store_guardrails_llm.py for the new
content_quality verdict pathways (fail blocks, missing field passes).
- Backfilled fixture descriptions across test_store_api.py,
test_store_entity_versions.py, test_store_put_atomic.py,
test_admin_store_submissions.py, test_marketplace_api.py,
test_marketplace_v32_endpoints.py so existing happy-path tests
clear the new 60-char floor.
* fix(content-guardrail): align agents walker with preview + drop import-time .format()
Two cleanups from the takeover review on #276 (vr/guardrails-content).
1) `_iter_components` for agents now skips files lacking frontmatter
(no `name` AND no `description`). Pre-fix the walker greedily
evaluated every `*.md` under `agents/` — `agents/README.md` and
helper docs got flagged as "frontmatter.description empty"
rejections. Worse: `summarize_for_preview` for `type=agent` ALREADY
filters the same shape, so the upload preview gave a green dot
while the post-bake check gave a red rejection on submit. Two new
regression tests in TestAgentsWalkerSkipsNonAgentFiles pin both
shapes (README + _NOTES.md) so the preview/check parity stays
aligned.
2) `body_too_short` hints now use the same runtime-kwarg substitution
pattern as every other hint in the table. Pre-fix the skill +
agent body_too_short hints called `.format(min_chars=_MIN_BODY_CHARS)`
at module-load time, but the call site `_hint_for(type_,
"body_too_short")` didn't pass `min_chars=`, so the format() was
just baking the constant at import. Cosmetic inconsistency; pass
`min_chars=_MIN_BODY_CHARS` at the call site instead and let
`_hint_for` do the substitution like it does for `too_short`.
Verified end-to-end:
- New TestAgentsWalkerSkipsNonAgentFiles cases fail on the unfixed
walker (verified by reverting to the pre-fix file and re-running);
pass cleanly after the fix.
- Full content-guardrail suite: 25/25 (23 existing + 2 new).
- Full pytest: 4189 passed, 25 skipped.
* release: 0.53.5 — content guardrail (flea-market submitter UX) + catalog ENTITY column + BQ hint dispatch
Bundles three threads landed in [Unreleased]:
- Vojta's flea-market content guardrail (two-tier mechanical + LLM)
- Zdeněk's `agnes catalog` ENTITY column replacement for FLAVOR
- Zdeněk's `/api/query` remote_estimate_failed hint dispatch fix
Plus the takeover hygiene from #276 review (agents walker preview/check
parity + body_too_short hint runtime kwarg consistency) and the
backslash-escape fix follow-up to v0.53.4 #275.
No DB migration; no API change. Patch upgrade lands transparently.
Upload form's new "Before you upload" disclosure + per-component preview
table appear on the next dev-VM auto-pull. Quarantine banner now groups
findings by file with "See <type> example ↗" deep-links to the new
/store/examples reference page.
---------
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
450 lines
18 KiB
Python
450 lines
18 KiB
Python
"""LLM-review wiring tests — Anthropic call mocked.
|
|
|
|
These tests cover the runner's persistence behavior on each verdict
|
|
shape (safe / risky / error). The actual prompt engineering lives in
|
|
``src/store_guardrails/prompts.py`` and is exercised at integration
|
|
time, not here.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import shutil
|
|
import tempfile
|
|
from pathlib import Path
|
|
from unittest.mock import patch
|
|
|
|
import duckdb
|
|
import pytest
|
|
|
|
from connectors.llm.exceptions import LLMTimeoutError
|
|
from src import db as src_db
|
|
from src.repositories.store_entities import StoreEntitiesRepository
|
|
from src.repositories.store_submissions import StoreSubmissionsRepository
|
|
from src.store_guardrails.runner import LlmResult, run_llm_review
|
|
|
|
|
|
@pytest.fixture
|
|
def conn(tmp_path, monkeypatch):
|
|
monkeypatch.setenv("DATA_DIR", str(tmp_path))
|
|
src_db._system_db_conn = None
|
|
src_db._system_db_path = None
|
|
c = src_db.get_system_db()
|
|
yield c
|
|
c.close()
|
|
|
|
|
|
@pytest.fixture
|
|
def plugin_dir():
|
|
d = Path(tempfile.mkdtemp(prefix="agnes_llm_test_"))
|
|
(d / "SKILL.md").write_text("# Test\nbody " * 30)
|
|
yield d
|
|
shutil.rmtree(d, ignore_errors=True)
|
|
|
|
|
|
def _seed_pending_submission(conn, plugin_dir: Path) -> tuple[str, str]:
|
|
"""Stage a store_entities row + a pending_llm submission.
|
|
|
|
Returns ``(entity_id, submission_id)`` so the test can assert against
|
|
final state.
|
|
"""
|
|
from src.repositories.users import UserRepository
|
|
UserRepository(conn).create(id="u1", email="alice@x.com", name="alice")
|
|
ents = StoreEntitiesRepository(conn)
|
|
ents.create(
|
|
id="e1", owner_user_id="u1", owner_username="alice",
|
|
type="skill", name="probe", description="probe skill",
|
|
category=None, version="1.0.0", file_size=10,
|
|
visibility_status="pending",
|
|
)
|
|
subs = StoreSubmissionsRepository(conn)
|
|
sub_id = subs.create(
|
|
submitter_id="u1", submitter_email="alice@x.com",
|
|
type="skill", name="probe", version="1.0.0",
|
|
status="pending_llm", entity_id="e1",
|
|
inline_checks={"manifest": {"status": "pass"}},
|
|
)
|
|
return "e1", sub_id
|
|
|
|
|
|
# The runner closes its own cursor in `finally`. Hand it a *fresh* cursor
|
|
# each call (mirrors the production `get_system_db` behavior) so closing
|
|
# it doesn't poison the test's primary cursor used for assertions.
|
|
def _conn_factory(_unused):
|
|
def _f():
|
|
return src_db.get_system_db()
|
|
return _f
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# run_llm_review outcomes
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
class TestLlmReviewRunner:
|
|
def test_safe_verdict_approves_entity(self, conn, plugin_dir):
|
|
eid, sub_id = _seed_pending_submission(conn, plugin_dir)
|
|
|
|
verdict = {
|
|
"risk_level": "safe", "summary": "OK", "findings": [],
|
|
"template_placeholders_found": 0, "reviewed_by_model": "claude-haiku-4-5-20251001",
|
|
"error": None,
|
|
}
|
|
with patch(
|
|
"src.store_guardrails.runner.llm_review.review_bundle",
|
|
return_value=verdict,
|
|
):
|
|
result = run_llm_review(
|
|
sub_id, plugin_dir=plugin_dir,
|
|
conn_factory=_conn_factory(conn),
|
|
api_key_loader=lambda: "sk-test",
|
|
model_loader=lambda: "claude-haiku-4-5-20251001",
|
|
)
|
|
|
|
assert isinstance(result, LlmResult)
|
|
assert result.passed
|
|
sub = StoreSubmissionsRepository(conn).get(sub_id)
|
|
assert sub["status"] == "approved"
|
|
ent = StoreEntitiesRepository(conn).get(eid)
|
|
assert ent["visibility_status"] == "approved"
|
|
|
|
def test_high_risk_verdict_blocks(self, conn, plugin_dir):
|
|
eid, sub_id = _seed_pending_submission(conn, plugin_dir)
|
|
|
|
verdict = {
|
|
"risk_level": "high", "summary": "exfil",
|
|
"findings": [{
|
|
"severity": "high", "category": "exfiltration",
|
|
"file": "run.py", "explanation": "ships token to remote",
|
|
"fix_hint": "remove the POST",
|
|
}],
|
|
"template_placeholders_found": 0, "reviewed_by_model": "claude-haiku-4-5-20251001",
|
|
"error": None,
|
|
}
|
|
with patch(
|
|
"src.store_guardrails.runner.llm_review.review_bundle",
|
|
return_value=verdict,
|
|
):
|
|
run_llm_review(
|
|
sub_id, plugin_dir=plugin_dir,
|
|
conn_factory=_conn_factory(conn),
|
|
api_key_loader=lambda: "sk-test",
|
|
model_loader=lambda: "claude-haiku-4-5-20251001",
|
|
)
|
|
|
|
sub = StoreSubmissionsRepository(conn).get(sub_id)
|
|
assert sub["status"] == "blocked_llm"
|
|
ent = StoreEntitiesRepository(conn).get(eid)
|
|
# Entity stays pending — not visible until override.
|
|
assert ent["visibility_status"] == "pending"
|
|
|
|
def test_low_risk_with_high_finding_blocks(self, conn, plugin_dir):
|
|
"""Pass condition requires BOTH risk_level<=low AND no high findings.
|
|
A 'low' verdict with a high finding still blocks."""
|
|
eid, sub_id = _seed_pending_submission(conn, plugin_dir)
|
|
|
|
verdict = {
|
|
"risk_level": "low", "summary": "mostly ok",
|
|
"findings": [{
|
|
"severity": "high", "category": "credentials",
|
|
"file": "creds.py", "explanation": "key in source",
|
|
}],
|
|
"template_placeholders_found": 0, "reviewed_by_model": "claude-haiku-4-5-20251001",
|
|
"error": None,
|
|
}
|
|
with patch(
|
|
"src.store_guardrails.runner.llm_review.review_bundle",
|
|
return_value=verdict,
|
|
):
|
|
run_llm_review(
|
|
sub_id, plugin_dir=plugin_dir,
|
|
conn_factory=_conn_factory(conn),
|
|
api_key_loader=lambda: "sk-test",
|
|
model_loader=lambda: "claude-haiku-4-5-20251001",
|
|
)
|
|
|
|
sub = StoreSubmissionsRepository(conn).get(sub_id)
|
|
assert sub["status"] == "blocked_llm"
|
|
|
|
def test_content_quality_fail_blocks(self, conn, plugin_dir):
|
|
"""Safe security verdict but content_quality.verdict=fail must still
|
|
block. The LLM substantive layer is a hard gate — descriptions that
|
|
clear the mechanical floor can still get flagged for vagueness."""
|
|
eid, sub_id = _seed_pending_submission(conn, plugin_dir)
|
|
|
|
verdict = {
|
|
"risk_level": "safe", "summary": "OK",
|
|
"findings": [],
|
|
"template_placeholders_found": 0,
|
|
"content_quality": {
|
|
"verdict": "fail",
|
|
"issues": [{
|
|
"file": "skills/probe/SKILL.md",
|
|
"field": "frontmatter.description",
|
|
"issue": "describes WHAT the skill is, not WHEN to invoke it",
|
|
"hint": "Rewrite as 'Use when reviewing PRs to flag missing tests.'",
|
|
}],
|
|
},
|
|
"reviewed_by_model": "claude-haiku-4-5-20251001",
|
|
"error": None,
|
|
}
|
|
with patch(
|
|
"src.store_guardrails.runner.llm_review.review_bundle",
|
|
return_value=verdict,
|
|
):
|
|
run_llm_review(
|
|
sub_id, plugin_dir=plugin_dir,
|
|
conn_factory=_conn_factory(conn),
|
|
api_key_loader=lambda: "sk-test",
|
|
model_loader=lambda: "claude-haiku-4-5-20251001",
|
|
)
|
|
|
|
sub = StoreSubmissionsRepository(conn).get(sub_id)
|
|
assert sub["status"] == "blocked_llm"
|
|
# The content_quality verdict + issues persisted on the submission
|
|
# so the quarantine banner can render the rewrite hints.
|
|
assert sub["llm_findings"]["content_quality"]["verdict"] == "fail"
|
|
assert len(sub["llm_findings"]["content_quality"]["issues"]) == 1
|
|
|
|
def test_content_quality_missing_treated_as_pass(self, conn, plugin_dir):
|
|
"""Backward compat — older recorded verdicts without content_quality
|
|
must not retroactively block. The wire format adds the field; absent
|
|
means pass."""
|
|
eid, sub_id = _seed_pending_submission(conn, plugin_dir)
|
|
|
|
verdict = {
|
|
"risk_level": "safe", "summary": "OK", "findings": [],
|
|
"template_placeholders_found": 0,
|
|
# content_quality intentionally absent
|
|
"reviewed_by_model": "claude-haiku-4-5-20251001",
|
|
"error": None,
|
|
}
|
|
with patch(
|
|
"src.store_guardrails.runner.llm_review.review_bundle",
|
|
return_value=verdict,
|
|
):
|
|
run_llm_review(
|
|
sub_id, plugin_dir=plugin_dir,
|
|
conn_factory=_conn_factory(conn),
|
|
api_key_loader=lambda: "sk-test",
|
|
model_loader=lambda: "claude-haiku-4-5-20251001",
|
|
)
|
|
|
|
sub = StoreSubmissionsRepository(conn).get(sub_id)
|
|
assert sub["status"] == "approved"
|
|
|
|
def test_medium_finding_with_safe_risk_passes(self, conn, plugin_dir):
|
|
"""Medium findings shouldn't block when overall risk is safe — that's
|
|
the 'noise but no exploit' band the operator opted into when picking
|
|
Haiku as the tier. Operators who want stricter pin Sonnet/Opus."""
|
|
eid, sub_id = _seed_pending_submission(conn, plugin_dir)
|
|
|
|
verdict = {
|
|
"risk_level": "safe", "summary": "noise",
|
|
"findings": [{
|
|
"severity": "medium", "category": "code_quality",
|
|
"file": "x.py", "explanation": "could be cleaner",
|
|
}],
|
|
"template_placeholders_found": 1, "reviewed_by_model": "claude-haiku-4-5-20251001",
|
|
"error": None,
|
|
}
|
|
with patch(
|
|
"src.store_guardrails.runner.llm_review.review_bundle",
|
|
return_value=verdict,
|
|
):
|
|
run_llm_review(
|
|
sub_id, plugin_dir=plugin_dir,
|
|
conn_factory=_conn_factory(conn),
|
|
api_key_loader=lambda: "sk-test",
|
|
model_loader=lambda: "claude-haiku-4-5-20251001",
|
|
)
|
|
|
|
sub = StoreSubmissionsRepository(conn).get(sub_id)
|
|
assert sub["status"] == "approved"
|
|
|
|
def test_review_error_keeps_pending(self, conn, plugin_dir):
|
|
eid, sub_id = _seed_pending_submission(conn, plugin_dir)
|
|
|
|
# llm_review.review_bundle catches LLMError and returns a dict with
|
|
# error set; the runner records review_error.
|
|
verdict = {
|
|
"risk_level": None, "summary": None, "findings": [],
|
|
"template_placeholders_found": 0, "reviewed_by_model": "claude-haiku-4-5-20251001",
|
|
"error": "LLMTimeoutError: Anthropic connection error",
|
|
}
|
|
with patch(
|
|
"src.store_guardrails.runner.llm_review.review_bundle",
|
|
return_value=verdict,
|
|
):
|
|
run_llm_review(
|
|
sub_id, plugin_dir=plugin_dir,
|
|
conn_factory=_conn_factory(conn),
|
|
api_key_loader=lambda: "sk-test",
|
|
model_loader=lambda: "claude-haiku-4-5-20251001",
|
|
)
|
|
|
|
sub = StoreSubmissionsRepository(conn).get(sub_id)
|
|
assert sub["status"] == "review_error"
|
|
ent = StoreEntitiesRepository(conn).get(eid)
|
|
assert ent["visibility_status"] == "pending"
|
|
|
|
def test_missing_plugin_dir_records_review_error(self, conn, tmp_path):
|
|
eid, sub_id = _seed_pending_submission(conn, tmp_path / "exists")
|
|
# Point at a path that doesn't exist.
|
|
ghost = tmp_path / "ghost-plugin-dir"
|
|
with patch(
|
|
"src.store_guardrails.runner.llm_review.review_bundle"
|
|
) as m:
|
|
run_llm_review(
|
|
sub_id, plugin_dir=ghost,
|
|
conn_factory=_conn_factory(conn),
|
|
api_key_loader=lambda: "sk-test",
|
|
model_loader=lambda: "claude-haiku-4-5-20251001",
|
|
)
|
|
m.assert_not_called()
|
|
|
|
sub = StoreSubmissionsRepository(conn).get(sub_id)
|
|
assert sub["status"] == "review_error"
|
|
|
|
def test_safe_verdict_skipped_when_admin_archived_during_review(self, conn, plugin_dir):
|
|
"""#3 — BG-task race: admin archives entity while LLM review is
|
|
in flight. Pre-fix the verdict's set_visibility('approved')
|
|
clobbered the archive. Post-fix the guarded variant refuses
|
|
and audit-logs the skip."""
|
|
eid, sub_id = _seed_pending_submission(conn, plugin_dir)
|
|
# Admin archives BEFORE the LLM verdict lands.
|
|
StoreEntitiesRepository(conn).archive(eid, by_user_id="admin")
|
|
|
|
verdict = {
|
|
"risk_level": "safe", "summary": "OK", "findings": [],
|
|
"template_placeholders_found": 0,
|
|
"reviewed_by_model": "claude-haiku-4-5-20251001",
|
|
"error": None,
|
|
}
|
|
with patch(
|
|
"src.store_guardrails.runner.llm_review.review_bundle",
|
|
return_value=verdict,
|
|
):
|
|
run_llm_review(
|
|
sub_id, plugin_dir=plugin_dir,
|
|
conn_factory=_conn_factory(conn),
|
|
api_key_loader=lambda: "sk-test",
|
|
model_loader=lambda: "claude-haiku-4-5-20251001",
|
|
)
|
|
|
|
ent = StoreEntitiesRepository(conn).get(eid)
|
|
assert ent["visibility_status"] == "archived", (
|
|
"BG verdict must NOT clobber an admin archive"
|
|
)
|
|
# Submission still flips to 'approved' (verdict is forensic
|
|
# record); the lifecycle stays archived because the admin won.
|
|
sub = StoreSubmissionsRepository(conn).get(sub_id)
|
|
assert sub["status"] == "approved"
|
|
|
|
# And we wrote an audit row explaining the skip.
|
|
audits = conn.execute(
|
|
"SELECT action, params FROM audit_log "
|
|
"WHERE resource = ? AND action = ?",
|
|
[f"store_submission:{sub_id}",
|
|
"store.submission.bg_verdict_skipped"],
|
|
).fetchall()
|
|
assert audits, "missing skip audit row"
|
|
|
|
def test_config_loader_failure_records_review_error(self, conn, plugin_dir):
|
|
eid, sub_id = _seed_pending_submission(conn, plugin_dir)
|
|
|
|
def boom():
|
|
raise RuntimeError("no api key")
|
|
|
|
run_llm_review(
|
|
sub_id, plugin_dir=plugin_dir,
|
|
conn_factory=_conn_factory(conn),
|
|
api_key_loader=boom,
|
|
model_loader=lambda: "claude-haiku-4-5-20251001",
|
|
)
|
|
|
|
sub = StoreSubmissionsRepository(conn).get(sub_id)
|
|
assert sub["status"] == "review_error"
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# llm_review.review_bundle — single-shot transport-error path
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
class TestReviewBundleErrorTransport:
|
|
def test_anthropic_timeout_returns_error_dict(self, plugin_dir):
|
|
from src.store_guardrails import llm_review
|
|
|
|
with patch(
|
|
"src.store_guardrails.llm_review.AnthropicExtractor"
|
|
) as MockEx:
|
|
inst = MockEx.return_value
|
|
inst.extract_json.side_effect = LLMTimeoutError("connection error")
|
|
result = llm_review.review_bundle(
|
|
plugin_dir, type_="skill", name="x", version="1.0.0",
|
|
description="x" * 30,
|
|
api_key="sk-test", model="claude-haiku-4-5-20251001",
|
|
)
|
|
assert result["error"]
|
|
assert result["risk_level"] is None
|
|
assert result["reviewed_by_model"] == "claude-haiku-4-5-20251001"
|
|
|
|
def test_missing_risk_level_surfaces_as_review_error(self, plugin_dir):
|
|
"""#10 — model returns a structured response with no/empty
|
|
risk_level. Pre-fix this defaulted to 'medium' and looked like a
|
|
model-decided block. Post-fix it surfaces as
|
|
``error='missing_risk_level'`` so the runner persists
|
|
``status='review_error'`` (admin gets the retry button)."""
|
|
from src.store_guardrails import llm_review
|
|
|
|
with patch(
|
|
"src.store_guardrails.llm_review.AnthropicExtractor"
|
|
) as MockEx:
|
|
inst = MockEx.return_value
|
|
inst.extract_json.return_value = {
|
|
"summary": "model didn't fill risk_level",
|
|
"findings": [],
|
|
"template_placeholders_found": 0,
|
|
}
|
|
result = llm_review.review_bundle(
|
|
plugin_dir, type_="skill", name="x", version="1.0.0",
|
|
description="x" * 30,
|
|
api_key="sk-test", model="claude-haiku-4-5-20251001",
|
|
)
|
|
assert result["risk_level"] is None
|
|
assert result["error"] == "missing_risk_level"
|
|
|
|
def test_system_prompt_passed_via_dedicated_parameter(self, plugin_dir):
|
|
"""#1 — the system prompt MUST be passed via the SDK's separate
|
|
``system=`` parameter, not concatenated into user content. This
|
|
is the trust boundary that keeps a crafted README inside the
|
|
bundle from overriding the reviewer rules."""
|
|
from src.store_guardrails import llm_review
|
|
from src.store_guardrails.prompts import SYSTEM_PROMPT
|
|
|
|
with patch(
|
|
"src.store_guardrails.llm_review.AnthropicExtractor"
|
|
) as MockEx:
|
|
inst = MockEx.return_value
|
|
inst.extract_json.return_value = {
|
|
"risk_level": "safe", "summary": "ok", "findings": [],
|
|
"template_placeholders_found": 0,
|
|
}
|
|
llm_review.review_bundle(
|
|
plugin_dir, type_="skill", name="x", version="1.0.0",
|
|
description="x" * 30,
|
|
api_key="sk-test", model="claude-haiku-4-5-20251001",
|
|
)
|
|
call = inst.extract_json.call_args
|
|
assert call.kwargs.get("system") == SYSTEM_PROMPT, (
|
|
"SYSTEM_PROMPT must be passed via system= so it lands "
|
|
"in the SDK's system role — not in user content"
|
|
)
|
|
user_payload = call.kwargs.get("prompt") or ""
|
|
assert "<bundle>" in user_payload and "</bundle>" in user_payload, (
|
|
"user payload must wrap the bundle files in trust-boundary "
|
|
"sentinel tags"
|
|
)
|
|
assert SYSTEM_PROMPT not in user_payload, (
|
|
"SYSTEM_PROMPT must NOT be inlined into user content"
|
|
)
|