feat(store-guardrails): admin-configurable content thresholds (#281)

* feat(store-guardrails): admin-configurable content thresholds

Adds the flea-market content guardrail floors to the /admin/server-config
editor so operators can tune the bar without code changes. Defaults are
unchanged (60 chars description, 25 chars command, 5 distinct words, 200
chars body) — patching guardrails.* in instance.yaml or via the admin UI
overrides any of them and the next inline check picks up the new value.

src/store_guardrails/content_check.py now resolves the four floors via
helper functions (_min_desc_chars / _min_command_desc_chars /
_min_distinct_words / _min_body_chars) that read app.instance_config at
call time. Module-level _DEFAULT_* constants remain as fallbacks if
the import fails (defensive — keeps the guardrail module loadable
without the app package on its path).

app/instance_config.py grows four matching getters returning the live
value with sane defaults + integer coercion.

app/api/admin.py registers 'guardrails' as an editable section + ships
nine known-fields entries (min_description_chars,
min_command_description_chars, min_distinct_words, min_body_chars,
enabled, review_model, blocked_quota_per_day, blocked_bundle_ttl_days,
stuck_review_grace_seconds) with operator-facing hint copy explaining
what each knob does.

app/web/templates/admin_server_config.html gets a SECTION_META entry
so the section renders as 'Flea-market guardrails' with a help string
instead of a bare section ID.

app/web/router.py threads the live thresholds into /store/new and
/store/examples via a small _guardrail_thresholds() helper so the
disclosure copy, char counter, and "Why these limits" table render
the configured value (not a hardcoded 60). End-to-end smoke verified:
PATCH guardrails.min_description_chars=90 → /store/new immediately
renders "90 characters" + JS DESC_MIN=90 on the next request, no
restart required (helpers read live config per call).

* chore(store-guardrails): address PR review safe-fix findings

Code-review safe_auto findings on PR #281 (review run
20260513-100126-64052520):

- CHANGELOG: add Unreleased entry covering the new
  /admin/server-config Flea-market guardrails section, the four live
  threshold getters, and the route-helper rendering knobs. Required by
  the project's non-negotiable "Changelog discipline" rule.
- content_check.py: narrow `except Exception` to `except ImportError`
  on the four `_min_*()` resolver helpers. Surface-level TypeError /
  ValueError on a malformed YAML value belongs to the
  instance_config getters' own try/except — the resolvers should only
  defend against the in-tree import itself failing, not silently
  swallow real bugs in the getters.
- store_upload.html: refresh the stale "30-char threshold" comment to
  reflect the configurable floor (default 60), and add `|default(60)`
  / `|default(25)` / `|default(5)` filters to the disclosure-copy
  bindings so the upload form matches store_examples.html's
  belt-and-suspenders rendering if a future route ever renders the
  template without populating the `guardrail` context.
- router.py: tighten `_guardrail_thresholds()` return annotation
  from bare `dict` to `dict[str, int]`.

Residual work (left for separate change after operator direction):
- Add round-trip test (PATCH guardrails -> next inline check uses
  new value) — primary testing gap.
- Decide policy on `min_*=0` (currently coerced to 1 via
  `max(1, int(val))`) vs treating 0 as a disable sentinel like
  neighbour getters (`blocked_quota_per_day`,
  `blocked_bundle_ttl_days`).
- Add POST-time integer validation for `guardrails.*` so a typo'd
  YAML value (bool / string / float) errors loudly instead of
  silently falling back to the default.

* test(store-guardrails): cover admin-configurable thresholds + PATCH round-trip

Closes the "primary testing gap" Vojta noted in the safe-fix commit
on PR #281 — the four new `get_guardrails_min_*` getters and the
PATCH-takes-effect-on-next-check live-config flow had no direct
coverage.

10 new tests in `tests/test_store_guardrails_admin_config.py`:

- TestGuardrailGetterDefaults (4 tests) — each new getter returns the
  documented default (60 / 25 / 5 / 200) when nothing is configured.
- TestGuardrailGetterOverlay (5 tests) — overlay-driven overrides win,
  string values that look numeric coerce via int(), garbage strings
  fall back to default via the (TypeError, ValueError) branch, and the
  `max(1, int(val))` floor pins zero/negative inputs to 1.
- TestPatchRoundTrip (1 test) — PATCH `/api/admin/server-config`
  `guardrails.min_description_chars=90`, then call content_check
  against a 75-char description that previously passed: must now fail
  with `too_short`. Then PATCH back to 60 and verify the next check
  passes again. Closes the cache-invalidation contract Vojta relies on
  for the "no app restart" claim — broken without the
  reset_cache() bracket in /api/admin/server-config.

The TestGuardrailGetterOverlay.test_zero_or_negative_floored_to_one
test pins the current `max(1, int(val))` policy. Vojta's safe-fix
commit explicitly left "policy on min_*=0 vs disable-sentinel" as
residual work — pinning the current behavior here ensures any future
change to use 0 as a disable sentinel must update this test (and the
reviewer sees the policy decision).

Verified: 4509 tests pass locally (4499 existing + 10 new).

* release: 0.54.2 — admin-configurable flea-market guardrail thresholds + tests

Last commit on the PR per CLAUDE.md hard rule. Patch bump (0.54.1 →
0.54.2) bundling Vojta's admin-configurable thresholds for the
flea-market content guardrail (9 knobs in /admin/server-config) plus
the test coverage closing the "primary testing gap" he punted in the
safe-fix commit.

No DB migration; defaults unchanged from PR #276 — instances that
don't set `guardrails.*` keep the original bar transparently.

---------

Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
Co-authored-by: ZdenekSrotyr <139972147+ZdenekSrotyr@users.noreply.github.com>
This commit is contained in:
Vojtech 2026-05-13 13:20:55 +04:00 committed by GitHub
parent 8f54ca3742
commit 50a974f196
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 492 additions and 29 deletions

View file

@ -10,6 +10,28 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C
## [Unreleased] ## [Unreleased]
## [0.54.2] — 2026-05-13
### Added
- **Admin-configurable flea-market content guardrail thresholds.**
`/admin/server-config` gains a new **Flea-market guardrails** section
exposing nine knobs: `min_description_chars` (default 60),
`min_command_description_chars` (default 25), `min_distinct_words`
(default 5), `min_body_chars` (default 200), `enabled` (master
kill-switch), `review_model` (haiku / sonnet / opus),
`blocked_quota_per_day` (default 50), `blocked_bundle_ttl_days`
(default 30), `stuck_review_grace_seconds` (default 1800). Each
field carries an operator-facing hint string. The four mechanical
floors are read from `app.instance_config` on every inline check,
so a `/admin/server-config` PATCH takes effect on the next request
without restarting uvicorn. `/store/new` (live char counter +
disclosure copy) and `/store/examples` (the "Why these limits"
table) render the configured values via a small
`_guardrail_thresholds()` helper threaded into the route context.
Defaults are unchanged — instances that don't set
`guardrails.*` keep the original PR #276 bar.
## [0.54.1] — 2026-05-13 ## [0.54.1] — 2026-05-13
### Added ### Added

View file

@ -258,6 +258,7 @@ _EDITABLE_SECTIONS: tuple[str, ...] = (
"desktop", "desktop",
"corporate_memory", "corporate_memory",
"materialize", "materialize",
"guardrails",
) )
# "Danger-zone" sections — flipping these can lock operators out (auth.*) or # "Danger-zone" sections — flipping these can lock operators out (auth.*) or
@ -703,6 +704,100 @@ _KNOWN_FIELDS: dict[str, dict[str, dict]] = {
), ),
}, },
}, },
"guardrails": {
"min_description_chars": {
"kind": "int",
"default": 60,
"hint": (
"Minimum character floor for skill / agent / plugin "
"descriptions on flea-market uploads (the inline content "
"guardrail). Real-world Claude skill descriptions cluster "
"150220 chars; the default 60 is the bottom of the bar "
"to catch placeholders. Bump to 100+ to push submitters "
"closer to the ecosystem norm. Min 1."
),
},
"min_command_description_chars": {
"kind": "int",
"default": 25,
"hint": (
"Minimum character floor for slash-command descriptions. "
"Tighter than skills because commands are one-verb "
"actions (\"run tests\", \"format code\"). Default 25. Min 1."
),
},
"min_distinct_words": {
"kind": "int",
"default": 5,
"hint": (
"Minimum number of DISTINCT words in any description "
"string. Defends against padding-only descriptions like "
"\"description description description\" that hit the "
"character count but say nothing. Default 5. Min 1."
),
},
"min_body_chars": {
"kind": "int",
"default": 200,
"hint": (
"Minimum body-content floor for skill / agent files "
"(the markdown after the YAML frontmatter). Real skill "
"bodies run 5002000 chars; the default 200 is a "
"\"one paragraph\" floor that catches stubs. Min 1."
),
},
"enabled": {
"kind": "bool",
"default": True,
"hint": (
"Master kill-switch for the LLM guardrail tier. When "
"False (or when ANTHROPIC_API_KEY / LLM_API_KEY is "
"absent), uploads still run the inline mechanical "
"checks but skip the LLM security + content-quality "
"review and auto-approve. Default True."
),
},
"review_model": {
"kind": "select",
"default": "haiku",
"options": ["haiku", "sonnet", "opus"],
"hint": (
"Anthropic model tier for the LLM security + content "
"review. Haiku is the cheapest and fastest; Sonnet / "
"Opus catch subtler prompt-injection + vague descriptions "
"at proportionally higher per-upload cost."
),
},
"blocked_quota_per_day": {
"kind": "int",
"default": 50,
"hint": (
"Per-submitter cap on `blocked_inline` rows in the "
"trailing 24h. Bounds the worst case where a bot loops "
"on malformed ZIPs. 0 disables the quota. Default 50."
),
},
"blocked_bundle_ttl_days": {
"kind": "int",
"default": 30,
"hint": (
"How many days to keep a blocked bundle's bytes on disk. "
"The submission row + sha256 + size always survive; only "
"the bytes get removed. 0 disables the purge entirely. "
"Default 30."
),
},
"stuck_review_grace_seconds": {
"kind": "int",
"default": 1800,
"hint": (
"How long a submission may stay at `status='pending_llm'` "
"before the reaper flips it to `review_error`. Default "
"1800 (30 min) comfortably exceeds Sonnet / Opus p99 "
"wall time. 0 disables the reaper."
),
},
},
} }
# Keys whose values must be redacted from the audit diff. We match # Keys whose values must be redacted from the audit diff. We match

View file

@ -449,6 +449,62 @@ def get_guardrails_stuck_review_grace_seconds() -> int:
return 1800 return 1800
def get_guardrails_min_description_chars() -> int:
"""Minimum character floor for skill / agent / plugin descriptions.
Reads ``guardrails.min_description_chars`` (default 60). Set the
floor low (e.g. 30) to relax the inline content check; set high
(e.g. 120) to push submitters closer to the Claude-skill-ecosystem
norm of 150220 chars per description.
"""
val = get_value("guardrails", "min_description_chars", default=60)
try:
return max(1, int(val))
except (TypeError, ValueError):
return 60
def get_guardrails_min_command_description_chars() -> int:
"""Minimum character floor for slash-command descriptions.
Reads ``guardrails.min_command_description_chars`` (default 25).
Commands are typically one-verb actions kept tighter than skills.
"""
val = get_value("guardrails", "min_command_description_chars", default=25)
try:
return max(1, int(val))
except (TypeError, ValueError):
return 25
def get_guardrails_min_distinct_words() -> int:
"""Minimum distinct-word count for any description string.
Reads ``guardrails.min_distinct_words`` (default 5). Defends against
"padding hits the char count but says nothing" cases like
`"description description description description"`.
"""
val = get_value("guardrails", "min_distinct_words", default=5)
try:
return max(1, int(val))
except (TypeError, ValueError):
return 5
def get_guardrails_min_body_chars() -> int:
"""Minimum body-content floor for skill / agent files.
Reads ``guardrails.min_body_chars`` (default 200). Body = the
markdown after the YAML frontmatter. 200 chars is a "one paragraph"
floor that catches stubs; real skill bodies run 5002000 chars.
"""
val = get_value("guardrails", "min_body_chars", default=200)
try:
return max(1, int(val))
except (TypeError, ValueError):
return 200
def get_guardrails_enabled() -> bool: def get_guardrails_enabled() -> bool:
"""Master kill-switch for the guardrail pipeline. """Master kill-switch for the guardrail pipeline.

View file

@ -1150,6 +1150,27 @@ async def install_redirect(request: Request):
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
def _guardrail_thresholds() -> dict[str, int]:
"""Live admin-configurable thresholds surfaced into the upload UI.
Each render reads the current value so the disclosure / counter /
examples-table copy stays in lock-step with the
/admin/server-config patch no app restart required.
"""
from app.instance_config import (
get_guardrails_min_body_chars,
get_guardrails_min_command_description_chars,
get_guardrails_min_description_chars,
get_guardrails_min_distinct_words,
)
return {
"min_description_chars": get_guardrails_min_description_chars(),
"min_command_description_chars": get_guardrails_min_command_description_chars(),
"min_distinct_words": get_guardrails_min_distinct_words(),
"min_body_chars": get_guardrails_min_body_chars(),
}
@router.get("/store/new", response_class=HTMLResponse) @router.get("/store/new", response_class=HTMLResponse)
async def store_new( async def store_new(
request: Request, request: Request,
@ -1157,7 +1178,11 @@ async def store_new(
conn: duckdb.DuckDBPyConnection = Depends(_get_db), conn: duckdb.DuckDBPyConnection = Depends(_get_db),
): ):
from src.store_categories import STORE_CATEGORIES from src.store_categories import STORE_CATEGORIES
ctx = _build_context(request, user=user, categories=list(STORE_CATEGORIES)) ctx = _build_context(
request, user=user,
categories=list(STORE_CATEGORIES),
guardrail=_guardrail_thresholds(),
)
return templates.TemplateResponse(request, "store_upload.html", ctx) return templates.TemplateResponse(request, "store_upload.html", ctx)
@ -1173,7 +1198,7 @@ async def store_examples(
whose bundle failed review can see what 'good' looks like whose bundle failed review can see what 'good' looks like
side-by-side with the rule that bit them. side-by-side with the rule that bit them.
""" """
ctx = _build_context(request, user=user) ctx = _build_context(request, user=user, guardrail=_guardrail_thresholds())
return templates.TemplateResponse(request, "store_examples.html", ctx) return templates.TemplateResponse(request, "store_examples.html", ctx)

View file

@ -222,6 +222,10 @@ const SECTION_META = {
title: "Materialize", title: "Materialize",
help: "Concurrency safety net for the materialize path. Controls the file-lock TTL used to detect and reclaim stale locks from hard-killed processes.", help: "Concurrency safety net for the materialize path. Controls the file-lock TTL used to detect and reclaim stale locks from hard-killed processes.",
}, },
guardrails: {
title: "Flea-market guardrails",
help: "Per-component content quality thresholds for store uploads. Lower the min-* knobs to relax the bar; raise to push submitters toward longer, more useful descriptions. The LLM tier (review_model + enabled) governs the second-stage substantive review.",
},
}; };
const DANGER_SECTIONS = new Set(["auth", "server"]); const DANGER_SECTIONS = new Set(["auth", "server"]);

View file

@ -97,21 +97,21 @@
<tbody> <tbody>
<tr> <tr>
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Skill / agent / plugin description</td> <td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Skill / agent / plugin description</td>
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);"><strong>60 chars</strong> · 5 distinct words</td> <td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);"><strong>{{ guardrail.min_description_chars|default(60) }} chars</strong> · {{ guardrail.min_distinct_words|default(5) }} distinct words</td>
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">120220 chars (one full sentence)</td> <td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">120220 chars (one full sentence)</td>
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Tells the assistant when to use the component and what it does. Showed on the marketplace tile so others can pick it.</td> <td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Tells the assistant when to use the component and what it does. Showed on the marketplace tile so others can pick it.</td>
</tr> </tr>
<tr> <tr>
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Command description</td> <td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Command description</td>
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);"><strong>25 chars</strong> · 5 distinct words</td> <td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);"><strong>{{ guardrail.min_command_description_chars|default(25) }} chars</strong> · {{ guardrail.min_distinct_words|default(5) }} distinct words</td>
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">40100 chars</td> <td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">40100 chars</td>
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Commands are one-verb actions ("run tests", "format code"). A short clear sentence is enough.</td> <td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Commands are one-verb actions ("run tests", "format code"). A short clear sentence is enough.</td>
</tr> </tr>
<tr> <tr>
<td style="padding: 8px 10px;">Skill / agent content body</td> <td style="padding: 8px 10px;">Skill / agent content body</td>
<td style="padding: 8px 10px;"><strong>200 chars</strong></td> <td style="padding: 8px 10px;"><strong>{{ guardrail.min_body_chars|default(200) }} chars</strong></td>
<td style="padding: 8px 10px;">5002000 chars</td> <td style="padding: 8px 10px;">5002000 chars</td>
<td style="padding: 8px 10px;">The body explains what the component does once used: inputs it expects, outputs it produces, edge cases. 200 chars is the bare-minimum "one paragraph" floor.</td> <td style="padding: 8px 10px;">The body explains what the component does once used: inputs it expects, outputs it produces, edge cases. The minimum is a "one paragraph" floor that catches stubs.</td>
</tr> </tr>
</tbody> </tbody>
</table> </table>

View file

@ -382,8 +382,8 @@
</p> </p>
<p><strong>The bar:</strong></p> <p><strong>The bar:</strong></p>
<ul> <ul>
<li>Each description ≥ <strong>60 characters</strong> (commands ≥ 25). Aim for one complete sentence — too short and the assistant can't tell when to use it.</li> <li>Each description ≥ <strong>{{ guardrail.min_description_chars|default(60) }} characters</strong> (commands ≥ {{ guardrail.min_command_description_chars|default(25) }}). Aim for one complete sentence — too short and the assistant can't tell when to use it.</li>
<li>At least 5 distinct words</li> <li>At least {{ guardrail.min_distinct_words|default(5) }} distinct words</li>
<li>No <code>TODO</code> / <code>TBD</code> / template placeholders</li> <li>No <code>TODO</code> / <code>TBD</code> / template placeholders</li>
<li>Action-oriented, names the trigger condition</li> <li>Action-oriented, names the trigger condition</li>
</ul> </ul>
@ -479,8 +479,8 @@
</p> </p>
<p><strong>The bar:</strong></p> <p><strong>The bar:</strong></p>
<ul> <ul>
<li>Each description ≥ <strong>60 characters</strong> (commands ≥ 25). Aim for one complete sentence — too short and the assistant can't tell when to use it.</li> <li>Each description ≥ <strong>{{ guardrail.min_description_chars|default(60) }} characters</strong> (commands ≥ {{ guardrail.min_command_description_chars|default(25) }}). Aim for one complete sentence — too short and the assistant can't tell when to use it.</li>
<li>At least 5 distinct words</li> <li>At least {{ guardrail.min_distinct_words|default(5) }} distinct words</li>
<li>No <code>TODO</code> / <code>TBD</code> / template placeholders</li> <li>No <code>TODO</code> / <code>TBD</code> / template placeholders</li>
<li>Action-oriented, names the trigger condition</li> <li>Action-oriented, names the trigger condition</li>
</ul> </ul>
@ -915,10 +915,11 @@ function renderDocs() {
}); });
} }
// Description char counter — turns green at the 30-char threshold so the // Description char counter — turns green at the configured floor so the
// submitter gets immediate feedback that they're past the floor. The // submitter gets immediate feedback that they're past the bar. The server
// server is the source of truth; this is purely UX. // is the source of truth; this is purely UX. Floor is operator-configurable
const DESC_MIN = 60; // Matches src/store_guardrails/content_check.py // via /admin/server-config (default 60).
const DESC_MIN = {{ guardrail.min_description_chars|default(60) }}; // Live config from /admin/server-config
const descField = document.getElementById('description'); const descField = document.getElementById('description');
const descCounter = document.getElementById('desc-counter'); const descCounter = document.getElementById('desc-counter');
function updateDescCounter() { function updateDescCounter() {

View file

@ -1,6 +1,6 @@
[project] [project]
name = "agnes-the-ai-analyst" name = "agnes-the-ai-analyst"
version = "0.54.1" version = "0.54.2"
description = "Agnes — AI Data Analyst platform for AI analytical systems" description = "Agnes — AI Data Analyst platform for AI analytical systems"
requires-python = ">=3.11,<3.14" requires-python = ">=3.11,<3.14"
license = "MIT" license = "MIT"

View file

@ -32,16 +32,51 @@ from ._frontmatter import frontmatter_body_offset, parse_frontmatter
# Criteria constants # Criteria constants
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Hard floors. Rationale: 60 chars + 5 distinct words is the bottom of # Hard-floor defaults. Calibrated against real ecosystem norms (Claude
# ecosystem norms (npm 60-120, Docker Hub 100, VS Code 120, GitHub # skill packs cluster 150220 chars per description; npm / Docker Hub /
# 80-200) and well below real Claude/Anthropic skill examples (mean # VS Code 100120). 60 chars + 5 distinct words catches the "didn't
# ~190 chars, range 159-224 across superpowers / compound-engineering
# / octo skill packs). The 60-char floor catches the obvious "didn't
# bother" cases; the LLM tier judges substantive quality on top. # bother" cases; the LLM tier judges substantive quality on top.
_MIN_DESC_CHARS = 60 #
_MIN_COMMAND_DESC_CHARS = 25 # Defaults are overridable via `instance.yaml.guardrails.*` keys —
_MIN_DISTINCT_WORDS = 5 # operators tune the floor without code changes. Resolution helpers
_MIN_BODY_CHARS = 200 # below read the live config on every call so /admin/server-config
# patches take effect on the next request (no app restart needed).
_DEFAULT_MIN_DESC_CHARS = 60
_DEFAULT_MIN_COMMAND_DESC_CHARS = 25
_DEFAULT_MIN_DISTINCT_WORDS = 5
_DEFAULT_MIN_BODY_CHARS = 200
def _min_desc_chars() -> int:
try:
from app.instance_config import get_guardrails_min_description_chars
return get_guardrails_min_description_chars()
except ImportError:
return _DEFAULT_MIN_DESC_CHARS
def _min_command_desc_chars() -> int:
try:
from app.instance_config import get_guardrails_min_command_description_chars
return get_guardrails_min_command_description_chars()
except ImportError:
return _DEFAULT_MIN_COMMAND_DESC_CHARS
def _min_distinct_words() -> int:
try:
from app.instance_config import get_guardrails_min_distinct_words
return get_guardrails_min_distinct_words()
except ImportError:
return _DEFAULT_MIN_DISTINCT_WORDS
def _min_body_chars() -> int:
try:
from app.instance_config import get_guardrails_min_body_chars
return get_guardrails_min_body_chars()
except ImportError:
return _DEFAULT_MIN_BODY_CHARS
# Case-insensitive substring matches that mark an unfilled template. # Case-insensitive substring matches that mark an unfilled template.
_PLACEHOLDER_PHRASES = ( _PLACEHOLDER_PHRASES = (
@ -201,7 +236,7 @@ def check_submission_description(description: Optional[str]) -> Dict[str, Any]:
single synthetic ``file`` = ``<submission>``. single synthetic ``file`` = ``<submission>``.
""" """
issues = _evaluate_description_string( issues = _evaluate_description_string(
description, min_chars=_MIN_DESC_CHARS, description, min_chars=_min_desc_chars(),
component_kind="submission", component_kind="submission",
) )
if issues: if issues:
@ -326,7 +361,7 @@ def _read_text(p: Path) -> str:
def _evaluate(comp: Dict[str, Any]) -> List[Dict[str, Any]]: def _evaluate(comp: Dict[str, Any]) -> List[Dict[str, Any]]:
"""Return the issue list for one component (empty when it passes).""" """Return the issue list for one component (empty when it passes)."""
type_ = comp["type"] type_ = comp["type"]
min_chars = _MIN_COMMAND_DESC_CHARS if type_ == "command" else _MIN_DESC_CHARS min_chars = _min_command_desc_chars() if type_ == "command" else _min_desc_chars()
issues = _evaluate_description_string( issues = _evaluate_description_string(
comp.get("description"), comp.get("description"),
@ -344,12 +379,12 @@ def _evaluate(comp: Dict[str, Any]) -> List[Dict[str, Any]]:
# commands often legitimately have a one-line body). # commands often legitimately have a one-line body).
if type_ in {"skill", "agent"}: if type_ in {"skill", "agent"}:
body = (comp.get("body") or "").strip() body = (comp.get("body") or "").strip()
if len(body) < _MIN_BODY_CHARS: if len(body) < _min_body_chars():
issues.append({ issues.append({
"file": comp["file"], "file": comp["file"],
"field": "body", "field": "body",
"code": "body_too_short", "code": "body_too_short",
"hint": _hint_for(type_, "body_too_short", min_chars=_MIN_BODY_CHARS), "hint": _hint_for(type_, "body_too_short", min_chars=_min_body_chars()),
"name": comp.get("name"), "name": comp.get("name"),
"component_type": type_, "component_type": type_,
}) })
@ -413,7 +448,7 @@ def _evaluate_description_string(
for t in raw.split() for t in raw.split()
] ]
distinct = {t for t in tokens if t} distinct = {t for t in tokens if t}
if len(distinct) < _MIN_DISTINCT_WORDS: if len(distinct) < _min_distinct_words():
return [{ return [{
"code": "low_word_count", "code": "low_word_count",
"hint": _hint_for(component_kind, "low_word_count"), "hint": _hint_for(component_kind, "low_word_count"),

View file

@ -0,0 +1,225 @@
"""Tests for the admin-configurable flea-market content-guardrail thresholds (#281).
Covers:
1. The four new `get_guardrails_min_*` getters in app/instance_config.py:
defaults, overlay-driven overrides, type coercion, and the
`max(1, int(val))` floor.
2. The round-trip: POST to /api/admin/server-config patches
`guardrails.min_description_chars`, the next inline content check
uses the new floor (closes the "primary testing gap" Vojta noted in
the PR #281 safe-fix commit message).
These tests close the only real gap surfaced in the PR #281 takeover
review every other reviewer finding was either already addressed in
Vojta's safe-fix commit or intentionally deferred (operator-direction
decisions on the `min_*=0` semantics + POST-time integer validation).
"""
from __future__ import annotations
import json
import shutil
import tempfile
from pathlib import Path
import pytest
import yaml as _yaml
def _auth(token: str) -> dict:
return {"Authorization": f"Bearer {token}"}
def _reset_cache() -> None:
import app.instance_config as ic
ic._instance_config = None
# ---------------------------------------------------------------------------
# Unit tests for the four new getters
# ---------------------------------------------------------------------------
class TestGuardrailGetterDefaults:
"""Each getter returns the documented default when nothing is configured."""
def test_min_description_chars_default_60(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("TESTING", "1")
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
_reset_cache()
from app.instance_config import get_guardrails_min_description_chars
assert get_guardrails_min_description_chars() == 60
def test_min_command_description_chars_default_25(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("TESTING", "1")
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
_reset_cache()
from app.instance_config import get_guardrails_min_command_description_chars
assert get_guardrails_min_command_description_chars() == 25
def test_min_distinct_words_default_5(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("TESTING", "1")
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
_reset_cache()
from app.instance_config import get_guardrails_min_distinct_words
assert get_guardrails_min_distinct_words() == 5
def test_min_body_chars_default_200(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("TESTING", "1")
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
_reset_cache()
from app.instance_config import get_guardrails_min_body_chars
assert get_guardrails_min_body_chars() == 200
class TestGuardrailGetterOverlay:
"""Operator-supplied overlay values win over defaults."""
def _seed_overlay(self, tmp_path, monkeypatch, payload: dict) -> None:
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("TESTING", "1")
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
(state / "instance.yaml").write_text(_yaml.dump(payload))
_reset_cache()
def test_overlay_overrides_min_description_chars(self, tmp_path, monkeypatch):
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_description_chars": 90},
})
from app.instance_config import get_guardrails_min_description_chars
assert get_guardrails_min_description_chars() == 90
def test_overlay_overrides_min_body_chars(self, tmp_path, monkeypatch):
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_body_chars": 500},
})
from app.instance_config import get_guardrails_min_body_chars
assert get_guardrails_min_body_chars() == 500
def test_string_value_coerced_to_int(self, tmp_path, monkeypatch):
# An operator hand-editing the YAML can leave a string that's still
# numeric — int() accepts it. Documented defensively in the getter.
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_distinct_words": "8"},
})
from app.instance_config import get_guardrails_min_distinct_words
assert get_guardrails_min_distinct_words() == 8
def test_garbage_value_falls_back_to_default(self, tmp_path, monkeypatch):
# Bool / non-numeric string / other garbage hits the
# `(TypeError, ValueError)` branch and returns the documented default.
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_command_description_chars": "not-a-number"},
})
from app.instance_config import get_guardrails_min_command_description_chars
assert get_guardrails_min_command_description_chars() == 25
def test_zero_or_negative_floored_to_one(self, tmp_path, monkeypatch):
# `max(1, int(val))` — operator setting 0 to "disable" doesn't
# actually disable; it's silently coerced to 1. Documented behavior;
# this test pins the contract so a future change to use 0-as-sentinel
# has to update this test (and reviewers see the policy decision).
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_description_chars": 0},
})
from app.instance_config import get_guardrails_min_description_chars
assert get_guardrails_min_description_chars() == 1
# Negative integer hits the same floor.
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_body_chars": -50},
})
from app.instance_config import get_guardrails_min_body_chars
assert get_guardrails_min_body_chars() == 1
# ---------------------------------------------------------------------------
# Round-trip: PATCH /api/admin/server-config → next inline check uses new floor
# ---------------------------------------------------------------------------
class TestPatchRoundTrip:
"""The "primary testing gap" Vojta flagged: an admin PATCH to
`guardrails.min_description_chars` must take effect on the very next
`content_check` call, with no app restart. The cache is invalidated
by /api/admin/server-config's reset_cache() bracket.
"""
def _write_skill(self, plugin_dir: Path, *, description: str) -> None:
target = plugin_dir / "skills" / "test-skill"
target.mkdir(parents=True, exist_ok=True)
body = "Body content explaining the skill in enough words to clear the body floor. " * 4
(target / "SKILL.md").write_text(
f"---\nname: test-skill\ndescription: {description}\n---\n\n{body}\n",
encoding="utf-8",
)
def test_patch_min_description_chars_takes_effect_next_check(
self, seeded_app, monkeypatch, tmp_path,
):
# 75-char description: passes default floor (60) but fails after
# we PATCH the floor to 90.
mid_length = "Use when validating the round-trip live config thresholds end to end now."
assert 60 <= len(mid_length) < 90, len(mid_length)
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
_reset_cache()
plugin_dir = Path(tempfile.mkdtemp(prefix="agnes_admin_config_test_"))
try:
self._write_skill(plugin_dir, description=mid_length)
# Step 1: at default floor (60), the description passes.
from src.store_guardrails.content_check import check as content_check
result = content_check(plugin_dir)
assert result["status"] == "pass", (
f"description {len(mid_length)} chars should pass default floor 60, "
f"got: {result}"
)
# Step 2: PATCH the floor to 90 via the admin API.
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.post(
"/api/admin/server-config",
headers=_auth(token),
json={"sections": {"guardrails": {"min_description_chars": 90}}},
)
assert r.status_code in (200, 204), r.text
# Step 3: same description, same content_check — must now fail
# with too_short. Cache invalidation done inside the admin POST
# handler; no test-side reset_cache() call is needed (or
# acceptable — that would be testing the test, not the system).
result_after = content_check(plugin_dir)
assert result_after["status"] == "fail", (
f"after PATCH to floor 90, {len(mid_length)}-char description "
f"must fail; got: {result_after}"
)
codes = {issue["code"] for issue in result_after["issues"]}
assert "too_short" in codes, (
f"expected too_short in issue codes, got: {codes}"
)
# Step 4: PATCH the floor back to 60 (fixture hygiene + extra
# confirmation that subsequent PATCHes also propagate).
r = c.post(
"/api/admin/server-config",
headers=_auth(token),
json={"sections": {"guardrails": {"min_description_chars": 60}}},
)
assert r.status_code in (200, 204), r.text
assert content_check(plugin_dir)["status"] == "pass", (
"PATCH-back-to-default did not propagate"
)
finally:
shutil.rmtree(plugin_dir, ignore_errors=True)
_reset_cache()