From 50a974f1969eff81dce4de1d13d0ae309cf4073a Mon Sep 17 00:00:00 2001 From: Vojtech <119944107+cvrysanek@users.noreply.github.com> Date: Wed, 13 May 2026 13:20:55 +0400 Subject: [PATCH] feat(store-guardrails): admin-configurable content thresholds (#281) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(store-guardrails): admin-configurable content thresholds Adds the flea-market content guardrail floors to the /admin/server-config editor so operators can tune the bar without code changes. Defaults are unchanged (60 chars description, 25 chars command, 5 distinct words, 200 chars body) — patching guardrails.* in instance.yaml or via the admin UI overrides any of them and the next inline check picks up the new value. src/store_guardrails/content_check.py now resolves the four floors via helper functions (_min_desc_chars / _min_command_desc_chars / _min_distinct_words / _min_body_chars) that read app.instance_config at call time. Module-level _DEFAULT_* constants remain as fallbacks if the import fails (defensive — keeps the guardrail module loadable without the app package on its path). app/instance_config.py grows four matching getters returning the live value with sane defaults + integer coercion. app/api/admin.py registers 'guardrails' as an editable section + ships nine known-fields entries (min_description_chars, min_command_description_chars, min_distinct_words, min_body_chars, enabled, review_model, blocked_quota_per_day, blocked_bundle_ttl_days, stuck_review_grace_seconds) with operator-facing hint copy explaining what each knob does. app/web/templates/admin_server_config.html gets a SECTION_META entry so the section renders as 'Flea-market guardrails' with a help string instead of a bare section ID. app/web/router.py threads the live thresholds into /store/new and /store/examples via a small _guardrail_thresholds() helper so the disclosure copy, char counter, and "Why these limits" table render the configured value (not a hardcoded 60). End-to-end smoke verified: PATCH guardrails.min_description_chars=90 → /store/new immediately renders "90 characters" + JS DESC_MIN=90 on the next request, no restart required (helpers read live config per call). * chore(store-guardrails): address PR review safe-fix findings Code-review safe_auto findings on PR #281 (review run 20260513-100126-64052520): - CHANGELOG: add Unreleased entry covering the new /admin/server-config Flea-market guardrails section, the four live threshold getters, and the route-helper rendering knobs. Required by the project's non-negotiable "Changelog discipline" rule. - content_check.py: narrow `except Exception` to `except ImportError` on the four `_min_*()` resolver helpers. Surface-level TypeError / ValueError on a malformed YAML value belongs to the instance_config getters' own try/except — the resolvers should only defend against the in-tree import itself failing, not silently swallow real bugs in the getters. - store_upload.html: refresh the stale "30-char threshold" comment to reflect the configurable floor (default 60), and add `|default(60)` / `|default(25)` / `|default(5)` filters to the disclosure-copy bindings so the upload form matches store_examples.html's belt-and-suspenders rendering if a future route ever renders the template without populating the `guardrail` context. - router.py: tighten `_guardrail_thresholds()` return annotation from bare `dict` to `dict[str, int]`. Residual work (left for separate change after operator direction): - Add round-trip test (PATCH guardrails -> next inline check uses new value) — primary testing gap. - Decide policy on `min_*=0` (currently coerced to 1 via `max(1, int(val))`) vs treating 0 as a disable sentinel like neighbour getters (`blocked_quota_per_day`, `blocked_bundle_ttl_days`). - Add POST-time integer validation for `guardrails.*` so a typo'd YAML value (bool / string / float) errors loudly instead of silently falling back to the default. * test(store-guardrails): cover admin-configurable thresholds + PATCH round-trip Closes the "primary testing gap" Vojta noted in the safe-fix commit on PR #281 — the four new `get_guardrails_min_*` getters and the PATCH-takes-effect-on-next-check live-config flow had no direct coverage. 10 new tests in `tests/test_store_guardrails_admin_config.py`: - TestGuardrailGetterDefaults (4 tests) — each new getter returns the documented default (60 / 25 / 5 / 200) when nothing is configured. - TestGuardrailGetterOverlay (5 tests) — overlay-driven overrides win, string values that look numeric coerce via int(), garbage strings fall back to default via the (TypeError, ValueError) branch, and the `max(1, int(val))` floor pins zero/negative inputs to 1. - TestPatchRoundTrip (1 test) — PATCH `/api/admin/server-config` `guardrails.min_description_chars=90`, then call content_check against a 75-char description that previously passed: must now fail with `too_short`. Then PATCH back to 60 and verify the next check passes again. Closes the cache-invalidation contract Vojta relies on for the "no app restart" claim — broken without the reset_cache() bracket in /api/admin/server-config. The TestGuardrailGetterOverlay.test_zero_or_negative_floored_to_one test pins the current `max(1, int(val))` policy. Vojta's safe-fix commit explicitly left "policy on min_*=0 vs disable-sentinel" as residual work — pinning the current behavior here ensures any future change to use 0 as a disable sentinel must update this test (and the reviewer sees the policy decision). Verified: 4509 tests pass locally (4499 existing + 10 new). * release: 0.54.2 — admin-configurable flea-market guardrail thresholds + tests Last commit on the PR per CLAUDE.md hard rule. Patch bump (0.54.1 → 0.54.2) bundling Vojta's admin-configurable thresholds for the flea-market content guardrail (9 knobs in /admin/server-config) plus the test coverage closing the "primary testing gap" he punted in the safe-fix commit. No DB migration; defaults unchanged from PR #276 — instances that don't set `guardrails.*` keep the original bar transparently. --------- Co-authored-by: ZdenekSrotyr Co-authored-by: ZdenekSrotyr <139972147+ZdenekSrotyr@users.noreply.github.com> --- CHANGELOG.md | 22 ++ app/api/admin.py | 95 +++++++++ app/instance_config.py | 56 +++++ app/web/router.py | 29 ++- app/web/templates/admin_server_config.html | 4 + app/web/templates/store_examples.html | 8 +- app/web/templates/store_upload.html | 17 +- pyproject.toml | 2 +- src/store_guardrails/content_check.py | 63 ++++-- tests/test_store_guardrails_admin_config.py | 225 ++++++++++++++++++++ 10 files changed, 492 insertions(+), 29 deletions(-) create mode 100644 tests/test_store_guardrails_admin_config.py diff --git a/CHANGELOG.md b/CHANGELOG.md index 68784ed..8d3eff8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,28 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C ## [Unreleased] +## [0.54.2] — 2026-05-13 + +### Added + +- **Admin-configurable flea-market content guardrail thresholds.** + `/admin/server-config` gains a new **Flea-market guardrails** section + exposing nine knobs: `min_description_chars` (default 60), + `min_command_description_chars` (default 25), `min_distinct_words` + (default 5), `min_body_chars` (default 200), `enabled` (master + kill-switch), `review_model` (haiku / sonnet / opus), + `blocked_quota_per_day` (default 50), `blocked_bundle_ttl_days` + (default 30), `stuck_review_grace_seconds` (default 1800). Each + field carries an operator-facing hint string. The four mechanical + floors are read from `app.instance_config` on every inline check, + so a `/admin/server-config` PATCH takes effect on the next request + without restarting uvicorn. `/store/new` (live char counter + + disclosure copy) and `/store/examples` (the "Why these limits" + table) render the configured values via a small + `_guardrail_thresholds()` helper threaded into the route context. + Defaults are unchanged — instances that don't set + `guardrails.*` keep the original PR #276 bar. + ## [0.54.1] — 2026-05-13 ### Added diff --git a/app/api/admin.py b/app/api/admin.py index 6b69cfe..f4b1040 100644 --- a/app/api/admin.py +++ b/app/api/admin.py @@ -258,6 +258,7 @@ _EDITABLE_SECTIONS: tuple[str, ...] = ( "desktop", "corporate_memory", "materialize", + "guardrails", ) # "Danger-zone" sections — flipping these can lock operators out (auth.*) or @@ -703,6 +704,100 @@ _KNOWN_FIELDS: dict[str, dict[str, dict]] = { ), }, }, + "guardrails": { + "min_description_chars": { + "kind": "int", + "default": 60, + "hint": ( + "Minimum character floor for skill / agent / plugin " + "descriptions on flea-market uploads (the inline content " + "guardrail). Real-world Claude skill descriptions cluster " + "150–220 chars; the default 60 is the bottom of the bar " + "to catch placeholders. Bump to 100+ to push submitters " + "closer to the ecosystem norm. Min 1." + ), + }, + "min_command_description_chars": { + "kind": "int", + "default": 25, + "hint": ( + "Minimum character floor for slash-command descriptions. " + "Tighter than skills because commands are one-verb " + "actions (\"run tests\", \"format code\"). Default 25. Min 1." + ), + }, + "min_distinct_words": { + "kind": "int", + "default": 5, + "hint": ( + "Minimum number of DISTINCT words in any description " + "string. Defends against padding-only descriptions like " + "\"description description description\" that hit the " + "character count but say nothing. Default 5. Min 1." + ), + }, + "min_body_chars": { + "kind": "int", + "default": 200, + "hint": ( + "Minimum body-content floor for skill / agent files " + "(the markdown after the YAML frontmatter). Real skill " + "bodies run 500–2000 chars; the default 200 is a " + "\"one paragraph\" floor that catches stubs. Min 1." + ), + }, + "enabled": { + "kind": "bool", + "default": True, + "hint": ( + "Master kill-switch for the LLM guardrail tier. When " + "False (or when ANTHROPIC_API_KEY / LLM_API_KEY is " + "absent), uploads still run the inline mechanical " + "checks but skip the LLM security + content-quality " + "review and auto-approve. Default True." + ), + }, + "review_model": { + "kind": "select", + "default": "haiku", + "options": ["haiku", "sonnet", "opus"], + "hint": ( + "Anthropic model tier for the LLM security + content " + "review. Haiku is the cheapest and fastest; Sonnet / " + "Opus catch subtler prompt-injection + vague descriptions " + "at proportionally higher per-upload cost." + ), + }, + "blocked_quota_per_day": { + "kind": "int", + "default": 50, + "hint": ( + "Per-submitter cap on `blocked_inline` rows in the " + "trailing 24h. Bounds the worst case where a bot loops " + "on malformed ZIPs. 0 disables the quota. Default 50." + ), + }, + "blocked_bundle_ttl_days": { + "kind": "int", + "default": 30, + "hint": ( + "How many days to keep a blocked bundle's bytes on disk. " + "The submission row + sha256 + size always survive; only " + "the bytes get removed. 0 disables the purge entirely. " + "Default 30." + ), + }, + "stuck_review_grace_seconds": { + "kind": "int", + "default": 1800, + "hint": ( + "How long a submission may stay at `status='pending_llm'` " + "before the reaper flips it to `review_error`. Default " + "1800 (30 min) comfortably exceeds Sonnet / Opus p99 " + "wall time. 0 disables the reaper." + ), + }, + }, } # Keys whose values must be redacted from the audit diff. We match diff --git a/app/instance_config.py b/app/instance_config.py index d5eabcc..6dff3c1 100644 --- a/app/instance_config.py +++ b/app/instance_config.py @@ -449,6 +449,62 @@ def get_guardrails_stuck_review_grace_seconds() -> int: return 1800 +def get_guardrails_min_description_chars() -> int: + """Minimum character floor for skill / agent / plugin descriptions. + + Reads ``guardrails.min_description_chars`` (default 60). Set the + floor low (e.g. 30) to relax the inline content check; set high + (e.g. 120) to push submitters closer to the Claude-skill-ecosystem + norm of 150–220 chars per description. + """ + val = get_value("guardrails", "min_description_chars", default=60) + try: + return max(1, int(val)) + except (TypeError, ValueError): + return 60 + + +def get_guardrails_min_command_description_chars() -> int: + """Minimum character floor for slash-command descriptions. + + Reads ``guardrails.min_command_description_chars`` (default 25). + Commands are typically one-verb actions — kept tighter than skills. + """ + val = get_value("guardrails", "min_command_description_chars", default=25) + try: + return max(1, int(val)) + except (TypeError, ValueError): + return 25 + + +def get_guardrails_min_distinct_words() -> int: + """Minimum distinct-word count for any description string. + + Reads ``guardrails.min_distinct_words`` (default 5). Defends against + "padding hits the char count but says nothing" cases like + `"description description description description"`. + """ + val = get_value("guardrails", "min_distinct_words", default=5) + try: + return max(1, int(val)) + except (TypeError, ValueError): + return 5 + + +def get_guardrails_min_body_chars() -> int: + """Minimum body-content floor for skill / agent files. + + Reads ``guardrails.min_body_chars`` (default 200). Body = the + markdown after the YAML frontmatter. 200 chars is a "one paragraph" + floor that catches stubs; real skill bodies run 500–2000 chars. + """ + val = get_value("guardrails", "min_body_chars", default=200) + try: + return max(1, int(val)) + except (TypeError, ValueError): + return 200 + + def get_guardrails_enabled() -> bool: """Master kill-switch for the guardrail pipeline. diff --git a/app/web/router.py b/app/web/router.py index 0bd0bb0..09aa2e3 100644 --- a/app/web/router.py +++ b/app/web/router.py @@ -1150,6 +1150,27 @@ async def install_redirect(request: Request): # --------------------------------------------------------------------------- +def _guardrail_thresholds() -> dict[str, int]: + """Live admin-configurable thresholds surfaced into the upload UI. + + Each render reads the current value so the disclosure / counter / + examples-table copy stays in lock-step with the + /admin/server-config patch — no app restart required. + """ + from app.instance_config import ( + get_guardrails_min_body_chars, + get_guardrails_min_command_description_chars, + get_guardrails_min_description_chars, + get_guardrails_min_distinct_words, + ) + return { + "min_description_chars": get_guardrails_min_description_chars(), + "min_command_description_chars": get_guardrails_min_command_description_chars(), + "min_distinct_words": get_guardrails_min_distinct_words(), + "min_body_chars": get_guardrails_min_body_chars(), + } + + @router.get("/store/new", response_class=HTMLResponse) async def store_new( request: Request, @@ -1157,7 +1178,11 @@ async def store_new( conn: duckdb.DuckDBPyConnection = Depends(_get_db), ): from src.store_categories import STORE_CATEGORIES - ctx = _build_context(request, user=user, categories=list(STORE_CATEGORIES)) + ctx = _build_context( + request, user=user, + categories=list(STORE_CATEGORIES), + guardrail=_guardrail_thresholds(), + ) return templates.TemplateResponse(request, "store_upload.html", ctx) @@ -1173,7 +1198,7 @@ async def store_examples( whose bundle failed review can see what 'good' looks like side-by-side with the rule that bit them. """ - ctx = _build_context(request, user=user) + ctx = _build_context(request, user=user, guardrail=_guardrail_thresholds()) return templates.TemplateResponse(request, "store_examples.html", ctx) diff --git a/app/web/templates/admin_server_config.html b/app/web/templates/admin_server_config.html index 1a11615..56ff75d 100644 --- a/app/web/templates/admin_server_config.html +++ b/app/web/templates/admin_server_config.html @@ -222,6 +222,10 @@ const SECTION_META = { title: "Materialize", help: "Concurrency safety net for the materialize path. Controls the file-lock TTL used to detect and reclaim stale locks from hard-killed processes.", }, + guardrails: { + title: "Flea-market guardrails", + help: "Per-component content quality thresholds for store uploads. Lower the min-* knobs to relax the bar; raise to push submitters toward longer, more useful descriptions. The LLM tier (review_model + enabled) governs the second-stage substantive review.", + }, }; const DANGER_SECTIONS = new Set(["auth", "server"]); diff --git a/app/web/templates/store_examples.html b/app/web/templates/store_examples.html index b789889..4ddb514 100644 --- a/app/web/templates/store_examples.html +++ b/app/web/templates/store_examples.html @@ -97,21 +97,21 @@ Skill / agent / plugin description - 60 chars · 5 distinct words + {{ guardrail.min_description_chars|default(60) }} chars · {{ guardrail.min_distinct_words|default(5) }} distinct words 120–220 chars (one full sentence) Tells the assistant when to use the component and what it does. Showed on the marketplace tile so others can pick it. Command description - 25 chars · 5 distinct words + {{ guardrail.min_command_description_chars|default(25) }} chars · {{ guardrail.min_distinct_words|default(5) }} distinct words 40–100 chars Commands are one-verb actions ("run tests", "format code"). A short clear sentence is enough. Skill / agent content body - 200 chars + {{ guardrail.min_body_chars|default(200) }} chars 500–2000 chars - The body explains what the component does once used: inputs it expects, outputs it produces, edge cases. 200 chars is the bare-minimum "one paragraph" floor. + The body explains what the component does once used: inputs it expects, outputs it produces, edge cases. The minimum is a "one paragraph" floor that catches stubs. diff --git a/app/web/templates/store_upload.html b/app/web/templates/store_upload.html index 9d17831..7b8e6c6 100644 --- a/app/web/templates/store_upload.html +++ b/app/web/templates/store_upload.html @@ -382,8 +382,8 @@

The bar:

@@ -479,8 +479,8 @@

The bar:

@@ -915,10 +915,11 @@ function renderDocs() { }); } -// Description char counter — turns green at the 30-char threshold so the -// submitter gets immediate feedback that they're past the floor. The -// server is the source of truth; this is purely UX. -const DESC_MIN = 60; // Matches src/store_guardrails/content_check.py +// Description char counter — turns green at the configured floor so the +// submitter gets immediate feedback that they're past the bar. The server +// is the source of truth; this is purely UX. Floor is operator-configurable +// via /admin/server-config (default 60). +const DESC_MIN = {{ guardrail.min_description_chars|default(60) }}; // Live config from /admin/server-config const descField = document.getElementById('description'); const descCounter = document.getElementById('desc-counter'); function updateDescCounter() { diff --git a/pyproject.toml b/pyproject.toml index 4a63dfe..a839283 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "agnes-the-ai-analyst" -version = "0.54.1" +version = "0.54.2" description = "Agnes — AI Data Analyst platform for AI analytical systems" requires-python = ">=3.11,<3.14" license = "MIT" diff --git a/src/store_guardrails/content_check.py b/src/store_guardrails/content_check.py index 1ee7cee..63a780a 100644 --- a/src/store_guardrails/content_check.py +++ b/src/store_guardrails/content_check.py @@ -32,16 +32,51 @@ from ._frontmatter import frontmatter_body_offset, parse_frontmatter # Criteria constants # --------------------------------------------------------------------------- -# Hard floors. Rationale: 60 chars + 5 distinct words is the bottom of -# ecosystem norms (npm 60-120, Docker Hub 100, VS Code 120, GitHub -# 80-200) and well below real Claude/Anthropic skill examples (mean -# ~190 chars, range 159-224 across superpowers / compound-engineering -# / octo skill packs). The 60-char floor catches the obvious "didn't +# Hard-floor defaults. Calibrated against real ecosystem norms (Claude +# skill packs cluster 150–220 chars per description; npm / Docker Hub / +# VS Code 100–120). 60 chars + 5 distinct words catches the "didn't # bother" cases; the LLM tier judges substantive quality on top. -_MIN_DESC_CHARS = 60 -_MIN_COMMAND_DESC_CHARS = 25 -_MIN_DISTINCT_WORDS = 5 -_MIN_BODY_CHARS = 200 +# +# Defaults are overridable via `instance.yaml.guardrails.*` keys — +# operators tune the floor without code changes. Resolution helpers +# below read the live config on every call so /admin/server-config +# patches take effect on the next request (no app restart needed). +_DEFAULT_MIN_DESC_CHARS = 60 +_DEFAULT_MIN_COMMAND_DESC_CHARS = 25 +_DEFAULT_MIN_DISTINCT_WORDS = 5 +_DEFAULT_MIN_BODY_CHARS = 200 + + +def _min_desc_chars() -> int: + try: + from app.instance_config import get_guardrails_min_description_chars + return get_guardrails_min_description_chars() + except ImportError: + return _DEFAULT_MIN_DESC_CHARS + + +def _min_command_desc_chars() -> int: + try: + from app.instance_config import get_guardrails_min_command_description_chars + return get_guardrails_min_command_description_chars() + except ImportError: + return _DEFAULT_MIN_COMMAND_DESC_CHARS + + +def _min_distinct_words() -> int: + try: + from app.instance_config import get_guardrails_min_distinct_words + return get_guardrails_min_distinct_words() + except ImportError: + return _DEFAULT_MIN_DISTINCT_WORDS + + +def _min_body_chars() -> int: + try: + from app.instance_config import get_guardrails_min_body_chars + return get_guardrails_min_body_chars() + except ImportError: + return _DEFAULT_MIN_BODY_CHARS # Case-insensitive substring matches that mark an unfilled template. _PLACEHOLDER_PHRASES = ( @@ -201,7 +236,7 @@ def check_submission_description(description: Optional[str]) -> Dict[str, Any]: single synthetic ``file`` = ````. """ issues = _evaluate_description_string( - description, min_chars=_MIN_DESC_CHARS, + description, min_chars=_min_desc_chars(), component_kind="submission", ) if issues: @@ -326,7 +361,7 @@ def _read_text(p: Path) -> str: def _evaluate(comp: Dict[str, Any]) -> List[Dict[str, Any]]: """Return the issue list for one component (empty when it passes).""" type_ = comp["type"] - min_chars = _MIN_COMMAND_DESC_CHARS if type_ == "command" else _MIN_DESC_CHARS + min_chars = _min_command_desc_chars() if type_ == "command" else _min_desc_chars() issues = _evaluate_description_string( comp.get("description"), @@ -344,12 +379,12 @@ def _evaluate(comp: Dict[str, Any]) -> List[Dict[str, Any]]: # commands often legitimately have a one-line body). if type_ in {"skill", "agent"}: body = (comp.get("body") or "").strip() - if len(body) < _MIN_BODY_CHARS: + if len(body) < _min_body_chars(): issues.append({ "file": comp["file"], "field": "body", "code": "body_too_short", - "hint": _hint_for(type_, "body_too_short", min_chars=_MIN_BODY_CHARS), + "hint": _hint_for(type_, "body_too_short", min_chars=_min_body_chars()), "name": comp.get("name"), "component_type": type_, }) @@ -413,7 +448,7 @@ def _evaluate_description_string( for t in raw.split() ] distinct = {t for t in tokens if t} - if len(distinct) < _MIN_DISTINCT_WORDS: + if len(distinct) < _min_distinct_words(): return [{ "code": "low_word_count", "hint": _hint_for(component_kind, "low_word_count"), diff --git a/tests/test_store_guardrails_admin_config.py b/tests/test_store_guardrails_admin_config.py new file mode 100644 index 0000000..5ae212a --- /dev/null +++ b/tests/test_store_guardrails_admin_config.py @@ -0,0 +1,225 @@ +"""Tests for the admin-configurable flea-market content-guardrail thresholds (#281). + +Covers: +1. The four new `get_guardrails_min_*` getters in app/instance_config.py: + defaults, overlay-driven overrides, type coercion, and the + `max(1, int(val))` floor. +2. The round-trip: POST to /api/admin/server-config patches + `guardrails.min_description_chars`, the next inline content check + uses the new floor (closes the "primary testing gap" Vojta noted in + the PR #281 safe-fix commit message). + +These tests close the only real gap surfaced in the PR #281 takeover +review — every other reviewer finding was either already addressed in +Vojta's safe-fix commit or intentionally deferred (operator-direction +decisions on the `min_*=0` semantics + POST-time integer validation). +""" + +from __future__ import annotations + +import json +import shutil +import tempfile +from pathlib import Path + +import pytest +import yaml as _yaml + + +def _auth(token: str) -> dict: + return {"Authorization": f"Bearer {token}"} + + +def _reset_cache() -> None: + import app.instance_config as ic + ic._instance_config = None + + +# --------------------------------------------------------------------------- +# Unit tests for the four new getters +# --------------------------------------------------------------------------- + + +class TestGuardrailGetterDefaults: + """Each getter returns the documented default when nothing is configured.""" + + def test_min_description_chars_default_60(self, tmp_path, monkeypatch): + monkeypatch.setenv("DATA_DIR", str(tmp_path)) + monkeypatch.setenv("TESTING", "1") + monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!") + _reset_cache() + from app.instance_config import get_guardrails_min_description_chars + assert get_guardrails_min_description_chars() == 60 + + def test_min_command_description_chars_default_25(self, tmp_path, monkeypatch): + monkeypatch.setenv("DATA_DIR", str(tmp_path)) + monkeypatch.setenv("TESTING", "1") + monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!") + _reset_cache() + from app.instance_config import get_guardrails_min_command_description_chars + assert get_guardrails_min_command_description_chars() == 25 + + def test_min_distinct_words_default_5(self, tmp_path, monkeypatch): + monkeypatch.setenv("DATA_DIR", str(tmp_path)) + monkeypatch.setenv("TESTING", "1") + monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!") + _reset_cache() + from app.instance_config import get_guardrails_min_distinct_words + assert get_guardrails_min_distinct_words() == 5 + + def test_min_body_chars_default_200(self, tmp_path, monkeypatch): + monkeypatch.setenv("DATA_DIR", str(tmp_path)) + monkeypatch.setenv("TESTING", "1") + monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!") + _reset_cache() + from app.instance_config import get_guardrails_min_body_chars + assert get_guardrails_min_body_chars() == 200 + + +class TestGuardrailGetterOverlay: + """Operator-supplied overlay values win over defaults.""" + + def _seed_overlay(self, tmp_path, monkeypatch, payload: dict) -> None: + monkeypatch.setenv("DATA_DIR", str(tmp_path)) + monkeypatch.setenv("TESTING", "1") + monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!") + state = tmp_path / "state" + state.mkdir(parents=True, exist_ok=True) + (state / "instance.yaml").write_text(_yaml.dump(payload)) + _reset_cache() + + def test_overlay_overrides_min_description_chars(self, tmp_path, monkeypatch): + self._seed_overlay(tmp_path, monkeypatch, { + "guardrails": {"min_description_chars": 90}, + }) + from app.instance_config import get_guardrails_min_description_chars + assert get_guardrails_min_description_chars() == 90 + + def test_overlay_overrides_min_body_chars(self, tmp_path, monkeypatch): + self._seed_overlay(tmp_path, monkeypatch, { + "guardrails": {"min_body_chars": 500}, + }) + from app.instance_config import get_guardrails_min_body_chars + assert get_guardrails_min_body_chars() == 500 + + def test_string_value_coerced_to_int(self, tmp_path, monkeypatch): + # An operator hand-editing the YAML can leave a string that's still + # numeric — int() accepts it. Documented defensively in the getter. + self._seed_overlay(tmp_path, monkeypatch, { + "guardrails": {"min_distinct_words": "8"}, + }) + from app.instance_config import get_guardrails_min_distinct_words + assert get_guardrails_min_distinct_words() == 8 + + def test_garbage_value_falls_back_to_default(self, tmp_path, monkeypatch): + # Bool / non-numeric string / other garbage hits the + # `(TypeError, ValueError)` branch and returns the documented default. + self._seed_overlay(tmp_path, monkeypatch, { + "guardrails": {"min_command_description_chars": "not-a-number"}, + }) + from app.instance_config import get_guardrails_min_command_description_chars + assert get_guardrails_min_command_description_chars() == 25 + + def test_zero_or_negative_floored_to_one(self, tmp_path, monkeypatch): + # `max(1, int(val))` — operator setting 0 to "disable" doesn't + # actually disable; it's silently coerced to 1. Documented behavior; + # this test pins the contract so a future change to use 0-as-sentinel + # has to update this test (and reviewers see the policy decision). + self._seed_overlay(tmp_path, monkeypatch, { + "guardrails": {"min_description_chars": 0}, + }) + from app.instance_config import get_guardrails_min_description_chars + assert get_guardrails_min_description_chars() == 1 + + # Negative integer hits the same floor. + self._seed_overlay(tmp_path, monkeypatch, { + "guardrails": {"min_body_chars": -50}, + }) + from app.instance_config import get_guardrails_min_body_chars + assert get_guardrails_min_body_chars() == 1 + + +# --------------------------------------------------------------------------- +# Round-trip: PATCH /api/admin/server-config → next inline check uses new floor +# --------------------------------------------------------------------------- + + +class TestPatchRoundTrip: + """The "primary testing gap" Vojta flagged: an admin PATCH to + `guardrails.min_description_chars` must take effect on the very next + `content_check` call, with no app restart. The cache is invalidated + by /api/admin/server-config's reset_cache() bracket. + """ + + def _write_skill(self, plugin_dir: Path, *, description: str) -> None: + target = plugin_dir / "skills" / "test-skill" + target.mkdir(parents=True, exist_ok=True) + body = "Body content explaining the skill in enough words to clear the body floor. " * 4 + (target / "SKILL.md").write_text( + f"---\nname: test-skill\ndescription: {description}\n---\n\n{body}\n", + encoding="utf-8", + ) + + def test_patch_min_description_chars_takes_effect_next_check( + self, seeded_app, monkeypatch, tmp_path, + ): + # 75-char description: passes default floor (60) but fails after + # we PATCH the floor to 90. + mid_length = "Use when validating the round-trip live config thresholds end to end now." + assert 60 <= len(mid_length) < 90, len(mid_length) + + monkeypatch.setenv("DATA_DIR", str(tmp_path)) + state = tmp_path / "state" + state.mkdir(parents=True, exist_ok=True) + _reset_cache() + + plugin_dir = Path(tempfile.mkdtemp(prefix="agnes_admin_config_test_")) + try: + self._write_skill(plugin_dir, description=mid_length) + + # Step 1: at default floor (60), the description passes. + from src.store_guardrails.content_check import check as content_check + result = content_check(plugin_dir) + assert result["status"] == "pass", ( + f"description {len(mid_length)} chars should pass default floor 60, " + f"got: {result}" + ) + + # Step 2: PATCH the floor to 90 via the admin API. + c = seeded_app["client"] + token = seeded_app["admin_token"] + r = c.post( + "/api/admin/server-config", + headers=_auth(token), + json={"sections": {"guardrails": {"min_description_chars": 90}}}, + ) + assert r.status_code in (200, 204), r.text + + # Step 3: same description, same content_check — must now fail + # with too_short. Cache invalidation done inside the admin POST + # handler; no test-side reset_cache() call is needed (or + # acceptable — that would be testing the test, not the system). + result_after = content_check(plugin_dir) + assert result_after["status"] == "fail", ( + f"after PATCH to floor 90, {len(mid_length)}-char description " + f"must fail; got: {result_after}" + ) + codes = {issue["code"] for issue in result_after["issues"]} + assert "too_short" in codes, ( + f"expected too_short in issue codes, got: {codes}" + ) + + # Step 4: PATCH the floor back to 60 (fixture hygiene + extra + # confirmation that subsequent PATCHes also propagate). + r = c.post( + "/api/admin/server-config", + headers=_auth(token), + json={"sections": {"guardrails": {"min_description_chars": 60}}}, + ) + assert r.status_code in (200, 204), r.text + assert content_check(plugin_dir)["status"] == "pass", ( + "PATCH-back-to-default did not propagate" + ) + finally: + shutil.rmtree(plugin_dir, ignore_errors=True) + _reset_cache()