agnes-the-ai-analyst/tests/test_store_guardrails_admin_config.py
Vojtech 50a974f196
feat(store-guardrails): admin-configurable content thresholds (#281)
* feat(store-guardrails): admin-configurable content thresholds

Adds the flea-market content guardrail floors to the /admin/server-config
editor so operators can tune the bar without code changes. Defaults are
unchanged (60 chars description, 25 chars command, 5 distinct words, 200
chars body) — patching guardrails.* in instance.yaml or via the admin UI
overrides any of them and the next inline check picks up the new value.

src/store_guardrails/content_check.py now resolves the four floors via
helper functions (_min_desc_chars / _min_command_desc_chars /
_min_distinct_words / _min_body_chars) that read app.instance_config at
call time. Module-level _DEFAULT_* constants remain as fallbacks if
the import fails (defensive — keeps the guardrail module loadable
without the app package on its path).

app/instance_config.py grows four matching getters returning the live
value with sane defaults + integer coercion.

app/api/admin.py registers 'guardrails' as an editable section + ships
nine known-fields entries (min_description_chars,
min_command_description_chars, min_distinct_words, min_body_chars,
enabled, review_model, blocked_quota_per_day, blocked_bundle_ttl_days,
stuck_review_grace_seconds) with operator-facing hint copy explaining
what each knob does.

app/web/templates/admin_server_config.html gets a SECTION_META entry
so the section renders as 'Flea-market guardrails' with a help string
instead of a bare section ID.

app/web/router.py threads the live thresholds into /store/new and
/store/examples via a small _guardrail_thresholds() helper so the
disclosure copy, char counter, and "Why these limits" table render
the configured value (not a hardcoded 60). End-to-end smoke verified:
PATCH guardrails.min_description_chars=90 → /store/new immediately
renders "90 characters" + JS DESC_MIN=90 on the next request, no
restart required (helpers read live config per call).

* chore(store-guardrails): address PR review safe-fix findings

Code-review safe_auto findings on PR #281 (review run
20260513-100126-64052520):

- CHANGELOG: add Unreleased entry covering the new
  /admin/server-config Flea-market guardrails section, the four live
  threshold getters, and the route-helper rendering knobs. Required by
  the project's non-negotiable "Changelog discipline" rule.
- content_check.py: narrow `except Exception` to `except ImportError`
  on the four `_min_*()` resolver helpers. Surface-level TypeError /
  ValueError on a malformed YAML value belongs to the
  instance_config getters' own try/except — the resolvers should only
  defend against the in-tree import itself failing, not silently
  swallow real bugs in the getters.
- store_upload.html: refresh the stale "30-char threshold" comment to
  reflect the configurable floor (default 60), and add `|default(60)`
  / `|default(25)` / `|default(5)` filters to the disclosure-copy
  bindings so the upload form matches store_examples.html's
  belt-and-suspenders rendering if a future route ever renders the
  template without populating the `guardrail` context.
- router.py: tighten `_guardrail_thresholds()` return annotation
  from bare `dict` to `dict[str, int]`.

Residual work (left for separate change after operator direction):
- Add round-trip test (PATCH guardrails -> next inline check uses
  new value) — primary testing gap.
- Decide policy on `min_*=0` (currently coerced to 1 via
  `max(1, int(val))`) vs treating 0 as a disable sentinel like
  neighbour getters (`blocked_quota_per_day`,
  `blocked_bundle_ttl_days`).
- Add POST-time integer validation for `guardrails.*` so a typo'd
  YAML value (bool / string / float) errors loudly instead of
  silently falling back to the default.

* test(store-guardrails): cover admin-configurable thresholds + PATCH round-trip

Closes the "primary testing gap" Vojta noted in the safe-fix commit
on PR #281 — the four new `get_guardrails_min_*` getters and the
PATCH-takes-effect-on-next-check live-config flow had no direct
coverage.

10 new tests in `tests/test_store_guardrails_admin_config.py`:

- TestGuardrailGetterDefaults (4 tests) — each new getter returns the
  documented default (60 / 25 / 5 / 200) when nothing is configured.
- TestGuardrailGetterOverlay (5 tests) — overlay-driven overrides win,
  string values that look numeric coerce via int(), garbage strings
  fall back to default via the (TypeError, ValueError) branch, and the
  `max(1, int(val))` floor pins zero/negative inputs to 1.
- TestPatchRoundTrip (1 test) — PATCH `/api/admin/server-config`
  `guardrails.min_description_chars=90`, then call content_check
  against a 75-char description that previously passed: must now fail
  with `too_short`. Then PATCH back to 60 and verify the next check
  passes again. Closes the cache-invalidation contract Vojta relies on
  for the "no app restart" claim — broken without the
  reset_cache() bracket in /api/admin/server-config.

The TestGuardrailGetterOverlay.test_zero_or_negative_floored_to_one
test pins the current `max(1, int(val))` policy. Vojta's safe-fix
commit explicitly left "policy on min_*=0 vs disable-sentinel" as
residual work — pinning the current behavior here ensures any future
change to use 0 as a disable sentinel must update this test (and the
reviewer sees the policy decision).

Verified: 4509 tests pass locally (4499 existing + 10 new).

* release: 0.54.2 — admin-configurable flea-market guardrail thresholds + tests

Last commit on the PR per CLAUDE.md hard rule. Patch bump (0.54.1 →
0.54.2) bundling Vojta's admin-configurable thresholds for the
flea-market content guardrail (9 knobs in /admin/server-config) plus
the test coverage closing the "primary testing gap" he punted in the
safe-fix commit.

No DB migration; defaults unchanged from PR #276 — instances that
don't set `guardrails.*` keep the original bar transparently.

---------

Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
Co-authored-by: ZdenekSrotyr <139972147+ZdenekSrotyr@users.noreply.github.com>
2026-05-13 09:20:55 +00:00

225 lines
9.8 KiB
Python

"""Tests for the admin-configurable flea-market content-guardrail thresholds (#281).
Covers:
1. The four new `get_guardrails_min_*` getters in app/instance_config.py:
defaults, overlay-driven overrides, type coercion, and the
`max(1, int(val))` floor.
2. The round-trip: POST to /api/admin/server-config patches
`guardrails.min_description_chars`, the next inline content check
uses the new floor (closes the "primary testing gap" Vojta noted in
the PR #281 safe-fix commit message).
These tests close the only real gap surfaced in the PR #281 takeover
review — every other reviewer finding was either already addressed in
Vojta's safe-fix commit or intentionally deferred (operator-direction
decisions on the `min_*=0` semantics + POST-time integer validation).
"""
from __future__ import annotations
import json
import shutil
import tempfile
from pathlib import Path
import pytest
import yaml as _yaml
def _auth(token: str) -> dict:
return {"Authorization": f"Bearer {token}"}
def _reset_cache() -> None:
import app.instance_config as ic
ic._instance_config = None
# ---------------------------------------------------------------------------
# Unit tests for the four new getters
# ---------------------------------------------------------------------------
class TestGuardrailGetterDefaults:
"""Each getter returns the documented default when nothing is configured."""
def test_min_description_chars_default_60(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("TESTING", "1")
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
_reset_cache()
from app.instance_config import get_guardrails_min_description_chars
assert get_guardrails_min_description_chars() == 60
def test_min_command_description_chars_default_25(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("TESTING", "1")
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
_reset_cache()
from app.instance_config import get_guardrails_min_command_description_chars
assert get_guardrails_min_command_description_chars() == 25
def test_min_distinct_words_default_5(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("TESTING", "1")
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
_reset_cache()
from app.instance_config import get_guardrails_min_distinct_words
assert get_guardrails_min_distinct_words() == 5
def test_min_body_chars_default_200(self, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("TESTING", "1")
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
_reset_cache()
from app.instance_config import get_guardrails_min_body_chars
assert get_guardrails_min_body_chars() == 200
class TestGuardrailGetterOverlay:
"""Operator-supplied overlay values win over defaults."""
def _seed_overlay(self, tmp_path, monkeypatch, payload: dict) -> None:
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("TESTING", "1")
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
(state / "instance.yaml").write_text(_yaml.dump(payload))
_reset_cache()
def test_overlay_overrides_min_description_chars(self, tmp_path, monkeypatch):
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_description_chars": 90},
})
from app.instance_config import get_guardrails_min_description_chars
assert get_guardrails_min_description_chars() == 90
def test_overlay_overrides_min_body_chars(self, tmp_path, monkeypatch):
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_body_chars": 500},
})
from app.instance_config import get_guardrails_min_body_chars
assert get_guardrails_min_body_chars() == 500
def test_string_value_coerced_to_int(self, tmp_path, monkeypatch):
# An operator hand-editing the YAML can leave a string that's still
# numeric — int() accepts it. Documented defensively in the getter.
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_distinct_words": "8"},
})
from app.instance_config import get_guardrails_min_distinct_words
assert get_guardrails_min_distinct_words() == 8
def test_garbage_value_falls_back_to_default(self, tmp_path, monkeypatch):
# Bool / non-numeric string / other garbage hits the
# `(TypeError, ValueError)` branch and returns the documented default.
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_command_description_chars": "not-a-number"},
})
from app.instance_config import get_guardrails_min_command_description_chars
assert get_guardrails_min_command_description_chars() == 25
def test_zero_or_negative_floored_to_one(self, tmp_path, monkeypatch):
# `max(1, int(val))` — operator setting 0 to "disable" doesn't
# actually disable; it's silently coerced to 1. Documented behavior;
# this test pins the contract so a future change to use 0-as-sentinel
# has to update this test (and reviewers see the policy decision).
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_description_chars": 0},
})
from app.instance_config import get_guardrails_min_description_chars
assert get_guardrails_min_description_chars() == 1
# Negative integer hits the same floor.
self._seed_overlay(tmp_path, monkeypatch, {
"guardrails": {"min_body_chars": -50},
})
from app.instance_config import get_guardrails_min_body_chars
assert get_guardrails_min_body_chars() == 1
# ---------------------------------------------------------------------------
# Round-trip: PATCH /api/admin/server-config → next inline check uses new floor
# ---------------------------------------------------------------------------
class TestPatchRoundTrip:
"""The "primary testing gap" Vojta flagged: an admin PATCH to
`guardrails.min_description_chars` must take effect on the very next
`content_check` call, with no app restart. The cache is invalidated
by /api/admin/server-config's reset_cache() bracket.
"""
def _write_skill(self, plugin_dir: Path, *, description: str) -> None:
target = plugin_dir / "skills" / "test-skill"
target.mkdir(parents=True, exist_ok=True)
body = "Body content explaining the skill in enough words to clear the body floor. " * 4
(target / "SKILL.md").write_text(
f"---\nname: test-skill\ndescription: {description}\n---\n\n{body}\n",
encoding="utf-8",
)
def test_patch_min_description_chars_takes_effect_next_check(
self, seeded_app, monkeypatch, tmp_path,
):
# 75-char description: passes default floor (60) but fails after
# we PATCH the floor to 90.
mid_length = "Use when validating the round-trip live config thresholds end to end now."
assert 60 <= len(mid_length) < 90, len(mid_length)
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
_reset_cache()
plugin_dir = Path(tempfile.mkdtemp(prefix="agnes_admin_config_test_"))
try:
self._write_skill(plugin_dir, description=mid_length)
# Step 1: at default floor (60), the description passes.
from src.store_guardrails.content_check import check as content_check
result = content_check(plugin_dir)
assert result["status"] == "pass", (
f"description {len(mid_length)} chars should pass default floor 60, "
f"got: {result}"
)
# Step 2: PATCH the floor to 90 via the admin API.
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.post(
"/api/admin/server-config",
headers=_auth(token),
json={"sections": {"guardrails": {"min_description_chars": 90}}},
)
assert r.status_code in (200, 204), r.text
# Step 3: same description, same content_check — must now fail
# with too_short. Cache invalidation done inside the admin POST
# handler; no test-side reset_cache() call is needed (or
# acceptable — that would be testing the test, not the system).
result_after = content_check(plugin_dir)
assert result_after["status"] == "fail", (
f"after PATCH to floor 90, {len(mid_length)}-char description "
f"must fail; got: {result_after}"
)
codes = {issue["code"] for issue in result_after["issues"]}
assert "too_short" in codes, (
f"expected too_short in issue codes, got: {codes}"
)
# Step 4: PATCH the floor back to 60 (fixture hygiene + extra
# confirmation that subsequent PATCHes also propagate).
r = c.post(
"/api/admin/server-config",
headers=_auth(token),
json={"sections": {"guardrails": {"min_description_chars": 60}}},
)
assert r.status_code in (200, 204), r.text
assert content_check(plugin_dir)["status"] == "pass", (
"PATCH-back-to-default did not propagate"
)
finally:
shutil.rmtree(plugin_dir, ignore_errors=True)
_reset_cache()