* feat(store-guardrails): admin-configurable content thresholds Adds the flea-market content guardrail floors to the /admin/server-config editor so operators can tune the bar without code changes. Defaults are unchanged (60 chars description, 25 chars command, 5 distinct words, 200 chars body) — patching guardrails.* in instance.yaml or via the admin UI overrides any of them and the next inline check picks up the new value. src/store_guardrails/content_check.py now resolves the four floors via helper functions (_min_desc_chars / _min_command_desc_chars / _min_distinct_words / _min_body_chars) that read app.instance_config at call time. Module-level _DEFAULT_* constants remain as fallbacks if the import fails (defensive — keeps the guardrail module loadable without the app package on its path). app/instance_config.py grows four matching getters returning the live value with sane defaults + integer coercion. app/api/admin.py registers 'guardrails' as an editable section + ships nine known-fields entries (min_description_chars, min_command_description_chars, min_distinct_words, min_body_chars, enabled, review_model, blocked_quota_per_day, blocked_bundle_ttl_days, stuck_review_grace_seconds) with operator-facing hint copy explaining what each knob does. app/web/templates/admin_server_config.html gets a SECTION_META entry so the section renders as 'Flea-market guardrails' with a help string instead of a bare section ID. app/web/router.py threads the live thresholds into /store/new and /store/examples via a small _guardrail_thresholds() helper so the disclosure copy, char counter, and "Why these limits" table render the configured value (not a hardcoded 60). End-to-end smoke verified: PATCH guardrails.min_description_chars=90 → /store/new immediately renders "90 characters" + JS DESC_MIN=90 on the next request, no restart required (helpers read live config per call). * chore(store-guardrails): address PR review safe-fix findings Code-review safe_auto findings on PR #281 (review run 20260513-100126-64052520): - CHANGELOG: add Unreleased entry covering the new /admin/server-config Flea-market guardrails section, the four live threshold getters, and the route-helper rendering knobs. Required by the project's non-negotiable "Changelog discipline" rule. - content_check.py: narrow `except Exception` to `except ImportError` on the four `_min_*()` resolver helpers. Surface-level TypeError / ValueError on a malformed YAML value belongs to the instance_config getters' own try/except — the resolvers should only defend against the in-tree import itself failing, not silently swallow real bugs in the getters. - store_upload.html: refresh the stale "30-char threshold" comment to reflect the configurable floor (default 60), and add `|default(60)` / `|default(25)` / `|default(5)` filters to the disclosure-copy bindings so the upload form matches store_examples.html's belt-and-suspenders rendering if a future route ever renders the template without populating the `guardrail` context. - router.py: tighten `_guardrail_thresholds()` return annotation from bare `dict` to `dict[str, int]`. Residual work (left for separate change after operator direction): - Add round-trip test (PATCH guardrails -> next inline check uses new value) — primary testing gap. - Decide policy on `min_*=0` (currently coerced to 1 via `max(1, int(val))`) vs treating 0 as a disable sentinel like neighbour getters (`blocked_quota_per_day`, `blocked_bundle_ttl_days`). - Add POST-time integer validation for `guardrails.*` so a typo'd YAML value (bool / string / float) errors loudly instead of silently falling back to the default. * test(store-guardrails): cover admin-configurable thresholds + PATCH round-trip Closes the "primary testing gap" Vojta noted in the safe-fix commit on PR #281 — the four new `get_guardrails_min_*` getters and the PATCH-takes-effect-on-next-check live-config flow had no direct coverage. 10 new tests in `tests/test_store_guardrails_admin_config.py`: - TestGuardrailGetterDefaults (4 tests) — each new getter returns the documented default (60 / 25 / 5 / 200) when nothing is configured. - TestGuardrailGetterOverlay (5 tests) — overlay-driven overrides win, string values that look numeric coerce via int(), garbage strings fall back to default via the (TypeError, ValueError) branch, and the `max(1, int(val))` floor pins zero/negative inputs to 1. - TestPatchRoundTrip (1 test) — PATCH `/api/admin/server-config` `guardrails.min_description_chars=90`, then call content_check against a 75-char description that previously passed: must now fail with `too_short`. Then PATCH back to 60 and verify the next check passes again. Closes the cache-invalidation contract Vojta relies on for the "no app restart" claim — broken without the reset_cache() bracket in /api/admin/server-config. The TestGuardrailGetterOverlay.test_zero_or_negative_floored_to_one test pins the current `max(1, int(val))` policy. Vojta's safe-fix commit explicitly left "policy on min_*=0 vs disable-sentinel" as residual work — pinning the current behavior here ensures any future change to use 0 as a disable sentinel must update this test (and the reviewer sees the policy decision). Verified: 4509 tests pass locally (4499 existing + 10 new). * release: 0.54.2 — admin-configurable flea-market guardrail thresholds + tests Last commit on the PR per CLAUDE.md hard rule. Patch bump (0.54.1 → 0.54.2) bundling Vojta's admin-configurable thresholds for the flea-market content guardrail (9 knobs in /admin/server-config) plus the test coverage closing the "primary testing gap" he punted in the safe-fix commit. No DB migration; defaults unchanged from PR #276 — instances that don't set `guardrails.*` keep the original bar transparently. --------- Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com> Co-authored-by: ZdenekSrotyr <139972147+ZdenekSrotyr@users.noreply.github.com>
339 lines
13 KiB
HTML
339 lines
13 KiB
HTML
{% extends "base.html" %}
|
||
{% block title %}Submission examples — {{ config.INSTANCE_NAME }}{% endblock %}
|
||
|
||
{% block content %}
|
||
<style>
|
||
.examples-page { max-width: 980px; }
|
||
.examples-page h1 {
|
||
font-size: 26px; font-weight: 700; margin: 0 0 6px;
|
||
color: var(--text-primary, #111827);
|
||
}
|
||
.examples-page .sub {
|
||
font-size: 14px; color: var(--text-secondary, #6b7280);
|
||
margin-bottom: 24px; line-height: 1.55;
|
||
}
|
||
.ex-section {
|
||
background: var(--surface, #fff);
|
||
border: 1px solid var(--border, #e5e7eb);
|
||
border-radius: 10px;
|
||
padding: 20px 22px;
|
||
margin-bottom: 18px;
|
||
}
|
||
.ex-section h2 {
|
||
margin: 0 0 4px; font-size: 17px; font-weight: 600;
|
||
color: var(--text-primary, #111827);
|
||
}
|
||
.ex-section .why {
|
||
font-size: 13px; color: var(--text-secondary, #6b7280);
|
||
margin-bottom: 12px; line-height: 1.6;
|
||
}
|
||
.ex-block {
|
||
font-family: var(--font-mono); font-size: 12.5px;
|
||
background: #f6f8fa;
|
||
border: 1px solid var(--border, #e5e7eb);
|
||
border-radius: 8px;
|
||
padding: 12px 14px;
|
||
line-height: 1.5;
|
||
overflow-x: auto;
|
||
white-space: pre-wrap;
|
||
word-break: break-word;
|
||
color: #1f2937;
|
||
}
|
||
.ex-label {
|
||
display: inline-block; font-size: 11px; font-weight: 600;
|
||
padding: 2px 8px; border-radius: 4px;
|
||
background: rgba(16, 185, 129, 0.12); color: #047857;
|
||
text-transform: uppercase; letter-spacing: 0.5px;
|
||
margin-bottom: 6px;
|
||
}
|
||
.ex-label.bad { background: rgba(220, 38, 38, 0.10); color: #b91c1c; }
|
||
.ex-compare { display: grid; grid-template-columns: 1fr 1fr; gap: 14px; }
|
||
@media (max-width: 720px) { .ex-compare { grid-template-columns: 1fr; } }
|
||
.ex-tips {
|
||
background: var(--background, #f9fafb);
|
||
border-left: 3px solid var(--primary, #0073D1);
|
||
padding: 10px 14px; border-radius: 4px;
|
||
font-size: 13px; margin-top: 12px; line-height: 1.55;
|
||
}
|
||
.ex-tips strong { color: var(--text-primary, #111827); }
|
||
.top-actions { margin-bottom: 16px; }
|
||
.top-actions a {
|
||
color: var(--primary, #0073D1); text-decoration: none;
|
||
font-size: 13px;
|
||
}
|
||
</style>
|
||
|
||
<div class="examples-page page-shell">
|
||
<div class="top-actions">
|
||
<a href="/store/new">← Back to upload</a>
|
||
</div>
|
||
|
||
<h1>Submission examples</h1>
|
||
<p class="sub">
|
||
Each component (plugin, agent, skill, command) needs a description
|
||
that names the trigger condition AND the action. Skills are the
|
||
strictest case because Claude reads the description verbatim when
|
||
deciding whether to invoke the skill. Examples below show the
|
||
minimum bar plus what a strong submission looks like.
|
||
</p>
|
||
|
||
<!-- ───── Why these limits ───────────────────────────────────────── -->
|
||
<div class="ex-section">
|
||
<h2>Why these limits?</h2>
|
||
<div class="why">
|
||
Descriptions that are too short don't give the assistant enough
|
||
to go on. Below is the minimum bar plus what a well-written
|
||
submission usually looks like.
|
||
</div>
|
||
<table style="width:100%; border-collapse: collapse; font-size: 13px;">
|
||
<thead>
|
||
<tr style="background: var(--background, #f9fafb); text-align: left;">
|
||
<th style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Field</th>
|
||
<th style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Minimum</th>
|
||
<th style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Recommended</th>
|
||
<th style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">What it does</th>
|
||
</tr>
|
||
</thead>
|
||
<tbody>
|
||
<tr>
|
||
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Skill / agent / plugin description</td>
|
||
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);"><strong>{{ guardrail.min_description_chars|default(60) }} chars</strong> · {{ guardrail.min_distinct_words|default(5) }} distinct words</td>
|
||
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">120–220 chars (one full sentence)</td>
|
||
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Tells the assistant when to use the component and what it does. Showed on the marketplace tile so others can pick it.</td>
|
||
</tr>
|
||
<tr>
|
||
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Command description</td>
|
||
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);"><strong>{{ guardrail.min_command_description_chars|default(25) }} chars</strong> · {{ guardrail.min_distinct_words|default(5) }} distinct words</td>
|
||
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">40–100 chars</td>
|
||
<td style="padding: 8px 10px; border-bottom: 1px solid var(--border, #e5e7eb);">Commands are one-verb actions ("run tests", "format code"). A short clear sentence is enough.</td>
|
||
</tr>
|
||
<tr>
|
||
<td style="padding: 8px 10px;">Skill / agent content body</td>
|
||
<td style="padding: 8px 10px;"><strong>{{ guardrail.min_body_chars|default(200) }} chars</strong></td>
|
||
<td style="padding: 8px 10px;">500–2000 chars</td>
|
||
<td style="padding: 8px 10px;">The body explains what the component does once used: inputs it expects, outputs it produces, edge cases. The minimum is a "one paragraph" floor that catches stubs.</td>
|
||
</tr>
|
||
</tbody>
|
||
</table>
|
||
<div class="ex-tips" style="margin-top: 12px;">
|
||
<strong>Two-step review.</strong> First we check length and word
|
||
count to catch placeholders like <code>TODO</code> or
|
||
<code>description</code>. Submissions that pass the length check
|
||
then go to a substantive reviewer that judges whether the
|
||
description is genuinely useful or just padded filler that hit
|
||
the character count by accident.
|
||
</div>
|
||
</div>
|
||
|
||
<!-- ───── Skill ──────────────────────────────────────────────────── -->
|
||
<div class="ex-section" id="skill">
|
||
<h2>Skill</h2>
|
||
<div class="why">
|
||
<code>skills/<name>/SKILL.md</code> with YAML frontmatter and a
|
||
body. The frontmatter <code>description</code> IS the trigger
|
||
string Claude reads; the body explains how the skill works once
|
||
invoked.
|
||
</div>
|
||
|
||
<div class="ex-compare">
|
||
<div>
|
||
<span class="ex-label bad">Rejected</span>
|
||
<pre class="ex-block">---
|
||
name: code-review
|
||
description: A reviewer skill
|
||
---
|
||
|
||
Reviews code.
|
||
</pre>
|
||
<div class="ex-tips">
|
||
<strong>Why it fails:</strong> description is 15 chars
|
||
(floor 60), restates the name, body is 13 chars
|
||
(floor 200). Claude can't decide when to invoke it.
|
||
</div>
|
||
</div>
|
||
<div>
|
||
<span class="ex-label">Passes</span>
|
||
<pre class="ex-block">---
|
||
name: code-review
|
||
description: Use when reviewing pull requests to flag missing tests,
|
||
weak assertions, brittle implementation-coupled tests, and edge
|
||
cases the implementation forgot.
|
||
---
|
||
|
||
# Code review skill
|
||
|
||
Run this skill against the diff of an open pull request. It walks
|
||
the changed files and surfaces three categories of issues:
|
||
|
||
1. **Missing tests** — new functions / endpoints / migrations
|
||
without corresponding test coverage.
|
||
2. **Weak assertions** — tests that exist but only assert truthy
|
||
values, status codes, or shape without verifying the actual
|
||
behavior contract.
|
||
3. **Brittle coupling** — tests that depend on private state,
|
||
internal call counts, or implementation details that will break
|
||
on refactor without catching real regressions.
|
||
|
||
## Inputs
|
||
|
||
The skill expects to be invoked from a git repo with a PR branch
|
||
checked out. It reads `git diff <base>..HEAD` to scope the review.
|
||
|
||
## Output
|
||
|
||
Markdown comment grouped by file, with line refs and one fix
|
||
suggestion per finding.
|
||
</pre>
|
||
<div class="ex-tips">
|
||
<strong>Why it passes:</strong> description names WHEN (PR
|
||
review) and WHAT it does, body explains inputs/outputs.
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- ───── Agent ──────────────────────────────────────────────────── -->
|
||
<div class="ex-section" id="agent">
|
||
<h2>Agent</h2>
|
||
<div class="why">
|
||
Single <code>.md</code> file with YAML frontmatter. The
|
||
<code>description</code> drives dispatch — when a parent agent
|
||
decides which subagent to spawn, this is the string it reads.
|
||
</div>
|
||
|
||
<div class="ex-compare">
|
||
<div>
|
||
<span class="ex-label bad">Rejected</span>
|
||
<pre class="ex-block">---
|
||
name: debugger
|
||
description: A debugger
|
||
---
|
||
|
||
Helps with debugging.
|
||
</pre>
|
||
<div class="ex-tips">
|
||
<strong>Why it fails:</strong> description is 11 chars
|
||
(floor 60), body is 22 chars (floor 200), neither names a
|
||
dispatch criterion.
|
||
</div>
|
||
</div>
|
||
<div>
|
||
<span class="ex-label">Passes</span>
|
||
<pre class="ex-block">---
|
||
name: debugger
|
||
description: Diagnoses test failures, runtime errors, and unexpected
|
||
behavior. Use when a build is failing, a stack trace lands in the
|
||
conversation, or the user describes a symptom without a known
|
||
root cause.
|
||
---
|
||
|
||
# Debugger subagent
|
||
|
||
Triage flow:
|
||
|
||
1. Reproduce the failure deterministically (capture the exact
|
||
command + working directory + relevant env).
|
||
2. Reduce — strip the case down to the smallest input that still
|
||
fails. Confirm the reduction reproduces.
|
||
3. Bisect or trace, depending on whether the regression is recent
|
||
or the behavior was never correct.
|
||
4. Propose ONE root-cause hypothesis and a minimal fix. Don't
|
||
shotgun-fix multiple symptoms.
|
||
|
||
Report the hypothesis + fix path back to the dispatcher; do not
|
||
apply the fix yourself unless explicitly asked.
|
||
</pre>
|
||
<div class="ex-tips">
|
||
<strong>Why it passes:</strong> description names WHAT it
|
||
does (diagnose) AND WHEN to dispatch (test failures, stack
|
||
traces, symptoms).
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- ───── Plugin ─────────────────────────────────────────────────── -->
|
||
<div class="ex-section" id="plugin">
|
||
<h2>Plugin</h2>
|
||
<div class="why">
|
||
<code>.claude-plugin/plugin.json</code> at the bundle root. The
|
||
<code>description</code> is the marketplace tile copy — first
|
||
thing a user sees when browsing.
|
||
</div>
|
||
|
||
<div class="ex-compare">
|
||
<div>
|
||
<span class="ex-label bad">Rejected</span>
|
||
<pre class="ex-block">{
|
||
"name": "review-tools",
|
||
"description": "Tools for code review",
|
||
"version": "0.1.0"
|
||
}
|
||
</pre>
|
||
<div class="ex-tips">
|
||
<strong>Why it fails:</strong> 20 chars (floor 60),
|
||
generic enough to apply to any plugin.
|
||
</div>
|
||
</div>
|
||
<div>
|
||
<span class="ex-label">Passes</span>
|
||
<pre class="ex-block">{
|
||
"name": "review-tools",
|
||
"description": "Code review automation: PR diff analysis, test
|
||
coverage gaps, and a configurable rule pack for Rails / Python /
|
||
TypeScript projects.",
|
||
"version": "0.1.0"
|
||
}
|
||
</pre>
|
||
<div class="ex-tips">
|
||
<strong>Why it passes:</strong> names the audience (PR
|
||
reviewers), the value (gaps + rule pack), and the scope
|
||
(3 named languages).
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<!-- ───── Command ────────────────────────────────────────────────── -->
|
||
<div class="ex-section" id="command">
|
||
<h2>Command</h2>
|
||
<div class="why">
|
||
<code>commands/<name>.md</code>. Shown in <code>/help</code>
|
||
and slash-command lists. Lower 20-char floor since commands tend
|
||
to be one-verb actions.
|
||
</div>
|
||
|
||
<div class="ex-compare">
|
||
<div>
|
||
<span class="ex-label bad">Rejected</span>
|
||
<pre class="ex-block">---
|
||
name: run-tests
|
||
description: Runs tests
|
||
---
|
||
</pre>
|
||
<div class="ex-tips">
|
||
<strong>Why it fails:</strong> 10 chars (floor 25),
|
||
restates the name.
|
||
</div>
|
||
</div>
|
||
<div>
|
||
<span class="ex-label">Passes</span>
|
||
<pre class="ex-block">---
|
||
name: run-tests
|
||
description: Run the project test suite and print failures grouped
|
||
by file with the first 5 lines of the traceback inline.
|
||
---
|
||
</pre>
|
||
<div class="ex-tips">
|
||
<strong>Why it passes:</strong> states the action, the
|
||
shape of the output, and the trimming behavior.
|
||
</div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="top-actions" style="margin-top: 20px;">
|
||
<a href="/store/new">← Back to upload</a>
|
||
</div>
|
||
</div>
|
||
{% endblock %}
|