* feat(web): value-first /home reskin (CEO mock palette + pillars + first-session)
Restructures `/home` to lead with product value instead of install steps,
matching the CEO mock proposed for the homepage:
- New intro hero on top — eyebrow `Welcome, {{ display_name }}`, H1
`{{ instance_brand }} is your team's AI workspace`, lede framing the
product as an "AI Chief of Staff", two CTAs (`Set up in ~15 min →`
jumps to the wizard, `Just browse — no install needed` jumps to
`#look-around`), and a four-pillar row (Data packages · Plugins ·
Skills · Memory). Renders for both onboarded and not-onboarded users
so the value framing is consistent across visits.
- New `first-session` narrative — five-beat walkthrough (launch → pick
project → memory loads → ask → close) with mock terminal frames
carrying traffic-light dots, prompts, and dimmed system output.
- Setup wizard chrome — progress chip (`Step 1 of N · ~15 min ·
One-time · Reversible`), thin progress bar, and per-step number
badges on each `.install-block` so the wizard reads as bounded
instead of an open-ended scroll.
- Palette shift from blue to green/navy: `--hp-primary` aliases
`#2ea877` (mint), `--hp-hero-bg` is navy `#0f1b3a`, code panels stay
near-black `#0c1224` with warm-yellow `#ffd866` accents. The token
alias is reused so downstream rules pick up the new accent
automatically; instance theme overrides via
`config.theme_overrides()` still win.
- VS Code surface tile carries a `Recommended` pill; the existing
"Want to look around first?" section is renamed to `Explore your
workspace` and gets the `#look-around` anchor.
All test-pinned class names and IDs (`install-hero`, `install-block`,
`home-mock`, `self-mark-btn`, `setupClaudeBtn`, `offboard-strip`,
`home-getting-started`, `home-gs-item`, `home-overview`,
`home-usage`) preserved as structural anchors; new visual language
overlays via additional classes. Existing onboarded/not-onboarded
branching, `/api/me/onboarded` POST, status frame gating, post-CTA
modal, and OS-tab switching JS unchanged. Stray `~/FoundryAI`
comment swapped for `~/{{ workspace_dir }}` to honor the
vendor-agnostic OSS rule.
51 home tests pass without modification.
* fix(web): /home palette inversion — dark intro hero on top, light setup card below
Previous reskin commit kept the install-hero as a dark navy gradient and
rendered the new intro hero as a light surface — opposite of what the CEO
mock specifies. Playwright comparison vs `data/ceo_home.html` confirmed:
- CEO mock: dark navy hero at TOP (with white pillars on navy), LIGHT
white setup card BELOW with light step rows and dark code panels
inset.
- Previous: light intro hero on top, dark setup card below. Inverted.
This patch flips both:
- `.home-hero-intro` now: dark navy gradient `#0f1b3a → #1a2a5f`, green
radial glow in the corner, green eyebrow, white H1 (`accent` span
green), rgba-white lede, green pill primary CTA, translucent-white
secondary CTA, pillars row separated by hairline border-top with
green square-dot bullets in front of each pillar header.
- `.install-hero` and `.install-block` now: white surface card with
thin green accent strip across the top, light step rows split by
hairline borders, green-tinted step-number circles (`#e6f9f0` bg,
`#1f8a5e` ink), green progress chip + bar. Code panels
(`.install-cmd`) and terminal frames stay dark — they're the "type
this" surfaces.
- All previously-rgba-white descendants of `.install-hero`
(close button, eyebrow, h1, lead, links, code chips, OS tabs,
install notes, setup-CTA button, self-mark fallback, auto-detect
badge, terminal-howto disclosure) re-skinned for light surface.
All 12 home page tests still pass (no markup changes, only CSS).
* fix(web): /home parity polish — system font + mock sizes + blue info hint + gray step-num
After v2 palette flip, user comparison vs CEO mock surfaced three
remaining gaps in the wizard area:
- Font stack mismatch: Agnes inherits Inter via `style-custom.css`,
but the CEO mock uses the platform system stack (San Francisco on
macOS, Segoe UI on Windows). The rendered weight/letterforms read
noticeably different. `.home-mock` now declares
`-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif`
for itself and all descendants, with the monospace stack reserved
for `code`/`kbd`/`pre`, `.install-cmd`, and `.terminal-body`.
- Step number badges were green-tinted; mock uses neutral gray
(`#f0f2f6` bg, `#4a5168` ink) — green is reserved for the "done"
state. Switched to `--hp-surface-dim` + `--hp-text-secondary`.
- "Don't have a terminal open?" disclosure was an amber/yellow
variant left over from the old dark-hero palette. Mock uses a
blue info-hint vocabulary (`--info-bg: #eef3ff`,
`--info-line: #4f7cf2`, `--info-ink: #1c3994`) with white kbd
chips. Added the info-* tokens to the `:root` block and re-skinned
`details.terminal-howto` (incl. summary, body, kbd) to match.
Step-body type sizes also brought in line with the mock spec —
`.install-block .label` (step h3 equivalent) is now 17px / 700 with
6px gap; `.install-note` body type is 14px / 1.55.
`--hp-info-bg / --hp-info-ink / --hp-info-line / --hp-warn-bg /
--hp-warn-ink / --hp-warn-line / --hp-surface-dim` added as
first-class tokens so future hint/warn callouts pick the same colors
without a duplicate vocabulary.
12/12 home tests pass.
* feat(web): centralize design tokens + reword /home wizard to 6 steps (CEO mock parity)
Two intertwined changes that touch both global design + /home structure:
GLOBAL TOKEN SHIFT (app/web/static/style-custom.css)
- `--primary` flipped from blue `#0073D1` to green `#2ea877` — same brand
alias the rest of the app referenced, so every page picks up the new
accent automatically. Old `--primary-dark` / `--primary-light` recolored
to match.
- New tokens added: `--brand-accent`, `--hero-bg`, `--hero-ink`,
`--surface-dim`, `--info-bg/ink/line`, `--warn-bg/ink/line`. Brings
the global vocabulary in line with the CEO mock's `:root` block so
callouts and hero surfaces don't have to invent local tokens.
- `--font-primary` switched from Inter-led stack to the system stack
(`-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Inter",
system-ui, sans-serif`) so weight/letterforms render identically on
macOS (San Francisco) and Windows (Segoe UI) — matches the mock and
avoids a font-loading flash for analysts without Inter installed.
- Shadow tints re-cast in navy `rgba(15,27,58,...)`; focus ring uses
the new green `rgba(46,168,119,0.25)`.
- `.app-nav-link` font-size 13px → 14px, padding 6px 12px → 8px 14px,
hover bg → `--primary-light` (mint), color → `--primary-dark`.
`.app-nav-menu-item.is-active` re-tinted to the same green system.
- Sweep across 26 templates (style-custom.css + 25 template files)
replacing every hardcoded `#0073D1` / `#005BA3` / `#E6F3FC` /
`rgba(0,115,209,…)` / `rgba(0,86,163,…)` with token references or
the new green hexes — 175 occurrences total. Pages that styled their
own buttons / borders / shadows pick up the new brand color without
per-page overrides.
/HOME WIZARD: 6 STEPS PER MOCK (app/web/templates/home_not_onboarded.html)
- Step 1 reworded `Install Claude Code on your computer` + `~3 min`
subhead (mock copy).
- Step 2 renamed `Pick a folder for {{ instance_brand }}` (was
`create your workspace folder`) — same `mkdir` command, mock-aligned
framing.
- NEW Step 3 `Open a terminal inside that folder` — no shell command,
just the "you are standing in the right directory" reassurance with
a Finder/PowerShell/file-manager howto disclosure. Mirrors the CEO
mock's Step 3.
- Step 4 (was Step 3, gated by `home_automode.show`) renamed
`Launch Claude with auto-approve on`. Body copy lightly updated so
it references "the next step" instead of "Step 4".
- Step 5 (was Step 4) renamed `Get the install script and paste it
into Claude`. The setup-cta-lead now explicitly says
"pasting the script into Claude Code will install {{ instance_brand
}}…" so existing test assertions pinning the `install Agnes`
substring still match.
- NEW Step 6 `Optional: create a one-word shortcut for next time` —
prints an `echo 'alias {{workspace_dir|lower}}=…' >> ~/.zshrc`
one-liner for Unix and an `Add-Content $PROFILE …` equivalent for
Windows. OS tabs + copy buttons reuse the existing wizard chrome.
- Progress chip dynamic: `Step 1 of 6` when home_automode is on,
`Step 1 of 5` when off. Progress bar fill `100 // total_steps` so
the bar sits at 16-20 % on first paint.
- `.step-lede` token added for the new short body copy beneath each
step label (14.5px / ink-soft).
- `macOS / Linux / WSL` tab labels changed to `macOS / Linux` per
user instruction. Terminal-howto `WSL:` paragraph dropped; the
paste-shortcut hint now reads `(Linux)` instead of `(Linux/WSL)`.
Functional WSL handling in `connector_prompts.py` (it's a Linux
detection fallback, not user-facing label) preserved.
- `setup_instructions.py` Claude Code install hint:
`npm (Linux / WSL)` → `npm (Linux)`.
SURFACES — 4 CARDS PER MOCK
- Replaced the 3-tile `.home-usage-grid` with a 4-card grid:
- VS Code (Recommended) — `.surface-card.feature`, green ring,
DAILY USE eyebrow + 5-step numbered list + `Open VS Code setup
guide →` link to `/setup-advanced#vscode`.
- Terminal — QUICK ACCESS eyebrow + 4-step list.
- Claude Code (Desktop app) — CONNECT IT eyebrow + 4-step list.
- Cowork (claude.ai) — `.surface-card.incomplete`, warn-tinted
border + `Instructions needed` badge + a TODO callout describing
the missing content. The card is intentionally honest about the
gap rather than hiding it.
TEST UPDATES
- `test_web_home_page.py` negative onboarded-state assertions
rebased on the new step labels (6 entries instead of 4).
- `test_home_route_resolution.py` `test_home_renders_automode_block_by_default`
+ its `_when_env_off` counterpart now check the new
`Step 4 — Launch Claude with auto-approve on` label.
* fix(web): /home section content + layout — verbatim mock match
User comparison flagged several remaining gaps; this patch rewrites
the three lower sections of /home to match the CEO mock spec exactly:
FIRST-SESSION (5 beats)
- h2 28px / 700 / -.5px tracking (was 19px / 600).
- lede 18px ink-soft (was 13.5px secondary).
- `.session-walk` wrapper, 36px gap between beats (mock spec).
- `.session-step` grid 48px / 1fr, gap 22px — number circle on
the left, content on the right.
- `.session-num` 40 × 40 circle with SOLID GREEN bg (`--primary`)
and WHITE text + soft green shadow (was 28px mint pill w/
dark-green text).
- `.session-content h3` 18px / 600 (was 14.5px / 600).
- `.session-content > p` 15px.
- `.session-content .annotation` 13.5px ink-muted body type with
`strong` for highlighting (replaces the upper-case "WHAT'S
HAPPENING" eyebrow pattern that didn't match the mock).
- `.session-intro` callout card (white surface + mint icon block)
framing the "five beats" tagline.
- `.session-tldr` summary box (brand-light bg + brand-dark left
border) wrapping up the loop.
- Terminal frames re-skinned: `#0c1224` body / `#182241` bar /
real macOS traffic-light colors `#ff5f57` / `#febc2e` / `#28c840`.
- Terminal body 13px / 1.65 line-height with mock-spec class
vocabulary: `.you` (yellow input), `.ai-name` (brand bold),
`.path` (light blue), `.dim` (translucent code-ink), `.caret`
(blinking cursor).
- Five beats rewritten with mock's exact narrative flow (launch →
menu → pick → ask → close), vendor-agnostic project names
(`RevenueAnalysis`, `Onboarding`, etc.) replacing the customer-
specific `GRPN_*` examples in the mock. Templated `{{
instance_brand }}` / `{{ workspace_dir }}` / `{{ workspace_dir |
lower }}` (the shortcut alias) everywhere.
SURFACES (4 cards)
- The section is no longer wrapped in a white rectangle; the
`.home-usage` class loses its bg + border + padding (mock has the
cards directly on the page bg).
- h2 28px (was 22px). Eyebrow 12px / 1.5px tracking / brand-dark.
- `.surface-card.feature` (VS Code) now uses 2px green border +
vertical brand-light → white gradient (was 1px ring).
- `.surface-card.incomplete` (Cowork) uses 2px red border (`#e35e5e`)
+ vertical red-tint → white gradient (was yellow flat bg).
- `.surface-card .steps` panel: inner surface-dim bg + 8px radius
+ 13px font.
- `.surface-foot` top-border + ink-muted (mock spec).
- `.badge-warn` now a solid red box (`#e35e5e` bg + white ink + 4px
radius) instead of a yellow pill, matching the mock.
- Header layout fixed: the global absorbed `header { display: flex;
justify-content: space-between }` rule was making the h2 sit on
the right of the eyebrow; explicit `display: block` override on
`.home-mock section > header` puts the title on the LEFT under
the eyebrow as the mock has.
BROWSE — Explore your workspace
- Wrapped in `<section class="browse-section">` with proper
eyebrow + h2 + lede (was a bare `.section-label` div).
- `.browse-grid` 5-col grid (was responsive auto-fill, 4-card
layout). Skills tile added as a 5th card linking to
`/marketplace?type=skills`.
- `.browse-card` mock-spec: 22 20 padding, 28px icon, 15px title,
12.5px ink-muted desc, hover lifts -2px with brand border +
shadow-md.
Section wrappers (`.home-usage`, `.first-session`) no longer carry
the white card chrome — they sit directly on the page bg, matching
the mock. Only Getting Started + Overview keep their white cards.
GLOBAL eyebrow vocabulary (`.home-hero-intro .eyebrow`,
`.first-session > .eyebrow`, `.surfaces > header .eyebrow`,
`.browse-section .eyebrow`) all aligned to mock spec: 12px / 700 /
1.5px tracking / brand-dark color / 14px bottom margin.
Hero h1 bumped to 44px / 800 / -1px tracking (was 32px / 600).
51/51 home tests pass.
* fix(web): /home session-intro card + terminal-body verbatim mock match
User comparison flagged three remaining /home gaps; this patch
addresses each:
- `.session-intro` rule was missing — the "five beats" tagline
rendered as a bare line with no card chrome. Added the mock-
spec card: white surface, 14px radius, 20×24 padding, 1px
border + shadow-sm, with a 44×44 brand-light icon block on the
left.
- Beat 1 terminal-title was `~/{{ workspace_dir }} — zsh` (mock-
style shell-pwd format), but the user wants every terminal
frame across all 5 beats to read `claude — {{ instance_brand }}`.
Updated.
- Terminal-body line structure for beats 2-5 rewritten verbatim
from the CEO mock:
- `<span class="prompt">></span><span class="you">…</span>`
now has no space between the prompt and user input (mock
pattern: zero gap, the .prompt's `margin-right: 8px` provides
the visual separation).
- Beat 2 menu items use `<strong>[N]</strong>` numbering with
project entries on indented lines, each project name followed
by a `<span class="dim">(N ago)</span>` timestamp at a fixed
column — instead of my prior single-line concatenation.
- Beat 3 narrative split into 4 stanzas separated by blank lines
(matches mock): the "Switched to <strong>X</strong>" status,
then dim Loaded/Last-session lines, then a stand-alone "One
unprocessed input detected:" pair, then the "Want me to
process …" question. My prior version dim-wrapped the entire
block, which looked off.
- Beat 4 narrative split into headline summary + risks section
with <strong> heads + bullet lists separated by blank lines,
matching the mock's "Q1 close summary" / "Open risks" rhythm.
The Q1 question carries the mock's manual line-break + 2-
space continuation indent inside the `.you` span — without
that, terminal-body's `white-space: pre-wrap` would auto-wrap
awkwardly at a different column than the mock.
- Beat 5 exit narrative uses two separate dim lines + a
standalone `.ai-name` "See you next time." line, then prompt
+ caret. My prior version collapsed everything into one dim
block.
- Project names changed from customer-specific (`GRPN_*`) to
generic (RevenueAnalysis, WeeklyReview, Onboarding, OpsDb,
HRHandShake) so the OSS distribution stays vendor-agnostic
per CLAUDE.md.
- `Marketing plan` examples replaced with `Q1 close` so the
narrative stays plausible for an analyst audience.
12/12 home tests pass.
* fix(web): /home surfaces verbatim mock — VS Code thumb, Terminal expected-output, NEW badge
User comparison flagged three remaining surface-section gaps:
- VS Code surface card was rendering a generic "Screenshot pending"
placeholder; the mock has a labeled inline mockup
(`<a class="vscode-thumb">` w/ `.thumb-fallback`) showing the
recommended 4-pane layout (EXPLORER yellow, TERMINAL 1 purple,
TERMINAL 2 green, TERMINAL 3 orange) on a dark navy bg + a
"Recommended layout" caption pill. CSS `.vscode-thumb` block
added — uses gradient-strip backgrounds to draw the colored
panel bars without needing a base64 image.
- "Recommended" badge was a pill (999px radius) with
`--brand-accent` bg + navy text. Mock uses `.badge` instead of
`.recommend-pill` — solid `--primary` (brand-dark green) bg
with WHITE text and 4px radius. Replaced the class + CSS rule
so the badge reads as a tag, not a pill.
- Terminal surface card was missing the "What you should see"
subsection — mock has an `.expected-output` block showing a
sample of the welcome menu inside a dim dashed panel. Added the
block with the mock's exact rendered output (templated to
`{{ instance_brand }}` + generic project names instead of
customer-specific GRPN entries) plus the `.expected-output`
CSS (surface-dim bg + dashed border + `::before` "WHAT YOU
SHOULD SEE" eyebrow per mock spec).
Also addressed the explore-section feedback:
- Skills browse-card now carries the `new` class so it picks up
the `.browse-card.new::after` corner badge ("NEW", green bg,
white text, 10px / 700 / 0.5px tracking) per mock.
- Browse cards align same height via `align-self: stretch` (grid
default) + `flex-grow: 1` on `.browse-desc` so descriptions
fill remaining vertical space; previously the Skills tile sat
shorter because its desc text was longer than others'.
Structural HTML changes to all four surface cards: dropped the
inner `<div class="surface-card-head">` wrapper + `<p
class="surface-pitch">` class in favor of mock's flat layout
(`.what` + `.steps` + `.when-to-use`). `<ol class="surface-steps">`
replaced with `<div class="steps"><strong
class="steps-eyebrow">DAILY USE / QUICK ACCESS / CONNECT IT</strong>
<ol>...</ol></div>` so the eyebrow + numbered list share the
mock's tinted surface-dim panel.
12/12 home tests pass.
* fix(web): align /home setup walkthrough to design spec
- Setup-section header (eyebrow + heading + lede) floats above the
install hero; install card has no accent strip; step labels drop
`Step N —` prefix; closing strip is single flex row.
- VS Code surface card renders recommended-layout screenshot from
`/static/img/vscode-layout.png` with click-to-enlarge lightbox.
- Workspace install path cascades to `~/Desktop/{workspace_dir}` in
every step, surface card, first-session annotation, and shortcut.
- Step 1 verify text restores Enterprise — Finance and Legal option.
- Step 6 shortcut installs a shell function with arg forwarding
(`"$@"` unix / `@args` windows) and a user-facing Auto / YOLO
permission-mode toggle.
- Step 5 manual-fallback details inline on the CTA row; description
reads at step-lede size, not 13px chip.
- Setup-section heading no longer right-aligns (was inheriting
`header { display: flex; justify-content: space-between }` from
the legacy stylesheet; wrapper changed to `<div>`).
- Getting Started `<details>` block removed (duplicated links).
* test(web): align /home tests with restructured setup wizard
- Replace test_getting_started_card_renders_on_home with
test_setup_section_renders_for_not_onboarded — asserts the new
setup-section-header floats above the install hero and Getting
Started markup is absent (block removed in the prior commit).
- Update automode-block test to match labels without the
`Step N —` prefix.
- Update setup-CTA partial test to match the relabeled
"Copy install script to clipboard" button.
Drop orphaned CSS for `.home-getting-started`, `.home-gs-summary*`,
and `.home-gs-item` — selectors had no matching markup after the
Getting Started block was removed.
Also: Step 3 `pwd` expected-output uses an absolute path
(`/Users/yourname/Desktop/{workspace_dir}`) instead of the
tilde-prefixed form, matching what the command actually prints.
* fix(web): repaint home_onboarded + setup_advanced; align CTA label
- home_onboarded + setup_advanced still carried the retired blue
`#0056A3` as both `--hp-primary-dark` and the hero gradient
endpoint. Both reference `var(--primary-dark)` now so the green
palette cascades.
- setup_advanced YOLO snippet was the old `alias` form (no cd, no
arg forwarding). Replaced with the shell function variant from
/home Step 6 — drops into ~/Desktop/{workspace_dir} and forwards
"\$@" (unix) / @args (Windows).
- setup_advanced ~/{workspace_dir} path references cascaded to
~/Desktop/{workspace_dir} so install story matches /home.
- Dashboard's "Setup a new Claude Code" button label aligned to the
canonical "Copy install script to clipboard" — matches /home and
the new docstring in _claude_setup_cta.jinja, which now mandates
this label across consumers.
* fix(web): keep base brand blue; scope green palette to /home redesign
User noticed login + dashboard had turned green when the /home
redesign flipped --primary from blue (#0073D1) to green (#2ea877)
in commit 278f202e. The brand-wide flip went further than the
redesign needed — only /home, /home (onboarded), and /setup-advanced
intentionally use the green/navy spec; every other page (login,
dashboard, catalog, marketplace, admin, profile) was just inheriting
the green because --primary cascaded everywhere.
Revert the global brand colour to blue and lock the green into the
two outstanding redesign scopes:
- style-custom.css: --primary back to #0073D1, --primary-light back
to rgba(0,115,209,0.1), --primary-dark back to #005BA3,
--brand-accent back to a lighter blue.
- home_onboarded.html: .home-mock now sets --hp-primary,
--hp-primary-dark, --hp-primary-light to explicit green hex
(matching home_not_onboarded), so the hero stays green regardless
of the global brand.
- setup_advanced.html: same lock — .advanced-mock pins the green
palette in-scope.
Hero gradients on both pages now reference the local --hp-primary
chain (not the global --primary), so any future palette tweak inside
either scope cascades correctly without disturbing the rest of the app.
* refactor(web): hoist --hp-* into shared design-tokens.css (--ds-*)
PR 2 of the design-system extraction ladder. Pure mechanical rename
+ dedup; no visual diff on any rendered page (verified on /home,
/dashboard).
- New app/web/static/css/design-tokens.css declares the full token
set on :root: brand surface (green primary, primary-dark, mint
light, brand-accent), hero (navy bg + ink), code-panel (near-black
bg + cool ink + warm-yellow), light surfaces (bg/surface/border),
text (primary/secondary/muted), orange accent, info + warn
callout vocabularies, navy-tinted elevation shadows, system font
stack + mono.
- base.html loads it alongside style-custom.css so the tokens are
globally available.
- Rename --hp-* -> --ds-* in home_not_onboarded (313 refs),
home_onboarded (15), setup_advanced (39). 367 token references
pointed at one of three local blocks; now all point at the
global :root.
- Drop the three local token blocks. Each scope class
(.home-mock / .advanced-mock) only keeps its base ink + font-size
+ line-height rules.
The legacy --primary family stays canonical for the blue base
brand — login, dashboard, catalog, marketplace, admin still read
blue. The design system is opt-in via the scope class.
* refactor(web): extract shared components.css; migrate /home markup
PR 3 of the design-system extraction ladder. First batch of
reusable components lifted out of home_not_onboarded.html into a
new shared stylesheet; markup migrated to consume them.
- New app/web/static/css/components.css with five components, all
reusable on any page that loads design-tokens.css:
.callout-rec — amber lightbulb recommendation box
.callout-hint — blue info hint box
.code-output — "WHAT YOU SHOULD SEE" terminal output block
.lightbox — full-bleed image enlarge overlay
.setup-section-header — wizard header (eyebrow + h2 + lede)
- base.html loads components.css after design-tokens.css.
- home_not_onboarded.html markup renamed:
class="rec" -> class="callout-rec"
class="hint" -> class="callout-hint"
class="expected-output" -> class="code-output"
- Local CSS rules removed from home_not_onboarded.html for each of
the extracted components — ~150 lines down to 5-line "extracted to
components.css" comments. The bespoke wizard-specific styles
(.install-cmd, .os-tabs, .mode-tabs, .terminal-frame) stay
template-local for now since they only have one consumer.
Visual regression check: /home install hero renders the amber rec
callout, blue hint callout, dashed code-output block, green section
header, and click-to-enlarge VS Code thumb identically to the
pre-extraction render. 43 home tests pass.
* fix(web): unify page-headers — activity-center full-width, marketplace shares box
- /activity-center audit-log hero rendered as half-width because the
_page_hero include was inside <header class="obs-topbar">, a flex
row that pinned the time-range + auto-refresh controls next to it.
The hero is now a sibling rendered before the <header>, so it
spans the full container width like every other admin page; the
controls keep their flex row underneath.
- Marketplace hero unified with .page-header--hero. Markup is now
<section class="page-header page-header--hero mp-hero"> so the
shared box drives padding/radius/gradient/max-width/shadow; the
.mp-hero override block only carries the right-anchored cover
image and the rules for the search row + scope checkboxes (which
the canonical hero doesn't have). Inner text uses the canonical
.page-header__eyebrow / __title / __subtitle classes.
- .page-header--hero shadow tint now follows the brand blue
(rgba(0, 115, 209, 0.2)) instead of the leftover green from the
prior palette flip; same depth highlight everywhere the gradient
is blue.
* fix(web): unify remaining page heroes — admin, profile, install, store, stack
Sweep across pages that carried bespoke gradient hero markup so
every page-hero shares the canonical `.page-header--hero`
dimensions (padding 28/32/24, border-radius 14, max-width
var(--width-app), navy-tinted shadow, gradient with --primary →
--primary-dark). Inner text uses the .page-header__eyebrow /
__title / __subtitle classes so typography matches across the app.
- admin_tables: migrated to _page_hero.html include.
- admin_tokens: kept .tokens-hero wrapper for the counts-chip row
but added the canonical class on the same element; stripped
duplicate gradient + padding + typography rules.
- install: same pattern (kept hero-meta pill row).
- profile: migrated to _page_hero.html include.
- store_upload: kept .upload-hero wrapper for the .meta chip row;
composite class with the canonical hero.
- setup_advanced: .advanced-mock .ad-hero now matches canonical
dimensions; green palette retained via --ds-primary/dark.
- stack_card.css: .stack-hero (catalog + corporate-memory search
hero) uses canonical gradient + padding + max-width.
The detail-page heroes (marketplace_plugin_detail,
marketplace_item_detail, catalog_*_detail, store_edit,
admin_group_detail, admin_store_submission_detail) stay bespoke
for now — they're rich detail headers with photos, badges, install
actions; converting them would lose contract context. Same applies
to dashboard.html env-setup-cta (it's a CTA card, not a page hero).
* fix(web): canonicalise .container — single page shell every page inherits
Previously each admin page set its own `.container:has(.<page>)
{max-width: none}` + `.<page>-page {max-width: 1400px}` override,
and per-page hero markup either nested inside flex toolbars (which
pinned the hero next to filter controls and squeezed it half-width)
or self-constrained with a different max-width than the page. /home,
/dashboard, /marketplace, and /admin/* ended up at different widths
with different nav-to-hero gaps.
- style-custom.css `.container` now carries the canonical 1280px
max-width + `16px 32px 48px` padding so every page inherits the
same nav-to-hero gap and side gutters. `.container > main` is
margin/padding 0 so the container is the sole owner of gutters.
- `.page-header--hero` drops its self-constraining max-width and
auto-centering margin — the container provides the width, so the
hero sits flush with the table/toolbar below it.
- `.stack-hero` (catalog + corporate-memory) and `.advanced-mock
.ad-hero` (/setup-advanced) follow the same pattern: container
owns the width.
- Per-page max-width overrides stripped from admin_users,
admin_access, admin_groups, admin_marketplaces, admin_welcome,
admin_workspace_prompt.
- _page_hero include extracted from inside flex toolbars on
admin_users, admin_access, admin_groups, admin_marketplaces,
admin_server_config, admin_welcome, admin_workspace_prompt,
admin_sessions, admin_session_detail, admin_usage,
activity_center. The toolbar (`.users-toolbar`, `.gp-toolbar`,
etc.) keeps only the filter + action controls; hero renders
before it as a sibling.
- _page_chrome.html trimmed to just the page-background tint for
the redesign scopes; the duplicate `.container` rules it carried
are now redundant.
Verified: /home, /admin/marketplaces, /admin/users all render
container width 1280px with hero top at 88px (16px below the
72px-tall sticky nav). Same spacing as /home design spec.
* fix(web): admin_tables + admin_corporate_memory inherit canonical .container
Both pages were overriding `{% block layout %}` from base.html,
which bypasses the canonical `.container` wrapper. Result: hero
span the full viewport (1596px on a wide screen) while the inner
content sat at a narrower max-width — hero and content didn't
align, and the nav-to-hero gap differed from every other admin
page.
Switched both templates to `{% block content %}` so they render
inside the canonical `.container` from base.html — same path as
admin_groups, admin_users, admin_marketplaces, etc.
- admin_tables: dropped local `.page-title { max-width: 1600px }`
+ `.content { max-width: 1600px }` overrides (kept typography +
inner gutter rules) and the mobile padding overrides that paired
with them. Container now owns the gutters.
- admin_corporate_memory: only the block keyword needed changing;
the template already had a clean inner structure (no max-width
override on `.container-memory`).
Verified on /admin/tables and /admin/corporate-memory:
- .container width 1280, padding 16/32/48
- Hero top 88 (nav 72 + container padding-top 16)
- Hero + content both 1216px wide, both at left 190 — perfect
alignment with /admin/groups.
* fix(web): drop .page-shell padding override + admin_tables stale :root
Two regressions discovered after the canonical-container unification:
1. `.container:has(.page-shell)` still set `padding: 28px 32px 48px`
while the canonical `.container` had moved to `16px 32px 48px`.
Every page-shell consumer (/admin/sessions, /admin/sessions/<id>,
/admin/usage, /marketplace, /dashboard, marketplace detail pages,
/me/activity, /store/*, /admin/store-submissions) was rendering
with a 28px nav-to-hero gap while /admin/users + /admin/groups
rendered with 16px. Same width, mismatched vertical rhythm.
The opt-in rule is now a no-op marker: canonical container
already provides 1280px + 16/32/48 + main margin/padding 0.
2. admin_tables.html had a stale `<style>` block that re-declared
`:root { --primary: var(--primary); ... }`. The self-referential
token resolved to empty, collapsing the page-header hero's
`linear-gradient(135deg, var(--primary), var(--primary-dark))`
to no background — the hero appeared as a pale ghost without
colour. The entire shadow `:root` block was a stale copy of the
design tokens that style-custom.css already provides. Dropped
it; tokens now resolve from the global `:root`.
After both fixes /admin/sessions, /admin/tables, and every other
page-shell consumer match /admin/groups exactly: container 1280px,
container padding-top 16px, hero at top 88px / left 190px / width
1216px.
* fix(web): drop /admin/tokens .tokens-page width + padding override
`.tokens-page` carried its own `max-width: 1280px; margin: 0 auto;
padding: 28px 8px 48px` block — the canonical `.container` already
provides width + 16/32/48 padding, so the nested wrapper was
adding 28px on top of the container's 16px (= 44px nav-to-hero
gap, vs 16px on every other admin page) and shrinking the hero
sideways by 8px on each side (1200px vs the canonical 1216px).
After: container owns the layout; `.tokens-page` is just a
font-family scope. /admin/tokens hero now sits at top 88, left 190,
width 1216 — same numbers as /admin/groups / /admin/users.
* fix(web): hero links readable on blue; /admin/access Groups link href
- New `.page-header--hero a` rule in style-custom.css forces any
anchor inside a gradient hero to render white + underlined so
links stay readable on the blue background. Previously links
inherited the global `var(--primary)` blue, which disappeared
on top of the matching blue gradient. No per-page class needed —
drop a plain `<a>` in any hero subtitle and it just works.
- /admin/access hero subtitle was Jinja-passing the inline link
with HTML-entity-encoded quotes (`href="..."`). The
entities decoded to literal `"` characters inside the rendered
href, producing `/admin/%22/admin/groups%22` — a 404. Switched
the `set` to a block-set (`{% set page_hero_subtitle %}...{% endset %}`)
so the inline `<a href="/admin/groups">Groups</a>` survives
unescaped through `_page_hero.html`. Also stripped the now-redundant
inline `style="color:#fff;text-decoration:underline;"` — the new
shared rule handles it.
* fix(web): /dashboard top padding matches every other page
`.main` on /dashboard had `padding: 28px 32px 48px` while every
other page now uses `16px 32px 48px` via the canonical
`.container`. Dashboard bypasses `.container` (overrides
base.html's `layout` block to render a full-width `<main>`
directly), so the padding lives on `.main` itself — bumped the
top to 16px to match.
After: first child top = 88, left = 190, width = 1216 — same
numbers as /admin/groups / /admin/users / /admin/marketplaces.
* fix(web): green eyebrow + white title on .page-header--hero (matches /home)
`.page-header--hero .page-header__eyebrow` was faint white
(rgba(255,255,255,0.75)) — readable but unbranded against the blue
gradient. Changed to `var(--ds-brand-accent)` (mint green #54d3a0)
so every page hero pairs a green eyebrow with white title +
subtitle, echoing /home's setup-section header (green eyebrow,
dark heading combo). One CSS rule applies everywhere — no
per-page styling needed.
Also bumped the eyebrow to font-weight 700 / letter-spacing 1.2px
so the green stands out cleanly against the gradient.
* fix(web): page-header--hero + stack-hero use /home navy gradient
`.page-header--hero` and `.stack-hero` were on the brand-blue
gradient (`var(--primary)` → `var(--primary-dark)`) while
/home's hero (`.home-hero-intro`) sits on the deeper navy
gradient (`#0f1b3a` → `#1a2a5f`). Every other page-hero now
uses that same navy gradient so /home, /marketplace, /catalog,
/corporate-memory, /admin/*, /profile, /install, /dashboard,
/setup-advanced share one brand surface. Shadow tint adjusted
to the navy depth (rgba(15, 27, 58, 0.22)).
Brand blue stays the link/CTA colour everywhere else; only the
hero box itself is navy.
* fix(web): primary buttons green; marketplace tabs navy translucent
Two parity tweaks pulling the rest of the app toward /home's
visual language.
- `.btn-primary` (both rules in style-custom.css) now uses
`var(--ds-primary)` / `var(--ds-primary-dark)` green fill,
matching the "Copy install script to clipboard" button on
/home. Brand-blue `--primary` still drives link colour and the
accent surface; only the filled button background flipped to
green. Every page with a `.btn-primary` (admin "+Add user",
"+Add marketplace", catalog, marketplace actions, dashboard,
modals) now reads as the same "do it" affordance.
- `.mp-tabs` (Curated Marketplace / Flea Market / My Stack tab
group) now sits on the navy `--ds-hero-bg` with translucent
white pills (rgba(255,255,255,0.10) inactive, 0.18 active) —
same translucent-white-on-navy treatment as the "Just browse —
no install needed" pill on /home. Icons render as soft white;
per-tab colour-coding dropped in favour of the unified surface.
* fix(web): catalog/memory tabs + empty-state CTA + admin action buttons
Bring /catalog and /memory in line with /home + /marketplace:
- `.stack-tabs` (Browse / My Stack / Recipes on /catalog,
Browse / My Stack on /memory) now uses the navy `--ds-hero-bg`
container with translucent-white-on-navy pills, mirroring the
`.mp-tabs` treatment and /home's "Just browse — no install
needed" CTA pill. Per-tab icon colour-coding dropped — icons
render as soft white on the navy fill.
- `.stack-tabs-row__actions .btn` (right-slot "+New Recipe",
"+New Data Package" admin CTAs) now uses green primary fill
(`--ds-primary`), matching `.btn-primary` and /home's
"Copy install script to clipboard" button.
- `.stack-empty .cta a` (empty-state action button — the
"Open /admin/tables →" CTA on /catalog and equivalent on
/memory) flipped from blue `--primary` to green `--ds-primary`
so the colour aligns with every other primary button in the app.
* fix(web): marketplace Search button green (--ds-primary) matching other CTAs
* fix(web): unify Search button + admin-action button across browse pages
- Added Search button (`<button class="stack-hero__search-btn">`)
to /catalog and /memory heroes — same green pill as /marketplace.
Wired to the existing live-filter pipeline (button click runs
`applyFilters()` and refocuses the input). All three browse pages
now wear the identical search bar UI.
- `.stack-hero__search-btn` shares `--ds-primary` fill with
`.mp-hero .search-btn`.
- `.mp-actions .btn` ("Submit a skill or plugin" CTA on /marketplace)
flipped from the legacy blue-outline to the same green primary
fill + dimensions (`display: inline-flex; line-height: 1;
padding: 9px 16px; gap: 6px`) as `.stack-tabs-row__actions .btn`
on /catalog and /memory. All three right-slot action buttons
render at identical height now.
- `.stack-tabs-row__actions .btn` got `inline-flex` + `line-height: 1`
+ `gap: 6px` so a `<button class="btn">` and a `<a class="btn">`
both render at exactly 33px high — the embedded
`.admin-only-hint` chip no longer pushes one variant taller
than the other.
* fix(web): marketplace guide CTAs green (fastpath + primary); drop flea purple
* fix(web): dashboard CTA hero on navy; readable <code> chips in hero
- `.env-setup-cta` on /dashboard ("Set up a new Claude Code"
card) flipped from the brand-blue gradient + green-tinted shadow
to the canonical navy gradient (`--ds-hero-bg` → `#1a2a5f`) with
navy-tinted shadow + 14px radius + 28/32/24 padding, matching
`.page-header--hero` and /home's `.home-hero-intro`. Dashboard's
top CTA now sits on the same brand surface as every other hero.
- Added `.page-header--hero code` rule — translucent white pill +
warm-yellow ink (#ffd866) so `<code>` chips embedded in hero
subtitles read as code samples against the navy gradient. The
global `code` rule sets `color: var(--text-primary)` (dark),
which turned in-hero chips into invisible dark-on-white-on-navy
ghosts (e.g. the `-by-dev` suffix on /store/new).
- /store/new's `.page-header__subtitle code` dropped its inline
style override — the shared rule handles it now.
* feat(web): two-theme switching via data-theme + admin toggle
Introduces a theme system that flips the entire UI palette between
"navy" (current design, default) and "blue" (pre-redesign palette)
via a single `<html data-theme="...">` attribute. Page markup, class
names, and component styles don't change — only the `--ds-*` token
values flip.
Backend
- New `app/instance_config.py::get_instance_theme()` resolves the
active theme from `AGNES_INSTANCE_THEME` env > `instance.theme`
in instance.yaml > default "navy". Unrecognised values clamp to
"navy" so a typo doesn't break the page.
- `app/web/router.py::_build_context` injects `instance_theme`
alongside `instance_brand` etc. so every template inherits it.
- `app/web/templates/base.html` renders
`<html lang="en" data-theme="{{ instance_theme | default('navy') }}">`.
CSS
- `app/web/static/css/design-tokens.css` adds two new tokens to
the default `:root` set: `--ds-hero-shadow` (drop-shadow tint
on hero boxes) and `--ds-hero-eyebrow` (eyebrow accent colour).
Plus a `:root[data-theme="blue"]` override block that flips
seven tokens: `--ds-primary`, `--ds-primary-dark`,
`--ds-primary-light`, `--ds-brand-accent`, `--ds-hero-bg`,
`--ds-hero-bg-deep`, `--ds-hero-shadow`, `--ds-hero-eyebrow`.
The blue theme aliases the brand surface tokens back to the
legacy `--primary` family.
- `.page-header--hero`, `.stack-hero`, `.env-setup-cta`,
`.home-mock .home-hero-intro` now reference the new
`--ds-hero-shadow` and `--ds-hero-bg-deep` tokens instead of
hard-coding `rgba(15, 27, 58, 0.22)` and `#1a2a5f` — gradient +
shadow now flip with the theme.
- `.page-header--hero .page-header__eyebrow` uses
`var(--ds-hero-eyebrow)` so the eyebrow goes mint-green on
navy and translucent-white on blue (mint on blue reads poorly).
Admin
- `app/api/admin.py::_KNOWN_FIELDS["instance"]` now registers a
`theme` field of kind `select` with options `["navy", "blue"]`
and a `hint` explaining the trade-off. The existing
/admin/server-config UI auto-renders a select for this — no
template changes needed.
Defaults
- Default value is "navy" so existing instances see no visual
change. Admins flip to "blue" via /admin/server-config to
restore the pre-redesign look.
Restart note: uvicorn must reload to pick up the Python changes
(new getter, new template-context key, new known-field). CSS
changes hot-reload via browser refresh.
* fix(web): blue theme — home hero eyebrow + CTA contrast
`.home-hero-intro .eyebrow` and `.btn-intro-primary` referenced
`--ds-brand-accent` directly, which on the blue theme resolves to
the lighter brand-accent blue (#4F9DEB). Result: light-blue eyebrow
on the blue gradient ("WELCOME, ADMIN" barely readable) and a
light-blue button with darker-blue text ("Set up in ~15 min")
that all sat in the same hue range.
Introduces three new theme-aware tokens:
- `--ds-hero-eyebrow` already existed; blue theme bumped opacity
to 0.92 so the eyebrow reads as full white.
- `--ds-hero-cta-bg` + `--ds-hero-cta-fg` + `--ds-hero-cta-bg-hover`
flip the primary hero CTA: mint-green on navy (default), white-
on-blue under `data-theme="blue"`.
`.home-hero-intro .eyebrow` now uses `--ds-hero-eyebrow` (mint on
navy / white on blue) and `.btn-intro-primary` uses the CTA token
trio.
Recommended palette on blue theme:
- Eyebrow: white at 92% opacity (clear on the blue gradient).
- Primary CTA pill: white background, brand-blue dark text
(`--primary-dark` = #005BA3) for AAA-level contrast.
- Secondary CTA: translucent white pill (unchanged).
* fix(web): blue theme — callout-hint info bg/border/ink re-tinted to brand blue (was indigo, clashed with brand-blue hero)
* fix(store): promote-on-approve looks up version_no by submission_id
Live bug observed on agnes-development: an entity had 5+
version_history rows sharing the same `hash` (user re-uploaded
byte-identical bundles as v2/v4/v6 of the same skill — the LLM and
inline checks happily approved each one). The runner's
promote-on-approve path looked up the submission's version_no by
hash:
for entry in entity.version_history:
if entry["hash"] == sub_hash:
target = int(entry["n"]); break
The loop matched the FIRST hash collision — always v1, n=1. With
current=1, the forward-only `target > current` guard then skipped
the promote, leaving the entity stuck at v1 even though the new
submission's status flipped to `approved`. UI kept showing v1 as
"current".
Fix: look up by submission_id via the existing
`_version_no_for_submission` helper (already used by retry / rescan
/ download paths). Same lookup applied in
`admin_override_store_submission` which had the identical hash-match
loop.
Test: TestPromoteLookupByByteIdenticalBundles uploads v1 + a
byte-identical v2, drives the LLM with mock-approve, asserts
entity.version_no advances to 2.
* fix: bundle #329 reviewer-Important follow-ups + post-merge polish
Bundled with Vojtech's commit ahead of this (the promote-on-approve
`version_no` lookup-by-submission_id fix) since #330 is the next
release-cut PR and the four #329 follow-ups would otherwise need a
standalone release-cut PR — prohibited by docs/RELEASING.md §
"Release-cut belongs to the PR".
Fixed:
- src/usage_ask.py — SCHEMA_DIGEST + SYSTEM_PROMPT referenced the
dropped `usage_plugin_daily` table. The admin
`POST /api/admin/telemetry/ask` endpoint ships SYSTEM_PROMPT to
the LLM, so any model-emitted SQL against `usage_plugin_daily`
would fail with a DuckDB binder error post-#329 merge. Updated to
describe the new v48 rollups (`usage_marketplace_item_daily` /
`_window`) and rule 5 of the prompt to point at them.
Internal:
- CHANGELOG.md [0.54.20] section restored to its canonical content
from the v0.54.20 git tag. The #329 self-merge carried 226 lines
of author's pre-rebase bullets that ended up mis-attributed; the
published v0.54.20 GitHub Release (FTS BM25 + batch bar) now
matches the CHANGELOG section verbatim. Also fills in [Unreleased]
with this PR's bullets (Fixed + Internal).
- tests/conftest.py — dropped the unused
`conn_with_usage_schema_and_attribution` fixture that INSERTed
into the now-removed `usage_attribution_*` tables. Zero callers
today, but a tripwire — the first future test to request it would
have failed with a binder error.
- app/web/templates/marketplace.html — replaced a customer-specific
token (`groupon-marketplace`) in the Most Popular sort-tiebreaker
comment with a generic `<customer>-marketplace` placeholder per
CLAUDE.md § Vendor-agnostic OSS. Also scrubbed an `agnes-development`
reference in app/api/admin.py and src/store_guardrails/runner.py
(cherry-picked from Vojtech's commit) on the same hygiene rule.
* release: 0.54.22 — flea-market promote-by-submission_id fix + #329 reviewer follow-ups
---------
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
* feat(telemetry): marketplace item rollup refactor (schema v46)
Replace the v42 attribution layer with prefix-split + live lookup against
marketplace_plugins / store_entities. The v42 design had a latent bug —
AttributionLookup keyed on bare skill names while Claude Code writes
`<plugin>:<local>` in JSONL, so lookups never matched and
usage_plugin_daily stayed empty in every deployment.
Schema (v46 migration):
- Drop usage_attribution_skills / _agents / _commands (mapping tables,
derivable from marketplace_plugins + plugin tree).
- Drop usage_plugin_daily (always empty in production due to the bug above).
- Create usage_marketplace_item_daily — per-day fact (count, distinct_users,
error_count), composite PK on (day, source, type, parent_plugin, name).
- Create usage_marketplace_item_window — sliding-window snapshot with
true cross-window distinct user counts; period_label='last_7d' refreshes
every tick, 'last_30d' refreshes hourly (tracked via session_processor_state).
- Mark usage_tool_daily as candidate for removal (no product-UI consumer).
Attribution flow:
- MarketplaceItemLookup replaces AttributionLookup. Preloads
marketplace_plugins.name + store_entities.name into memory once per
UsageProcessor tick, then per-event splits identifier on ':',
matches prefix, writes resolved source / parent_plugin into
usage_events. agnes-store-bundle prefix routes to flea entities.
Slash commands with `plugin:` prefix count as type='skill' in rollup.
API:
- BREAKING: MarketplaceItem.unique_users_30d renamed to distinct_users_30d
(now a true distinct count from the window snapshot, not sum-of-daily).
- InnerDetailResponse gains a telemetry field — invocations_30d +
distinct_users_30d surfaced on curated inner skill / agent detail pages.
- Card chip hidden pending UX finalisation; data stays in the response.
Backfill: scripts/backfill_marketplace_rollup.py — one-shot rebuild over
historic usage_events after deploy, idempotent.
USAGE_PROCESSOR_VERSION bumped 4 → 5 so the reprocess loop re-attributes
existing events to the new source/ref_id semantics on the next tick.
Tests rewritten: test_session_processor_usage, test_usage_rollups,
test_marketplace_telemetry, test_api_admin_usage_reprocess,
test_db_schema_version, test_home_stats, test_schema_v42_migration.
New: test_backfill_marketplace_rollup.
* fix(marketplace): refresh Most Popular on search + category changes
`loadMostPopular()` early-exits when `state.q` or `state.category` is
set, but the search + category handlers only called `loadItems()` —
so once the section was visible, typing a query or filtering by
category didn't re-run the hide check and the cards stayed on screen
out of scope. Tab + sort handlers already chained the call.
Add the call to runSearch + category pill click handlers (All +
per-category) so the visibility contract holds for every state
mutation that can flip the early-exit condition.
* feat(marketplace): All-plugins section + 7-day Most Popular
Listing layout:
- Always-visible "All plugins" / "All items" / "Your stack" section
header (label swaps per tab) wrapped in `#mp-all-section` so its
margin-collapse mirrors the sibling `#mp-popular-section` and the
spacing from the filter row stays consistent in both layouts.
- Sort dropdown moved from the filter row into the All-* header,
pinned right via `margin-left: auto`. Anchored to its section so
the relationship between sort + grid is obvious.
- `.mp-section-header` gets `min-height: 32px` + `align-items: center`
so the bare-text Most Popular row matches the dropdown-bearing
All-* row.
- `.mp-section-header` margin tightened 24px → 20px on top.
Most Popular:
- Capacity reduced 8 → 4 cards.
- Now reflects a 7-day window (was 30-day). Backend surfaces
`invocations_7d` + `distinct_users_7d` on `MarketplaceItem`
alongside the existing 30d fields; the loader pulls a wider page
(server still sorts by 30d) and re-sorts + filters client-side
on `invocations_7d > 0` so the strip stays "hot right now".
- Section label updated to "Last 7 days".
- Section now renders on both `curated` and `flea` tabs (was
curated-only). Hidden on `my` and whenever search / category
filter is active. Refresh hooks wired into search + category
click handlers so visibility flips immediately on state change.
Backend (`_load_invocation_stats`):
- Single SELECT pulls both `last_30d` and `last_7d` rows from
`usage_marketplace_item_window`; the result dict carries
invocations + distinct_users for both windows.
- Trend (recent_7 vs prior_7) kept on the daily fact table so it
stays independent of the window snapshot's freshness.
* feat(marketplace): Most adopted sort + hide Trending when no trend data
Add a fourth sort option to the All-items dropdown — "Most adopted
(30d)", keyed on `MarketplaceItem.distinct_users_30d` (true 30d
distinct user count from `usage_marketplace_item_window`). Protects
the listing from power-user skew that `most_used` is susceptible to:
one user × 100 invokes can't beat 10 different users × 1 invoke
under adoption sort.
Hide Trending option when the response has no trend data. User
reported `sort=trending` returning an empty grid because every
plugin's `trend_pct` was None (prior-week threshold of >= 3
invocations didn't clear anywhere). Empty grids on a user-selected
sort are worse UX than just not offering the sort — surface what
works, hide what doesn't.
Backend (`app/api/marketplace.py`):
- `_apply_sort` gains a `most_adopted` branch (DESC distinct_users_30d,
ties by name ASC).
- `sort` Literal extended.
- `ItemListResponse.available_sorts` lists the sort keys the UI
should expose for this response. recent/most_used/most_adopted
always; trending only when at least one item in the tab's stats
carries a non-null trend_pct.
- `_available_sorts(stats_dicts)` helper centralises the rule —
curated and flea branches pass one stats dict, my-tab passes both
(option is available when either source has trend data).
Frontend (`app/web/templates/marketplace.html`):
- New `<option value="most_adopted">Most adopted (30d)</option>`
between Most used and Trending.
- URL state allowlist extended so `?sort=most_adopted` round-trips.
- `applyAvailableSorts(available)` runs after each list fetch:
hides options not in the response's available_sorts; if the user
is on a now-unavailable sort, resets to 'recent' and re-fetches.
Search-mode fan-out unions availability across the curated + flea
responses so a hit on either side keeps the option visible.
* feat(marketplace): funnel chip on cards + deterministic Most Popular sort
Card chip — funnel telemetry between description and footer:
[stack-icon] N installed · [user-icon] N active · [bolt-icon] N calls · ↑/↓ N%
- stack_count (new MarketplaceItem field): for curated it's COUNT(*)
on user_plugin_optouts (post-v28 row PRESENCE = subscribed; system
plugins are fanned out to every user via fanout_system_for_user so
the count includes them naturally). For flea it reuses the existing
store_entities.install_count (bumped on install/uninstall).
- distinct_users_30d (existing) — active users in the 30d window.
- invocations_30d (existing) — call volume.
- trend_pct (existing) — week-over-week, both directions: green ↑ /
red ↓, magnitude only (sign in the arrow). Hidden when null.
Backend additions in app/api/marketplace.py:
- MarketplaceItem.stack_count field.
- _load_curated_stack_counts() — one SELECT per render, GROUP BY
(marketplace_id, plugin_name). Wired into the curated + my-tab
branches; flea reads install_count off the entity row directly.
Frontend (app/web/templates/marketplace.html):
- Heroicons solid 24×24 inlined (one helper per icon, all
fill="currentColor" so per-segment colour tokens apply): rectangle-
stack (mirrors the My Stack tab icon), user, bolt, arrow-trending-
up/down.
- Per-segment colour: installed=amber #F59F0A (My Stack accent),
active=green #0e9b6a, calls=orange #f97316. Text stays neutral so
the chip still reads as metadata, the leading glyph carries the
visual cue. Trend pill keeps the full-segment green/red colour.
- Zero state: chip hidden when stack_count == 0 AND invocations_30d
== 0 — brand-new cards aren't visually penalised by a "0·0·0" row.
- Tooltips on every segment via title="…" so hover explains the
number's meaning to anyone uncertain about the icon.
Most Popular section — deterministic ordering:
Previously sorted by invocations_7d DESC with no tie-breakers, so
several cards with identical 7d call counts would swap places on
refresh (JS stable sort fell back on backend order, and the backend's
own tie-breaker for `most_used` was just name ASC — six `grpn`
plugins from six test marketplaces collapse to the same name and
became indeterminate via list_with_filters' created_at order).
New cascading hierarchy (chosen primary now matches what "most
popular" really means — wide adoption, not power-user volume):
1. distinct_users_7d DESC ← adoption / social proof
2. invocations_7d DESC ← volume at equal adoption
3. distinct_users_30d DESC ← broader adoption fallback
4. invocations_30d DESC ← broader volume fallback
5. name ASC ← deterministic textual order
6. marketplace_slug ASC ← splits duplicate plugin names across
marketplaces
Six levels guarantee any two items end at a different sort key, so
the strip is stable across refreshes.
* fix(marketplace): unify Most Popular on 30d + right-align installed chip
Most Popular section was sorting on the 7d window while its cards
rendered 30d numbers — header label promised one thing, cards showed
another. Unified everything on 30d so a card means the same data
everywhere on the page.
- Dropped the "Last 7 days" meta from the Most Popular header.
- Sort cascade now starts on distinct_users_30d, then invocations_30d,
with 7d adoption/volume as recency-aware fallbacks before the name +
marketplace_slug deterministic tail. Six levels guarantee identical
sort keys never produce indeterminate order across refreshes.
- Filter switched from invocations_7d > 0 to invocations_30d > 0 to
match the new horizon.
- Most Popular now only renders on page 1 of the listing. Past initial
discovery, a top-of-list popularity strip on page 2+ would shadow the
results the user paged into. Pager click handler refreshes the
section so navigating back to page 1 re-mounts it.
Chip layout — split engagement vs adoption visually:
[user] N active · [bolt] N calls · [↑/↓] N% [stack] N installed
└────────── LEFT (time-bounded engagement) ────┘ └── RIGHT (all-time) ──┘
- Installed (stack_count) is all-time, decremented on uninstall. Alone
it says little ("12 people installed it") without the engagement
context next to it ("…but did anyone actually use it?"). Visually
separating the two groups makes that distinction obvious — left
group answers "is it used", right answers "does anyone have it".
- Implemented via flex with margin-left:auto on .seg-installed so
installed drifts to the trailing edge.
- Installed tooltip now reads "Currently installed by N users" — the
count is a real-time net (uninstall drops it), and saying "currently"
makes that explicit. Helps when a card shows 0: signals "nobody has
this in their stack right now", not "data missing".
* feat(plugin-detail): telemetry chip in hero, derived rows in sidebar
Surface the same telemetry funnel the listing card carries on the
curated plugin detail page, so clicking through from /marketplace
keeps a single mental model — figures match, semantics match. The
detail sidebar drops the two raw numbers that used to live there
(Invocations 30d / Users 30d — duplicated by the chip now) and
replaces them with two *derived* signals only the daily series can
provide: Active days + Last used.
Backend (app/api/marketplace.py):
- PluginDetailResponse.stack_count — curated reads via
_load_curated_stack_counts(), flea reuses install_count. Frontend
treats both sources uniformly.
- _build_telemetry() always returns a dict (never None). Frontend
decides chip visibility from stack_count + invocations_30d the
same way the listing card does. daily_series is always 30 entries
(zero-padded) so "Active days" and "Last used" derivations on the
sidebar are trivial array filters.
Frontend (app/web/templates/marketplace_plugin_detail.html):
- New .hero-telemetry slot at the bottom of the hero meta column,
between the pills row and the action buttons. Renders the four
funnel segments — active · calls · trend · installed — joined by
` · `. No left/right split: the hero has space, so a single
coherent metadata strip reads cleaner than the card's split layout.
- Heroicons solid inlined (user / bolt / arrow-trending-up,-down /
rectangle-stack) recoloured against the dark hero — icons in
lighter tokens (mint #6ee7b7, peach #fdba74, cream #fde68a), trend
pill keeps the saturated green/red because direction-coding earns
its own colour.
- Tooltip on installed reads "Currently installed by N users" — the
count is a real-time net (drops on uninstall), and "currently"
makes that explicit when a card shows 0.
- fmtNum helper added so 1.2k / 14M renderings match the card's
format exactly.
- Sidebar swap: Invocations + Users rows removed, replaced by
Active days → "N of 30"
Last used → fmtRelative of the latest non-zero day
Both derived from telemetry.daily_series — engagement consistency
+ recency, neither of which the hero chip exposes on its own.
* feat(item-detail): telemetry chip in hero for curated skill/agent
Bring the funnel chip the plugin detail page got in 4cf38d40 to the
curated inner skill/agent detail page — clicking through from the
listing card now keeps the same metadata strip from grid to plugin
page to inner item page.
Backend (app/api/marketplace.py):
- _load_inner_item_stats() rewritten:
* always returns a dict (never None) so the frontend can decide
chip visibility client-side, same contract as _build_telemetry
* adds trend_pct, computed the same way as plugin level
(recent_7 vs prior_7 from usage_marketplace_item_daily, ≥3
prior-week threshold)
* adds daily_series (30 entries, zero-padded) so the sidebar can
derive Active days + Last used
- InnerDetailResponse.parent_stack_count — new field. Skills/agents
don't have a per-item subscription model, so the hero shows the
*parent plugin's* stack count under a "Plugin:" prefix. The
funnel: "12 installed plugin → 2 actually use this skill".
- curated_skill_detail + curated_agent_detail handlers load
_load_curated_stack_counts() once and pass the parent's value.
Frontend (app/web/templates/marketplace_item_detail.html):
- New .item-detail .hero .hero-telemetry slot beneath the badges
row. CSS mirrors plugin-detail's colour tokens (mint/peach/cream
Heroicons solid + saturated trend pill) so the two surfaces read
as one visual family.
- Installed segment uses a "Plugin:" label rendered with reduced
opacity to signal the metric describes the parent, not the item
itself. Tooltip: "Parent plugin (<plugin_name>) currently
installed by N users".
- Sidebar Invocations + Users rows removed (chip carries them).
Active days + Last used derived from telemetry.daily_series replace
them; only rendered when activeDays > 0 so a brand-new skill
doesn't show "0 of 30" / "Last used —".
- "Type" row dropped from the sidebar — duplicates the hero badge.
- fmtNum helper added (matches listing card + plugin detail).
Plugin detail (app/web/templates/marketplace_plugin_detail.html):
- Hero "Curator: …" line removed. The Details sidebar already
carries that info; duplicating it under the h1 was visual noise.
- Sidebar "Owner" row renamed to "Curator" — for curated plugins
it's a person who curates inclusion in this Agnes instance, not
the upstream code owner. "Owner" was a hold-over label.
* feat(item-detail): unify hero with plugin detail — pills + breadcrumb + cleaner sidebar
- Inner skill/agent hero now uses the same `.pills` / `.pill.cat / .curated /
.flea / .muted` class names + CSS as the plugin detail page; the only
item-only addition is `.pill.type` (Skill / Agent uppercase, plugin detail
has no kind axis).
- Hero `Updated` moved out of the meta-row into a muted pill (mirrors the
plugin detail hero), removed from the Details sidebar to avoid duplication.
- Details sidebar slimmed: dropped Marketplace, Path, Updated rows; Parent
plugin now shows the curator-friendly display name
(`parent_display_name || manifest_name || slug`) instead of the slug.
- Breadcrumb extended to full path: Marketplace > <marketplace_name> >
<plugin display name> > <self>, mirroring the plugin detail breadcrumb.
- Backend: new `InnerDetailResponse.parent_display_name` field, populated via
`_curated_plugin_enrichment` from marketplace-metadata.json — same source
plugin detail hero already uses.
* feat(marketplace): flea inner skill/agent detail + breadcrumb polish
- Flea inner skill/agent detail page parity with curated:
* GET /api/marketplace/flea/{id}/skill/{name} + /agent/{name}
returning InnerDetailResponse (mirror of curated_skill_detail).
* /marketplace/flea/{id}/skill|agent/{name} web routes that render
marketplace_item_detail.html with source='flea' + innerName context.
* Frontend apiURL grows a third branch for flea-inner; breadcrumb
grows to 4 segments (Marketplace > Flea Market > <plugin display
name> > <self>) when innerName is set.
* Telemetry attribution: MarketplaceItemLookup resolves
<flea_plugin>:<inner> prefixes to (source='flea',
parent_plugin=<plugin name>) so nested invocations land in the
same rollups curated nested skills use. USAGE_PROCESSOR_VERSION
bumped 5 -> 6 so the reprocess loop re-attributes historic events.
- Breadcrumb 2nd segment is now a generic clickable "Curated
Marketplace" / "Flea Market" link to /marketplace?tab=... instead
of the opaque per-instance marketplace_name. Applied on both plugin
detail and inner item detail.
- Inner item hero telemetry chip works for both sources: installedCount
branches on parent_stack_count (curated) vs install_count (flea),
installed segment drops the "Plugin:" prefix for flea standalone /
inner items.
- Updated row dropped from Details sidebar on item detail — the hero
pill already carries the value, sidebar row was duplicate.
* feat(item-detail): block stack-install on flea inner items (mirror curated)
Inner skills/agents nested inside a flea plugin can no longer be added
to a user's stack on their own — adoption only happens at the plugin
level, same rule curated nested items have followed since launch.
- Hero action: when innerName is set (curated nested OR flea nested),
render "Open parent plugin →" link + helper text instead of the
install/remove buttons. Flea standalone entities (no innerName) keep
the normal install UX.
- Meta-row: same branch now serves curated + flea inner — "part of
<parent plugin display name> · by <author>" with the parent link
pointing at the right detail page per source.
No API gate change needed: POST /api/store/entities/{id}/install only
accepts existing entity ids (plugin-level), inner items have no entity
id of their own so the endpoint cannot target them directly.
* feat(marketplace): telemetry chip on inner cards + fix flea hero chip visibility
Inner skill/agent cards on the plugin detail page now carry the same
four-segment funnel chip the marketplace listing cards show (N active
. N calls . trend . N installed), for both curated nested skills and
flea nested skills. Plus two fixes that were keeping the hero chip
hidden on flea plugin / flea inner detail pages.
- Backend `_load_inner_items_stats_by_parent(conn, source, parent_plugin)`
bulk loader: one query per plugin against usage_marketplace_item_window
+ one against _daily, returning {(name, type): stats}. Avoids N+1
per-card lookups.
- `InnerItemSummary` gains invocations_30d / distinct_users_30d /
trend_pct / parent_stack_count fields. `curated_detail` and
`flea_detail` (in the entity.type=='plugin' branch) enrich the
skills / agents lists after the existing cover-photo enrichment loop.
- `marketplace_plugin_detail.html`: new `.plugin-detail .inner-card
.inv-chip*` CSS lifted from marketplace.html with the listing-card
rules, new buildInnerCardChip() helper, buildCardSection appends
the chip to each card body. Same gate as the listing card (hidden
on parent_stack==0 && calls==0).
- fix(flea): flea_detail forgot to populate PluginDetailResponse.stack_count
from entity.install_count (listing card does this on line 851; detail
endpoint didn't). Hero chip gate `stackCount===0 && calls===0` then
always hid the chip even when the entity had installs. Now mirrors
listing card semantics: stack_count == install_count for flea.
- fix(flea inner): renderInnerHeroTelemetry was reading `d.install_count`
for any non-curated source. InnerDetailResponse has no install_count
field — it has parent_stack_count (populated server-side from the
parent flea plugin's install_count). Gate + label now read
parent_stack_count for both curated nested AND flea nested scenarios;
install_count remains the flea standalone path.
* fix(marketplace): Owner label on flea + parent-centric sidebar for flea inner
- Plugin detail Details sidebar — authorship row label now tracks the
source: curated bundles get `Curator` (existing behaviour), flea
bundles get `Owner`. The `owner_todo` reminder placeholder stays on
the curated branch only; flea falls through silently.
- Inner item detail Details sidebar — flea-inner (skill/agent nested
inside a flea plugin) now shares the curated nested layout: Parent
plugin / Bundle size / Active days / Last used / Owner. Drops the
flea-standalone shape's `Category`, `Version`, `Installs`, `Released`
rows that didn't apply to a nested item. Active days + Last used were
already wired (telemetryRows) — they just weren't on the flea-inner
branch.
* fix(tests): bump SCHEMA_VERSION assertions 47 -> 48 post-rebase
The marketplace telemetry migration was renamed _v46_to_v47 -> _v47_to_v48
during the rebase onto main (collision with #326 FTS BM25 migration that
took the v47 slot). Two test files still asserted the pre-rebase value:
- tests/test_home_stats.py::test_schema_version_constant_is_46 (CI red)
- tests/test_schema_v46_migration.py::test_schema_version_is_46
Renames the helper fn name + bumps the assertion. The other two test
files (test_db_schema_version.py, test_schema_v42_migration.py) were
already updated in the rebase resolution.
* fix(telemetry): _build_telemetry returns None when invocations_30d == 0
The follow-up commit that introduced the always-return-dict shape broke
the test contract from the original v46 PR (commit b603e998):
tests/test_marketplace_telemetry.py::TestDetailTelemetry::
test_detail_endpoint_telemetry_absent_when_no_data
AssertionError: assert {'daily_series': [...], ...} is None
Both `PluginDetailResponse.telemetry` and `InnerDetailResponse.telemetry`
are declared `Optional[Dict] = None`, the frontend renders are None-safe
(`d.telemetry || {}` guard + `if (!d.telemetry || ...)` on daily_series),
so dropping the dict on zero activity is the cleaner default.
* release: 0.54.21 — marketplace telemetry refactor (schema v48) + flea inner detail parity + listing UX polish
---------
Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
Three coordinated tweaks to the publication discovery surface:
1. Action-row CTA on /marketplace?tab=curated reads 'Submit a skill
or plugin' instead of 'Submit a plugin'. Skills are first-class
citizens of the curated shelf; the old wording made them feel
like an afterthought. Same rename in the empty-state JS innerHTML
so the two paths can't drift.
2. Curated guide page (/marketplace/guide/curated) expanded from a
4-line stub into a 3-step ordered list documenting the Named
Curator handoff (find curator → handoff → publish + lifecycle).
New '.guide-fastpath' callout block points users at the Flea
Market when they want lighter review-bar / faster path. Primary
CTA at the bottom of the curated guide now links to the flea
guide too, so users who skim past the fast-path callout still
see the escape hatch.
3. Flea guide page (/marketplace/guide/flea) expanded from a
3-line stub into a 4-step ordered list (package → upload via
form → automated review → published). Documents the actual
/store/new flow + the automated guardrails (manifest, content
quality, prompt-injection scan) so users know what 'self-service'
actually means before they upload.
Route titles updated to match: 'Submit a skill or plugin to
Curated Marketplace'.
New file: tests/test_web_marketplace_guide.py — three tests covering
the CTA rename, the curated guide's structural elements (Named
Curators lede, 3 steps, fastpath callout, primary-CTA href), and
the flea guide's structural elements (4 steps, no fastpath
asymmetry, /store/new primary CTA).
* docs(plan): design-system unification plan (post-review revisions)
Plan covers consolidating two CSS files into one, introducing
canonical primitives (.btn family, .search-input, .filter-bar,
.page-header, .data-table, .empty-state, .toast, .stat-card,
.tab-strip), unifying the top-nav Admin trigger with sibling
links, and migrating 41 templates that today carry inline
<style> blocks.
Post-review revisions: nav fix moved to first commit (user
complaint lands first); sticky-header and dark-mode skeleton
tasks dropped (defer to follow-up PRs); contract test class
detection tokenizes class="..." attributes properly; baseline
screenshot loop added to Task 0; vendor-token grep widened.
* fix(nav): unify Admin trigger with sibling nav links
The top-nav Admin entry is a <button class="app-nav-link
app-nav-menu-trigger">, siblings are <a class="app-nav-link">.
.app-nav-menu-trigger used to override .app-nav-link with
"color: inherit; font: inherit", resetting font-size from 13px
back to body default and color from --text-secondary to body
color. Active state diverged too: .is-active on links used
--primary blue, [aria-expanded=true] on the button used
--border-light grey.
Fix: expand .app-nav-link so it covers <button>-element resets
(font-family: inherit, border: 0, background: transparent,
cursor: pointer, display: inline-flex for chevron alignment).
Add [aria-expanded="true"] as another active-state selector
so the dropdown's open state highlights identically to .is-active
on links. Delete the now-redundant .app-nav-menu-trigger rules
that stripped button chrome.
Extract the inline <script> from _app_header.html into a new
app/web/static/app.js (loaded by base.html only — base_login.html
has no nav). Sets up window.appUI.wireDropdown for both the user
menu and the Admin dropdown via DOMContentLoaded.
* style(css): consolidate style.css into style-custom.css + add cache-bust
One stylesheet for the whole web UI:
- style.css (1086 lines, legacy Google-inspired tokens + components)
absorbed into style-custom.css under a labeled block, placed after
the modern :root + body so style-custom's component rules continue
to override the legacy ones (preserves the original cascade order
that came from loading style.css first).
- style.css deleted; <link> dropped from base.html + base_login.html.
- static_url() now appends ?v=<mtime> to /static/<path>. Cheap
per-request os.stat — auto-invalidates browser + proxy caches on
redeploy without operator intervention. Mtime survives across
uvicorn restarts as long as the file content is unchanged.
Legacy classes (.btn, .card, .login-*, .badge, .code-block, .flash,
.form-group, .username-box, .btn-copy, .auth-tabs, .divider, etc.)
still render — they live in style-custom.css now. Login pages,
error page, password setup, and the dashboard's Claude Code Setup
card all kept working in browser smoke.
* test(design): contract test for design-system invariants
7 structural invariants enforced from this commit onwards:
- style.css must stay deleted
- no template links style.css via static_url
- exactly one bare :root block in style-custom.css
- canonical primitives declared (.btn, .btn-primary, .search-input,
.filter-bar, .page-header, .data-table, .empty-state, .toast, …)
- no deprecated class names in templates (.users-table, .gp-table,
.marketplaces-table, .audit-table, .users-search, .marketplaces-search,
.modal-btn, .btn-primary-v2, …)
- app.js loaded by base.html, NOT by base_login.html
- 3 helper-level unit tests for the class-attribute tokenizer
(multi-line attrs, Jinja-conditional fragments, false-positive prose)
Two of the assertions intentionally start FAILING after this commit
(missing primitives + legacy class refs in 7 admin templates) and
will turn green as Tasks 4–7 add primitives and Tasks 8–15 migrate
the templates.
* feat(css): canonical button family + legacy token aliases
Adds at top of :root: legacy token aliases (--bg, --card-bg, --text,
--text-light, --secondary, --radius) pointing at modern equivalents.
Absorbed style.css rules referenced these names; without aliases
they fell back to 'unset'. Aliases live until Task 16 alongside
their absorbed rules.
Appends canonical .btn variants at end of file (last cascade):
.btn-primary + .btn-primary-v2 + .modal-btn.primary (alias group)
.btn-secondary + .btn-secondary-v2 + .modal-btn:not(.primary):not(.danger)
.btn-ghost + .btn-ghost-v2
.btn-danger + .modal-btn.danger
.btn-lg
.btn:disabled + .btn:focus-visible (focus ring via --focus-ring)
Existing absorbed .btn, .btn-primary, .btn-secondary, .btn-sm rules
remain — the canonical block adds the missing variants + selector-list
aliases so .modal-btn and v2 markup keep rendering until migration
tasks swap them out.
Contract test: .btn-danger now declared (one less missing primitive).
Browser smoke: /admin/tokens hero + filter pills + empty state render
correctly with the absorbed style.css rules now backed by real tokens.
* feat(css): form-control primitives — .search-input + .filter-bar + .filter-pill + .form-input
Canonical filter bar shape: 36px-height inputs (matches button height
for vertical rhythm), 28px pills with .is-active state, consistent
focus ring via --focus-ring token.
Selector-list aliases for legacy per-page classes:
- .users-search / .marketplaces-search / .kb-search → .search-input
- .filters-card → .filter-bar
- .pill[aria-pressed="true"] also matches the .filter-pill active state
.form-input added as a sibling of .search-input for forms — same
baseline height + radius + focus treatment, with textarea.form-input
auto-sizing to min 96px and using the mono font (matches CSV/SQL
pasted-snippet patterns on /admin/agent-prompt + /admin/workspace-prompt).
Contract test: .search-input + .filter-bar + .filter-pill now declared.
* feat(css): .page-header primitive + variants + .tab-strip
Canonical page-header pattern with title (22px) + optional subtitle +
optional eyebrow + right-aligned actions slot. Two modifiers:
- .page-header--hero: gradient background (primary→primary-dark),
28px white title, semi-transparent subtitle/eyebrow. For
/marketplace, /store, /profile-style pages that already use this
layout via per-page inline <style>. Migration tasks delete the
duplicated rules.
- .page-header--compact: 18px title for dense admin index pages.
.tab-strip + .tab-strip__item — the secondary tab row pattern used by
/marketplace?tab=flea and similar. .is-active / [aria-selected=true]
both flip the active treatment (primary color + bottom border).
Contract test: .page-header / __title / __subtitle / __actions all
now declared (4 fewer missing primitives).
* feat(css+js): .data-table + .empty-state + .toast + .stat-card primitives
Last primitive batch. All 8 canonical-primitives invariants in
test_design_system_contract.py now green; only the template-migration
test fails (expected — Tasks 8–15).
.data-table (+ --compact modifier): selector-list aliases for legacy
per-page table classes (.users-table, .gp-table, .marketplaces-table,
.audit-table) so existing markup keeps rendering until migration.
Compact modifier shrinks padding + font for dense lists (audit log).
.empty-state with __icon / __title / __description / __actions —
replaces the ad-hoc 'no results' rendering scattered across pages
(corporate_memory, admin_users, admin_marketplaces, etc.).
.toast / .toast-container — paired with window.appToast({kind, msg,
timeout}) appended to app.js. Bottom-right stacked, click-to-dismiss,
auto-dismiss after 4s by default. Kind 'success' / 'warning' / 'error'
/ 'info' shows a 3px colored left border.
.stat-card (+ --accent variant) + .stat-row grid — for the dashboard
metric tile row.
* style(templates): migrate 8 templates off deprecated class names
Mechanical class-attribute rewrite via tokenizer (preserves Jinja
conditionals + multi-line attrs):
modal-btn primary -> btn btn-primary
modal-btn danger -> btn btn-danger
modal-btn -> btn btn-secondary
users-table -> data-table
gp-table -> data-table
marketplaces-table -> data-table
audit-table -> data-table
users-search -> search-input
marketplaces-search -> search-input
8 templates touched: admin_groups, admin_marketplaces, admin_tokens,
admin_users, admin_welcome, admin_workspace_prompt, my_tokens,
corporate_memory_admin. 43 lines updated total.
Inline <style> blocks in these templates still define rules for the
old class names — those rules no longer match anything and become
dead code, removed in Task 16's alias cleanup along with the
selector-list aliases in style-custom.css.
Contract test (tests/test_design_system_contract.py) now fully green:
9/9 invariants enforced from this commit onward.
* feat(css): extend .data-table selector list to 13 more bespoke -table classes
Visual unification of remaining tables across the codebase without
per-template edits. The .data-table baseline rules (uppercase header
tracking, 12px padding, hover state, border-radius) now apply to:
.ad-table / .ea-table / .md-table / .members-table /
.obs-table / .overview-stats-table / .registry-table /
.sample-table / .sched-table / .sess-table / .sub-table /
.subs-table / .ud-table
These class names live in 12 templates (activity_center, admin_access,
admin_group_detail, admin_scheduler_runs, admin_sessions,
admin_store_submissions, admin_tables, admin_usage, admin_user_detail,
catalog, me_debug, profile_sessions) that have their own per-page
<style> blocks. Per-page rules with higher specificity still win for
their custom needs (column widths, etc.) — this commit only sets a
shared baseline so every table renders with the same chrome.
Contract test stays green: 9/9 invariants enforced.
* style(css): remove now-unused legacy class aliases
Phase A renamed 8 templates off these names; no markup references
them any more, so the selector-list memberships are dead weight.
Removed from style-custom.css:
.btn-primary-v2 / .btn-secondary-v2 / .btn-ghost-v2
.modal-btn / .modal-btn.primary / .modal-btn.danger /
.modal-btn:not(.primary):not(.danger)
.users-search / .marketplaces-search / .kb-search
.users-table / .gp-table / .marketplaces-table / .audit-table
.filters-card
37 lines smaller. Contract test catches any reintroduction.
KEPT aliases (still in untouched template markup):
- .pill (marketplace_plugin_detail.html, marketplace.html — these
pages weren't part of Phase A's deprecated-class sweep; their
own .pill CSS rules still apply)
- All .data-table family extensions (.ad-table, .ea-table, .md-table,
.members-table, .obs-table, .overview-stats-table, .registry-table,
.sample-table, .sched-table, .sess-table, .sub-table, .subs-table,
.ud-table) — these still render data tables in 12 templates;
selector-list aliasing keeps them visually unified with .data-table
baseline.
- Legacy token aliases (--bg / --text / --text-light / --secondary /
--card-bg / --radius) — still resolve absorbed style.css rules.
Templates' inline <style> blocks still contain dead rules for the
renamed classes (.users-search, .modal-btn, etc.); harmless but
bloat. Optional follow-up: a separate sweep can drop those.
* docs(changelog): design-system unification under [Unreleased]
* feat(css): unify page-shell width — .container baseline 1280px + modifiers
Inventory found 30+ unique max-width values across templates (280px
login → 1600px admin/tables). The legacy .container default was 800px,
which made every admin page set its own wider inline override —
30+ ad-hoc widths drifted as a result.
Canonical: .container max-width = var(--width-app) (1280px). Pages
that need a different shape opt in via modifiers:
.container--narrow → var(--width-narrow) (800px) — long-form text,
setup wizards
.container--wide → var(--width-wide) (1400px) — admin lists,
marketplace grids
.container--full → max-width: none — hero / landing
Pages that already set a NARROWER inline max-width (setup, login flows
inside .login-card, etc.) still render at their narrower size — the
inline override beats the new canonical 1280px. The visible change
hits the ~20 admin pages currently rendering at 800px via the legacy
default, which jump to 1280px and pick up consistent breathing room.
Spacing also normalized: padding 24px 20px → var(--space-6) var(--space-5).
* fix(home+catalog): gut dashboard sections + remove confusing toggle + fix table count
Dashboard /home cleanup:
- Remove 'Your Data' card — Data Packages is already a top-nav entry,
so duplicating data sources on the landing page just adds noise.
- Remove 'Account' card — group memberships + scripts + last sync
belong on /profile, not on the welcome screen.
- Remove entire right-column (Corporate Memory + Activity Center
widgets) — both surfaces have dedicated admin pages reachable from
the Admin dropdown.
- Keep stats row (Tables/Columns/Rows/Data Size/Unstructured),
env-setup-CTA, and Notifications card.
/catalog cleanup:
- Strip the 'Always included' badge + the locked toggle-switch from
Core Business Data and Business Metrics cards. The toggle was
always 'checked disabled' — it visually looked like a switch but
could not be toggled, which was confusing. The 'Always included'
copy itself was redundant once the toggle was gone. Agnes Internal
already rendered without these, so the three cards are now visually
consistent.
Catalog data_stats fix:
- 'total_tables' was len(sync_state) — counted only tables that had
ever synced, so a 30-row table_registry with 0 ever synced rendered
as '0 tables'. Switched to len(tables) — the registered
business-data table list — so the count reflects what's actually
available, not what's been touched.
* fix(home): real stat numbers + drop unstructured tile + cleanup dead CSS
Dashboard stats were hardcoded zeros (columns: 0, size_display:
'0 MB', unstructured_display: '0 MB') and the table counter pulled
from sync_state (synced) instead of table_registry (registered).
On a fresh deployment with 30 registered tables and 0 ever synced,
the page rendered '0 / 0 / 0 / 0 MB / 0 MB' — useless.
Now:
- Tables: COUNT(*) FROM table_registry WHERE source_type != 'internal'.
Matches the /catalog Core Business Data counter.
- Columns: SUM(sync_state.columns). Zero only when nothing's synced yet.
- Rows: unchanged (SUM(sync_state.rows), already correct).
- Data Size: SUM(sync_state.file_size_bytes), human-formatted via
inline _fmt_bytes helper (KB/MB/GB).
- Unstructured: tile dropped — was always '0 MB' and had no source.
- last_updated: now derived from sync_state max(last_sync), wasn't set
before so the 'Synced …' tag never rendered.
Dashboard.html cleanup: ~725 lines of orphan inline <style> removed —
.section-title, .data-source*, .toggle-switch*, .catalog-cta*,
.memory-card / .memory-stat / .memory-description / .memory-footer
/ .btn-memory, .activity-card / .activity-stat / .activity-text
/ .btn-activity, .account-grid / .account-row / .account-scripts
/ .badge-role / .badge-group / .cron-line, .badge-included /
.badge-beta / .badge-demo. All matched markup deleted in the
previous commit; the CSS was dead code until now.
* ui(catalog): rename page heading 'Data Catalog' → 'Data Packages'
The top-nav entry says 'Data Packages' but the page itself said
'Data Catalog' — confusing two-name product. Aligns the heading and
<title> with the nav label. Subtitle trimmed too: 'manage your
subscriptions' was a vestige of the toggle UI that just got removed,
replaced with a one-liner describing what the page is for.
Two other 'Data Catalog' strings stay: they live inside the table-
profiler overlay JS and refer to an EXTERNAL catalog system (e.g.
OpenMetadata / Atlan) that an operator may link to per table — that
is a generic term for any external data-catalog product, not our
page name.
* fix(nav): dropdown clicks always work + mutual-exclusion close
Two bugs in the wireDropdown helper:
1. Clicking trigger B while trigger A's menu was open left both open.
e.stopPropagation() in trigger.click prevented the document-click
handler from firing, so trigger A's open menu had no way to learn
that something else was clicked. Net effect: state diverged across
the two dropdowns the more you clicked.
2. The target-vs-trigger equality check (e.target !== trigger) was
strict. Clicking the chevron <svg> inside the button reports the
svg or its <path> child as e.target — not the button — so removing
stopPropagation alone would trip the close branch in the same
click that just opened the panel.
Fix both at once: drop e.stopPropagation() AND switch the doc-handler
guard to trigger.contains(e.target). Now any click outside both the
trigger subtree and the panel subtree closes; any click on another
trigger closes via the OTHER dropdown's doc handler; clicks inside
the trigger (button OR svg child) are fully ignored by the doc
handler and only the trigger's own toggle handler fires.
* feat(ui): canonical blue-gradient hero on every admin page
The UI had a per-page hero pattern on ~10 onboarding/marketing pages
(admin_tokens / profile / install / setup_advanced / marketplace /
my_tokens / store_upload / home_*), each with its own ad-hoc CSS
(.tokens-hero, .profile-hero, .install-hero, .upload-hero, …). The
admin section's index + detail pages had plain H1/H2 with their own
.users-title / .gp-title / .obs-title / .cfg-title / … inline styling.
Net effect: half the app felt like a product, half felt like a
spreadsheet.
Now:
- .page-header--hero CSS upgraded to match the look analysts already
liked from admin_tokens: 28px/32px/24px padding, 14px radius, soft
primary-tinted box-shadow (0 4px 16px rgba(0,115,209,0.2)), 28px
semibold title, optional uppercase eyebrow + 13.5px subtitle.
Narrow-viewport breakpoint included.
- New _page_hero.html partial wraps the boilerplate. Usage:
{% set page_hero_eyebrow = "Users & Access" %}
{% set page_hero_title = "Users" %}
{% set page_hero_subtitle = "…" %}
{% include "_page_hero.html" %}
- 15 admin templates migrated to it: admin_users / admin_groups /
admin_marketplaces / admin_access / admin_sessions /
admin_session_detail / admin_store_submissions /
admin_scheduler_runs / admin_usage / admin_user_detail /
admin_welcome / admin_workspace_prompt / admin_server_config /
activity_center / admin/news_editor. Each gets a grouped eyebrow
(Users & Access / Data / Agent Experience / Activity Center /
Server) matching the Admin dropdown sections so the page identity
is unambiguous at a glance.
Legacy *-title H2/H1 + adjacent subtitle paragraphs deleted; their
per-page CSS rules are dead now (harmless, retire in a follow-up
sweep alongside other inline-style cleanup the reviewers flagged).
admin_tables.html intentionally NOT migrated — it's a standalone
HTML page that doesn't extend base.html; a separate refactor.
Test: test_admin_users_page_renders_for_admin assertion updated
from .users-title to .page-header__title + .page-header--hero (the
canonical pair). All other web/template tests stay green.
* refactor(ui): dedup _humanbytes, drop 267 lines of dead inline CSS
(1) _humanbytes consolidation:
- Add TB branch + optional precision param (default 2 preserves existing
Store detail callers; dashboard uses precision=1 for headline tiles).
- Delete inline _fmt_bytes from dashboard handler — was a copy of
_humanbytes with different rounding. One canonical helper now.
(2) Dead inline-CSS sweep across 17 migrated templates:
- Conservative regex: a CSS rule is deleted only when its primary class
matches one of the known-dead names AND that name is NOT referenced
from any class= attribute in the same file's markup.
- Per-file 'in-use' guard saved several false positives that the deny
list would have nuked (e.g. .users-toolbar, .gp-search, .obs-subtitle,
.marketplaces-toolbar are still in use; only .users-table, .users-search,
.users-title, .modal-btn, etc. that have NO markup left went away).
- Removed: -267 lines across admin_users (-42), admin_marketplaces (-45),
admin_groups (-31), my_tokens (-38), admin_tokens (-29), admin_access
(-9), admin_user_detail (-6), admin_welcome (-8), admin_workspace_prompt
(-8), admin_server_config (-2), admin_sessions (-1), admin_session_detail
(-1), admin_usage (-1), admin_store_submissions (-3), admin_scheduler_runs
(-3), activity_center (-4), corporate_memory_admin (-36).
Contract test stays green (9/9); all web/template/render/user_management
tests pass.
* feat(ui): canonical hero on /catalog (Data Packages)
Same .page-header--hero treatment as the admin pages — Data eyebrow,
Data Packages title, Browse-the-data-sources subtitle. Removes the
ad-hoc .page-title block (h1 / p / wrapper-div) and its CSS rules
(now dead, 3 rule blocks deleted).
* fix(nav): load app.js from _app_header.html — works on standalone pages
The previous nav-fix commit moved the inline dropdown script from
_app_header.html into app/web/static/app.js + added <script src=…>
to base.html. That broke EVERY page that includes _app_header.html
WITHOUT extending base.html (catalog, corporate_memory*,
admin_tables, install). They got the nav markup but no JS → both
Admin and AD dropdowns dead on those pages.
Fix: emit the <script src=app.js defer> directly inside the
_app_header.html partial. Any page that includes the header now
gets the script automatically — base.html-extenders AND standalone
HTML pages alike. base.html's duplicate <script> line removed.
Also fixes the wide-hero on /catalog: .page-header--hero now sets
its own max-width: var(--width-app) (1280px) so standalone pages
without a .container parent don't render the gradient edge-to-edge.
catalog's .source-cards bumped from 900px → 1280px to match the
hero, otherwise the page reads two-tier (wide blue band, narrow
content) which the user flagged.
Verified locally via agent-browser: Admin + AD dropdowns now click
through on /catalog, /admin/tables, /corporate-memory.
* docs(plan): standalone pages → base.html framework migration plan
Plan + Plan-agent review (8 must-fix items applied) for converting
the 5 templates that ship their own <html><head><body> scaffold
(catalog, install, corporate_memory, corporate_memory_admin,
admin_tables) to extend base.html. Root cause of yesterday's
'dropdown dead on /catalog' regression: shared infrastructure in
base.html doesn't propagate to standalones.
* feat(base): body_attrs block + migrate install.html to extend base
base.html: new {% block body_attrs %}{% endblock %} slot so pages
that need <body> attributes (admin_tables has data-source-type)
can carry them through extends.
install.html: convert from standalone <html><head><body> scaffold
to {% extends "base.html" %} with title / body_attrs / head_extra
/ layout / scripts blocks. Drops:
- <!DOCTYPE>, <html>, </html>, <head>, </head>
- <meta charset>, <meta viewport>
- Duplicate <link rel="stylesheet" href="...style-custom.css">
(base.html already provides one)
- <body> opening + closing tags
- Leading _app_header.html include + _version_badge.html include
(base.html handles both)
Preserves per-page CSS (in head_extra), per-page JS (in scripts),
the Inter font preconnect (kept inline; not hoisted to base in
this PR — separate decision).
Pilots the migration recipe before the 4 larger pages.
* refactor(memory): extend base.html
Same recipe as install.html. corporate_memory.html now inherits
<html>/<head>/<body> + nav + app.js script tag from base.html.
Page-specific CSS and JS preserved in head_extra + scripts blocks.
* refactor(memory-admin): extend base.html
Same recipe as install/corporate_memory. Curation page now in the
shared rendering pipeline.
* refactor(catalog): extend base.html
catalog.html had the most complexity: 7 head-level assets (chart.js,
Prism, prism-sql, metric_modal.css link + 2 preconnects + Inter
stylesheet), 5 body-level <script> blocks including a <script type=
"module"> for the metric modal, 2 duplicate style-custom.css links
in <head>. The migration script preserved all of them — head-level
externals hoisted to {% block head_extra %} in source order, body
scripts relocated to {% block scripts %} in source order (so chart.js
loads before the IIFE that builds Chart instances), duplicate
style-custom.css links dropped (base.html provides one).
* refactor(admin-tables): extend base.html + carry data-source-type
The biggest of the 5 standalones at 3563 lines. <body data-source-
type="{{ data_source_type }}"> attribute carried through via the
new {% block body_attrs %} slot (admin_tables JS reads
document.body.dataset.sourceType to switch between keboola and
bigquery rendering paths).
* release: 0.54.10 — UI design system unification + homepage status frame + initial workspace override + store guardrails
Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com>
* refactor(web): migrate remaining templates to canonical design primitives
- admin_group_detail: .data-table, .btn family, appToast(), remove duplicate table/button/toast CSS
- admin_store_submission_detail: .data-table, .btn family, appToast(), remove duplicate btn/toast CSS
- profile_sessions: .data-table, _page_hero.html, remove duplicate table/title CSS
- me_debug: .data-table, .btn family, remove duplicate table/button CSS
- marketplace: .btn-primary/.btn-secondary, remove duplicate button CSS
- store_edit: remove duplicate .btn-primary/.btn-link CSS, canonical button classes
- store_upload: remove duplicate .btn-primary/.btn-secondary/.btn-link CSS
Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com>
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
* perf(marketplace): browser-cache cover photos + restore Curated tab filter spacing
Cover photos on /marketplace grid now serve with `Cache-Control: public,
max-age=2592000, immutable` plus URL fingerprinting (`?v=<commit-sha8>`
for curated, `?v=<version_no>` for flea) so browser refresh stops hitting
the server entirely for unchanged assets. Per-plugin RBAC dropped from
the three image endpoints (curated_asset, curated_mirrored, get_entity_photo)
in favor of login-only auth — eliminates _system_db_lock contention on
parallel image requests. Per-request magic-bytes revalidation also dropped
from curated_asset (it was re-reading the file just to discard the bytes,
then FileResponse read it again).
Spacing bug: sort-dropdown commit (6be1cee) wrapped .mp-filter-row in a
new flex container with inline margin-bottom:4px, masking the original
12px CSS rule. Curated tab (where .mp-type-row is hidden) ended up with
4px between filters and the card grid. Wrapper margin restored to 12px.
See CHANGELOG entry under [Unreleased] — the RBAC relaxation is called
out under ### Security with explicit threat-model rationale for AI/human
reviewers.
* test(marketplace): update renamed-html-as-png test for dropped magic-bytes check
Magic-bytes body validation was dropped from `curated_asset` in the previous
commit — the request path now relies on extension allowlist + pinned
Content-Type + nosniff + strict CSP to neuter mismatched payloads at the
browser layer. Update the test to assert the new defense-in-depth posture
(200 served, but Content-Type=image/png + nosniff + CSP=default-src 'none')
rather than the gone 415.
---------
Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
* docs(spec): admin observability spec + Activity Center MVP plan
Parent spec (480 lines) + executable plan (2295 lines, 14 TDD tasks).
Covers Activity Center rebuild (/admin/activity), with /admin/sessions
and /admin/feedback deferred to follow-up plans.
Already incorporates reviewer-pass revisions across three angles
(security, production resilience, code architecture):
- _get_db import path corrected to app.auth.dependencies
- Test fixtures aligned with seeded_app / admin_user / get_system_db
- All new audit writes wrapped in try/except + logger.exception
- Filename sanitization on session uploads
- DuckDB DESC index behavior documented; upgrade window flagged
- Migration idempotency + evolved-DB test cases
- reveal_raw + shared-cache multi-worker explicitly deferred
Targets schema v40 (audit_log gains params_before, client_ip,
client_kind, correlation_id + 3 indices).
* feat(db): schema v40 — audit_log gains params_before, client_ip, client_kind, correlation_id + 3 indices
* chore(test): clean up Task 1 — drop unused import, rename stale test
* feat(audit): AuditRepository.log() accepts params_before/client_ip/client_kind/correlation_id
* test(audit): strengthen params_before assertion to round-trip JSON content
* feat(audit): AuditRepository.query() rich filters + keyset cursor pagination
* feat(sync): SyncStateRepository.list_recent() cross-table feed
* feat(audit): POST /api/sync/trigger writes audit_log row
* feat(audit): POST /api/scripts/run-due writes audit_log row
* feat(audit): POST /api/upload/sessions writes audit_log row + sanitizes filename
* feat(audit): GET /api/data/{table_id}/download writes audit_log row
* feat(activity): /api/admin/activity timeline + /health + /sync endpoints
* feat(ui): /admin/activity rebuilt — health pulse, timeline, sync grid; /activity-center → 308 redirect
BREAKING: removed demo executive-pulse / maturity-roadmap content from activity_center.html.
The page now reflects real audit_log + sync_history data.
* feat(ui): admin nav + dashboard widget point at /admin/activity
* feat(activity): recursive-audit suppression for AC read endpoints (60s window per actor+filter)
* feat(activity): emit PostHog events when integration enabled (no-op default)
* fix(audit): move v40 indices out of _SYSTEM_SCHEMA + update test_repositories to unpack query() tuple
_SYSTEM_SCHEMA CREATE INDEX on audit_log(timestamp) failed when migration
tests hand-roll a bare audit_log (id, action) without the timestamp column.
Fix: remove indices from _SYSTEM_SCHEMA; add ADD COLUMN IF NOT EXISTS guards
for timestamp and other pre-v40 columns in _v39_to_v40() so the upgrade path
is safe on any hand-rolled schema; call _v39_to_v40 explicitly in the
fresh-install (current==0) path to restore index creation there.
Also unpack the (rows, next_cursor) tuple from AuditRepository.query() in
the three TestAuditRepository tests that still treated it as a list.
* docs: CHANGELOG entry for Activity Center MVP
* chore: refresh stale module docstring in app/api/activity.py
* feat(cli): agnes admin activity — terminal access to Activity Center (timeline + health + sync)
* fix(db): _v39_to_v40 — add IF NOT EXISTS guard for 'action' column
The v39→v40 ladder step adds defensive ADD COLUMN IF NOT EXISTS for
every audit_log column so a hand-rolled bare audit_log (id only) is
safe through the ladder. 'action' was missing from the guard list,
causing CREATE INDEX idx_audit_action_time to fail on tests that
stub audit_log with only an id column (tests/test_e2e_extract.py::
TestSchemaMigration::test_migration_preserves_and_extends).
Local 6/6 schema tests + the previously-failing CI test pass.
* docs(spec): platform telemetry epic — Boss directive + Activity Monitoring plan rebased onto v40 (stacked on zs/spec-activity-center)
* feat(db): schema v41 — 7 usage_* tables for telemetry (events, summary, rollups, attribution)
* chore(db): tighten v41 — usage_session_summary.session_id NOT NULL + upgrade test asserts all 7 tables
* feat(usage): UsageAttributionRepository — replace/delete/lookup over usage_attribution_* tables
* refactor(marketplace): extract list_inner_skills/agents/commands to src/marketplace_listing.py for reuse
* feat(usage): explode plugin attribution on marketplace sync + store entity write; backfill script
* refactor(marketplace): finish src/marketplace_listing.py extraction — drop duplicate _list_inner_* + _parse_frontmatter from app/api/marketplace.py
* feat(usage): promote attribution helpers to src/usage_attribution_helpers.py; hook update_entity rename + bundle-swap; clarify best-effort semantics
* feat(usage): UsageProcessor real extraction + rollup rebuild + 10 fixture-driven tests
* fix(usage): include tool_id in event hash + executemany + rollup transaction (critical multi-tool-turn drop fix)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(marketplace): popularity stats — invocations_30d + trend + sort=most_used|trending + Most Popular section
* feat(admin): /admin/users/<id> Sessions section — list + single-file + bulk-zip downloads (audit-logged)
* feat(usage): admin export endpoint + CLI — csv/json/parquet streaming, filters, audit-logged
* feat(usage): agnes admin ask — LLM Text-to-SQL over usage_events with SELECT-only validator (audit-logged)
* feat(usage): reprocess + prune endpoints + scheduler daily prune job + CLI
* docs: PLATFORM_SETUP.md operator playbook + HOWTO/ cookbook (5 guides + index)
Adds docs/PLATFORM_SETUP.md as a consolidated operator playbook covering
bootstrap, TLS, marketplaces (curated + flea), scheduler env vars, telemetry
extraction/export/ask/prune, privacy posture, and daily routine.
Adds docs/HOWTO/ with 5 analyst cookbook guides: first query, snapshots for
remote tables, private sessions, feedback + admin ask, and customizing skills.
Existing setup docs (QUICKSTART, DEPLOYMENT, ONBOARDING, HEADLESS_USAGE)
get a one-line cross-reference at the top pointing to PLATFORM_SETUP.md.
* docs(changelog): platform telemetry epic — usage_* foundation + surfaces + admin access + docs
Comprehensive [Unreleased] entry covering: usage_events/session_summary/
tool_daily/plugin_daily tables (v41), attribution lookup tables, backfill
script, marketplace Most Popular + invocation chips + sort, admin Sessions
section, export/ask/reprocess/prune endpoints + CLI mirrors, Activity Center
(v40), PLATFORM_SETUP.md + HOWTO/ docs, and operations notes for v41 upgrade.
* fix(security): block DuckDB read_*/http_*/glob functions in usage_ask validator + symlink escape guard in session zip + clarify mark-private semantics
* fix(admin): parquet export tempfile cleanup on COPY failure + correct processed-first sort on /admin/users/<id>/sessions
* feat(audit): close 8 production audit gaps — query (local/remote/hybrid), catalog/schema/sample, snapshot estimate/create, check-access
* feat(ui): /admin/usage summary dashboard + per-user activity tab on /admin/users/<id>
* fix(audit): cap error messages at 200 chars + audit user_activity reads + recursion guard on usage.summary
* fix(audit): catalog.list audits on error path + clean up deferred json import
* fix(ux): client_kind=cli for PAT auth + timeline empty state + email-instead-of-uuid + nav reorder + help text + loading indicators + ask doc
* feat(observability): unify /admin/activity into single page with saved views
- KPI cards (events, users, error rate, p95) clickable as quick-filters
- Faceted filter dropdowns populated from audit_log in the current window
- Sortable audit table, cursor pagination, per-row JSON side panel
- Saved views (schema v43: user_observability_views) — per-user state
- Top bar: window selector + 30s Live toggle + saved views dropdown
- /admin/scheduler-runs → 308 redirect (source=scheduler filter)
- New endpoints: /api/admin/observability/{facets,kpis,views}
* test: update activity + scheduler-runs tests for unified page
- test_admin_activity_page_renders asserts new structural anchors
- test_admin_scheduler_runs_page_admin_only asserts 308 redirect
* fix(observability): respect [hidden] on modal + side panel
CSS `display: flex` on .obs-modal beat the [hidden] attribute's UA
display:none, so the save-view modal rendered on page load and Cancel
clicks couldn't dismiss it. Gate the modal's flex layout on
:not([hidden]); add the same display:none guard prophylactically to
.obs-panel and .obs-views-panel.
* feat(observability): user enrichment in audit + interactive /admin/usage
Activity:
- /api/admin/activity now joins users for user_email + user_name per row
- User column renders "name (id-prefix)" or "email (id-prefix)" instead
of an opaque truncated UUID; falls back to id when the user record is
missing
Usage:
- /admin/usage rewritten as the same filter/group-by/search pattern as
/admin/activity. Faceted dropdowns (User / Tool / Source / Event type)
populated from usage_events; debounced free-text search across
tool_name / skill_name / subagent_type / command_name
- New endpoints /api/admin/usage/{facets,kpis,query}; the query endpoint
supports group_by in {day, username, tool_name, source, ref_id} with
sort + offset pagination, plus an ungrouped raw-events mode
- 4 KPI cards (events, distinct users, distinct tools, error rate) are
clickable quick-filters; clicking a grouped row applies the bucket as
a filter
- Old static `?window=7d|30d|all` server preload removed; all state is
client-side via since_minutes + group_by + filters in the URL
* fix(observability): clearer labels, all-column sort, drop saved views UI
- Rename page titles: "Activity" → "Server activity", "Usage" → "Tool usage"
with a one-line subtitle on each explaining what the page covers and
linking the other one. The two pages source different data (audit_log
vs usage_events) and the previous labels conflated them.
- Drop the saved-views dropdown + save modal from /admin/activity. The
modal pop-open bug was the trigger; the value wasn't there yet. The
/api/admin/observability/views CRUD + DuckDB table stay in place.
- Rename "Live (30s)" to "Auto-refresh (30s)" with a tooltip clarifying
that it's the re-fetch rate, not the time range. Time range now
labeled "Time range" instead of "Window".
- All audit-table columns are sortable (User, Source, Action, Resource,
Result added); sort is page-local with a Jinja comment explaining the
trade-off. Same for raw usage rows.
- Fix duplicate sort-arrow bug — the literal "▼" in the Time th HTML was
rendering alongside the CSS ::before arrow. Removed the literal; CSS
is the single source of truth.
* feat(observability): global Sessions browser + transcript viewer + CLI
Web:
- /admin/sessions — list every collected session JSONL across all users
with time-range, user, model, errors-only and free-text filters. Default
sort surfaces error-heavy sessions first. KPI cards (sessions, distinct
users, sessions w/ errors, tool error rate) clickable as quick-filters.
- /admin/sessions/<username>/<file> — transcript viewer rendering the
JSONL chronologically: user prompts, assistant text, tool calls (with
JSON input) and tool results (with flattened output). Errors get a red
border + chip and a "Next error" navigation button at the top.
- Admin dropdown gains a "Sessions" link.
API:
- GET /api/admin/sessions/{list,kpis,facets} — filtered cross-user reads
off usage_session_summary
- GET /api/admin/sessions/{username}/{file}/transcript — parses JSONL via
the existing services.session_pipeline.lib, returns chronological events
- GET /api/admin/sessions/{username}/{file}/download — JSONL stream, same
path-safety guards as the per-user endpoint, audit-logged
CLI:
- `agnes admin sessions list [--user X] [--errors] [--since 7d]` — table
output with `!` prefix on rows that hit a tool error
- `agnes admin sessions show <username> <file>` — transcript dump, with
`--errors` to print only the failed tool_result blocks
- `agnes admin sessions download <username> <file> [-o path]`
- `agnes admin sessions kpis` — top-level numbers
* feat(internal): expose telemetry tables to agnes query with row-level RBAC
Three new registered tables backed by system.duckdb, queryable through
the same /api/query plumbing analysts use for Keboola / BigQuery /
local sources:
agnes_sessions → usage_session_summary (filter: username)
agnes_usage → usage_events (filter: username)
agnes_audit → audit_log (filter: user_id)
RBAC is per-row, not per-table: admins see every user's rows; non-admins
see only their own. The filter is built server-side from the auth user
dict; non-admin filter values are regex-validated before SQL interpolation.
Implementation:
- new connector connectors/internal/ with access (filter+exec) + registry
(idempotent table_registry seed at startup)
- /api/query detects internal table refs and short-circuits to a CTE
wrapper that prepends "WITH agnes_x AS (SELECT * FROM <src> WHERE …),
…" then "SELECT * FROM (<user_sql>) AS _q". DuckDB cursor on the
shared system.duckdb handle — opening parallel handles / ATTACH on the
same file is blocked process-wide.
- mixing internal + BQ / registered local tables in one SELECT is
rejected (v1 limitation)
- src.rbac.can_access_table waves internal tables through for all
authenticated users; row scoping is the actual security control
- /api/v2/schema and /api/v2/sample gained internal branches; sample
intentionally skips its cache because rows are RBAC-scoped per caller
- audit row written as action='query.internal' with is_admin flag
Tests: connectors/internal/access — RBAC, filter clause, schema, CTE
wrapper coexistence with user-supplied aggregations, unsafe-username
rejection. 16/16 passing.
Motivating queries this enables:
SELECT tool_name, COUNT(*) FROM agnes_usage
WHERE is_error GROUP BY 1 ORDER BY 2 DESC
-- analyst self-introspection: which tools fail for me?
SELECT user_id, COUNT(*) FROM agnes_audit
WHERE action = 'session.transcript_view' GROUP BY 1
-- admin: who's been looking at whose session transcripts?
* feat(admin): group dropdown into 5 named sections + internal tables in /catalog
Admin dropdown gains section headers so admins can land on the right
page without re-reading the full menu:
Activity Center Server activity / Tool usage / Sessions
Users & Access Users / Groups / Resource access / Tokens
Data Tables
Agent Experience Curated Marketplaces / Flea Submissions /
Agent Setup Prompt / Agent Workspace Prompt
Server Server config
"Agent Experience" frames the curated content + prompts as one cluster
— it's all admin-controlled material that shapes what an analyst's AI
agent encounters. "Configuration" → "Server" since only one item lives
there now.
Renamed the section's first two items:
"Activity" → "Server activity" (matches page H1)
"Usage" → "Tool usage"
Also fixes /catalog visibility of the internal tables (agnes_sessions /
_usage / _audit) for non-admin users: ``app.auth.access.can_access``
short-circuits to True for resource_type='table' + an internal-table id.
Without this, non-admins saw the tables in /api/v2/catalog (which uses
the same RBAC bypass) but not on the /catalog HTML page (which calls
can_access directly, requiring a resource_grants row internal tables
don't have).
CSS for `.app-nav-menu-section`: small caps, muted, non-clickable; first
section trims top padding so the panel doesn't open with an awkward gap.
* refactor(admin): move corporate memory into Admin > Agent Experience
Memory link was the only admin-only entry in the primary nav (gated by
session.user.is_admin). Moves it into the Admin dropdown under Agent
Experience, alongside Curated Marketplaces / Flea Submissions / Prompts
— all admin-curated content that shapes what an analyst's AI agent
encounters.
Renamed the nav label to "Shared Knowledge" to match what the page
actually is (admin-curated organisational knowledge from session
verification, surfaced to agents). URL stays at /corporate-memory; the
route still gates on require_admin per the existing comment.
Side effect: primary nav (Home / Marketplace / Data Packages) is now
uniform for every authenticated user — no conditional admin-only entry.
* ui: rename admin entries to Curated Knowledge / Init Prompt / Workspace Prompt
- "Shared Knowledge" → "Curated Knowledge" (parallel with "Curated
Marketplaces" in the same Agent Experience section; "curated" tells
the admin what they do there — review + approve)
- "Agent Setup Prompt" → "Init Prompt" (matches the `agnes init` flow
it actually drives)
- "Agent Workspace Prompt" → "Workspace Prompt" (the "Agent" prefix
was redundant — every item in the section is agent-facing)
Renames page titles + H1s on /admin/agent-prompt and
/admin/workspace-prompt to match.
* refactor: rename Usage → Telemetry across user-facing surfaces
External surfaces all switch; internal Python module / file names and the
physical DB tables (usage_events, usage_session_summary, usage_tool_daily,
usage_plugin_daily) stay — renaming them would force a schema migration
+ a redo of the LLM Text-to-SQL prompt for no analyst-visible win.
Changes:
- Admin dropdown: "Tool usage" → "Telemetry"
- Page H1 / <title>: same
- URL: /admin/usage → /admin/telemetry; old URL 308-redirects
- API prefix: /api/admin/usage/* → /api/admin/telemetry/*
- CLI: primary command `agnes admin telemetry …`; `agnes admin usage` kept
as a deprecated alias so existing operator scripts keep working
- Internal data-source table id: agnes_usage → agnes_telemetry. The
registry seed now evicts any stale internal-source row whose id no
longer matches INTERNAL_TABLES, so the old `agnes_usage` row is
removed from table_registry on next app boot
- All tests + JS endpoint paths updated
* test(rbac): include auto-appended internal tables in expectations
get_accessible_tables now appends agnes_sessions / agnes_telemetry /
agnes_audit to every authenticated user's accessible-tables list so the
internal data source shows up in /catalog. The two existing rbac tests
asserted hardcoded list shapes that pre-dated the change.
Rewritten to assert "granted tables + the canonical internal-table set"
instead of literal lists, so the test stays correct if the internal
table roster changes again later.
* ui: visual dividers between admin-dropdown sections
Adds a 1px top border + 6px top margin to every section header except
the first, so the five named groups (Activity Center, Users & Access,
Data, Agent Experience, Server) read as visually separated clusters.
The header itself stays small-caps + muted as before — the border is
additive.
* ui(memory): match obs-topbar visual on /corporate-memory
The Curated Knowledge page (linked from the admin dropdown's Agent
Experience section) opened straight into the stats bar — no title,
no subtitle, no shared chrome with the other admin pages. Adds an
obs-topbar-style header at the top of .container-memory:
- H1 "Curated Knowledge"
- subtitle explaining what the page is + how AI agents pull from it
The `.ck-*` class set duplicates the inline obs-* styles from
/admin/activity etc. for this one page; promoting the obs-* class set
to style-custom.css for shared reuse is the obvious next step (4 pages
already inline the same CSS), tracked as a follow-up.
Page <title> also renamed from "Corporate Memory" → "Curated Knowledge".
* ui(tables): list Agnes internal tables in /admin/tables + group in /catalog
/admin/tables previously rendered three per-source-type listings
(BQ / Keboola / Jira) and dropped any row whose source_type didn't
match — so the agnes_sessions / agnes_telemetry / agnes_audit rows
seeded into table_registry were invisible. Adds a fourth read-only
section "Agnes internal tables" that filters source_type === 'internal'
and renders the same registry-table layout the other sections use,
with two changes:
- no Register button (these rows are seeded on every app boot from
connectors/internal/registry.py)
- Edit + Delete actions hidden (any change would be reverted on the
next start). Manage access stays so admins can still inspect.
Mode badge picks up a new mode-internal CSS class (teal accent) so the
display doesn't lie and call it "local".
In /catalog, internal tables now group under an "agnes" accordion
section (bucket="agnes" on seed) instead of falling into the catch-all
"default". Single source of truth for which tables exist; admins find
them where they expect.
* ui(tables): Agnes internal as a 4th tab next to BQ/Keboola/Jira
Previous iteration mounted the internal-table listing as a separate
standalone card under the tab strip. Reshapes it to a proper
tab-content section so admins switch between data sources via one
consistent nav (BigQuery / Keboola / Jira / Agnes internal).
- New tab button "Agnes internal" in the tab-nav.
- The listing card becomes <section id="tab-content-internal"
class="tab-content">; switchTab() already routes by id so no JS
change beyond extending the hash allowlist for direct #internal
links.
- Tab content keeps the read-only treatment from the previous commit
(no Register button, no Edit / Delete in renderRegistryListing).
* ui: rename Curated Knowledge → Curated Memory
Settles the naming back on "Curated Memory" — parallel structure with
"Curated Marketplaces" in the same Agent Experience section, and zero
rename ripple: URL (/corporate-memory), API (/api/memory/*), CLI
(agnes admin memory), and Python modules all stay on "memory" so the
admin label finally lines up with the underlying surfaces.
The "Curated" prefix still tells admins what they do on the page
(review pending → approve / mandate / reject) and reads as a sibling
of "Curated Marketplaces" right next to it in the dropdown.
Touches: admin dropdown label, page <title>, page H1. DB tables stay
on knowledge_* (already the canonical naming for the data shape).
* ui: rename "Server activity" → "Audit log"
"Audit log" is what the page actually is — server-side audit_log table
rendered with KPI cards + filter bar + sortable table. The "Server
activity" label confused the term with Claude Code session telemetry
(Telemetry page) and didn't make the source/concept clear.
Touches:
- Admin dropdown nav label
- /admin/activity page H1 + subtitle
- /admin/telemetry subtitle cross-link
- test_activity_api page-renders assertion
URL (/admin/activity) and API (/api/admin/activity/*) stay — the
"activity" name has stuck at the route layer for a year; rerouting
those would churn dashboards/bookmarks for zero analyst-visible win.
* ui(admin-nav): gray band on each section header for clearer separation
Previous iteration used a 1px top border between section labels — the
labels still blended into the items above/below at a glance. Switches
to a light gray background band per section header, extended edge-to-
edge inside the panel via negative horizontal margins. Bolder
font-weight (700) reinforces the separation; bumping the font color
isn't needed because the band itself does the work.
First section's header tucks into the panel's top border-radius so the
band reaches the corners without a gap.
* ui(catalog): rename internal-table category to "Agnes Internal"
`bucket` is what /catalog renders as the accordion category header
verbatim — "agnes" lowercase didn't read as a real category name and
got confused with a system identifier. Bumps to "Agnes Internal".
Seed re-applies on every app boot so existing rows pick up the new
bucket value via `ON CONFLICT (id) DO UPDATE`.
* ui(catalog): split Agnes Internal into its own card on /catalog
Previously the three internal tables landed inside the "Core Business
Data" card under an "Agnes Internal" accordion alongside Keboola / BQ
buckets — readers conflated system telemetry with business datasets,
and the data_stats header counter ("3 tables · ~X rows total") only
ever counted synced rows so internal tables looked invisible.
Split the catalog page into two cards:
- Core Business Data: only non-internal source_types (Keboola, BQ,
Jira). Accordions group by bucket as before. Stats counter reflects
this card's tables.
- Agnes Internal: a dedicated card with its own visual treatment
(teal accent matching the mode-internal badge in /admin/tables).
Flat list (no accordion — only 3 rows, never grows here), each
row carries the canonical `agnes query` snippet. Read-only — no
profiler click, no In-stack toggle, no sync metadata.
Route adds `internal_card` context object; template renders the new
card only when it's non-None.
* fix(rbac): hide internal tables from /admin/access + drop "my" framing
Two related cleanups for the Agnes-internal tables:
1. /admin/access (resource grants) no longer lists them. The
`can_access` check has a hardcoded internal-table bypass — security
is row-level (per-request view filter), so a table-grain
`resource_grants` row would do nothing. Surfacing them in the UI
let admins set up grants that silently no-op. Filter at the
`_table_blocks` projection so the UI tree never sees them.
2. Display names drop the analyst-perspective "my" framing:
"Agnes — my sessions" → "Agnes sessions"
"Agnes — my telemetry events" → "Agnes telemetry events"
"Agnes — my audit log" → "Agnes audit log"
The "my" only makes sense from the querying analyst's seat
(`SELECT … FROM agnes_sessions` returns *their* rows); on /admin/*
pages where admin sees / configures them across users, the
pronoun was misleading. Description text now spells out the
row-level RBAC contract explicitly.
Display names update via TableRegistryRepository.register's ON CONFLICT
UPDATE on next app boot; no manual cleanup needed.
* ui: subtitle notes about agnes_* tables on each Activity Center page
The recursive observability story — Agnes serves its own audit /
telemetry / session data through the same `agnes query` plumbing
analysts use for business data — wasn't surfaced anywhere on the
admin pages that show that data. Three pages get a one-liner with
the canonical `agnes query` snippet + the RBAC contract (analysts
see their own rows, admin sees all):
- /admin/activity (Audit log) → agnes_audit
- /admin/telemetry (Tool usage) → agnes_telemetry
- /admin/sessions → agnes_sessions
Sets up the discovery moment for admins: they're reading the page,
they see "you can query this from Claude Code", they remember it
when an analyst asks "how do I find my own failed tool calls?".
* ui(tables): explain "Show log" empty-state on /admin/tables
Cache warmup log <pre> renders with a dark background and is only
populated by the SSE stream during a Re-warm all run. Opening the
page cold + clicking Show log just revealed a black bar with no
context — admins couldn't tell what they were looking at.
Adds an inline paragraph above the <pre> explaining what the log is,
the row format, when it fills in, and where to find the historical
audit trail (/admin/activity). The actual <pre> stays empty until
SSE events arrive, but the surrounding copy carries the meaning.
* ui(tables): auto-open cache-warmup log on Re-warm all click
A Re-warm all run takes ~24s per remote BQ row. With the <details>
collapsed by default, operators saw the button disable, watched a
quiet ~24s pass, and assumed nothing had happened — the streaming
log was hidden behind a closed disclosure.
Two small JS tweaks:
- cacheWarmupRun() opens the details on click, so streamed lines
appear without an extra interaction
- cacheWarmupOnStart() hides the inline hint paragraph the moment
real log content lands, so the dark log block isn't competing
with redundant context
Hint paragraph also clarifies that only `query_mode='remote'` BQ
rows are warmed — operators with only materialized/internal tables
would see total=0 and the page would "do nothing" by spec.
* ui: trim Agnes internal copy across surfaces
Descriptions had grown to explain the extraction pipeline ("parsed
out of session JSONLs"), the underlying table ("Backed by
usage_session_summary"), the RBAC mechanic ("row-level RBAC at query
time — analysts see their own; admin sees all"), and the SQL snippet.
Every implementation detail meant another rewrite on the next iter.
Strips to one stable line per surface: what the data is, plus
"Also available locally for analysis". Mechanics live in code +
docs; the page copy says what the user needs to know.
Touched:
- connectors/internal/access.py: INTERNAL_TABLES descriptions
- activity_center.html / admin_usage.html / admin_sessions.html
subtitles
- catalog.html Agnes Internal card description + row strip
- admin_tables.html "Agnes internal" tab hint
* fix(internal): is_user_admin arity bugs + + saved-view payload cap
Round-1 code review (PR #278) caught two blocking bugs and three nits.
Blocking — both `is_user_admin(user)` (single dict arg) calls raised
TypeError. is_user_admin signature is `(user_id, conn)`. Affected:
- app/api/query.py:_run_internal_query — every POST /api/query that
references agnes_sessions / agnes_telemetry / agnes_audit blew up
with a 500. The headline analyst-facing feature of this PR was
unusable through the API.
- app/api/v2_sample.py — same shape; `GET /api/v2/sample/agnes_*`
returned 500.
Both fixed to call `is_user_admin(user.get("id"), conn)`. Added two
FastAPI-level tests in test_internal_data_source.py that go through
the TestClient — the existing unit tests on `execute_internal_query`
and `build_filter_clause` skipped the request-handler layer where the
bugs lived, which is why this landed.
Nits also closed:
- connectors/internal/access.py: `+` allowed in _USERNAME_RE /
_USER_ID_RE so RFC 5321 email local-parts (alice+test@x) resolve
correctly without hitting InternalAccessError.
- app/api/observability.py: saved-view payload capped at 64 KiB to
prevent an admin from bloating system.duckdb with a malformed save.
* fix(security): close non-admin data-leak via underlying-table refs
PR #278 R2 review surfaced a non-admin-exploitable bypass: SQL whose
string literal contains 'agnes_sessions' routed into the privileged
internal-query path, then queried the underlying physical table
(usage_session_summary / usage_events / audit_log) directly, escaping
the CTE wrapper's row filter. Two reinforcing defenses:
1. find_internal_refs() now strips single-quoted string literals
before scanning for alias names — a literal alone no longer
routes the request into the privileged code path.
2. execute_internal_query() rejects non-admin SQL that references
the underlying physical tables (usage_*, audit_log). The CTE
wrapper only scopes the agnes_* aliases; a direct FROM on the
base table — or a shadowing inner WITH that still has to read
the base table — bypasses RBAC. Block before execution with an
actionable error pointing to the agnes_* alias. Admins are
unaffected (god-mode short-circuit on the filter clause).
3. tests/test_internal_data_source.py — three new negative tests
covering literal-only matches, direct-table refs, and CTE
shadow attempts.
Also tightens usage_ask.py's SELECT-only validator: pragma_table_info,
pragma_storage_info, pragma_database_*, and duckdb_tables / columns /
views / indexes / schemas are reflection functions that leak metadata
the analyst question shouldn't reach. \bPRAGMA\b in _FORBIDDEN never
matched the function-call form (word-boundary between `A` and `_`).
* fix(security): dynamic denylist for non-admin internal queries
R3 review (PR #278) caught a wider data-leak than R2: the underlying-
physical-table guard listed only the 7 usage_* + audit_log tables,
but system.duckdb has 30+ other sensitive tables — users (emails +
ids), personal_access_tokens, resource_grants, user_groups,
user_observability_views, store_*, marketplace_*, knowledge_*, etc.
A non-admin SQL like
SELECT * FROM agnes_sessions
UNION ALL SELECT email, id, … FROM users LIMIT 1
would leak every user's row.
Replaces the hardcoded denylist with a **dynamic allowlist** —
non-admin SQL may reference ONLY the registered agnes_* aliases.
Every other table in `information_schema.tables` (main schema) is
rejected. Future migrations that add a new sensitive table are
automatically covered without re-editing this module.
Also strips SQL comments (`/* */` and `--`) before the identifier
scan so a comment-wrapped table name (`/**/users/**/`) can't slip
past the regex.
Four new negative tests pin: `users`, `personal_access_tokens`,
block-comment wrap, line-comment wrap.
Plus: per-user view-count cap (100) on /api/admin/observability/views
so an admin can't fill system.duckdb with thousands of saved views.
* release: 0.54.0 — Activity Center + Telemetry + Sessions + internal datasource
Cuts the work shipped across this PR (Activity Center build, recursive
internal data source) into a versioned release. Bumps pyproject.toml
to 0.54.0; renames the top of CHANGELOG.md from [Unreleased] to
[0.54.0] — 2026-05-12 with a header summary; opens a fresh
[Unreleased] section for the next round.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Rename agnes-metadata.json to marketplace-metadata.json
Curated marketplace enrichment file (.claude-plugin/agnes-metadata.json)
becomes marketplace-metadata.json. Clean cut, no fallback — curators of
upstream marketplace repos must rename the file on their side.
Python API renames mirror the file rename: read_agnes_metadata →
read_marketplace_metadata, AGNES_METADATA_REL → MARKETPLACE_METADATA_REL,
AGNES_METADATA_MAX_BYTES → MARKETPLACE_METADATA_MAX_BYTES. Synth Claude
Code marketplace strip rule (.agnes/** + the metadata file) follows the
new filename.
* Marketplace detail polish: window cover + 715:310 aspect + helper alignment
- Plugin & item (skill/agent) detail hero: 160x160 square cover replaced
with a macOS-style window frame (3 traffic-light dots + titlebar label
showing the entity name). Body is constrained to 715:310 so curator-
uploaded covers no longer crop to a square. Window is 380px wide; meta
column and absolutely-positioned top-right install/remove actions stay
put. Fallback when no cover_photo_url (translucent gradient + PL/SK/AG
initials) is unchanged, just inside the window body.
- Inner skill/agent cards in the plugin detail's Internal structure
section adopt the same 715:310 aspect (was fixed 78px tall). No window
chrome on inner cards — just the matching proportions so covers read
consistently across hero, grid tiles, and listing cards.
- Curated nested item helper text ("This skill is part of ... — add the
bundle to your stack to use it") now stacks UNDER the "Open parent
plugin" button instead of being a side-by-side flex sibling in the
actions-row. Added align-self: flex-end so the 260px helper box
anchors at the right edge of the 300px actions column, matching the
button's right edge.
* Marketplace My tab: surface the same category + type filters as Flea
- Frontend: mp-cat-row and mp-type-row now show on tab=my (previously
hidden — type was flea-only, category was flea/curated-only). Curated
browse stays plugin-only and continues to hide the type pills.
fetchOne() sends the `type` param for tab=my too, so the items
endpoint's existing my-branch filter actually receives it.
- Backend categories endpoint, tab=my branch: when the type filter is
set to skill/agent, skip counting curated subscriptions. Curated
plugins are always type='plugin', so they wouldn't survive the items
endpoint's type filter; including them in the category counts made
the pill numbers overstate what users could actually see in the
grid. type=None or type='plugin' keeps the previous behaviour.
- CHANGELOG entry under [Unreleased].
* Marketplace plugin detail: render rich content from marketplace-metadata.json
Adds five optional plugin-level fields to marketplace-metadata.json and
renders them on the curated plugin detail page + listing card:
* display_name — friendly h1 / listing-card name / mac-window titlebar
label (overrides the technical plugin id)
* tagline — punchy 1-line value prop for the hero subtitle and the
listing card description (replacing the verbose marketplace.json
description on cards)
* description — multi-paragraph markdown body, server-side rendered
through markdown-it-py and sanitized through nh3 with a
description-scoped allowlist (no iframes / no raw HTML / no
javascript: links). Powers the "What it does" panel.
* use_cases[] — {title, description, prompt} entries that render as a
3-column "When to use it" card grid; each card shows the literal
prompt as a code chip so users can copy-paste into Claude Code.
* sample_interaction — {user, assistant} dialog rendered in a Claude
Code-style dark Catppuccin Mocha transcript panel: monospace user
row with a green ">" prompt indicator + sans-serif assistant body
with markdown formatting (peach bold, yellow italic, pink inline
code, mantle-dark fenced code blocks).
All five fields are optional; UI sections only render when populated,
so plugins without enrichment look identical to before. Fields are
read on-demand from the working tree (cached by mtime per marketplace
slug) so curator edits land at the next request without waiting for
a sync cycle — same pattern as the existing inner-skill/agent
enrichment path. No DB schema bump.
Skill / agent rich-content rendering is deferred to a later phase
(needs a source-of-truth decision: extend plugin.yml? LLM-generate
from SKILL.md / agent.md?). The schema accepts the same fields at
skill/agent level today for forward compatibility but the UI ignores
them for now.
Also: stripped a stale `background-color: var(--bg)` from the global
`code` rule in style.css (was making inline code visually disappear
on the page background).
* Skill / agent detail: render rich content from marketplace-metadata.json
Brings the skill/agent detail pages to parity with the plugin detail
page. Same rich-content schema (display_name, tagline, description as
markdown, use_cases[], sample_interaction) plus two per-item additions:
* invocation — curator-provided literal command string. When set,
overrides the computed "<manifest_name>:<inner_name>" chip and
cleanly supports both "/" skill prefix and "@" agent prefix (the
hardcoded "/" in the chip markup is hidden when the curator provides
the invocation, so /grpn-eng:query <q> and @grpn-eng:cto-architect
both render correctly).
* when_to_use — markdown disambiguation block ("Use this for X. For
similar Y, see /other-skill") rendered into a new "When to use this"
panel below the Example section.
Skill / agent category is now per-item overridable in
marketplace-metadata.json. When absent, the API keeps the parent
plugin's category as the badge so existing items don't lose their
category until curators opt in to per-item categorization.
The new "Example" Q&A panel uses the same Claude Code-style dark
Catppuccin Mocha transcript treatment as the plugin detail —
monospace user row with a green ">" prompt indicator + sans-serif
assistant body with markdown formatting.
All new fields are optional and read on-demand from the working tree.
Skills / agents whose marketplace-metadata.json doesn't carry rich
content render exactly the same way they did before (frontmatter
description + computed slash command + cover from existing v32
enrichment). No DB schema bump.
* Fix TypeError in skill / agent detail when curator sets per-item category
`curated_skill_detail` and `curated_agent_detail` were passing both
`**parent` (from `_curated_inner_parent_fields`, which returns the
parent plugin's category as a fallback) and `**enrichment` (from
`_curated_inner_enrichment`, which returns the per-item category
override when the curator set one) into `InnerDetailResponse(...)`.
Python function-call kwargs unpacking with overlapping keys raises
`TypeError: got multiple values for keyword argument 'category'`
— it doesn't merge like a literal dict does. The bug only surfaced
when the marketplace-metadata.json carried a `category` field at
skill / agent level (curator opting into per-item categorization);
items without that override hit the endpoint cleanly because only
parent provided the key.
Fix: build `merged = {**parent, **enrichment}` first (literal-dict
syntax DOES merge, with the right-hand-side winning) and unpack the
merged dict. Curator override still wins via the merge order, and
the same pattern is future-proof for any other field that lands in
both layers later.
Plus a regression test in test_marketplace_metadata.py asserting
that the inner-resolver carries `category` for downstream merging.
* Marketplace detail: tolerate partial curator JSON
Server constructed UseCase / SampleInteraction via raw dict indexing
(uc["title"], sample["assistant"]), so a curator commit missing any
required Pydantic field crashed the whole plugin / skill / agent detail
endpoint with a 500. Route both constructions through _safe_use_case /
_safe_sample_interaction helpers — partial input silently drops the
malformed card / section instead of breaking the page.
Regression test in test_marketplace_api.py covers the three shapes:
use_case missing a key, use_case with an empty string, and
sample_interaction with only user (no assistant). Sibling rich fields
still render.
* Address PR-251 review (must-fixes + S2/S3 polish) + release-cut 0.50.0
Five must-fixes from the review pass (3 from @cvrysanek's two-stage
review, 2 from my independent pass), plus the 0.50.0 release-cut as the
last commit on this PR per CLAUDE.md (CLAUDE.md "Release-cut belongs
to the PR" rule added in v0.49.1).
Must-fixes
----------
1. Cache eviction: bounded LRU instead of per-marketplace predicate.
The previous predicate (`k[0] == marketplace_id and k[1] != mtime_ns`)
only swept stale entries for the CURRENT marketplace; with N>100
distinct marketplaces each holding one mtime key, the cap silently
failed and memory grew linearly. Replaced with OrderedDict-backed
bounded LRU at cap=256, drop oldest insert on overflow.
Cache stress test pinned in test_marketplace_metadata.py.
2. Render CPU cap: per-field byte cap on description / when_to_use /
sample_interaction.assistant via MARKETPLACE_METADATA_FIELD_MAX_BYTES
(= 64 KiB). Without this, a 1 MiB curator markdown body × QPS =
curator-controlled CPU burn through pure-Python markdown-it-py.
Truncation respects UTF-8 boundaries and logs a warning so the
curator sees the cap fire on the next sync. Test for cap +
UTF-8-boundary preservation.
3. Inner-detail bypassed the metadata cache. _curated_inner_enrichment,
_curated_inner_cover, and curated_detail all called
read_marketplace_metadata directly, defeating the mtime cache the
plugin listing already shared. Routed all three through
_read_metadata_cached so skill/agent detail hits are O(1) re-parses
per marketplace per mtime instead of O(QPS).
4. Truthy-vs-presence trap in plugin/inner enrichment merge. API-layer
writers used `if resolved.get(k):` which silently dropped any
future falsy-but-valid resolver field (bool featured=False, int
priority=0, str category=''). Switched to presence check
(`if k in resolved`) so the resolver is the authority on field
presence; `{**parent, **enrichment}` merge respects whatever the
resolver decided to ship.
5. Vendor-agnostic OSS cleanup. Removed operator-specific token
references (/grpn-eng:, @grpn-eng:, .foundryai/) from
src/marketplace_metadata.py docstring, app/web/templates/
marketplace_item_detail.html JS comment, docs/curated-marketplace-
format.md, and tests/test_marketplace_metadata.py fixtures. Replaced
with generic /my-plugin:tool / @my-agent:role / .example/ placeholders.
CHANGELOG
---------
- New "### Fixed (PR #251 follow-ups)" section documenting all 4
code-side must-fixes
- New "### Internal" section noting the vendor cleanup + new tests
- BREAKING bullet for the file rename now covers operator-side
migration: running instances see plugin enrichment disappear from
the UI until upstream curator renames + nightly sync overwrites the
working tree; POST /api/marketplaces/{id}/sync forces refresh sooner
- Stripped /grpn-eng: leaks from the existing skill/agent rich-content
bullet
Tests
-----
128 targeted tests pass (test_marketplace_metadata, test_marketplace_api,
test_marketplace, test_markdown_render, test_marketplace_synth_strip,
test_marketplace_filter). New tests added:
- 6 XSS regression tests on render_safe (javascript:/data:/vbscript:
schemes via autolink, reference link, and mixed-case + positive
http/https/mailto + noopener noreferrer rel)
- 3 byte-cap tests (truncation + UTF-8 boundary + under-cap pass-through)
- 1 cache eviction stress test (>256 marketplaces -> bounded at cap)
- 1 truthy-vs-presence resolver-contract test
Release-cut
-----------
- pyproject.toml 0.49.1 -> 0.50.0 (minor; BREAKING file rename per
pre-1.0 CHANGELOG note: "breaking changes called out under Changed
or Removed with the BREAKING marker")
- CHANGELOG [Unreleased] -> [0.50.0] - 2026-05-12, new empty
[Unreleased] on top.
---------
Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
* System plugin tier with mark/unmark fanout (schema v39)
Adds a mandatory plugin tier so admins can pin a small set of curated
plugins into every user's stack from day one. Marking a plugin via the
new toggle on /admin/marketplaces materializes resource_grants for every
group and user_plugin_optouts subscriptions for every user, so the
existing resolver pulls the plugin into every served set without a new
filter layer. Hooks on user-create (Google OAuth, magic-link, admin
POST, scheduler) and group-create propagate the same materialization to
new principals. UI locks: /admin/access disables the checkbox with a
SYSTEM pill; /marketplace cards swap the "In stack" green pill for an
amber "Required" badge with shield icon; the plugin detail install
button reads "Required by your org"; /my-ai-stack toggle is disabled.
Bypass paths return 409 (DELETE /api/admin/grants for system grants,
PUT /api/my-stack/curated/.../{enabled:false}, DELETE
/api/marketplace/curated/.../install). Unmark only flips the flag —
materialized rows persist so admins curate cleanup at their leisure
through the now-unlocked /admin/access checkboxes.
* Marketplace UX polish + drop legacy /store and /my-ai-stack pages
Two-part cleanup post-v39:
(1) Page deletion. /store and /my-ai-stack were already replaced by
/marketplace?tab=flea and /marketplace?tab=my respectively, but the
standalone routes lingered. Hard delete in dev mode — no redirects,
stale bookmarks 404. The /store/new upload wizard, the flea
detail/edit pages, the admin queue, and all /api/store/* +
/api/my-stack endpoints (CLI consumers) stay. Internal hardcoded
hrefs in the upload wizard's Cancel button and the advanced-setup
page repointed to the marketplace tabs.
(2) Detail-page install button rework. The single button that morphed
between "+ Add to my stack" and "✓ In your stack" did not
communicate uninstall affordance. The installed state now renders an
inline white status label *before* a separate red-bordered
"✕ Remove from stack" button on the same row, both at identical
height to avoid layout shift. System plugins keep their locked amber
"✓ Required by your org" pill (no Remove button — API refuses 409).
The post-action hint panel now fires on remove too with the title
flipped to "✓ Removed from your stack" — Claude Code needs the same
/update-agnes-plugins refresh either way.
Also: /admin/marketplaces Details modal "Mark as system" toggle
redesigned. The button was near-invisible (matched neutral row
metadata). It's now a balanced amber-toned chip with shield icon
and a structured confirm modal replacing the native confirm() dialog
that summarizes fanout consequences before commit.
* Move stack-hint inside hero with glass-on-gradient styling
The post-action hint card ("✓ Added to your stack" with the
/update-agnes-plugins recipe) used to live below the hero in
panel-what (gray card on white page body). Clicking add/remove
inserted/removed it between the hero and content, shifting the
panels below — a noticeable scroll jump.
The hint is now anchored inside the hero's top-right corner alongside
the install/remove buttons, both as flex children of an absolutely
positioned .actions container. The card uses a translucent
white-on-glass treatment that adopts the hero's kind color (blue for
plugin, green for skill, purple for agent) without per-kind branching.
Hero is always tall enough (160px photo) to contain the action+hint
stack without overflow, so toggling the hint visibility doesn't grow
the hero or shift body content.
The hero-head grid reserves a third 300px column for the absolute
actions overlay so meta gets the proper 1fr free space instead of
being squeezed by a padding-right hack. Responsive breakpoint at
1100px reflows the actions stack below hero-head when the viewport
isn't wide enough to keep meta + actions side-by-side comfortably.
* Add optional -DataPath bind mount to run-local-dev.ps1
When the operator wants to inspect DuckDB files (system.duckdb, extracts,
marketplaces, store/, …) directly from Windows Explorer, the named volume
inside the Docker Desktop WSL VM isn't reachable. The new -DataPath param
generates a transient compose override that rebinds /data on app, scheduler,
extract (and Caddy's /srv:ro mirror) to a Windows host folder.
Fully additive — when -DataPath is omitted everything behaves exactly as
before: no override file is generated, $composeFiles array is unchanged,
finally cleanup is a no-op. Existing positional invocations
(.\run-local-dev.ps1 up | down | logs) keep binding to $Action because
$DataPath is a named-only parameter with no Position attribute.
The override is written via [System.IO.File]::WriteAllText so the YAML is
BOM-less across PS 5.1 / 7+ — Compose rejects BOM-prefixed YAML on Windows.
The override file is unique per PID and removed in the script's finally
block so concurrent invocations and crashes don't leak files.
* factor mark_system fanout into UserCuratedSubscriptionsRepository
The endpoint imported UserCuratedSubscriptionsRepository, ignored it
(noqa: F841), then duplicated the user-side fanout SQL inline. Adds
fanout_system_for_plugin() symmetric to the existing
fanout_system_for_user() and routes mark_plugin_system through it —
removes the dead import + 14 lines of inline SQL, returns the same
`affected_users` delta count, no behavior change.
* drop customer-specific path from .ps1 example
Per CLAUDE.md vendor-agnostic OSS rule: replaced
C:\\Business\\Groupon\\Agnes\\agnes-data with the generic
C:\\Users\\<you>\\agnes-data placeholder so the docstring
example reads cleanly on any reviewer's box.
* release: 0.48.0 + parallelize Release-workflow pytest
Cuts the release shipped via #228#230#231#232#233#234#236#237#238#239#240 plus this PR (#241). Major changes:
- System plugin tier (schema v39) — admins mark a plugin mandatory; fans
out RBAC grants + subscriptions to every existing user/group plus
hooks for new principals
- BREAKING: removed standalone /store + /my-ai-stack page routes
(replaced by /marketplace?tab=flea + /marketplace?tab=my)
- Setup-prompt + bootstrap recovery fixes (#240)
- DuckDB CHECKPOINT-on-shutdown + 60s compose grace (#235)
- Marketplace + flea-market UX polish, agnes-metadata.json enrichment
Bonus: switch release.yml test step to `-n auto` (matches ci.yml).
Single-threaded was 15-20 min and frequently the bottleneck on PR
mergeability — now ~6 min.
---------
Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
* Curated marketplace enrichment via agnes-metadata.json + curator metadata
Adds a second well-defined metadata file `.claude-plugin/agnes-metadata.json`
that upstream marketplace repos can opt into, providing per-plugin (and
per-skill / per-agent) cover photo, demo video URL, doc links, and
category override. The Claude Code marketplace contract is untouched —
agnes-metadata.json + the convention `.agnes/` directory are stripped
from the synthetic Claude Code marketplace served via /marketplace.zip
and /marketplace.git/*, so user instances see a clean Claude Code repo
with no Agnes-only metadata.
Highlights:
- DB schema v32 — adds curator_name + curator_email on marketplace_registry,
cover_photo_url + video_url + doc_links on marketplace_plugins.
- Mandatory curator at marketplace registration, editable later through
the admin UI; surfaces on cards + detail pages in place of owner_todo.
- External-asset mirror cache at ${DATA_DIR}/marketplace-cache/<slug>/
with conditional GET, 60s timeout, 10 MB body cap, SSRF guards, and
Wikipedia-policy-compliant User-Agent.
- Strict drop semantics — anything Agnes can't deliver as a real PDF /
Markdown / plain text doc, or a real PNG / JPEG / WebP cover, is
dropped from the served metadata; UI looks identical to no-entry case
(gradient placeholder for missing covers, no row in the doc list).
- Doc allowlist + image allowlist enforced on both the curated mirror
flow and the Flea upload flow (/store/new); shared module
src/marketplace_assets.py.
- New /api/marketplace/curated/{mp}/{plugin}/{asset,doc,mirrored}/...
endpoints with path-traversal guards + RBAC + Content-Disposition
attachment for docs.
- Curator-focused format guide at /marketplace/format-guide; canonical
source is docs/curated-marketplace-format.md, also linked from the
admin /admin/marketplaces page next to + Add Marketplace.
See CHANGELOG.md under [Unreleased] for the full breakdown.
* Fix format-guide test assertion to match shortened disclaimer
The 'Flea Market' phrase was trimmed out of the disclaimer in
docs/curated-marketplace-format.md after the curator-focused rewrite.
Update the rendered-HTML test to assert the channel-scoping phrase
that's actually present ('Curated Marketplace channel only') rather
than the 'Flea Market' contrast that's no longer in the doc.
* Drop unused 'version' field from agnes-metadata.json schema
The parser never read it; it was a YAGNI placeholder for future
schema evolution. Curators don't need to wonder what to put there
when adding the file for the first time. Will be re-added if and
when we actually introduce a backwards-incompatible schema change.
* Harden asset mirror against SSRF via redirect + DNS rebinding
The pre-flight _is_safe_url check validated only the initial URL;
urllib.request.urlopen then followed redirects and re-resolved DNS for
the actual connection — both bypassable. Attacker-controlled origin
could 302 to http://169.254.169.254/... and exfil cloud metadata;
attacker-controlled DNS could return public IP first / 127.0.0.1 second.
Replace urlopen call with a shared OpenerDirector wired through three
custom handlers: _SafeRedirectHandler re-runs SSRF allowlist on every
redirect Location (max 5 hops, down from urllib's 10), and
_PinnedHTTPHandler / _PinnedHTTPSHandler connect to the IP that passed
validation rather than re-resolving the hostname. TLS SNI + cert verify
stay bound to the original hostname.
_resolve_safe returns the validated IP (the existing _is_safe_url
2-tuple wrapper stays for backwards compatibility) and rejects round-
robin DNS that mixes a public + private record. _UnsafeRedirectError
is a typed exception so _fetch_url can map redirect blocks to terminal
'rejected' status (not transient 'failed'). _http_open is the single
call site so tests can mock at one well-defined seam.
Tests cover redirect blocking (link-local, loopback), redirect-error
unwrapping inside URLError, pinned-IP connection target, and the
end-to-end DNS-rebinding scenario. Existing tests that mocked
urllib.request.urlopen are migrated to mock _http_open.
* Harden /asset/ endpoint against stored XSS
The endpoint served any file in the cloned marketplace repo with
stdlib-detected Content-Type, so a curator who landed evil.html (or a
renamed evil.png carrying HTML bytes) in the working tree got a
same-origin XSS — the response shares cookie scope with /admin and
/api/me/*.
The asset endpoint is image-only by contract (cover photos referenced
from agnes-metadata.json + inner skill / agent cards), so applying the
same allowlist + magic-bytes pattern that /doc/ already uses closes
the gap without breaking any legitimate use case. Three layered
checks: extension in IMAGE_EXTENSIONS (.png/.jpg/.jpeg/.webp; SVG
excluded — <script> inside SVG executes), validate_image_file magic
bytes (defeats rename-extension attack), Content-Type pinned from the
validated extension (never stdlib mimetypes).
Defense-in-depth: X-Content-Type-Options: nosniff stops browser MIME
sniffing; Content-Security-Policy: default-src 'none' blocks script /
iframe execution even if a future regression let HTML through.
Tests cover the .html extension reject, the renamed-HTML-as-PNG magic-
bytes reject, the .svg reject, and the happy-path PNG with security
headers attached. The pre-existing path-traversal test seeds a real
PNG instead of ok.txt now that the endpoint is image-only.
* Enforce mandatory curator on marketplace PATCH
The POST handler enforced curator_name + curator_email at create time,
but PATCH treated empty / missing curator inputs as 'no change'. Legacy
rows that pre-date v32 (curator_name=NULL) could be edited indefinitely
without ever filling the curator gap, and OWNER_TODO_PLACEHOLDER lingered
on every /marketplace card.
Reject the PATCH with 400 when the post-merge row would persist with
empty curator. The check fires after the existing field-merge logic, so
once-filled rows that don't touch curator still pass through (their
existing values fall through from the DB row). DB column stays nullable
so untouched legacy rows continue to coexist — the gate fires only the
moment an admin opens the edit modal.
Existing PATCH semantics preserved: empty-string input still means 'leave
existing value alone', and once-filled curator can't be cleared (those
test cases pass unchanged). New test seeds a legacy row directly via the
repository, then exercises url-only PATCH (rejected), partial-fill PATCH
(rejected), and full-fill PATCH (succeeds); a follow-up no-curator PATCH
on the now-formed row also passes.
* Drop unused curated-marketplace helpers (PR #234 review)
* build_db_payload — imported by src/marketplace.py but never called.
The strict-drop semantics it would have implemented were re-written
inline in _refresh_plugin_cache (see the comment block there). The
standalone helper still carried the old fall-back-to-original-external-
URL-on-mirror-failure behaviour, which contradicts the documented
drop-when-can't-deliver contract — a future contributor who re-wired
it would have introduced a silent regression. Delete with the helper
+ the import + the comment that referenced it.
* _resolve_marketplace_name — one-line shim with no remaining call
sites. Callers use _resolve_marketplace_meta which returns name +
curator together, avoiding the double DB hit the shim exists to
hide.
* '# noqa: F401 Optional kept for forward-compat' was wrong — Optional
IS used in src/marketplace.py (line 70 and line 238). Drop the noqa
comment so a future ruff run doesn't try to remove a real import.
Removing build_db_payload also drops the only remaining use of Optional
in src/marketplace_metadata.py, so the import comes out there too.
* Cap agnes-metadata.json size + catch RecursionError on parse
The reader is invoked once per marketplace per sync and the file is
curator-controlled. Two failure modes were unguarded:
* Multi-GB JSON: path.read_text() pulled the whole file into memory
before json.loads even ran. A curator with commit access to an
upstream repo could OOM the sync worker.
* Deeply-nested JSON under any size cap: cpython's recursive object /
array parser raises RecursionError at ~1000 levels of depth.
RecursionError is a RuntimeError, not ValueError, so the existing
catch let it propagate up and abort the entire sync — every other
marketplace in the same pass got skipped.
Add AGNES_METADATA_MAX_BYTES = 1 MiB (a real metadata file with covers,
docs, categories for ~50 plugins fits in <100 KB so the cap is
generous) and gate the size check on path.stat().st_size before the
body read. Broaden the parse except to (ValueError, RecursionError)
with a unified log line. Both failure modes degrade to the same
empty-dict fall-back the malformed-JSON path already used, so one bad
upstream never aborts the rest of the sync.
Tests cover the size cap firing before json.loads (whitespace-padded
valid JSON exceeding the cap) and the recursion path (5000 nested
arrays — past cpython's default recursion limit but well under the
size cap).
* Persist asset-mirror manifest per body write, before unlink
sync_assets wrote each body atomically (tmp + rename) but persisted
the manifest only at the end of the batch. A kill -9 mid-Phase 2 left
on-disk files the manifest never referenced. Once a curator dropped
that URL from agnes-metadata.json, Phase 3's cleanup had no record of
the file and the orphan stayed forever — there's no GC pass walking
the cache dir today, so disk would slowly bloat.
Phase 2 (body-write iteration): after the in-memory manifest mutation,
persist BEFORE unlinking the previous body. The crash window narrows
from 'all of Phase 2' to 'between persist and unlink' (microseconds).
A persist failure mid-batch keeps the previous body on disk — the on-
disk manifest still references it, and a stale-but-existing file beats
a 404. Cost: one extra tmp+rename per body write; manifest is a few KB
so the overhead is negligible vs. the HTTP fetches.
Phase 3 (curator-removed URLs): same discipline. Collect the to-delete
relpaths, persist the manifest with the entries already gone, THEN
unlink. A crash mid-cleanup leaves at most a microsecond window where
files exist despite the manifest no longer naming them. The next sync
reads the (correct) manifest and the orphan stays orphaned, but the
served state is consistent.
Tests cover per-body persist call count, the post-update on-disk
manifest content, and Phase 3 ordering verified by reading the on-disk
manifest from inside Path.unlink.
* Consolidate marketplace video embeds + format-guide CSS
The YouTube nocookie / Vimeo / <video> / link-fallback detection logic
was duplicated verbatim in marketplace_plugin_detail.html and
marketplace_item_detail.html (~40 JS lines each, with subtly-different
inline styles). Both templates now {% include %} a single
_marketplace_video_embed.html partial inside their IIFE so the regex,
the nocookie attribute set, and the unknown-host link fallback live in
ONE place — future tweaks (new host, new attribute, fixed sandbox flag)
no longer need to be applied twice in lockstep.
The .video-wrap selectors (one inline <style> rule in plugin_detail,
one inline style='...' attribute in item_detail) are replaced by the
existing .video-embed 16:9 wrapper in style-custom.css, with new
.video-embed video / .video-embed a child rules added so the wrapper
handles all four embed shapes uniformly without per-template
positioning.
The 60-line inline <style> block in marketplace_format_guide.html
moves verbatim to style-custom.css under a new 'Marketplace format
guide page' section, scoped to .format-guide so other pages aren't
affected.
No user-visible behaviour change: the rendered HTML for valid
YouTube / Vimeo / mp4 / external links is byte-identical to before,
and the format-guide page renders the same.
* Maintainability cleanup batch (PR #234 review)
#10: drop _path_under from app/api/marketplace.py — it was a byte-
equivalent clone of _safe_join (same Path.resolve(strict=True) +
relative_to() containment check). The three v32 endpoint handlers
(/asset, /doc, /mirrored) now share the existing helper.
#14: rename src/marketplace_assets.py → src/marketplace_asset_validation.py
so the file's purpose is obvious from the name and the previous
overlap with src/marketplace_asset_mirror.py is gone. Six call-site
imports updated in lockstep; CHANGELOG references under [Unreleased]
updated to track the new path.
#11: consolidate the URL builders that resolve
/api/marketplace/curated/<slug>/<plugin>/{asset,doc,mirrored}/...
paths. _internal_asset_url / _internal_doc_url / _mirrored_asset_url
lived in src/marketplace.py, while a copy named _mirrored_url lived
in app/api/marketplace.py with a 'must stay aligned' comment. New
module src/marketplace_urls.py is the single source of truth — both
call sites import from it and a future URL-format tweak only needs
to change one file. The _ROUTE_PREFIX constant collapses the per-
function f-string repetition. The route-handler endpoints themselves
still own the path string literals (keeping the builders identical
to the route declarations remains a checklist item, not a runtime
guarantee).
* Re-key asset-mirror manifest by (plugin, url) + dedup HTTP fetches
The manifest used to be keyed by URL alone, so two plugins in the
same marketplace referencing the same external image (a shared CDN
icon, a common cover) collided on entry.plugin_name — last writer
won. The DB row for the losing plugin then stored a served URL
pointing under the winning plugin's tree, and require_resource_access
denied legitimate access on one side and let the other plugin's user
reach the wrong asset.
In-memory: Dict[Tuple[str, str], MirrorEntry] keyed (plugin_name, url).
On disk: format flips from {url: entry} dict to [entry, ...] list of
self-describing entries (each carries plugin_name + url + the
previous fields). JSON keys can't be tuples; encoding 'plugin::url'
would just shift the parsing burden.
Phase 1 of sync_assets deduplicates fetches by URL — three plugins
sharing one URL share one HTTP request. The conditional-GET prior is
picked from any owning plugin's prior entry; if their etags diverge
(rare) we miss one 304 and pay for a full re-download instead.
Phase 2 still creates a per-(plugin, url) manifest entry pointing
under the plugin's own subdir, and Phase 3 cleanup is keyed the same
way so dropping a URL from one plugin's metadata doesn't disturb
another plugin still referencing it.
Body files stay per plugin (RBAC-clean isolation: deleting plugin A's
cache can't strand plugin B). Bandwidth saved by fetch dedup.
Consumer code re-keyed: src.marketplace._refresh_plugin_cache rebuilt
served_url_for / mirror_status as composite-keyed maps;
app.api.marketplace._resolve_external_via_mirror /
_curated_inner_cover / _curated_inner_enrichment look up by
(plugin_name, url).
Tests cover per-plugin manifest entries with shared URL, the single
HTTP fetch for N plugins, and Phase 3 drop-one-keep-other. All
existing tests migrated to composite key access; v2 list format
assertions verify on-disk shape.
* Migrate asset mirror from urllib.request to httpx
The asset mirror was the only HTTP call site in Agnes still using
urllib.request; every other module (CLI, Jira / OpenMetadata / OpenAI
connectors, scheduler, Telegram bot) already used httpx. The asset
mirror was added in this PR's base commit, so this is the only chance
to bring it into convention before someone copies it as 'the pattern
for HTTP fetches in Agnes'.
Three concrete benefits beyond consistency:
* SSRF defence collapses from five urllib classes
(_PinnedHTTPConnection, _PinnedHTTPSConnection, _PinnedHTTPHandler,
_PinnedHTTPSHandler, _SafeRedirectHandler) into one
_SSRFGuardTransport. httpx invokes handle_request() on every redirect
hop, so re-validation is free — we don't need a custom redirect
handler at all.
* DNS-rebinding defence: the transport rewrites request.url.host to the
SSRF-validated IP before delegating to super().handle_request().
httpcore connects to whatever URL.host says, so this pins the
connection without subclassing HTTPSConnection. The original hostname
goes into the Host header + the sni_hostname extension so TLS / vhost
routing still bind to the curator-supplied hostname.
* Error handling: one httpx.HTTPError catch-all for transport errors,
plus specific httpx.TimeoutException / httpx.TooManyRedirects branches
for clearer diagnostics. Matches the _translate_transport_error shape
in cli/client.py.
The shared httpx.Client is built lazily at module load (same pattern as
cli/client.py:_get_shared_client) with follow_redirects=True,
max_redirects=5, timeout=HTTP_TIMEOUT_SEC, and our custom transport.
Externally observable behaviour is unchanged: same FetchOutcome
statuses, same manifest format, same conditional GET semantics, same
body-size cap.
Tests migrated from urllib-shaped fakes to httpx-shaped (status_code,
iter_bytes, context manager). Five urllib-specific tests replaced with
httpx equivalents — three transport unit tests + one DNS-rebinding
integration test that verifies host rewrite via monkey-patched
super().handle_request. One test deleted without replacement
(unwrap-URLError-wrapping-an-_UnsafeRedirectError — urllib-specific,
not applicable to httpx).
* Surface curated agnes-metadata enrichment on My Stack tab
GET /api/marketplace/items?tab=my built each curated row from the
on-disk marketplace.json by way of resolve_allowed_plugins, which
doesn't carry the agnes-metadata enrichment columns
(cover_photo_url, video_url, category override, doc_links). The
handler then hard-coded cover_photo_url=None on the synthetic row.
Result: once a user clicked '+ Add to my stack' on a curated card,
the same plugin in tab=my rendered with the gradient placeholder
instead of its cover photo — confusing parity break vs. the curated
tab where the same row goes through MarketplacePluginsRepository
and gets the enriched columns.
Pre-load the enriched marketplace_plugins rows for every marketplace
the user is subscribed to, then look each granted+subscribed plugin
up by (marketplace_id, plugin_name). Fall back to the on-disk
synthetic shape only when the DB row is missing — happens during
the rare race where RBAC is granted before the first sync cycle
ingests the plugin. RBAC gating (granted set from
resolve_allowed_plugins) is unchanged so this fix can't widen
visibility; it just upgrades the data shape behind cards the user
was already going to see.
Per-marketplace list_for_marketplace beats N gets — typical user is
subscribed to <5 marketplaces, so this is at most a handful of
queries vs. one per subscribed plugin.
Regression test seeds a plugin with cover_photo_url + category
override, subscribes the user, hits /api/marketplace/items?tab=my,
and asserts photo_url + category come through. The misleading
'fall through to gradient until the user re-visits the curated tab'
comment is gone.
---------
Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
* feat(store): flea-market upload guardrails + soft delete + JOIN-based admin queue
Adds an end-to-end guardrails pipeline for store uploads (manifest +
static-security + LLM review), persists blocked bundles for forensics,
introduces soft-delete (Archive) semantics, consolidates the legacy
/store/{id} surface into /marketplace/flea/{id}, and reworks the admin
queue so lifecycle filters read live entity visibility via LEFT JOIN
rather than a denormalized submission column.
Schema v29 → v35:
* v29 store_submissions table + store_entities.visibility_status
* v30 file_size, bundle_sha256, bundle_purged_at on submissions
* v31 reshape store_submissions (drop legacy unique on entity_id)
* v32 store_entities.archived_at/by + 'archived' visibility value
* v33 drop store_submissions.retry_count (unused)
* v34 ensure idx_store_submissions_entity exists post column-drop
* v35 broaden visibility_status enum + JOIN architecture cutover
Pipeline (src/store_guardrails/):
* Inline checks: manifest_check, static_scan, quality_check
* LLM review configurable haiku|sonnet|opus (default haiku)
* BackgroundTasks-driven async path with structured-output JSON
* Per-submitter daily quota (default 50)
* 30-day TTL purge job (POST /api/admin/run-blocked-purge)
* Bundle SHA256 + size persisted; sha256 survives purge for forensics
Visibility model:
* pending | approved | hidden | archived
* _enforce_visibility returns 404 (no leak) for non-owner non-admin
* Owner sees own non-approved entries via include_owner_id widening
* Install refused with 409 entity_not_approved when not approved
Soft-delete (DELETE /api/store/entities/{id}):
* Default = soft (visibility_status='archived'); existing installs
keep getting served the bundle so users don't lose the plugin
* ?hard=true admin-only: drops bundle + cascades user_store_installs
* Hard-delete preserves entity_id on submission as tombstone so
audit_log linkage survives for the activity timeline
Admin queue lifecycle (the JOIN refactor):
* Verdict (store_submissions.status) is immutable forensic record
* Lifecycle (store_entities.visibility_status) is live state
* /admin/store/submissions Archived chip translates to
`e.visibility_status='archived'` via LEFT JOIN — any path that
flips visibility surfaces in the queue immediately
* Detail page renders Status (verdict) and Entity lifecycle side by
side so admins see "approved at review, now archived" at a glance
URL consolidation:
* /store/{id} deleted (no redirect, stale bookmarks 404)
* /marketplace/flea/{id} is the canonical detail surface
* Three in-tree callers (upload-success, my-stack card, store
listing card) updated to point at the new URL
* Quarantine banner extracted to _quarantine_banner.html partial,
self-guarded, included from both flea detail templates
* Banner JS auto-refreshes when the verdict lands by polling
/api/marketplace/flea/{id}/detail (visibility_status +
submission_status — the latter is needed because blocked_llm
keeps the entity at visibility_status='pending')
Audit log resource format:
* runner.py emits prefixed `store_submission:{id}` (post-fix)
* Detail-page timeline query handles three patterns: prefixed
submission, helper-emitted `store_entity:{sub_id}`, and bare-id
legacy rows — all surface in the activity timeline
UX fixes:
* Owner sees Under review / Quarantined / Hidden banner with status
* Install button gray-disabled (not blue) when non-approved
* Owner cannot delete quarantined entries (403); admin can
* Admin queue: filter chips, sortable columns, paging, page-size
* Auto-refresh queue every 5s while pending rows are visible
* Store upload page file picker no longer opens twice (label →
input default action collided with explicit JS handler)
Tests: 168 passed across the guardrails suites (admin submissions,
store API, inline / LLM / purge guardrails, store repositories,
marketplace filter, schema version). New regression coverage
includes: archive surfaces via JOIN even when API path is bypassed;
deleted submission renders activity timeline (tombstone); flea
detail surfaces submission_status only for owner/admin; detail page
renders Entity lifecycle row; audit log resource format covers both
helper and runner paths.
* fix(store-guardrails): PR #233 follow-up — prompt injection, atomic PUT, BG race, schema, reaper, sort whitelist
Addresses 9 of the 23 findings from the PR #233 review (spec at
docs/superpowers/specs/2026-05-09-pr233-guardrails-fixes-spec.md).
Merge-gate items #1-#6 plus high-value mediums #7, #9-#12, #23.
Architectural items (#8 enum split, #14 factory) and pure
maintainability (#15-#22) deferred to follow-ups.
Security:
* #1 prompt injection — SYSTEM_PROMPT now passed via the SDK's
dedicated system= parameter; bundle wrapped in <bundle>...</bundle>
sentinels declared data-only by the system prompt; literal
sentinel strings in user content are escaped so an adversarial
README can't forge a close tag.
* #6 static scan honesty — module docstring + admin copy + docs
declare static scan as signal not gate; .md/.txt/.rst/.html/.json/
.yaml/.yml/.toml skipped to avoid false positives on prose.
AST mode for Python deferred (separate flag, FP comparison work).
Correctness:
* #2 PUT atomicity — bundles bake into plugin.staging-<rand>/
alongside live, atomic-rename on success; failed checks leave
live tree byte-for-byte intact.
* #3 BG-task race — set_visibility_if_pending guards verdict flips
to the (pending, hidden) review window; admin archives during
review survive; skipped flips audit-logged.
* #4 v35 NOT NULL/DEFAULT — schema v35→v36 re-applies them on
store_entities.visibility_status. CHECK constraint enforced
application-side (DuckDB ADD CHECK on existing column unsupported).
* #7 stuck-review reaper — reap_stuck_llm_reviews flips pending_llm
rows older than guardrails.stuck_review_grace_seconds (default
1800) to review_error. Scheduler runs every 15 min via new
/api/admin/run-reap-stuck-reviews. Set knob to 0 to disable.
* #9 quota counter — count_blocked_for_submitter_since now counts
blocked_inline + blocked_llm + review_error so a submitter
triggering only LLM-blocked verdicts is bounded.
* #10 missing risk_level — surfaces as review_error with
error='missing_risk_level' instead of silently defaulting to
'medium' (which looked like a model-decided block).
* #11 archived_at clear — set_visibility nulls archived_at +
archived_by when transitioning out of 'archived' so a future
read doesn't show stale archive forensics on an approved row.
Maintainability:
* #12 FSM doc comment — accurate insert/transition/lifecycle
description in src/db.py near store_submissions schema.
* #23 sort-key whitelist — admin queue rejects unknown sort keys
with 400 invalid_sort_key; substring-replace footgun removed.
Deferred (separate PRs):
* #5 quota race — proper fix requires asyncio.Lock spanning the
full pipeline; threading.Lock blocks event loop, DuckDB MVCC
doesn't help. API-level slowapi bounds worst case for now.
* #6 part 3 (AST static scan), #8 (enum split), #13 (import
bundle docs), #14 (factory consolidation), #15-#22 (maint).
Tests:
* New: tests/test_store_guardrails_prompt_injection.py (corpus +
trust-boundary invariants), tests/test_store_put_atomic.py,
tests/test_store_guardrails_reaper.py.
* Extended: test_store_guardrails_llm.py (system param, missing
risk_level, BG race), test_admin_store_submissions.py (quota
counter widening, sort whitelist 400), test_store_repositories.py
(un-archive metadata clear), test_db_schema_version.py (v36).
* Full suite: 3738 passed; 17 pre-existing baseline failures
unchanged (db migration tests, cli binary rename, catalog export,
user mgmt v5 backfill — confirmed by stash + rerun on clean tree).
* Add /marketplace browse page + Model B opt-in stack composition
New /marketplace browse surface unifies the curated marketplaces
(admin-managed git mirrors) and the community Flea Market behind
three tabs — Curated / Flea / My Stack — with per-tab category
filter, search across both sources with scope checkboxes, and
numeric pagination, all driven by URL query state. Plugin detail
at /marketplace/curated/<slug>/<plugin> and /marketplace/flea/<id>;
nested skill / agent detail at /marketplace/curated/<slug>/<plugin>/
{skill,agent}/<name> and the flea-side single-page detail.
Model B opt-in: an RBAC grant on a curated plugin is now only
*eligibility*. The user must click "Add to my stack" for it to
enter their served Claude Code marketplace. Composition flips
from (rbac ∖ opt_outs) ∪ store_installs to
(rbac ∩ subscriptions) ∪ store_installs. The legacy
user_plugin_optouts table is renamed user_curated_subscriptions
(schema v27) — same table shape, inverted semantic, repository
methods become subscribe / unsubscribe / is_subscribed.
UX vocabulary: Install → Add to my stack, Installed → In your
stack, card "Installed" badge → "In stack" (amber pill), tab
"My Subscriptions" → "My Stack". Bridges the two-step model
(server-side bookmark vs. on-laptop install) the previous label
hid. Click triggers an inline post-add hint panel under the
description with the agnes refresh-marketplace recipe + Copy
chip, dismissible per-browser via localStorage.
Per-tab info blocks above the filter row:
- Curated: trust signal — "Each plugin here has a named curator
accountable for it." (blue accent + See-all-curators link)
- Flea: open-shelf signal — "Anyone in the company can upload
here." (purple accent + Tips-for-sharing link)
- My Stack: personal-shelf orientation — "Your AI stack —
everything you've added." (slate accent, no link)
Tabs carry per-tab Heroicons (shield-check / building-storefront
/ rectangle-stack) tinted to match each tab's accent; flips white
when the tab is active for contrast.
Hero illustration anchored to the right of the blue hero panel
(absolute, 47% wide, behind the search row content). Hidden
under 900px viewport.
Action-row CTAs realigned to publication intent: curated
"How to add new content" → "Submit a plugin" (links to the
guide page); flea button removed since +Upload sits next to it.
Empty-state CTAs match. /marketplace/guide/{curated,flea}
routes now host publication-flow guide pages with placeholder
ledes — full copy to be authored separately.
Categories: Heroicons-based icons mapped per category in
src/category_icons.py (zero new dependencies; SVG path strings
inlined). Marketplace cards, filter pills, and detail pages
read from the same source.
API endpoints under /api/marketplace:
- GET /items per-tab listing (curated / flea / my)
- GET /categories per-tab non-zero counts
- GET /curated/{slug}/{plugin} plugin detail
- POST/DELETE /curated/{slug}/{plugin}/install subscribe toggle
- GET /curated/{slug}/{plugin}/{skill,agent}/{name} inner item
The tab=my branch reads directly from
user_curated_subscriptions ∪ user_store_installs (not
resolve_user_marketplace, which bundles flea skills/agents into
a single store-bundle synthetic entry useful for serving the
Claude Code marketplace ZIP/git but wrong for browsing where
each item should appear as its own card).
Detail pages: plugin detail surfaces inner skills/agents as
clickable nested cards; commands/hooks/MCPs render as plain
name lists. Skill/agent detail mirrors the plugin layout with
kind-tinted accents (skill = green, agent = purple), Description
+ Details sidebar, Files + Docs sections, and the "How to call
it" copy-able invocation chip showing /<plugin>:<inner-name>
exactly as Claude Code namespaces it post-install. Curated
nested has no install button — links back to the parent plugin.
Navbar: standalone "My AI Stack" relabelled "My Stack" and
points at /marketplace?tab=my; "Store" link removed (Store
flow is reachable via the Flea Market tab's +Upload button).
The standalone /my-ai-stack and /store routes still work for
old bookmarks.
Tests cover the new browse / categories / install / RBAC paths
under tests/test_marketplace_api.py; existing marketplace and
store tests updated for Model B (explicit subscribe in fixtures).
Schema bumped v26 → v27 with idempotent migration that wipes
existing user_plugin_optouts rows on flip and adds
marketplace_plugins.created_at with registered_at backfill.
* Fix v28 migration + post-rebase test fallout
v28 ALTER TABLE marketplace_plugins ADD COLUMN created_at conflicted with
_SYSTEM_SCHEMA's earlier CREATE that already includes the column on fresh
installs (test fixtures starting at any pre-v28 version trip on it).
Switch to ADD COLUMN IF NOT EXISTS — same idiom as the upstream v27
Keboola sync-strategy migration on the same ladder.
Two test patches needed after the rebase bumped SCHEMA_VERSION 27 → 28:
- test_keboola_v27_migration.py: test_schema_version_constant_is_27 was
pinning ==27. Loosened to >=27 (the test's purpose is to verify the
v27 Keboola migration, not to pin the current SCHEMA_VERSION).
- test_setup_page_unified.py: was monkeypatching resolve_allowed_plugins
but compute_default_agent_prompt now reads from resolve_user_marketplace
(Model B-aware). Stub the right function so the test exercises the
v28 served-set path.
* Harden curated skill/agent inner endpoints against path traversal
`_read_inner`, the `skill_dir` walk in `curated_skill_detail`, and the
`agent_path.stat` in `curated_agent_detail` joined URL path-params onto
`plugin_root` without verifying the resolved candidate stayed inside it.
Starlette's `[^/]+` on `{skill_name}` / `{agent_name}` blocks the direct
URL exploit (encoded `/` 404s before the handler), but a curator-planted
symlink inside a curated marketplace's git mirror could still dereference
outside the plugin tree on read.
Adds `_safe_join(plugin_root, *parts)` doing
`Path.resolve(strict=True)` + `relative_to(plugin_root.resolve())`, used
by all three call sites so the boundary is enforced once and consistently.
Tests cover the helper directly (normal path resolves, escaping `..`
returns None, escaping symlink returns None, missing file returns None)
plus an end-to-end check that the symlink case actually 404s on the
HTTP endpoint. Symlink tests skip on Windows where symlink creation
needs elevated permissions; they run on Linux CI.
---------
Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>