Commit graph

5 commits

Author SHA1 Message Date
ZdenekSrotyr
1b0329e8c5
UI design system unification — one stylesheet, canonical primitives, nav fix (#284)
* docs(plan): design-system unification plan (post-review revisions)

Plan covers consolidating two CSS files into one, introducing
canonical primitives (.btn family, .search-input, .filter-bar,
.page-header, .data-table, .empty-state, .toast, .stat-card,
.tab-strip), unifying the top-nav Admin trigger with sibling
links, and migrating 41 templates that today carry inline
<style> blocks.

Post-review revisions: nav fix moved to first commit (user
complaint lands first); sticky-header and dark-mode skeleton
tasks dropped (defer to follow-up PRs); contract test class
detection tokenizes class="..." attributes properly; baseline
screenshot loop added to Task 0; vendor-token grep widened.

* fix(nav): unify Admin trigger with sibling nav links

The top-nav Admin entry is a <button class="app-nav-link
app-nav-menu-trigger">, siblings are <a class="app-nav-link">.
.app-nav-menu-trigger used to override .app-nav-link with
"color: inherit; font: inherit", resetting font-size from 13px
back to body default and color from --text-secondary to body
color. Active state diverged too: .is-active on links used
--primary blue, [aria-expanded=true] on the button used
--border-light grey.

Fix: expand .app-nav-link so it covers <button>-element resets
(font-family: inherit, border: 0, background: transparent,
cursor: pointer, display: inline-flex for chevron alignment).
Add [aria-expanded="true"] as another active-state selector
so the dropdown's open state highlights identically to .is-active
on links. Delete the now-redundant .app-nav-menu-trigger rules
that stripped button chrome.

Extract the inline <script> from _app_header.html into a new
app/web/static/app.js (loaded by base.html only — base_login.html
has no nav). Sets up window.appUI.wireDropdown for both the user
menu and the Admin dropdown via DOMContentLoaded.

* style(css): consolidate style.css into style-custom.css + add cache-bust

One stylesheet for the whole web UI:
- style.css (1086 lines, legacy Google-inspired tokens + components)
  absorbed into style-custom.css under a labeled block, placed after
  the modern :root + body so style-custom's component rules continue
  to override the legacy ones (preserves the original cascade order
  that came from loading style.css first).
- style.css deleted; <link> dropped from base.html + base_login.html.
- static_url() now appends ?v=<mtime> to /static/<path>. Cheap
  per-request os.stat — auto-invalidates browser + proxy caches on
  redeploy without operator intervention. Mtime survives across
  uvicorn restarts as long as the file content is unchanged.

Legacy classes (.btn, .card, .login-*, .badge, .code-block, .flash,
.form-group, .username-box, .btn-copy, .auth-tabs, .divider, etc.)
still render — they live in style-custom.css now. Login pages,
error page, password setup, and the dashboard's Claude Code Setup
card all kept working in browser smoke.

* test(design): contract test for design-system invariants

7 structural invariants enforced from this commit onwards:
- style.css must stay deleted
- no template links style.css via static_url
- exactly one bare :root block in style-custom.css
- canonical primitives declared (.btn, .btn-primary, .search-input,
  .filter-bar, .page-header, .data-table, .empty-state, .toast, …)
- no deprecated class names in templates (.users-table, .gp-table,
  .marketplaces-table, .audit-table, .users-search, .marketplaces-search,
  .modal-btn, .btn-primary-v2, …)
- app.js loaded by base.html, NOT by base_login.html
- 3 helper-level unit tests for the class-attribute tokenizer
  (multi-line attrs, Jinja-conditional fragments, false-positive prose)

Two of the assertions intentionally start FAILING after this commit
(missing primitives + legacy class refs in 7 admin templates) and
will turn green as Tasks 4–7 add primitives and Tasks 8–15 migrate
the templates.

* feat(css): canonical button family + legacy token aliases

Adds at top of :root: legacy token aliases (--bg, --card-bg, --text,
--text-light, --secondary, --radius) pointing at modern equivalents.
Absorbed style.css rules referenced these names; without aliases
they fell back to 'unset'. Aliases live until Task 16 alongside
their absorbed rules.

Appends canonical .btn variants at end of file (last cascade):
  .btn-primary + .btn-primary-v2 + .modal-btn.primary (alias group)
  .btn-secondary + .btn-secondary-v2 + .modal-btn:not(.primary):not(.danger)
  .btn-ghost + .btn-ghost-v2
  .btn-danger + .modal-btn.danger
  .btn-lg
  .btn:disabled + .btn:focus-visible (focus ring via --focus-ring)

Existing absorbed .btn, .btn-primary, .btn-secondary, .btn-sm rules
remain — the canonical block adds the missing variants + selector-list
aliases so .modal-btn and v2 markup keep rendering until migration
tasks swap them out.

Contract test: .btn-danger now declared (one less missing primitive).
Browser smoke: /admin/tokens hero + filter pills + empty state render
correctly with the absorbed style.css rules now backed by real tokens.

* feat(css): form-control primitives — .search-input + .filter-bar + .filter-pill + .form-input

Canonical filter bar shape: 36px-height inputs (matches button height
for vertical rhythm), 28px pills with .is-active state, consistent
focus ring via --focus-ring token.

Selector-list aliases for legacy per-page classes:
- .users-search / .marketplaces-search / .kb-search → .search-input
- .filters-card → .filter-bar
- .pill[aria-pressed="true"] also matches the .filter-pill active state

.form-input added as a sibling of .search-input for forms — same
baseline height + radius + focus treatment, with textarea.form-input
auto-sizing to min 96px and using the mono font (matches CSV/SQL
pasted-snippet patterns on /admin/agent-prompt + /admin/workspace-prompt).

Contract test: .search-input + .filter-bar + .filter-pill now declared.

* feat(css): .page-header primitive + variants + .tab-strip

Canonical page-header pattern with title (22px) + optional subtitle +
optional eyebrow + right-aligned actions slot. Two modifiers:
- .page-header--hero: gradient background (primary→primary-dark),
  28px white title, semi-transparent subtitle/eyebrow. For
  /marketplace, /store, /profile-style pages that already use this
  layout via per-page inline <style>. Migration tasks delete the
  duplicated rules.
- .page-header--compact: 18px title for dense admin index pages.

.tab-strip + .tab-strip__item — the secondary tab row pattern used by
/marketplace?tab=flea and similar. .is-active / [aria-selected=true]
both flip the active treatment (primary color + bottom border).

Contract test: .page-header / __title / __subtitle / __actions all
now declared (4 fewer missing primitives).

* feat(css+js): .data-table + .empty-state + .toast + .stat-card primitives

Last primitive batch. All 8 canonical-primitives invariants in
test_design_system_contract.py now green; only the template-migration
test fails (expected — Tasks 8–15).

.data-table (+ --compact modifier): selector-list aliases for legacy
per-page table classes (.users-table, .gp-table, .marketplaces-table,
.audit-table) so existing markup keeps rendering until migration.
Compact modifier shrinks padding + font for dense lists (audit log).

.empty-state with __icon / __title / __description / __actions —
replaces the ad-hoc 'no results' rendering scattered across pages
(corporate_memory, admin_users, admin_marketplaces, etc.).

.toast / .toast-container — paired with window.appToast({kind, msg,
timeout}) appended to app.js. Bottom-right stacked, click-to-dismiss,
auto-dismiss after 4s by default. Kind 'success' / 'warning' / 'error'
/ 'info' shows a 3px colored left border.

.stat-card (+ --accent variant) + .stat-row grid — for the dashboard
metric tile row.

* style(templates): migrate 8 templates off deprecated class names

Mechanical class-attribute rewrite via tokenizer (preserves Jinja
conditionals + multi-line attrs):

  modal-btn primary    -> btn btn-primary
  modal-btn danger     -> btn btn-danger
  modal-btn            -> btn btn-secondary
  users-table          -> data-table
  gp-table             -> data-table
  marketplaces-table   -> data-table
  audit-table          -> data-table
  users-search         -> search-input
  marketplaces-search  -> search-input

8 templates touched: admin_groups, admin_marketplaces, admin_tokens,
admin_users, admin_welcome, admin_workspace_prompt, my_tokens,
corporate_memory_admin. 43 lines updated total.

Inline <style> blocks in these templates still define rules for the
old class names — those rules no longer match anything and become
dead code, removed in Task 16's alias cleanup along with the
selector-list aliases in style-custom.css.

Contract test (tests/test_design_system_contract.py) now fully green:
9/9 invariants enforced from this commit onward.

* feat(css): extend .data-table selector list to 13 more bespoke -table classes

Visual unification of remaining tables across the codebase without
per-template edits. The .data-table baseline rules (uppercase header
tracking, 12px padding, hover state, border-radius) now apply to:

  .ad-table / .ea-table / .md-table / .members-table /
  .obs-table / .overview-stats-table / .registry-table /
  .sample-table / .sched-table / .sess-table / .sub-table /
  .subs-table / .ud-table

These class names live in 12 templates (activity_center, admin_access,
admin_group_detail, admin_scheduler_runs, admin_sessions,
admin_store_submissions, admin_tables, admin_usage, admin_user_detail,
catalog, me_debug, profile_sessions) that have their own per-page
<style> blocks. Per-page rules with higher specificity still win for
their custom needs (column widths, etc.) — this commit only sets a
shared baseline so every table renders with the same chrome.

Contract test stays green: 9/9 invariants enforced.

* style(css): remove now-unused legacy class aliases

Phase A renamed 8 templates off these names; no markup references
them any more, so the selector-list memberships are dead weight.
Removed from style-custom.css:

  .btn-primary-v2 / .btn-secondary-v2 / .btn-ghost-v2
  .modal-btn / .modal-btn.primary / .modal-btn.danger /
  .modal-btn:not(.primary):not(.danger)
  .users-search / .marketplaces-search / .kb-search
  .users-table / .gp-table / .marketplaces-table / .audit-table
  .filters-card

37 lines smaller. Contract test catches any reintroduction.

KEPT aliases (still in untouched template markup):
- .pill (marketplace_plugin_detail.html, marketplace.html — these
  pages weren't part of Phase A's deprecated-class sweep; their
  own .pill CSS rules still apply)
- All .data-table family extensions (.ad-table, .ea-table, .md-table,
  .members-table, .obs-table, .overview-stats-table, .registry-table,
  .sample-table, .sched-table, .sess-table, .sub-table, .subs-table,
  .ud-table) — these still render data tables in 12 templates;
  selector-list aliasing keeps them visually unified with .data-table
  baseline.
- Legacy token aliases (--bg / --text / --text-light / --secondary /
  --card-bg / --radius) — still resolve absorbed style.css rules.

Templates' inline <style> blocks still contain dead rules for the
renamed classes (.users-search, .modal-btn, etc.); harmless but
bloat. Optional follow-up: a separate sweep can drop those.

* docs(changelog): design-system unification under [Unreleased]

* feat(css): unify page-shell width — .container baseline 1280px + modifiers

Inventory found 30+ unique max-width values across templates (280px
login → 1600px admin/tables). The legacy .container default was 800px,
which made every admin page set its own wider inline override —
30+ ad-hoc widths drifted as a result.

Canonical: .container max-width = var(--width-app) (1280px). Pages
that need a different shape opt in via modifiers:

  .container--narrow → var(--width-narrow)  (800px) — long-form text,
                                                     setup wizards
  .container--wide   → var(--width-wide)    (1400px) — admin lists,
                                                     marketplace grids
  .container--full   → max-width: none — hero / landing

Pages that already set a NARROWER inline max-width (setup, login flows
inside .login-card, etc.) still render at their narrower size — the
inline override beats the new canonical 1280px. The visible change
hits the ~20 admin pages currently rendering at 800px via the legacy
default, which jump to 1280px and pick up consistent breathing room.

Spacing also normalized: padding 24px 20px → var(--space-6) var(--space-5).

* fix(home+catalog): gut dashboard sections + remove confusing toggle + fix table count

Dashboard /home cleanup:
- Remove 'Your Data' card — Data Packages is already a top-nav entry,
  so duplicating data sources on the landing page just adds noise.
- Remove 'Account' card — group memberships + scripts + last sync
  belong on /profile, not on the welcome screen.
- Remove entire right-column (Corporate Memory + Activity Center
  widgets) — both surfaces have dedicated admin pages reachable from
  the Admin dropdown.
- Keep stats row (Tables/Columns/Rows/Data Size/Unstructured),
  env-setup-CTA, and Notifications card.

/catalog cleanup:
- Strip the 'Always included' badge + the locked toggle-switch from
  Core Business Data and Business Metrics cards. The toggle was
  always 'checked disabled' — it visually looked like a switch but
  could not be toggled, which was confusing. The 'Always included'
  copy itself was redundant once the toggle was gone. Agnes Internal
  already rendered without these, so the three cards are now visually
  consistent.

Catalog data_stats fix:
- 'total_tables' was len(sync_state) — counted only tables that had
  ever synced, so a 30-row table_registry with 0 ever synced rendered
  as '0 tables'. Switched to len(tables) — the registered
  business-data table list — so the count reflects what's actually
  available, not what's been touched.

* fix(home): real stat numbers + drop unstructured tile + cleanup dead CSS

Dashboard stats were hardcoded zeros (columns: 0, size_display:
'0 MB', unstructured_display: '0 MB') and the table counter pulled
from sync_state (synced) instead of table_registry (registered).
On a fresh deployment with 30 registered tables and 0 ever synced,
the page rendered '0 / 0 / 0 / 0 MB / 0 MB' — useless.

Now:
- Tables: COUNT(*) FROM table_registry WHERE source_type != 'internal'.
  Matches the /catalog Core Business Data counter.
- Columns: SUM(sync_state.columns). Zero only when nothing's synced yet.
- Rows: unchanged (SUM(sync_state.rows), already correct).
- Data Size: SUM(sync_state.file_size_bytes), human-formatted via
  inline _fmt_bytes helper (KB/MB/GB).
- Unstructured: tile dropped — was always '0 MB' and had no source.
- last_updated: now derived from sync_state max(last_sync), wasn't set
  before so the 'Synced …' tag never rendered.

Dashboard.html cleanup: ~725 lines of orphan inline <style> removed —
.section-title, .data-source*, .toggle-switch*, .catalog-cta*,
.memory-card / .memory-stat / .memory-description / .memory-footer
/ .btn-memory, .activity-card / .activity-stat / .activity-text
/ .btn-activity, .account-grid / .account-row / .account-scripts
/ .badge-role / .badge-group / .cron-line, .badge-included /
.badge-beta / .badge-demo. All matched markup deleted in the
previous commit; the CSS was dead code until now.

* ui(catalog): rename page heading 'Data Catalog' → 'Data Packages'

The top-nav entry says 'Data Packages' but the page itself said
'Data Catalog' — confusing two-name product. Aligns the heading and
<title> with the nav label. Subtitle trimmed too: 'manage your
subscriptions' was a vestige of the toggle UI that just got removed,
replaced with a one-liner describing what the page is for.

Two other 'Data Catalog' strings stay: they live inside the table-
profiler overlay JS and refer to an EXTERNAL catalog system (e.g.
OpenMetadata / Atlan) that an operator may link to per table — that
is a generic term for any external data-catalog product, not our
page name.

* fix(nav): dropdown clicks always work + mutual-exclusion close

Two bugs in the wireDropdown helper:

1. Clicking trigger B while trigger A's menu was open left both open.
   e.stopPropagation() in trigger.click prevented the document-click
   handler from firing, so trigger A's open menu had no way to learn
   that something else was clicked. Net effect: state diverged across
   the two dropdowns the more you clicked.

2. The target-vs-trigger equality check (e.target !== trigger) was
   strict. Clicking the chevron <svg> inside the button reports the
   svg or its <path> child as e.target — not the button — so removing
   stopPropagation alone would trip the close branch in the same
   click that just opened the panel.

Fix both at once: drop e.stopPropagation() AND switch the doc-handler
guard to trigger.contains(e.target). Now any click outside both the
trigger subtree and the panel subtree closes; any click on another
trigger closes via the OTHER dropdown's doc handler; clicks inside
the trigger (button OR svg child) are fully ignored by the doc
handler and only the trigger's own toggle handler fires.

* feat(ui): canonical blue-gradient hero on every admin page

The UI had a per-page hero pattern on ~10 onboarding/marketing pages
(admin_tokens / profile / install / setup_advanced / marketplace /
my_tokens / store_upload / home_*), each with its own ad-hoc CSS
(.tokens-hero, .profile-hero, .install-hero, .upload-hero, …). The
admin section's index + detail pages had plain H1/H2 with their own
.users-title / .gp-title / .obs-title / .cfg-title / … inline styling.
Net effect: half the app felt like a product, half felt like a
spreadsheet.

Now:
- .page-header--hero CSS upgraded to match the look analysts already
  liked from admin_tokens: 28px/32px/24px padding, 14px radius, soft
  primary-tinted box-shadow (0 4px 16px rgba(0,115,209,0.2)), 28px
  semibold title, optional uppercase eyebrow + 13.5px subtitle.
  Narrow-viewport breakpoint included.
- New _page_hero.html partial wraps the boilerplate. Usage:
    {% set page_hero_eyebrow  = "Users & Access" %}
    {% set page_hero_title    = "Users" %}
    {% set page_hero_subtitle = "…" %}
    {% include "_page_hero.html" %}
- 15 admin templates migrated to it: admin_users / admin_groups /
  admin_marketplaces / admin_access / admin_sessions /
  admin_session_detail / admin_store_submissions /
  admin_scheduler_runs / admin_usage / admin_user_detail /
  admin_welcome / admin_workspace_prompt / admin_server_config /
  activity_center / admin/news_editor. Each gets a grouped eyebrow
  (Users & Access / Data / Agent Experience / Activity Center /
  Server) matching the Admin dropdown sections so the page identity
  is unambiguous at a glance.

Legacy *-title H2/H1 + adjacent subtitle paragraphs deleted; their
per-page CSS rules are dead now (harmless, retire in a follow-up
sweep alongside other inline-style cleanup the reviewers flagged).

admin_tables.html intentionally NOT migrated — it's a standalone
HTML page that doesn't extend base.html; a separate refactor.

Test: test_admin_users_page_renders_for_admin assertion updated
from .users-title to .page-header__title + .page-header--hero (the
canonical pair). All other web/template tests stay green.

* refactor(ui): dedup _humanbytes, drop 267 lines of dead inline CSS

(1) _humanbytes consolidation:
- Add TB branch + optional precision param (default 2 preserves existing
  Store detail callers; dashboard uses precision=1 for headline tiles).
- Delete inline _fmt_bytes from dashboard handler — was a copy of
  _humanbytes with different rounding. One canonical helper now.

(2) Dead inline-CSS sweep across 17 migrated templates:
- Conservative regex: a CSS rule is deleted only when its primary class
  matches one of the known-dead names AND that name is NOT referenced
  from any class= attribute in the same file's markup.
- Per-file 'in-use' guard saved several false positives that the deny
  list would have nuked (e.g. .users-toolbar, .gp-search, .obs-subtitle,
  .marketplaces-toolbar are still in use; only .users-table, .users-search,
  .users-title, .modal-btn, etc. that have NO markup left went away).
- Removed: -267 lines across admin_users (-42), admin_marketplaces (-45),
  admin_groups (-31), my_tokens (-38), admin_tokens (-29), admin_access
  (-9), admin_user_detail (-6), admin_welcome (-8), admin_workspace_prompt
  (-8), admin_server_config (-2), admin_sessions (-1), admin_session_detail
  (-1), admin_usage (-1), admin_store_submissions (-3), admin_scheduler_runs
  (-3), activity_center (-4), corporate_memory_admin (-36).

Contract test stays green (9/9); all web/template/render/user_management
tests pass.

* feat(ui): canonical hero on /catalog (Data Packages)

Same .page-header--hero treatment as the admin pages — Data eyebrow,
Data Packages title, Browse-the-data-sources subtitle. Removes the
ad-hoc .page-title block (h1 / p / wrapper-div) and its CSS rules
(now dead, 3 rule blocks deleted).

* fix(nav): load app.js from _app_header.html — works on standalone pages

The previous nav-fix commit moved the inline dropdown script from
_app_header.html into app/web/static/app.js + added <script src=…>
to base.html. That broke EVERY page that includes _app_header.html
WITHOUT extending base.html (catalog, corporate_memory*,
admin_tables, install). They got the nav markup but no JS → both
Admin and AD dropdowns dead on those pages.

Fix: emit the <script src=app.js defer> directly inside the
_app_header.html partial. Any page that includes the header now
gets the script automatically — base.html-extenders AND standalone
HTML pages alike. base.html's duplicate <script> line removed.

Also fixes the wide-hero on /catalog: .page-header--hero now sets
its own max-width: var(--width-app) (1280px) so standalone pages
without a .container parent don't render the gradient edge-to-edge.
catalog's .source-cards bumped from 900px → 1280px to match the
hero, otherwise the page reads two-tier (wide blue band, narrow
content) which the user flagged.

Verified locally via agent-browser: Admin + AD dropdowns now click
through on /catalog, /admin/tables, /corporate-memory.

* docs(plan): standalone pages → base.html framework migration plan

Plan + Plan-agent review (8 must-fix items applied) for converting
the 5 templates that ship their own <html><head><body> scaffold
(catalog, install, corporate_memory, corporate_memory_admin,
admin_tables) to extend base.html. Root cause of yesterday's
'dropdown dead on /catalog' regression: shared infrastructure in
base.html doesn't propagate to standalones.

* feat(base): body_attrs block + migrate install.html to extend base

base.html: new {% block body_attrs %}{% endblock %} slot so pages
that need <body> attributes (admin_tables has data-source-type)
can carry them through extends.

install.html: convert from standalone <html><head><body> scaffold
to {% extends "base.html" %} with title / body_attrs / head_extra
/ layout / scripts blocks. Drops:
- <!DOCTYPE>, <html>, </html>, <head>, </head>
- <meta charset>, <meta viewport>
- Duplicate <link rel="stylesheet" href="...style-custom.css">
  (base.html already provides one)
- <body> opening + closing tags
- Leading _app_header.html include + _version_badge.html include
  (base.html handles both)

Preserves per-page CSS (in head_extra), per-page JS (in scripts),
the Inter font preconnect (kept inline; not hoisted to base in
this PR — separate decision).

Pilots the migration recipe before the 4 larger pages.

* refactor(memory): extend base.html

Same recipe as install.html. corporate_memory.html now inherits
<html>/<head>/<body> + nav + app.js script tag from base.html.
Page-specific CSS and JS preserved in head_extra + scripts blocks.

* refactor(memory-admin): extend base.html

Same recipe as install/corporate_memory. Curation page now in the
shared rendering pipeline.

* refactor(catalog): extend base.html

catalog.html had the most complexity: 7 head-level assets (chart.js,
Prism, prism-sql, metric_modal.css link + 2 preconnects + Inter
stylesheet), 5 body-level <script> blocks including a <script type=
"module"> for the metric modal, 2 duplicate style-custom.css links
in <head>. The migration script preserved all of them — head-level
externals hoisted to {% block head_extra %} in source order, body
scripts relocated to {% block scripts %} in source order (so chart.js
loads before the IIFE that builds Chart instances), duplicate
style-custom.css links dropped (base.html provides one).

* refactor(admin-tables): extend base.html + carry data-source-type

The biggest of the 5 standalones at 3563 lines. <body data-source-
type="{{ data_source_type }}"> attribute carried through via the
new {% block body_attrs %} slot (admin_tables JS reads
document.body.dataset.sourceType to switch between keboola and
bigquery rendering paths).

* release: 0.54.10 — UI design system unification + homepage status frame + initial workspace override + store guardrails

Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com>

* refactor(web): migrate remaining templates to canonical design primitives

- admin_group_detail: .data-table, .btn family, appToast(), remove duplicate table/button/toast CSS
- admin_store_submission_detail: .data-table, .btn family, appToast(), remove duplicate btn/toast CSS
- profile_sessions: .data-table, _page_hero.html, remove duplicate table/title CSS
- me_debug: .data-table, .btn family, remove duplicate table/button CSS
- marketplace: .btn-primary/.btn-secondary, remove duplicate button CSS
- store_edit: remove duplicate .btn-primary/.btn-link CSS, canonical button classes
- store_upload: remove duplicate .btn-primary/.btn-secondary/.btn-link CSS

Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com>

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-05-14 13:28:03 +02:00
ZdenekSrotyr
91caefaca9
security(auth): per-IP rate limit + last-admin guard (#165)
* security(auth): per-IP rate limit on auth endpoints + generalize last-admin guard

Closes #45 and #151.

#45 — every auth endpoint was unthrottled (login, magic-link, token,
bootstrap), leaving us open to password brute-force and SMTP
email-bombing. Wires slowapi (new dep) into the middleware chain with
per-route limits: 10/min on login + token, 5/min on send-link, 3/min on
bootstrap. Returns 429 with Retry-After: 60 once exceeded. Per-IP key
respects the leftmost X-Forwarded-For hop (Caddy in front of the app
strips client-supplied XFF). Operator escape hatch:
AGNES_AUTH_RATELIMIT_ENABLED=0. Test suite disables the limiter via
autouse conftest fixture so existing auth tests that hammer endpoints
in tight loops are unaffected.

#151 — DELETE /api/admin/users/{id}/memberships/{group_id} and the
mirror DELETE /api/admin/groups/{group_id}/members/{user_id} only
guarded against self-removal as last admin. Generalizes to refuse
removing anyone from the seeded Admin group when they are the only
remaining active admin (mirrors the existing
count_admins(active_only=True) <= 1 check on delete_user / update_user).
Recovery from zero admins requires direct DB access, so this closes
a path where a scheduler/bootstrap actor that bypasses normal admin
checks could otherwise empty the group.

* security(auth): throttle remaining email-bombing + token-confirm endpoints

Address code-review gap on PR #165 — the first commit covered /send-link
but missed two endpoints with the IDENTICAL email-bombing surface:

- POST /auth/password/reset       — sends reset mail, anti-enum response
- POST /auth/password/setup/request — sends setup mail, anti-enum response

Both now share the 5/min limit with /send-link.

Also add 10/min to the token-confirm surfaces — high-entropy tokens but
partial leaks via logs / referer have surfaced before, and unbounded
guess rate would let an attacker exhaust the keyspace adjacent to a
leaked prefix:

- POST /auth/email/verify
- GET  /auth/email/verify         — closes the click-through bypass
- POST /auth/password/reset/confirm
- POST /auth/password/setup/confirm

Doc fix: rate_limit.py module docstring + CHANGELOG entry no longer
claim "disable without a redeploy" (misleading). The Limiter constructor
freezes `enabled` from env at import time, matching every other Agnes
env knob — operators set the flag and bounce the container.

Tests: 4 new cases in test_auth_rate_limit.py covering
/reset, /setup/request, /reset/confirm, GET /verify. Full suite:
2583 passed, 32 skipped, 0 failed.

* security(auth): throttle JSON /auth/password/setup — closes form-throttle bypass

Second code-review pass on PR #165 caught a fifth gap: POST /auth/password/setup
(JSON variant, kept for backward compat) consumes the same setup_token as
the web form /setup/confirm but was unthrottled — an attacker brute-forcing
the token just switches from the form path to the JSON path and resumes
at unbounded RPS. Apply the same 10/min limit and signature shape used
on /setup/confirm.

Also extend CHANGELOG note about the JSON-variant bypass for future
operators reading the security entry.

Test: 1 new case (test_password_setup_json_rate_limited_after_10_requests),
9 rate-limit tests + 28 password-flow tests + 41 auth-provider tests pass,
no regressions.

* chore(release): cut 0.30.1 — auth security hardening (rate limit + last-admin guard)
2026-05-02 21:08:33 +02:00
minasarustamyan
d4ac84dd46
feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150)
* feat(rbac): drop dataset_permissions + access_requests + users.role + is_public; v19 migration

BREAKING. Sjednocení datové RBAC vrstvy do per-group resource_grants modelu.
Před PR byla legacy data RBAC vrstva (dataset_permissions + is_public bypass)
de-facto neaktivní — is_public neměl API/UI/CLI surface, default true znamenal
že can_access_table vždycky bypassl. Dnes každý non-admin přístup vyžaduje
explicitní resource_grants(group, "table", id) řádek.

Schema v18 → v19 (src/db.py:_v18_to_v19_finalize):
- DROP TABLE dataset_permissions, access_requests
- DROP COLUMN users.role (NULL artifact since v13)
- DROP COLUMN table_registry.is_public
- Drops přes table-rebuild idiom (rename → create new → INSERT … SELECT
  → drop old) kvůli DuckDB ALTER DROP COLUMN limitacím na tabulkách
  s historic FK constraints. INSERT picks intersection sloupců, takže
  test fixtures s minimal pre-v19 schemou migrate cleanly.

Runtime:
- src/rbac.py:can_access_table → deleguje na app.auth.access.can_access
- DatasetPermissionRepository, AccessRequestRepository smazány
- AGNES_ENABLE_TABLE_GRANTS env-gate v app/resource_types.py odstraněn
  (TABLE je unconditionally enabled)

API drop:
- app/api/permissions.py, app/api/access_requests.py celé soubory
- /admin/permissions web route + admin_permissions.html
- "Request Access" modal v catalog.html + locked-row UI
- ~10 if user.get("role") != "admin" checků nahrazeno (admin shortcut
  je uvnitř can_access_table)
- /api/settings: drop permissions field z GET; PUT /api/settings/dataset
  gate přepnut na can_access(user_id, "table", dataset, conn)

Auth:
- app/auth/jwt.py:create_access_token: drop role parametr (claim zmizí
  z nově vydávaných JWT; staré tokeny zůstávají valid, claim ignored)
- app/api/users.py: drop role z CreateUserRequest / UpdateUserRequest
  (admin promotion = explicit add to Admin group via memberships API)
- src/repositories/users.py: drop role z create() / update()

CLI:
- da admin set-role smazán → hard-fail s replacement command
- da admin add-user --role flag pryč
- da auth import-token --role flag pryč
- da auth whoami: drop "Role:" výpis
- cli/config.py:save_token: role parametr now optional, no longer written
  (back-compat se starými token.json soubory zachována — pole se ignoruje)

Tests:
- DELETE: test_permissions.py, test_permissions_api.py, test_access_requests_api.py
- REWRITE: test_access_control.py (resource_grants flow), test_rbac.py
  (can_access_table over resource_grants), test_journey_rbac.py
  (drop access-request flow), test_resource_types.py (drop env-gate
  tests, drop is_public from helpers), test_v2_*.py (drop role-based
  user dicts in favor of id-based + Admin group membership),
  test_settings_api.py (no permissions field, can_access gate)
- TRIVIAL: ~30 souborů — drop role="admin" arg z UserRepository.create
  a 3rd positional role z create_access_token
- NEW: test_v18_to_v19 migration test (test_db.py),
  test_can_access_table_no_implicit_public (test_rbac.py),
  test_admin_set_role_returns_hardfail (test_cli_admin.py)
- OpenAPI snapshot regenerated

Docs:
- CHANGELOG: BREAKING entry pod [Unreleased]
- CLAUDE.md: schema v18 → v19
- docs/architecture.md: schema table + RBAC sekce přepsána
- docs/auth-google-oauth.md: admin promotion přes da admin break-glass
- cli/skills/security.md: kompletně přepsáno na group-based model
- docs/TODO-rbac-data-enforcement.md: smazáno (TODO splněn)

Test results: 2363 passed, 19 failed. Zbývající failures jsou pre-existing
Windows-specific issues (fcntl, charset) nesouvisející s tímto PR —
ověřeno git stash pop.

Plan: ~/.claude/plans/floofy-coalescing-parnas.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(release): cut 0.27.0

---------

Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
2026-04-30 22:02:16 +02:00
ZdenekSrotyr
e9d7af3cce feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening
This squashes 13 commits from ma/staging plus a small docstring translation
into a single coherent unit. Three workstreams.

== RBAC v13 redesign ==
- Drops core.viewer/analyst/km_admin/admin hierarchy and the
  internal_roles / group_mappings / user_role_grants / plugin_access tables.
- Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill
  wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry.
- Two authorization primitives in app.auth.access:
    require_admin                        — Admin-group god-mode
    require_resource_access(rt, "{path}") — entity-scoped grants
  Single DB lookup per request; no session cache; no implies BFS.
- /admin/access UI (single page) replaces /admin/role-mapping +
  /admin/plugin-access. CLI `da admin group/grant *` replaces
  `da admin role/mapping/grant-role/revoke-role/effective-roles`.
- ResourceType.TABLE listing-only — admins can record table grants,
  runtime enforcement still flows through legacy dataset_permissions
  (migration plan in docs/TODO-rbac-data-enforcement.md).

== Claude Code marketplace ==
- Aggregated /marketplace.zip + /marketplace.git/* (PAT-gated,
  RBAC-filtered, content-addressed cache via dulwich).
- Admin god-mode dropped on the marketplace surface — admins curate
  their own view via grants like everyone else.
- Bare-repo cache materializes per RBAC-filtered ETag; stale entries
  not pruned in this iteration (disclaimed in git_backend.py docstring).

== #81 #83 #44 security/ops hardening ==
- #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias).
- #81 Group B — Keboola extractor 3-state exit codes:
    0 success / 1 total fail / 2 PARTIAL fail
  Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary
  alerting must teach it the new partial signal.
- #81 Group C — schema v10 view_ownership; rejects silent overwrite
  of a prior connector's view name on collision.
- #81 Group D — extractor-side identifier validation.
- #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset
  + path-traversal fix.
- #44 — entire /api/scripts/* surface is admin-only (planted-script +
  sandbox-bypass risk closed).

== Web UI polish + deploy fix ==
- /admin/access: live grant-count badges (no stale snapshot revert),
  shared-header CSS link added to /catalog and /admin/{tables,permissions},
  per-resource-type colored stripes.
- docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't
  silently shadow sub-mounts and write state to the wrong disk.

== OSS vendor-neutralization (waves 1+2) ==
- scripts/grpn/ → scripts/ops/. Customer-specific identifiers
  (project IDs, internal hostnames, dev/prod VM IPs, brand names)
  replaced with placeholders across code, docs, Terraform, Caddyfile,
  OAuth probe, and planning docs. Downstream infra repos that copied
  scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must
  update the path.

== Translation ==
- src/repositories/user_groups.py::ensure_system docstring translated
  from Czech to English for codebase consistency.

Co-authored-by: Mina Rustamyan <mina@keboola.com>
2026-04-28 14:25:04 +02:00
ZdenekSrotyr
d2c76cb221
User management + PAT + CLI distribution + HTML auth redirect (#9 #10 #11 #12) (#28)
* fix: redirect unauthenticated HTML routes to /login (#10)

* docs(plan): user mgmt + PAT + CLI distribution implementation plan (#9 #10 #11 #12)

* build(docker): produce wheel artifact for /cli/download (#9)

* feat(db): schema v5 — users.active + deactivated_at/by (#11)

* feat(api): /cli/download wheel + /cli/install.sh with baked server URL (#9)

* feat(users): repository supports active flag + count_admins (#11)

* feat(ui): /install page with per-deployment install instructions (#9)

* feat(api): user PATCH/reset-password/set-password/activate/deactivate (#11)

* fix(cli): da login prompts for password and sends it in body (#9)

* test(api): safeguard tests for self-deactivate and last admin (#11)

* feat(auth): reject requests from deactivated users (#11)

* fixup(#10): propagate next through /login buttons + lock down sanitizer tests

* feat(cli): da admin set-role/activate/deactivate/reset-password/set-password (#11)

* feat(ui): /admin/users management page (#11)

* feat(db): schema v6 — personal_access_tokens (#12)

* feat(users): access_tokens repository (#12)

* feat(auth): JWT carries typ (session|pat) and explicit jti (#12)

* feat(auth): reject revoked/expired PATs; update last_used_at (#12)

* feat(api): /auth/tokens CRUD + admin revoke; session-only guard (#12)

* feat(cli): da auth token create/list/revoke (#12)

* feat(ui): /profile page with PAT create/list/revoke (#12)

* docs: PAT usage and session/PAT TTL clarification (#12)

* feat(auth): PAT first-use-from-new-IP audit + last_used_ip (schema v7) (#12)

Closes remaining acceptance gap from issue #12: audit_log entry on first use
of a PAT from an IP that differs from the recorded last_used_ip.

- schema v7: personal_access_tokens.last_used_ip column
- AccessTokenRepository.mark_used now stores the client IP
- get_current_user extracts client IP (X-Forwarded-For first hop, fallback
  to request.client.host) and emits a token.first_use_new_ip audit when the
  IP changes on a subsequent use (not the very first use)
- tests: new-ip audit, same-ip no-op, first-ever-use no-op, schema v7 column

* fix: address Devin review findings on PR #28

- app/main.py: exclude /auth/* from HTML redirect handler so JSON
  endpoints under /auth/ (PAT CRUD used by `da auth token` CLI) keep
  their 401 JSON contract (Devin #1, bug)
- app/api/tokens.py: reject expires_in_days <= 0 explicitly; use
  `is not None` so 0 no longer silently creates a non-expiring token
  (Devin #2)
- app/api/users.py: validate role against Role enum in create_user
  to match update_user and prevent 500 on role-protected requests
  later (Devin #3)
- app/web/templates/admin_users.html: escape user-supplied strings
  before innerHTML; move onclick handlers to addEventListener via
  data attributes so emails with quotes / HTML no longer break the UI
  or enable stored XSS (Devin #4)
- app/auth/router.py, app/auth/providers/{password,google}.py:
  reject deactivated users at login instead of issuing a JWT that
  would then fail on the next request — removes the confusing
  redirect loop (Devin #5)
- CLAUDE.md: document schema v7 instead of stale v4 (Devin #6)
- tests/test_web_ui.py: regression test for the /auth/* JSON 401

* feat(web): add /profile and /admin/users links to dashboard nav

* feat(web): point setup banner at /install page

* chore(web): drop unused setup_instructions context

* fix: address Devin review round 2 on PR #28

- app/api/tokens.py: when expires_in_days is None (the "never" option),
  use a ~100-year JWT expiry so the token doesn't silently die in 24h
  via the session-default fallback in create_access_token. The real
  expiry enforcement stays in verify_token's DB-level check (Devin 🔴)
- app/web/templates/profile.html: escape t.name and other user-supplied
  strings via esc() helper before innerHTML, same pattern as
  admin_users.html. Move revoke onclick to data-attribute +
  addEventListener (Devin 🟡)
- app/api/cli_artifacts.py: use `mktemp -d` with X's at end of template
  for GNU/BSD portability, place wheel inside the temp dir and
  clean up with rm -rf (Devin 🚩)

* feat(web): redesign /install page; make curl one-liner primary, collapse manual

Rebuild the public /install page using the dashboard visual language
(shared header, card layout, gradient hero, design tokens from
style-custom.css). The page is now anchored on the one-liner install
path: curl -fsSL <server>/cli/install.sh | bash is rendered as the
primary, prominent step 1, while the old manual wheel-download flow
is tucked behind a closed-by-default <details> block for users in
restricted/offline environments.

Information architecture:
  hero (server URL + version)
  -> step 1: quick install (one-liner, big Copy button)
  -> step 2: create PAT on /profile + export DA_TOKEN / da auth whoami
  -> step 3: Claude Code / MCP via ~/.config/da/token.json
  -> collapsed "Manual install" details for download-wheel flow
  -> footer link to docs/HEADLESS_USAGE.md

Every shell snippet has a vanilla-JS "Copy" button that confirms
visually ("Copied!" for 1.5s) and falls back to textarea+execCommand
on non-secure contexts. No new dependencies, no bundler.

The route now also pulls an optional user so the header shows the
same nav (Dashboard / Profile / Logout) as dashboard.html when a
session exists, while staying fully public when signed out.

* fix(cli): use real wheel filename in install.sh (broken pip/uv install)

The installer wrote the downloaded wheel as agnes_cli.whl, which lacks a
PEP-427 version component — both pip and uv tool install reject it and
abort the one-liner.

Use curl -OJ so Content-Disposition determines the on-disk filename, then
resolve it via glob. Install an EXIT trap to remove the tmpdir even when
install fails.

* fix(web): correct manual install wheel glob and add PEP 668 / PATH hints

- Wheel glob is agnes_the_ai_analyst-*.whl (not agnes-*.whl) — the old
  pattern never matched the real artefact name from the build.
- Add — or — separator between uv tool install and pip install.
- Warn that pip install --user is blocked on macOS Homebrew / modern
  Debian (PEP 668) and recommend uv tool install as the default path.
- Both flows now show the ~/.local/bin PATH hint so a fresh shell can
  find the da binary after install.

* fix(web): consistent session.user reference in install header

The avatar-letter fallback inside {% if session.user %} was reading
user.name / user.email directly, but the route dependency can pass
user=None — those references resolved to an empty FlexDict and produced
an empty avatar circle. Read everything through session.user to match
the guard and the dashboard pattern.

* fix(web): point headless usage link at GitHub source

/docs/HEADLESS_USAGE.md 404s — no static route serves repo docs. Point
the footer link at the rendered markdown on GitHub instead of adding a
dedicated docs serving route just for one file.

* feat(web): /install hero size, anon sign-in banner, step 2 copy polish

- Bump hero h1 from 26px to 30px to match dashboard primary scale.
- Anonymous visitors see a small sign-in banner above Step 2 (creating
  a token requires auth; without the banner the flow appears stuck).
- Add an 'After generating your token' section label inside Step 2 so
  the /profile CTA button no longer looks wedged mid-sentence between
  adjacent paragraphs.

* chore(web): /install a11y + version pill polish

- aria-live='polite' on copy buttons so screen readers announce the
  'Copied!' state change.
- Replace redundant INSTANCE_NAME eyebrow (already in the header logo)
  with 'Getting started'.
- Hide the version pill when AGNES_VERSION is unset/'dev' — avoids the
  misleading 'vdev' label in local/unbuilt runs.
- Manual summary focus-visible outline-offset +2px (was -2px which
  clipped inside the card), and mark the chevron as decorative.

* fix(web): use session.user in dashboard avatar fallback

Inside {% if session.user %} guard, the avatar fallback referenced
(user.name or user.email). If user is None the block crashes when
the profile picture is absent. Align with the guard variable.

* fix: address Devin review round 3 on PR #28

- app/api/users.py: stop auto-sending email from reset_password. The
  magic-link sender would deliver a "Login Link" that — when clicked —
  consumes the reset_token via verify_magic_link and logs the user in
  WITHOUT prompting for a new password. Admins now share the raw
  reset_token from the API response manually, or use set-password
  directly. email_sent is always False. Documented inline. (Devin 🟡)
- app/api/cli_artifacts.py: harden /cli/install.sh generation against
  shell injection via Host header or AGNES_VERSION. base_url is
  validated against a strict scheme+host+port regex; version against
  an alnum + dot/dash/underscore allowlist. Both values are also
  piped through shlex.quote() as defense in depth. (Devin 🟡)

The shared users.reset_token column between magic-link and password-
reset flows (Devin 🚩) remains an architectural gap; splitting into
separate columns needs schema v8 and is tracked for a follow-up PR.

* docs, chore(grpn): manual-deploy helpers + hackathon deploy learnings

Adds scripts/grpn/ — Makefile + agnes-auto-upgrade.sh + README for
operating Agnes on GRPN's existing foundryai-development VM when the
full Terraform flow is blocked by org policies:

- iam.disableServiceAccountKeyCreation (org constraint) forbids SA
  JSON keys, so GCP_SA_KEY-based CI is unavailable
- No projectIamAdmin delegation → bootstrap-gcp.sh can't grant roles
- Secret Manager IAM bindings require setIamPolicy which editor lacks

Helper targets: deploy, deploy-tag, recreate, restart, stop, start,
status, version, logs, ps, env, ssh, tunnel, open, bootstrap-admin,
set-data-source, install-cron, uninstall-cron.

docs/superpowers/plans/2026-04-22-grpn-deploy-learnings.md — running
log of all org-policy constraints hit during the hackathon deploy,
with workarounds and derived follow-ups (WIF support, external_ip
variable, customer onboarding IAM checklist).

Not a replacement for the TF flow — stopgap until WIF lands.

* fix(web): make header logos clickable links to home

* feat(web): one-click "Setup a new Claude Code" button

Adds a single-button flow on the dashboard and /install page that
generates a fresh personal access token via POST /auth/tokens and
copies a complete, paste-ready setup script (server URL, token,
install/verify commands) to the clipboard. Falls back to a modal
textarea when the clipboard is blocked; redirects to /login on 401;
surfaces backend errors inline.

- dashboard.html: replaces the top "Set up your local environment"
  anchor with a real button wired to setupNewClaude(). Removes the
  duplicate bottom setup banner to keep a single entry point.
- install.html: for signed-in users, Step 1 leads with the one-click
  button and demotes the curl one-liner into a collapsible "Or run
  manually" aside. Anonymous visitors still see the curl flow plus a
  sign-in hint.
- No new deps. Vanilla JS. Token lives in memory/clipboard only —
  never rendered into persistent DOM.

* feat(cli): add "da auth import-token" for non-interactive PAT login

Writes a provided JWT into ~/.config/da/token.json using the canonical
{access_token, email, role} shape expected by save_token(). Decodes the
token locally to pull email/role claims, verifies it against the server
via GET /api/catalog/tables, and refuses to overwrite an existing token
file if the server returns 401. --email / --role overrides exist for
tokens missing those claims; --skip-verify bypasses the server round-trip
for offline / CI scenarios.

* test(cli): cover da auth import-token success + 401 + claim-fallback paths

Three new tests in TestAuthImportToken:
- valid JWT + 200 -> canonical token.json written
- 401 from /api/catalog/tables -> exit 1, existing token file untouched
- JWT without email/role claims -> refused without overrides, accepted
  with --email / --role flags

* feat(web): update one-click Claude setup instructions — explicit uv install, import-token, skills question

Replaces the fragile `cat > token.json <<EOF` clipboard payload with an
explicit, auditable sequence:

  1. `curl -fsSL /cli/download` + `uv tool install --force` (no opaque
     `curl | bash`).
  2. `da auth import-token --token ...` instead of hand-written JSON.
  3. Explicit PATH persistence for zsh/bash.
  4. A required question to the user about whether to copy the bundled
     skills into ~/.claude/skills/agnes/ or pull them on-demand via
     `da skills show`.
  5. A final confirmation step with whoami + version output.

Factored both pages to include a shared partial
(app/web/templates/_claude_setup_instructions.jinja) so dashboard.html
and install.html can never drift apart again. {server_url} and {token}
stay as runtime placeholders substituted by renderSetupInstructions().

* feat(ui): modernize /admin/users + unify header nav across pages

- New shared partial app/web/templates/_app_header.html — single source
  of truth for the top navigation. Used by base.html and dashboard.html
  (which doesn't extend base.html). Active page highlighted via
  request.url.path. Admin "Users" link gated by session.user.role.
- style-custom.css: add .app-header / .app-nav-link / .app-btn-logout /
  .app-avatar styles (mirrors dashboard's previous inline copy under
  app-* prefix). Mobile-friendly fallback at <720px.
- base.html: include the new partial so every page extending base
  (admin_users, profile, login_email, error, …) gets the same chrome
  the dashboard has.
- dashboard.html: replace its inline <header class="header"> markup
  with the shared partial. Inline .header CSS left in place as
  harmless dead code (separate cleanup PR).
- admin_users.html: rewritten with avatars, role pills (color-coded
  per role), toggle switch for active, search/filter input, toast
  notifications, modal dialogs replacing alert/confirm/prompt,
  one-click copy for the reset token, empty / loading states.
  All XSS-safe via the existing esc() helper + data-attribute
  event delegation.
- tests/test_web_ui.py: smoke test that /admin/users renders the new
  shared header chrome and the modernized markup.

* feat(api): serve CLI wheel at /cli/agnes.whl for direct uv install

uv tool install inspects the URL path suffix to recognise a wheel, so
/cli/download (which has no .whl suffix) cannot be installed directly.
Expose a stable /cli/agnes.whl alias over the same wheel lookup so users
can run: uv tool install --force https://<server>/cli/agnes.whl

* test(cli): cover da auth import-token --server persisting to config.yaml

The server persistence was already implemented in the import-token command
(save_config({server}) call) but not covered by tests. Add an explicit test
so the one-step setup contract — single import-token call writes both token
and server — cannot regress.

* feat(web): simpler Claude setup — single uv install URL, single import-token call

User feedback: the prior clipboard payload repeated the server URL and
token across multiple steps (curl + tmpfile + install + rm + separate
seed-config + import-token). Collapse to:

 1. uv tool install --force {server_url}/cli/agnes.whl  (single URL, direct)
 2. da auth import-token --token ... --server ...        (one call, persists both)
 3. da auth whoami
 4. skills (ask user first)
 5. confirm

uv accepts HTTPS URLs that end in .whl and installs them directly, so
the tmpfile dance is unnecessary. import-token --server already persists
the server to config.yaml, so no separate printf > config.yaml step.

* fix(tests): update admin users heading assertion after template rename

The admin_users.html template now uses <h2 class="users-title">Users</h2>
instead of <h2>User management</h2>. Update the assertion to match.

* feat(ui): unify header across remaining 7 standalone pages

These 7 pages render their own full <html> and don't extend base.html,
so the previous unification commit only covered base + dashboard. Each
had its own ad-hoc <header> markup with inconsistent classes
(.top-header / .header / .page-header), inconsistent nav-link sets,
and inconsistent avatar/email styling.

Replace each inline <header>...</header> block with the shared
{% include '_app_header.html' %} so /activity-center, /admin/permissions,
/admin/tables, /catalog, /corporate-memory, /corporate-memory/admin,
and /install all show the same chrome (Dashboard / Install CLI /
Profile / Users / email + avatar / Logout) with the active page
highlighted via request.url.path.

Old inline header CSS (.header, .top-header, .page-header, .nav-link,
etc.) is left in place as harmless dead code; it can be cleaned up in
a follow-up sweep.

* feat(web): add readable preview of Claude setup payload on dashboard + /install

Move the line-by-line setup instructions into app/web/setup_instructions.py
as the single source of truth, then render them in two modes from the
existing _claude_setup_instructions.jinja partial:

- preview_mode=True  → visible, read-only <pre><code> block with the real
  server URL and a clearly-styled placeholder token (never a real one).
- preview_mode=False → the JS SETUP_INSTRUCTIONS_TEMPLATE used by the
  one-click flow (unchanged behaviour).

Both /dashboard (env-setup-cta card) and /install (Step 1 card) now show
the preview directly under the 'Setup a new Claude Code' button so users
can see exactly what will land in their clipboard before they click.

* feat(web): update setup instructions — `da diagnose` step, explicit section titles

Rework the Claude Code setup payload to:

- Give every numbered step an unambiguous verb header ("1) Install the CLI",
  "2) Log in", "3) Verify the login", "4) Run diagnostics", "5) Skills (ask
  the user first)", "6) Confirm").
- Add step 4 `da diagnose` as the post-login health check. The CLI already
  ships this command (cli/commands/diagnose.py); it prints "Overall:
  healthy" and a list of green checks that map cleanly to next actions.
- Ask the skills copy-vs-on-demand question verbatim so Claude Code always
  prompts the user the same way.
- Replace the terse "Confirm" line with a 4-bullet summary (version,
  whoami, skills choice, diagnose status) so the return message is
  structured and comparable across setups.

* chore(web): remove stale MCP card from /install (no MCP server today)

The 'Use with Claude Code / MCP' card (Step 3 on /install) referenced an
MCP integration Agnes does not ship. Remove the whole card. The one-click
'Setup a new Claude Code' flow in Step 1 already covers the long-lived
client use case and is less confusing than dangling persistence tips for
a non-existent integration.

* feat(api): include user_email + last_used_ip + user_id in admin tokens list response

Adds AdminTokenItem response model (superset of TokenListItem) and
AccessTokenRepository.list_all_with_user() joining personal_access_tokens
with users to denormalize user_email. Needed for /admin/tokens UI where
admins triage tokens across all users.

* feat(web): /admin/tokens page — list, filter, search, revoke across all users

Adds a new admin-only page with client-side filtering (status, user email,
last-used window), column sorting, counts bar (active/revoked/expired),
and an inline revoke action. Mirrors the /admin/users visual language.

* feat(web): add Tokens nav link for admins + deep-link from admin/users row

Admin-only nav entry to /admin/tokens, and a per-row Tokens button on
/admin/users that prefills the token page's user filter via ?user=<email>.

* test(admin): cover /admin/tokens rendering, filter state, non-admin denial, revoke

Verifies admin can render the page (title + JS hooks present), a non-admin
is blocked, unauthenticated users are redirected, the admin list response
includes user_email / user_id / last_used_ip, and admin can revoke another
user's token.

* feat(web): modern redesign of /admin/tokens — hero, stat strip, refined table, responsive cards, a11y

* feat(web): ditch the table — /admin/tokens as a card stack, modern GitHub-style list

Replaces the table-based layout with a stack of self-contained token cards
inside a <ul role=list>. Each card is a flex row: avatar + name/meta on the
left, last-used block in the middle, status pill + outlined 'Revoke' button
on the right. Status and sort controls are pill-shaped toggle chips; user
email search has an inline search icon. No <table>/<tr>/<th>/<td> anywhere.
Responsive below 720px (card stacks vertically) and 480px (stat chips 2x2).
Preserves filter IDs (flt-status, flt-user, flt-last-used) and data-revoke
for existing tests.

* feat(web): add /tokens (role-aware) — single page for both user PAT CRUD and admin overview

- Rename admin_tokens.html -> tokens.html with a new is_admin context flag.
- New route GET /tokens: renders the same card-stack UI for everyone.
  * Admins: loads /auth/admin/tokens, shows owner column + stat strip, keeps
    the owner-email search box and sort-by-owner chip.
  * Non-admins: loads /auth/tokens (own tokens only), hides owner column +
    stat chips, adds a 'New token' CTA in the hero that opens a modal
    (name + expires_in_days) calling POST /auth/tokens. The raw token is
    revealed once in a dismissable banner and cleared from the DOM on Hide.
- GET /admin/tokens now 302-redirects to /tokens, preserving query string
  (so the /admin/users deep-link ?user=foo still works).

* feat(web): /tokens full-bleed layout to match dashboard width

The hero, toolbar, and card list used to sit inside base.html's .container
(max-width 800px). Break out with negative horizontal margins so the page
spans the viewport like /dashboard does, capped at 1440px for readability
on very wide screens with a 24px gutter on each side.

- No change to base.html itself. The override is scoped to .tokens-page.
- body { overflow-x: hidden; } guards against rare horizontal scrollbars.
- < 808px viewport: reset to natural flow (mobile already narrower).
- ≥ 1488px viewport: cap to 1440px and re-center.

* chore(web): remove /profile template + nav link (redirect /profile -> /tokens)

The old /profile PAT CRUD page is now redundant — the modern /tokens page
covers both user and admin flows. Delete the template; the router's
/profile handler already 302-redirects to /tokens.

Nav cleanup:
- Remove the 'Profile' link.
- Show a single 'Tokens' link to every signed-in user (previously only
  admins saw it).
- Active-state matches /tokens, /admin/tokens, and /profile so the
  highlight survives the redirect chain.

/install CTA now points at /tokens instead of /profile.

* test: cover /tokens for admin + non-admin flows, /profile redirect, nav update

tests/test_admin_tokens_ui.py
- Point admin rendering test at /tokens directly and tighten assertions
  (admin-only stat strip + owner search, non-admin CTA absent).
- Add test_non_admin_can_render_tokens_page: personal body, New-token CTA,
  create-modal, reveal banner; stat strip + owner search absent.
- Add test_admin_tokens_redirects_to_tokens: 302 to /tokens, query string
  (?user=...) preserved for the /admin/users deep-link.
- Add test_profile_redirects_to_tokens: 302 to /tokens.
- Add test_non_admin_can_create_pat_via_tokens_page_api: exercises the
  POST /auth/tokens call that the non-admin create-modal submits.

tests/test_pat.py
- test_profile_page_renders -> test_profile_page_redirects_to_tokens:
  assert the 302 + that /tokens lands on the unified non-admin body.

tests/test_web_ui.py
- admin_users nav assertion: 'Tokens' link present, 'Profile' link absent.
- Add test_nav_shows_tokens_link_for_non_admin: non-admins see the same
  'Tokens' link (previously only admins did).
- Add test_profile_redirects_to_tokens back-compat check.

* feat(web): collapse 'What Claude Code will receive' by default

The preview block on /dashboard and /install now uses <details>/<summary>
so it is hidden by default. Click the chevron/title to expand and review
the clipboard payload. Markup stays in the DOM so existing tests that
assert on content continue to pass.

* fix(web): /tokens width — override .container to 1280px like dashboard

The negative-margin full-bleed trick was fragile and pushed content past
the right edge on deployed viewports. Replace with a simple max-width
override of base.html's .container on this page only, matching
/dashboard's 1280px center-column layout.

* feat(web): split role-aware /tokens into my_tokens.html + admin_tokens.html

* feat(web): router — separate handlers for /tokens (own) and /admin/tokens (all)

* feat(web): nav — show Tokens for all, add All tokens for admins

* test: cover split token pages (own vs all) + admin access gating

* feat(web): move 'My tokens' into a user dropdown menu

Replaces the separate Tokens/email/Logout nav trio with a rounded
avatar trigger that opens a dropdown containing the user's email,
role, a 'My tokens' link, and Logout. Admin-only 'All tokens' stays
as a top-level nav item since it's an admin function, not a personal
one. Click-outside and Escape close the panel; chevron rotates on
open.

* fix(api): allow PATs to list/get/revoke their own tokens (CLI flow)

The documented 'da auth token list/revoke' CLI flow in
docs/HEADLESS_USAGE.md uses a PAT, but the previous dependency
(require_session_token) returned 403. Only create_token must be
session-only to prevent PAT-spawning-PAT chains; listing and
revoking your own tokens is safe with a PAT.

* fix(api): cap expires_in_days at 3650 to avoid datetime overflow (500 to 400)

Values above ~11 million days overflowed datetime.max in
datetime.now(utc) + timedelta(days=...) and surfaced as an
unhandled OverflowError → 500. Cap at 10 years with a clear
400 instead; the no-expiry code path is unaffected.

* fix(api): relax _SAFE_URL_RE to allow path prefixes, underscores, and IPv6

The previous regex rejected legitimate reverse-proxy base_url values
(https://host/agnes/), underscores in Docker Compose hostnames, and
IPv6 literals (http://[::1]:8000). Widen the charset and allow an
optional trailing path. shlex.quote continues to provide
defense-in-depth against any metacharacter that slips through.

* fix(web): /login/email and Google OAuth propagate next_path

Previously, /login/email silently dropped the ?next=<path> query
param so the hidden form field rendered empty and login always
landed on /dashboard. Google's button was hard-coded to
/auth/google/login, ignoring next entirely.

- /login page now appends ?next to the Google button URL
- /login/email reads + sanitizes next, passes as template context
- google_login stashes sanitized next_path in session['login_next']
- google_callback pops + re-sanitizes and redirects there

Sanitization factored into app/auth/_common.safe_next_path.

* fix(auth): differentiate argon2 VerifyMismatchError from internal errors in web login

The previous except (VerifyMismatchError, Exception) collapsed both
cases into the generic 'invalid credentials' redirect, silently
hiding corrupted-hash / library errors from ops. Split the two:
bad password still gets ?error=invalid; anything else logs via
logger.exception and redirects with ?err=auth_internal so ops have
a visible signal and users don't retry forever against a broken
password_hash column.

* docs: correct CLAUDE.md table name (personal_access_tokens)

v7 note referenced 'access_tokens.last_used_ip' but the real table
is personal_access_tokens (as mentioned two tokens earlier in the
same bullet). Same-file consistency fix.

* chore(web): clarify admin user-reset UI — encourage Set password over the unused reset_token

POST /api/users/{id}/reset-password stores and returns a token
but no endpoint consumes it — the magic-link sender would log the
user in without prompting for a new password, defeating the reset.
- Drop the 'Reset' row action from admin_users so admins aren't
  pointed at a dead end.
- Rewrite the reveal-modal copy to tell admins to use Set password
  and explicitly note that the magic-link flow isn't available
  for reset tokens in this build.
The API endpoint stays for API-level future use.

* test: cover PAT CLI flow, expires_in_days overflow, proxy base_url, next propagation

- tests/test_pat.py: PAT can list own tokens (200, was 403);
  PAT can revoke own tokens (204); create_token returns 400 for
  expires_in_days > 3650 (was 500 via datetime overflow).
- tests/test_cli_artifacts.py: _SAFE_URL_RE accepts reverse-proxy
  path prefixes, underscores, and IPv6 literals; end-to-end check
  of cli_install_script with a stubbed base_url that includes
  a path prefix (Agnes behind /agnes/).
- tests/test_web_ui.py: /login propagates ?next to the Google
  button URL; /login/email renders next in the hidden form field
  and strips hostile values; unit coverage of safe_next_path.

* fix(security): use \Z instead of $ in URL/version allowlists (trailing-\n bypass)

Python regex `$` also matches just before a trailing newline, so a Host
header or AGNES_VERSION value like "good.example.com\n$(rm -rf /)"
would slip past the allowlist. `\Z` anchors to strict end-of-string.

shlex.quote downstream remains as defense-in-depth, but the allowlist
is now the tight gate it claims to be.

* fix(auth): PAT with null expiry omits JWT exp claim (DB is the source of truth)

Previously a PAT created with `expires_in_days=null` (user-requested
"never expires") set the DB `expires_at` to NULL (correct) but still
baked a ~100y `exp` claim into the JWT. That is misleading: the PAT
silently did expire eventually, despite the UI and API promising
"no expiry".

`create_access_token` now accepts `omit_exp=True` to skip the `exp`
claim entirely. `app/api/tokens.py` passes that when `expires_in_days
is None`. The authoritative expiry check lives in
`app/auth/dependencies.py`, which reads `expires_at` from the DB row —
unchanged. PyJWT accepts claim-less JWTs indefinitely.

* test: cover trailing-newline regex bypass + no-exp JWT for unbounded PAT

- test_safe_url_re_rejects_trailing_newline_bypass: asserts both
  `_SAFE_URL_RE` and `_SAFE_VERSION_RE` reject values with a trailing
  `\n` (previously accepted because Python `$` matches before `\n`).
- test_pat_null_expiry_jwt_has_no_exp_claim: POST /auth/tokens with
  `expires_in_days=null`, decode the returned JWT, assert `exp` is
  absent while `typ=pat`, `sub`, and `jti` are still present.
- test_pat_with_null_expiry_is_accepted_by_verify_token: verify_token
  round-trips a claim-less JWT without ExpiredSignatureError.
- test_pat_null_expiry_end_to_end_allows_authenticated_request: use
  the null-expiry PAT against /auth/tokens and confirm it authenticates.

* docs(auth): document X-Forwarded-For trust model in _client_ip

Deployment runs behind Caddy which strips incoming X-Forwarded-For
and sets its own, so the leftmost hop is trustworthy. Clarify that
the stored last_used_ip is audit-only and never used for access
control — if the app is ever exposed directly, this value becomes
client-settable.

* docs: /profile → /tokens in install.sh next-steps, CLI error, HEADLESS_USAGE, security skill

After splitting PAT management to /tokens (with /profile as a back-compat
302), stale references remained in user-facing text. Update them to the
canonical /tokens URL so shell scripts, CLI error hints, docs, and the
bundled security skill are all consistent.
2026-04-22 14:24:28 +02:00