agnes-the-ai-analyst

Author	SHA1	Message	Date
minasarustamyan	302cf58ccd	feat(marketplace): telemetry v46 + flea inner parity + listing polish (#329 ) * feat(telemetry): marketplace item rollup refactor (schema v46) Replace the v42 attribution layer with prefix-split + live lookup against marketplace_plugins / store_entities. The v42 design had a latent bug — AttributionLookup keyed on bare skill names while Claude Code writes `<plugin>:<local>` in JSONL, so lookups never matched and usage_plugin_daily stayed empty in every deployment. Schema (v46 migration): - Drop usage_attribution_skills / _agents / _commands (mapping tables, derivable from marketplace_plugins + plugin tree). - Drop usage_plugin_daily (always empty in production due to the bug above). - Create usage_marketplace_item_daily — per-day fact (count, distinct_users, error_count), composite PK on (day, source, type, parent_plugin, name). - Create usage_marketplace_item_window — sliding-window snapshot with true cross-window distinct user counts; period_label='last_7d' refreshes every tick, 'last_30d' refreshes hourly (tracked via session_processor_state). - Mark usage_tool_daily as candidate for removal (no product-UI consumer). Attribution flow: - MarketplaceItemLookup replaces AttributionLookup. Preloads marketplace_plugins.name + store_entities.name into memory once per UsageProcessor tick, then per-event splits identifier on ':', matches prefix, writes resolved source / parent_plugin into usage_events. agnes-store-bundle prefix routes to flea entities. Slash commands with `plugin:` prefix count as type='skill' in rollup. API: - BREAKING: MarketplaceItem.unique_users_30d renamed to distinct_users_30d (now a true distinct count from the window snapshot, not sum-of-daily). - InnerDetailResponse gains a telemetry field — invocations_30d + distinct_users_30d surfaced on curated inner skill / agent detail pages. - Card chip hidden pending UX finalisation; data stays in the response. Backfill: scripts/backfill_marketplace_rollup.py — one-shot rebuild over historic usage_events after deploy, idempotent. USAGE_PROCESSOR_VERSION bumped 4 → 5 so the reprocess loop re-attributes existing events to the new source/ref_id semantics on the next tick. Tests rewritten: test_session_processor_usage, test_usage_rollups, test_marketplace_telemetry, test_api_admin_usage_reprocess, test_db_schema_version, test_home_stats, test_schema_v42_migration. New: test_backfill_marketplace_rollup. * fix(marketplace): refresh Most Popular on search + category changes `loadMostPopular()` early-exits when `state.q` or `state.category` is set, but the search + category handlers only called `loadItems()` — so once the section was visible, typing a query or filtering by category didn't re-run the hide check and the cards stayed on screen out of scope. Tab + sort handlers already chained the call. Add the call to runSearch + category pill click handlers (All + per-category) so the visibility contract holds for every state mutation that can flip the early-exit condition. * feat(marketplace): All-plugins section + 7-day Most Popular Listing layout: - Always-visible "All plugins" / "All items" / "Your stack" section header (label swaps per tab) wrapped in `#mp-all-section` so its margin-collapse mirrors the sibling `#mp-popular-section` and the spacing from the filter row stays consistent in both layouts. - Sort dropdown moved from the filter row into the All-* header, pinned right via `margin-left: auto`. Anchored to its section so the relationship between sort + grid is obvious. - `.mp-section-header` gets `min-height: 32px` + `align-items: center` so the bare-text Most Popular row matches the dropdown-bearing All-* row. - `.mp-section-header` margin tightened 24px → 20px on top. Most Popular: - Capacity reduced 8 → 4 cards. - Now reflects a 7-day window (was 30-day). Backend surfaces `invocations_7d` + `distinct_users_7d` on `MarketplaceItem` alongside the existing 30d fields; the loader pulls a wider page (server still sorts by 30d) and re-sorts + filters client-side on `invocations_7d > 0` so the strip stays "hot right now". - Section label updated to "Last 7 days". - Section now renders on both `curated` and `flea` tabs (was curated-only). Hidden on `my` and whenever search / category filter is active. Refresh hooks wired into search + category click handlers so visibility flips immediately on state change. Backend (`_load_invocation_stats`): - Single SELECT pulls both `last_30d` and `last_7d` rows from `usage_marketplace_item_window`; the result dict carries invocations + distinct_users for both windows. - Trend (recent_7 vs prior_7) kept on the daily fact table so it stays independent of the window snapshot's freshness. * feat(marketplace): Most adopted sort + hide Trending when no trend data Add a fourth sort option to the All-items dropdown — "Most adopted (30d)", keyed on `MarketplaceItem.distinct_users_30d` (true 30d distinct user count from `usage_marketplace_item_window`). Protects the listing from power-user skew that `most_used` is susceptible to: one user × 100 invokes can't beat 10 different users × 1 invoke under adoption sort. Hide Trending option when the response has no trend data. User reported `sort=trending` returning an empty grid because every plugin's `trend_pct` was None (prior-week threshold of >= 3 invocations didn't clear anywhere). Empty grids on a user-selected sort are worse UX than just not offering the sort — surface what works, hide what doesn't. Backend (`app/api/marketplace.py`): - `_apply_sort` gains a `most_adopted` branch (DESC distinct_users_30d, ties by name ASC). - `sort` Literal extended. - `ItemListResponse.available_sorts` lists the sort keys the UI should expose for this response. recent/most_used/most_adopted always; trending only when at least one item in the tab's stats carries a non-null trend_pct. - `_available_sorts(stats_dicts)` helper centralises the rule — curated and flea branches pass one stats dict, my-tab passes both (option is available when either source has trend data). Frontend (`app/web/templates/marketplace.html`): - New `<option value="most_adopted">Most adopted (30d)</option>` between Most used and Trending. - URL state allowlist extended so `?sort=most_adopted` round-trips. - `applyAvailableSorts(available)` runs after each list fetch: hides options not in the response's available_sorts; if the user is on a now-unavailable sort, resets to 'recent' and re-fetches. Search-mode fan-out unions availability across the curated + flea responses so a hit on either side keeps the option visible. * feat(marketplace): funnel chip on cards + deterministic Most Popular sort Card chip — funnel telemetry between description and footer: [stack-icon] N installed · [user-icon] N active · [bolt-icon] N calls · ↑/↓ N% - stack_count (new MarketplaceItem field): for curated it's COUNT() on user_plugin_optouts (post-v28 row PRESENCE = subscribed; system plugins are fanned out to every user via fanout_system_for_user so the count includes them naturally). For flea it reuses the existing store_entities.install_count (bumped on install/uninstall). - distinct_users_30d (existing) — active users in the 30d window. - invocations_30d (existing) — call volume. - trend_pct (existing) — week-over-week, both directions: green ↑ / red ↓, magnitude only (sign in the arrow). Hidden when null. Backend additions in app/api/marketplace.py: - MarketplaceItem.stack_count field. - _load_curated_stack_counts() — one SELECT per render, GROUP BY (marketplace_id, plugin_name). Wired into the curated + my-tab branches; flea reads install_count off the entity row directly. Frontend (app/web/templates/marketplace.html): - Heroicons solid 24×24 inlined (one helper per icon, all fill="currentColor" so per-segment colour tokens apply): rectangle- stack (mirrors the My Stack tab icon), user, bolt, arrow-trending- up/down. - Per-segment colour: installed=amber #F59F0A (My Stack accent), active=green #0e9b6a, calls=orange #f97316. Text stays neutral so the chip still reads as metadata, the leading glyph carries the visual cue. Trend pill keeps the full-segment green/red colour. - Zero state: chip hidden when stack_count == 0 AND invocations_30d == 0 — brand-new cards aren't visually penalised by a "0·0·0" row. - Tooltips on every segment via title="…" so hover explains the number's meaning to anyone uncertain about the icon. Most Popular section — deterministic ordering: Previously sorted by invocations_7d DESC with no tie-breakers, so several cards with identical 7d call counts would swap places on refresh (JS stable sort fell back on backend order, and the backend's own tie-breaker for `most_used` was just name ASC — six `grpn` plugins from six test marketplaces collapse to the same name and became indeterminate via list_with_filters' created_at order). New cascading hierarchy (chosen primary now matches what "most popular" really means — wide adoption, not power-user volume): 1. distinct_users_7d DESC ← adoption / social proof 2. invocations_7d DESC ← volume at equal adoption 3. distinct_users_30d DESC ← broader adoption fallback 4. invocations_30d DESC ← broader volume fallback 5. name ASC ← deterministic textual order 6. marketplace_slug ASC ← splits duplicate plugin names across marketplaces Six levels guarantee any two items end at a different sort key, so the strip is stable across refreshes. fix(marketplace): unify Most Popular on 30d + right-align installed chip Most Popular section was sorting on the 7d window while its cards rendered 30d numbers — header label promised one thing, cards showed another. Unified everything on 30d so a card means the same data everywhere on the page. - Dropped the "Last 7 days" meta from the Most Popular header. - Sort cascade now starts on distinct_users_30d, then invocations_30d, with 7d adoption/volume as recency-aware fallbacks before the name + marketplace_slug deterministic tail. Six levels guarantee identical sort keys never produce indeterminate order across refreshes. - Filter switched from invocations_7d > 0 to invocations_30d > 0 to match the new horizon. - Most Popular now only renders on page 1 of the listing. Past initial discovery, a top-of-list popularity strip on page 2+ would shadow the results the user paged into. Pager click handler refreshes the section so navigating back to page 1 re-mounts it. Chip layout — split engagement vs adoption visually: [user] N active · [bolt] N calls · [↑/↓] N% [stack] N installed └────────── LEFT (time-bounded engagement) ────┘ └── RIGHT (all-time) ──┘ - Installed (stack_count) is all-time, decremented on uninstall. Alone it says little ("12 people installed it") without the engagement context next to it ("…but did anyone actually use it?"). Visually separating the two groups makes that distinction obvious — left group answers "is it used", right answers "does anyone have it". - Implemented via flex with margin-left:auto on .seg-installed so installed drifts to the trailing edge. - Installed tooltip now reads "Currently installed by N users" — the count is a real-time net (uninstall drops it), and saying "currently" makes that explicit. Helps when a card shows 0: signals "nobody has this in their stack right now", not "data missing". * feat(plugin-detail): telemetry chip in hero, derived rows in sidebar Surface the same telemetry funnel the listing card carries on the curated plugin detail page, so clicking through from /marketplace keeps a single mental model — figures match, semantics match. The detail sidebar drops the two raw numbers that used to live there (Invocations 30d / Users 30d — duplicated by the chip now) and replaces them with two derived signals only the daily series can provide: Active days + Last used. Backend (app/api/marketplace.py): - PluginDetailResponse.stack_count — curated reads via _load_curated_stack_counts(), flea reuses install_count. Frontend treats both sources uniformly. - _build_telemetry() always returns a dict (never None). Frontend decides chip visibility from stack_count + invocations_30d the same way the listing card does. daily_series is always 30 entries (zero-padded) so "Active days" and "Last used" derivations on the sidebar are trivial array filters. Frontend (app/web/templates/marketplace_plugin_detail.html): - New .hero-telemetry slot at the bottom of the hero meta column, between the pills row and the action buttons. Renders the four funnel segments — active · calls · trend · installed — joined by ` · `. No left/right split: the hero has space, so a single coherent metadata strip reads cleaner than the card's split layout. - Heroicons solid inlined (user / bolt / arrow-trending-up,-down / rectangle-stack) recoloured against the dark hero — icons in lighter tokens (mint #6ee7b7, peach #fdba74, cream #fde68a), trend pill keeps the saturated green/red because direction-coding earns its own colour. - Tooltip on installed reads "Currently installed by N users" — the count is a real-time net (drops on uninstall), and "currently" makes that explicit when a card shows 0. - fmtNum helper added so 1.2k / 14M renderings match the card's format exactly. - Sidebar swap: Invocations + Users rows removed, replaced by Active days → "N of 30" Last used → fmtRelative of the latest non-zero day Both derived from telemetry.daily_series — engagement consistency + recency, neither of which the hero chip exposes on its own. * feat(item-detail): telemetry chip in hero for curated skill/agent Bring the funnel chip the plugin detail page got in 4cf38d40 to the curated inner skill/agent detail page — clicking through from the listing card now keeps the same metadata strip from grid to plugin page to inner item page. Backend (app/api/marketplace.py): - _load_inner_item_stats() rewritten: * always returns a dict (never None) so the frontend can decide chip visibility client-side, same contract as _build_telemetry * adds trend_pct, computed the same way as plugin level (recent_7 vs prior_7 from usage_marketplace_item_daily, ≥3 prior-week threshold) * adds daily_series (30 entries, zero-padded) so the sidebar can derive Active days + Last used - InnerDetailResponse.parent_stack_count — new field. Skills/agents don't have a per-item subscription model, so the hero shows the parent plugin's stack count under a "Plugin:" prefix. The funnel: "12 installed plugin → 2 actually use this skill". - curated_skill_detail + curated_agent_detail handlers load _load_curated_stack_counts() once and pass the parent's value. Frontend (app/web/templates/marketplace_item_detail.html): - New .item-detail .hero .hero-telemetry slot beneath the badges row. CSS mirrors plugin-detail's colour tokens (mint/peach/cream Heroicons solid + saturated trend pill) so the two surfaces read as one visual family. - Installed segment uses a "Plugin:" label rendered with reduced opacity to signal the metric describes the parent, not the item itself. Tooltip: "Parent plugin (<plugin_name>) currently installed by N users". - Sidebar Invocations + Users rows removed (chip carries them). Active days + Last used derived from telemetry.daily_series replace them; only rendered when activeDays > 0 so a brand-new skill doesn't show "0 of 30" / "Last used —". - "Type" row dropped from the sidebar — duplicates the hero badge. - fmtNum helper added (matches listing card + plugin detail). Plugin detail (app/web/templates/marketplace_plugin_detail.html): - Hero "Curator: …" line removed. The Details sidebar already carries that info; duplicating it under the h1 was visual noise. - Sidebar "Owner" row renamed to "Curator" — for curated plugins it's a person who curates inclusion in this Agnes instance, not the upstream code owner. "Owner" was a hold-over label. * feat(item-detail): unify hero with plugin detail — pills + breadcrumb + cleaner sidebar - Inner skill/agent hero now uses the same `.pills` / `.pill.cat / .curated / .flea / .muted` class names + CSS as the plugin detail page; the only item-only addition is `.pill.type` (Skill / Agent uppercase, plugin detail has no kind axis). - Hero `Updated` moved out of the meta-row into a muted pill (mirrors the plugin detail hero), removed from the Details sidebar to avoid duplication. - Details sidebar slimmed: dropped Marketplace, Path, Updated rows; Parent plugin now shows the curator-friendly display name (`parent_display_name \|\| manifest_name \|\| slug`) instead of the slug. - Breadcrumb extended to full path: Marketplace > <marketplace_name> > <plugin display name> > <self>, mirroring the plugin detail breadcrumb. - Backend: new `InnerDetailResponse.parent_display_name` field, populated via `_curated_plugin_enrichment` from marketplace-metadata.json — same source plugin detail hero already uses. * feat(marketplace): flea inner skill/agent detail + breadcrumb polish - Flea inner skill/agent detail page parity with curated: * GET /api/marketplace/flea/{id}/skill/{name} + /agent/{name} returning InnerDetailResponse (mirror of curated_skill_detail). * /marketplace/flea/{id}/skill\|agent/{name} web routes that render marketplace_item_detail.html with source='flea' + innerName context. * Frontend apiURL grows a third branch for flea-inner; breadcrumb grows to 4 segments (Marketplace > Flea Market > <plugin display name> > <self>) when innerName is set. * Telemetry attribution: MarketplaceItemLookup resolves <flea_plugin>:<inner> prefixes to (source='flea', parent_plugin=<plugin name>) so nested invocations land in the same rollups curated nested skills use. USAGE_PROCESSOR_VERSION bumped 5 -> 6 so the reprocess loop re-attributes historic events. - Breadcrumb 2nd segment is now a generic clickable "Curated Marketplace" / "Flea Market" link to /marketplace?tab=... instead of the opaque per-instance marketplace_name. Applied on both plugin detail and inner item detail. - Inner item hero telemetry chip works for both sources: installedCount branches on parent_stack_count (curated) vs install_count (flea), installed segment drops the "Plugin:" prefix for flea standalone / inner items. - Updated row dropped from Details sidebar on item detail — the hero pill already carries the value, sidebar row was duplicate. * feat(item-detail): block stack-install on flea inner items (mirror curated) Inner skills/agents nested inside a flea plugin can no longer be added to a user's stack on their own — adoption only happens at the plugin level, same rule curated nested items have followed since launch. - Hero action: when innerName is set (curated nested OR flea nested), render "Open parent plugin →" link + helper text instead of the install/remove buttons. Flea standalone entities (no innerName) keep the normal install UX. - Meta-row: same branch now serves curated + flea inner — "part of <parent plugin display name> · by <author>" with the parent link pointing at the right detail page per source. No API gate change needed: POST /api/store/entities/{id}/install only accepts existing entity ids (plugin-level), inner items have no entity id of their own so the endpoint cannot target them directly. * feat(marketplace): telemetry chip on inner cards + fix flea hero chip visibility Inner skill/agent cards on the plugin detail page now carry the same four-segment funnel chip the marketplace listing cards show (N active . N calls . trend . N installed), for both curated nested skills and flea nested skills. Plus two fixes that were keeping the hero chip hidden on flea plugin / flea inner detail pages. - Backend `_load_inner_items_stats_by_parent(conn, source, parent_plugin)` bulk loader: one query per plugin against usage_marketplace_item_window + one against _daily, returning {(name, type): stats}. Avoids N+1 per-card lookups. - `InnerItemSummary` gains invocations_30d / distinct_users_30d / trend_pct / parent_stack_count fields. `curated_detail` and `flea_detail` (in the entity.type=='plugin' branch) enrich the skills / agents lists after the existing cover-photo enrichment loop. - `marketplace_plugin_detail.html`: new `.plugin-detail .inner-card .inv-chip` CSS lifted from marketplace.html with the listing-card rules, new buildInnerCardChip() helper, buildCardSection appends the chip to each card body. Same gate as the listing card (hidden on parent_stack==0 && calls==0). - fix(flea): flea_detail forgot to populate PluginDetailResponse.stack_count from entity.install_count (listing card does this on line 851; detail endpoint didn't). Hero chip gate `stackCount===0 && calls===0` then always hid the chip even when the entity had installs. Now mirrors listing card semantics: stack_count == install_count for flea. - fix(flea inner): renderInnerHeroTelemetry was reading `d.install_count` for any non-curated source. InnerDetailResponse has no install_count field — it has parent_stack_count (populated server-side from the parent flea plugin's install_count). Gate + label now read parent_stack_count for both curated nested AND flea nested scenarios; install_count remains the flea standalone path. fix(marketplace): Owner label on flea + parent-centric sidebar for flea inner - Plugin detail Details sidebar — authorship row label now tracks the source: curated bundles get `Curator` (existing behaviour), flea bundles get `Owner`. The `owner_todo` reminder placeholder stays on the curated branch only; flea falls through silently. - Inner item detail Details sidebar — flea-inner (skill/agent nested inside a flea plugin) now shares the curated nested layout: Parent plugin / Bundle size / Active days / Last used / Owner. Drops the flea-standalone shape's `Category`, `Version`, `Installs`, `Released` rows that didn't apply to a nested item. Active days + Last used were already wired (telemetryRows) — they just weren't on the flea-inner branch. * fix(tests): bump SCHEMA_VERSION assertions 47 -> 48 post-rebase The marketplace telemetry migration was renamed _v46_to_v47 -> _v47_to_v48 during the rebase onto main (collision with #326 FTS BM25 migration that took the v47 slot). Two test files still asserted the pre-rebase value: - tests/test_home_stats.py::test_schema_version_constant_is_46 (CI red) - tests/test_schema_v46_migration.py::test_schema_version_is_46 Renames the helper fn name + bumps the assertion. The other two test files (test_db_schema_version.py, test_schema_v42_migration.py) were already updated in the rebase resolution. * fix(telemetry): _build_telemetry returns None when invocations_30d == 0 The follow-up commit that introduced the always-return-dict shape broke the test contract from the original v46 PR (commit b603e998): tests/test_marketplace_telemetry.py::TestDetailTelemetry:: test_detail_endpoint_telemetry_absent_when_no_data AssertionError: assert {'daily_series': [...], ...} is None Both `PluginDetailResponse.telemetry` and `InnerDetailResponse.telemetry` are declared `Optional[Dict] = None`, the frontend renders are None-safe (`d.telemetry \|\| {}` guard + `if (!d.telemetry \|\| ...)` on daily_series), so dropping the dict on zero activity is the cleaner default. * release: 0.54.21 — marketplace telemetry refactor (schema v48) + flea inner detail parity + listing UX polish --------- Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com> Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>	2026-05-15 20:58:03 +02:00
ZdenekSrotyr	a48524509a	docs: consolidate and de-clutter the documentation tree (#306 ) CLAUDE.md rewritten (708 -> ~320 lines): four overlapping release sections collapsed to one, stale v1->v35 schema history dropped (it lives in CHANGELOG), marketplace endpoint internals and verbose process sections moved out or tightened. New focused docs: - docs/RELEASING.md - release process, deploy workflows, CI quirks (RELEASE_TEMPLATE.md folded in as an appendix) - docs/marketplace.md - marketplace ingestion + re-serving internals - docs/README.md - documentation index by audience, linked from README.md and CLAUDE.md Archived under docs/archive/: docs/superpowers/ (52 historical planning artifacts), HACKATHON.md, pd-ps-comments.md, security-audit-2026-04.md, future/NOTIFICATIONS.md. Removed the docs/auto-install.md stub. Fixed dangling links in connectors/jira/README.md and dev_docs/README.md, repointed code/doc references to archived paths.	2026-05-14 18:54:22 +00:00
ZdenekSrotyr	3e19caa975	fix(security): RBAC filter uses stable user_id instead of mutable email local-part (#293 ) (#299 ) * fix(security): RBAC filter for agnes_sessions matches both email local-part and user_id The upload API (POST /api/upload/sessions) stores session files under user_sessions/{user_id}/ (UUID), while the session collector uses the OS username (email local-part). The session pipeline writes the directory name verbatim into usage_session_summary.username, so the column can contain either value depending on the ingestion path. The RBAC filter in build_filter_clause previously only matched the email local-part, missing sessions uploaded via the API. The fix adds an OR condition so non-admin users see rows where username matches either their email local-part or their user_id. Closes #293 Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com> * fix(security): RBAC filter uses stable user_id instead of mutable email local-part Closes #293 Previous fix used OR condition matching both email local-part and user_id in the username column. This was fragile: email changes would break filtering. This commit introduces a dedicated user_id column populated by the session pipeline via resolve_user_id(), and switches the RBAC filter to use it exclusively. Changes: - Schema v45: add user_id column to usage_session_summary and usage_events - UsageProcessor: accept and store user_id in both tables - runner.py: resolve_user_id() maps directory name to users.id UUID (exact match for UUID dirs, email LIKE for local-part dirs) - INTERNAL_TABLES: agnes_sessions/agnes_telemetry filter on user_id column - build_filter_clause: simplified to WHERE user_id = '<uuid>' (no OR) - me.py/admin_user_sessions.py: query by user_id OR username for backward compatibility during transition - USAGE_PROCESSOR_VERSION bumped 2→3 to trigger reprocessing/backfill - Tests updated: 27 pass including new email-change resilience test Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com> * fix(tests): bump schema version assertions 44→45 Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com> * fix(docs): correct resolve_user_id docstring, add TypeError comment Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com> * fix(security): address review — backward-compat OR, LIKE escaping, narrower TypeError Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com> * fix(security): address code review — eliminate TypeError hack, add resolve_user_id tests Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com> * fix(db): create user_id indexes in _v44_to_v45, not _SYSTEM_SCHEMA _SYSTEM_SCHEMA runs before the migration ladder. On an upgrade from v42/v43/v44, usage_events / usage_session_summary already exist without the user_id column (CREATE TABLE IF NOT EXISTS is a no-op), so the CREATE INDEX ... (user_id) lines in _SYSTEM_SCHEMA failed to bind and aborted _ensure_schema — the app would not start post-upgrade. Move the index creation to _v44_to_v45, which ADDs the column first. Same pattern as the v41 audit_log indices. * fix(usage): bump USAGE_PROCESSOR_VERSION 3→4 for user_id backfill #303 shipped USAGE_PROCESSOR_VERSION=3 (release 0.54.12) for its <command-name> slash extraction. This PR's 2→3 bump collided with it on rebase, so the reprocess loop would not re-trigger to backfill the new user_id column on deployments already running v3. Bump to 4. * release: 0.54.13 — RBAC filter uses stable user_id (#293) --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-14 14:12:54 +00:00
minasarustamyan	f53e98d5a3	fix(usage): extract <command-name> slash invocations (release 0.54.12) (#303 ) UsageProcessor missed every user-typed slash invocation because the SLASH_RE regex (^\s*/<name>) expected a raw "/foo" prefix that Claude Code never writes. Real session jsonls wrap slash commands in a <command-name>/foo</command-name> XML tag inside user message content. Result on production: usage_events.command_name and usage_session_summary.slash_commands stayed NULL/0 for /clear, /exit, plugin commands like /plugin:name — verified on 17 dev-VM jsonls holding 25 <command-name> tags / 0 extracted rows. Replaces SLASH_RE with COMMAND_NAME_RE that searches for the tag anywhere in the user text (the tag sits after a <command-message> sibling). USAGE_PROCESSOR_VERSION bumps 2 → 3; operators wanting to rewrite historical rows under the new logic call POST /api/admin/usage/reprocess (agnes admin telemetry reprocess). Fixtures slash_command.jsonl, mixed.jsonl, skill_curated.jsonl rewritten from the unrealistic "/foo args" string format to the real <command-name> tag wrapper — existing assertions stay green against the new format, which is the regression baseline going forward. Adds TestCommandNameTagExtraction (4 unit tests on iter_events) covering string content, list-of-text-blocks content, mid-text tag position, and plain-prose "/x" non-match. Implicit Skill tool_use extraction (LLM-decided invocations) unchanged. Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>	2026-05-14 13:33:57 +00:00
Vojtech	37ad39c8a3	feat(home): status frame on /home (operator-gated, onboarded-only) (#297 ) * feat(home): status frame on /home — last sync, sessions, prompts, tokens, projects Adds the homepage status frame: a 5-card row above the install-hero / offboard-strip on /home showing the calling user's Last sync (their last `agnes pull`), Sessions, Prompts, Tokens used, and Projects worked on, with a 24h/7d pill toggle. Backed by `GET /api/me/home-stats?window=` (one DuckDB CTE joining `users` + `usage_session_summary` + `usage_events`) and SSR'd from the same `compute_home_stats` helper on initial paint so there's no spinner. The window toggle is the only JS-driven path. Side surfaces: - `GET /api/sync/manifest` now stamps `users.last_pull_at` so `agnes pull` (and the Claude Code SessionStart hook that wraps it) imprints the analyst's last sync time for the new card. - `usage_session_summary` gains four BIGINT token counters (input_tokens, output_tokens, cache_read_tokens, cache_creation_tokens) summed from JSONL `message.usage.` per assistant turn. - `USAGE_PROCESSOR_VERSION` bumps 1 → 2 so the session-pipeline reprocess loop invalidates stale summaries and backfills tokens on the next tick. Schema migration v43 → v44 is idempotent ALTERs (last_pull_at + 4 token columns) — fresh installs receive them from `_SYSTEM_SCHEMA`, upgrade path runs `_v43_to_v44`. Defaults (NULL / 0) backfill existing rows cleanly. 9 new tests in tests/test_home_stats.py cover the migration, endpoint shapes (24h/7d/unknown/empty/missing-user), and the manifest-side last_pull_at bump. docs(CHANGELOG): homepage status frame entries under [Unreleased] The post-rebase release-cut now belongs to whichever PR lands next after main rolled to 0.54.9. This PR logs its bullets under [Unreleased] (Added: homepage status frame, per-user pull tracking, token counters; Changed: schema v43 → v44 migration) so they ride out with the next release-cut. * fix(tests): bump test_schema_v42_migration asserts to v44 CI failed because tests/test_schema_v42_migration.py hardcoded `assert SCHEMA_VERSION == 43` and `assert v == 43` after init. v44 (homepage stats frame backing columns) was introduced in the preceding feat commit; this aligns the existing v42-era migration tests with the new schema version. * feat(home): gate status frame on operator flag + user.onboarded Two gates on the homepage status frame: 1. Operator master switch — `get_home_status_frame_visibility()` in app/instance_config.py mirrors the existing `get_home_automode_visibility()` shape: env var `AGNES_HOME_SHOW_STATUS_FRAME` > yaml `instance.home.show_status_frame` > default `True`. Cautious-rollout instances can disable the frame without forking; the yaml example documents both knobs. 2. Onboarded gate — the template only renders the frame when the caller's `users.onboarded` is true. First-day users see a clean install-hero before all-zero stat cards; the frame appears automatically on the next render after `agnes init` POSTs `/api/me/onboarded`. Router skips the `compute_home_stats` DB read entirely when either gate is closed; `home_stats` arrives at the template as None in that branch and the `{% if %}` shortcuts the include. Why both gates: PostHog feature flags evaluated and rejected — this codebase uses PostHog for analytics capture only, not feature gating; adding a per-user feature_enabled() call on the /home critical path would couple the homepage render to a remote eval and still require an admin master switch. The onboarded gate is a UX coherence rule layered on top of the operator switch, not an A/B test signal. 3 new tests in test_home_stats.py cover the env-var resolution (falsey values + default-true). The yaml example gets a `home:` block documenting both `show_automode` (pre-existing flag, was undocumented in the example) and `show_status_frame`.	2026-05-14 09:28:47 +00:00
ZdenekSrotyr	b4d3c576af	Activity Center: audit log + telemetry + sessions + agnes_* tables (#278 ) * docs(spec): admin observability spec + Activity Center MVP plan Parent spec (480 lines) + executable plan (2295 lines, 14 TDD tasks). Covers Activity Center rebuild (/admin/activity), with /admin/sessions and /admin/feedback deferred to follow-up plans. Already incorporates reviewer-pass revisions across three angles (security, production resilience, code architecture): - _get_db import path corrected to app.auth.dependencies - Test fixtures aligned with seeded_app / admin_user / get_system_db - All new audit writes wrapped in try/except + logger.exception - Filename sanitization on session uploads - DuckDB DESC index behavior documented; upgrade window flagged - Migration idempotency + evolved-DB test cases - reveal_raw + shared-cache multi-worker explicitly deferred Targets schema v40 (audit_log gains params_before, client_ip, client_kind, correlation_id + 3 indices). * feat(db): schema v40 — audit_log gains params_before, client_ip, client_kind, correlation_id + 3 indices * chore(test): clean up Task 1 — drop unused import, rename stale test * feat(audit): AuditRepository.log() accepts params_before/client_ip/client_kind/correlation_id * test(audit): strengthen params_before assertion to round-trip JSON content * feat(audit): AuditRepository.query() rich filters + keyset cursor pagination * feat(sync): SyncStateRepository.list_recent() cross-table feed * feat(audit): POST /api/sync/trigger writes audit_log row * feat(audit): POST /api/scripts/run-due writes audit_log row * feat(audit): POST /api/upload/sessions writes audit_log row + sanitizes filename * feat(audit): GET /api/data/{table_id}/download writes audit_log row * feat(activity): /api/admin/activity timeline + /health + /sync endpoints * feat(ui): /admin/activity rebuilt — health pulse, timeline, sync grid; /activity-center → 308 redirect BREAKING: removed demo executive-pulse / maturity-roadmap content from activity_center.html. The page now reflects real audit_log + sync_history data. * feat(ui): admin nav + dashboard widget point at /admin/activity * feat(activity): recursive-audit suppression for AC read endpoints (60s window per actor+filter) * feat(activity): emit PostHog events when integration enabled (no-op default) * fix(audit): move v40 indices out of _SYSTEM_SCHEMA + update test_repositories to unpack query() tuple _SYSTEM_SCHEMA CREATE INDEX on audit_log(timestamp) failed when migration tests hand-roll a bare audit_log (id, action) without the timestamp column. Fix: remove indices from _SYSTEM_SCHEMA; add ADD COLUMN IF NOT EXISTS guards for timestamp and other pre-v40 columns in _v39_to_v40() so the upgrade path is safe on any hand-rolled schema; call _v39_to_v40 explicitly in the fresh-install (current==0) path to restore index creation there. Also unpack the (rows, next_cursor) tuple from AuditRepository.query() in the three TestAuditRepository tests that still treated it as a list. * docs: CHANGELOG entry for Activity Center MVP * chore: refresh stale module docstring in app/api/activity.py * feat(cli): agnes admin activity — terminal access to Activity Center (timeline + health + sync) * fix(db): _v39_to_v40 — add IF NOT EXISTS guard for 'action' column The v39→v40 ladder step adds defensive ADD COLUMN IF NOT EXISTS for every audit_log column so a hand-rolled bare audit_log (id only) is safe through the ladder. 'action' was missing from the guard list, causing CREATE INDEX idx_audit_action_time to fail on tests that stub audit_log with only an id column (tests/test_e2e_extract.py:: TestSchemaMigration::test_migration_preserves_and_extends). Local 6/6 schema tests + the previously-failing CI test pass. * docs(spec): platform telemetry epic — Boss directive + Activity Monitoring plan rebased onto v40 (stacked on zs/spec-activity-center) * feat(db): schema v41 — 7 usage_* tables for telemetry (events, summary, rollups, attribution) * chore(db): tighten v41 — usage_session_summary.session_id NOT NULL + upgrade test asserts all 7 tables * feat(usage): UsageAttributionRepository — replace/delete/lookup over usage_attribution_* tables * refactor(marketplace): extract list_inner_skills/agents/commands to src/marketplace_listing.py for reuse * feat(usage): explode plugin attribution on marketplace sync + store entity write; backfill script * refactor(marketplace): finish src/marketplace_listing.py extraction — drop duplicate _list_inner_* + _parse_frontmatter from app/api/marketplace.py * feat(usage): promote attribution helpers to src/usage_attribution_helpers.py; hook update_entity rename + bundle-swap; clarify best-effort semantics * feat(usage): UsageProcessor real extraction + rollup rebuild + 10 fixture-driven tests * fix(usage): include tool_id in event hash + executemany + rollup transaction (critical multi-tool-turn drop fix) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(marketplace): popularity stats — invocations_30d + trend + sort=most_used\|trending + Most Popular section * feat(admin): /admin/users/<id> Sessions section — list + single-file + bulk-zip downloads (audit-logged) * feat(usage): admin export endpoint + CLI — csv/json/parquet streaming, filters, audit-logged * feat(usage): agnes admin ask — LLM Text-to-SQL over usage_events with SELECT-only validator (audit-logged) * feat(usage): reprocess + prune endpoints + scheduler daily prune job + CLI * docs: PLATFORM_SETUP.md operator playbook + HOWTO/ cookbook (5 guides + index) Adds docs/PLATFORM_SETUP.md as a consolidated operator playbook covering bootstrap, TLS, marketplaces (curated + flea), scheduler env vars, telemetry extraction/export/ask/prune, privacy posture, and daily routine. Adds docs/HOWTO/ with 5 analyst cookbook guides: first query, snapshots for remote tables, private sessions, feedback + admin ask, and customizing skills. Existing setup docs (QUICKSTART, DEPLOYMENT, ONBOARDING, HEADLESS_USAGE) get a one-line cross-reference at the top pointing to PLATFORM_SETUP.md. * docs(changelog): platform telemetry epic — usage_* foundation + surfaces + admin access + docs Comprehensive [Unreleased] entry covering: usage_events/session_summary/ tool_daily/plugin_daily tables (v41), attribution lookup tables, backfill script, marketplace Most Popular + invocation chips + sort, admin Sessions section, export/ask/reprocess/prune endpoints + CLI mirrors, Activity Center (v40), PLATFORM_SETUP.md + HOWTO/ docs, and operations notes for v41 upgrade. * fix(security): block DuckDB read_/http_/glob functions in usage_ask validator + symlink escape guard in session zip + clarify mark-private semantics * fix(admin): parquet export tempfile cleanup on COPY failure + correct processed-first sort on /admin/users/<id>/sessions * feat(audit): close 8 production audit gaps — query (local/remote/hybrid), catalog/schema/sample, snapshot estimate/create, check-access * feat(ui): /admin/usage summary dashboard + per-user activity tab on /admin/users/<id> * fix(audit): cap error messages at 200 chars + audit user_activity reads + recursion guard on usage.summary * fix(audit): catalog.list audits on error path + clean up deferred json import * fix(ux): client_kind=cli for PAT auth + timeline empty state + email-instead-of-uuid + nav reorder + help text + loading indicators + ask doc * feat(observability): unify /admin/activity into single page with saved views - KPI cards (events, users, error rate, p95) clickable as quick-filters - Faceted filter dropdowns populated from audit_log in the current window - Sortable audit table, cursor pagination, per-row JSON side panel - Saved views (schema v43: user_observability_views) — per-user state - Top bar: window selector + 30s Live toggle + saved views dropdown - /admin/scheduler-runs → 308 redirect (source=scheduler filter) - New endpoints: /api/admin/observability/{facets,kpis,views} * test: update activity + scheduler-runs tests for unified page - test_admin_activity_page_renders asserts new structural anchors - test_admin_scheduler_runs_page_admin_only asserts 308 redirect * fix(observability): respect [hidden] on modal + side panel CSS `display: flex` on .obs-modal beat the [hidden] attribute's UA display:none, so the save-view modal rendered on page load and Cancel clicks couldn't dismiss it. Gate the modal's flex layout on :not([hidden]); add the same display:none guard prophylactically to .obs-panel and .obs-views-panel. * feat(observability): user enrichment in audit + interactive /admin/usage Activity: - /api/admin/activity now joins users for user_email + user_name per row - User column renders "name (id-prefix)" or "email (id-prefix)" instead of an opaque truncated UUID; falls back to id when the user record is missing Usage: - /admin/usage rewritten as the same filter/group-by/search pattern as /admin/activity. Faceted dropdowns (User / Tool / Source / Event type) populated from usage_events; debounced free-text search across tool_name / skill_name / subagent_type / command_name - New endpoints /api/admin/usage/{facets,kpis,query}; the query endpoint supports group_by in {day, username, tool_name, source, ref_id} with sort + offset pagination, plus an ungrouped raw-events mode - 4 KPI cards (events, distinct users, distinct tools, error rate) are clickable quick-filters; clicking a grouped row applies the bucket as a filter - Old static `?window=7d\|30d\|all` server preload removed; all state is client-side via since_minutes + group_by + filters in the URL * fix(observability): clearer labels, all-column sort, drop saved views UI - Rename page titles: "Activity" → "Server activity", "Usage" → "Tool usage" with a one-line subtitle on each explaining what the page covers and linking the other one. The two pages source different data (audit_log vs usage_events) and the previous labels conflated them. - Drop the saved-views dropdown + save modal from /admin/activity. The modal pop-open bug was the trigger; the value wasn't there yet. The /api/admin/observability/views CRUD + DuckDB table stay in place. - Rename "Live (30s)" to "Auto-refresh (30s)" with a tooltip clarifying that it's the re-fetch rate, not the time range. Time range now labeled "Time range" instead of "Window". - All audit-table columns are sortable (User, Source, Action, Resource, Result added); sort is page-local with a Jinja comment explaining the trade-off. Same for raw usage rows. - Fix duplicate sort-arrow bug — the literal "▼" in the Time th HTML was rendering alongside the CSS ::before arrow. Removed the literal; CSS is the single source of truth. * feat(observability): global Sessions browser + transcript viewer + CLI Web: - /admin/sessions — list every collected session JSONL across all users with time-range, user, model, errors-only and free-text filters. Default sort surfaces error-heavy sessions first. KPI cards (sessions, distinct users, sessions w/ errors, tool error rate) clickable as quick-filters. - /admin/sessions/<username>/<file> — transcript viewer rendering the JSONL chronologically: user prompts, assistant text, tool calls (with JSON input) and tool results (with flattened output). Errors get a red border + chip and a "Next error" navigation button at the top. - Admin dropdown gains a "Sessions" link. API: - GET /api/admin/sessions/{list,kpis,facets} — filtered cross-user reads off usage_session_summary - GET /api/admin/sessions/{username}/{file}/transcript — parses JSONL via the existing services.session_pipeline.lib, returns chronological events - GET /api/admin/sessions/{username}/{file}/download — JSONL stream, same path-safety guards as the per-user endpoint, audit-logged CLI: - `agnes admin sessions list [--user X] [--errors] [--since 7d]` — table output with `!` prefix on rows that hit a tool error - `agnes admin sessions show <username> <file>` — transcript dump, with `--errors` to print only the failed tool_result blocks - `agnes admin sessions download <username> <file> [-o path]` - `agnes admin sessions kpis` — top-level numbers * feat(internal): expose telemetry tables to agnes query with row-level RBAC Three new registered tables backed by system.duckdb, queryable through the same /api/query plumbing analysts use for Keboola / BigQuery / local sources: agnes_sessions → usage_session_summary (filter: username) agnes_usage → usage_events (filter: username) agnes_audit → audit_log (filter: user_id) RBAC is per-row, not per-table: admins see every user's rows; non-admins see only their own. The filter is built server-side from the auth user dict; non-admin filter values are regex-validated before SQL interpolation. Implementation: - new connector connectors/internal/ with access (filter+exec) + registry (idempotent table_registry seed at startup) - /api/query detects internal table refs and short-circuits to a CTE wrapper that prepends "WITH agnes_x AS (SELECT * FROM <src> WHERE …), …" then "SELECT * FROM (<user_sql>) AS _q". DuckDB cursor on the shared system.duckdb handle — opening parallel handles / ATTACH on the same file is blocked process-wide. - mixing internal + BQ / registered local tables in one SELECT is rejected (v1 limitation) - src.rbac.can_access_table waves internal tables through for all authenticated users; row scoping is the actual security control - /api/v2/schema and /api/v2/sample gained internal branches; sample intentionally skips its cache because rows are RBAC-scoped per caller - audit row written as action='query.internal' with is_admin flag Tests: connectors/internal/access — RBAC, filter clause, schema, CTE wrapper coexistence with user-supplied aggregations, unsafe-username rejection. 16/16 passing. Motivating queries this enables: SELECT tool_name, COUNT() FROM agnes_usage WHERE is_error GROUP BY 1 ORDER BY 2 DESC -- analyst self-introspection: which tools fail for me? SELECT user_id, COUNT() FROM agnes_audit WHERE action = 'session.transcript_view' GROUP BY 1 -- admin: who's been looking at whose session transcripts? * feat(admin): group dropdown into 5 named sections + internal tables in /catalog Admin dropdown gains section headers so admins can land on the right page without re-reading the full menu: Activity Center Server activity / Tool usage / Sessions Users & Access Users / Groups / Resource access / Tokens Data Tables Agent Experience Curated Marketplaces / Flea Submissions / Agent Setup Prompt / Agent Workspace Prompt Server Server config "Agent Experience" frames the curated content + prompts as one cluster — it's all admin-controlled material that shapes what an analyst's AI agent encounters. "Configuration" → "Server" since only one item lives there now. Renamed the section's first two items: "Activity" → "Server activity" (matches page H1) "Usage" → "Tool usage" Also fixes /catalog visibility of the internal tables (agnes_sessions / _usage / _audit) for non-admin users: ``app.auth.access.can_access`` short-circuits to True for resource_type='table' + an internal-table id. Without this, non-admins saw the tables in /api/v2/catalog (which uses the same RBAC bypass) but not on the /catalog HTML page (which calls can_access directly, requiring a resource_grants row internal tables don't have). CSS for `.app-nav-menu-section`: small caps, muted, non-clickable; first section trims top padding so the panel doesn't open with an awkward gap. * refactor(admin): move corporate memory into Admin > Agent Experience Memory link was the only admin-only entry in the primary nav (gated by session.user.is_admin). Moves it into the Admin dropdown under Agent Experience, alongside Curated Marketplaces / Flea Submissions / Prompts — all admin-curated content that shapes what an analyst's AI agent encounters. Renamed the nav label to "Shared Knowledge" to match what the page actually is (admin-curated organisational knowledge from session verification, surfaced to agents). URL stays at /corporate-memory; the route still gates on require_admin per the existing comment. Side effect: primary nav (Home / Marketplace / Data Packages) is now uniform for every authenticated user — no conditional admin-only entry. * ui: rename admin entries to Curated Knowledge / Init Prompt / Workspace Prompt - "Shared Knowledge" → "Curated Knowledge" (parallel with "Curated Marketplaces" in the same Agent Experience section; "curated" tells the admin what they do there — review + approve) - "Agent Setup Prompt" → "Init Prompt" (matches the `agnes init` flow it actually drives) - "Agent Workspace Prompt" → "Workspace Prompt" (the "Agent" prefix was redundant — every item in the section is agent-facing) Renames page titles + H1s on /admin/agent-prompt and /admin/workspace-prompt to match. * refactor: rename Usage → Telemetry across user-facing surfaces External surfaces all switch; internal Python module / file names and the physical DB tables (usage_events, usage_session_summary, usage_tool_daily, usage_plugin_daily) stay — renaming them would force a schema migration + a redo of the LLM Text-to-SQL prompt for no analyst-visible win. Changes: - Admin dropdown: "Tool usage" → "Telemetry" - Page H1 / <title>: same - URL: /admin/usage → /admin/telemetry; old URL 308-redirects - API prefix: /api/admin/usage/* → /api/admin/telemetry/* - CLI: primary command `agnes admin telemetry …`; `agnes admin usage` kept as a deprecated alias so existing operator scripts keep working - Internal data-source table id: agnes_usage → agnes_telemetry. The registry seed now evicts any stale internal-source row whose id no longer matches INTERNAL_TABLES, so the old `agnes_usage` row is removed from table_registry on next app boot - All tests + JS endpoint paths updated * test(rbac): include auto-appended internal tables in expectations get_accessible_tables now appends agnes_sessions / agnes_telemetry / agnes_audit to every authenticated user's accessible-tables list so the internal data source shows up in /catalog. The two existing rbac tests asserted hardcoded list shapes that pre-dated the change. Rewritten to assert "granted tables + the canonical internal-table set" instead of literal lists, so the test stays correct if the internal table roster changes again later. * ui: visual dividers between admin-dropdown sections Adds a 1px top border + 6px top margin to every section header except the first, so the five named groups (Activity Center, Users & Access, Data, Agent Experience, Server) read as visually separated clusters. The header itself stays small-caps + muted as before — the border is additive. * ui(memory): match obs-topbar visual on /corporate-memory The Curated Knowledge page (linked from the admin dropdown's Agent Experience section) opened straight into the stats bar — no title, no subtitle, no shared chrome with the other admin pages. Adds an obs-topbar-style header at the top of .container-memory: - H1 "Curated Knowledge" - subtitle explaining what the page is + how AI agents pull from it The `.ck-` class set duplicates the inline obs- styles from /admin/activity etc. for this one page; promoting the obs-* class set to style-custom.css for shared reuse is the obvious next step (4 pages already inline the same CSS), tracked as a follow-up. Page <title> also renamed from "Corporate Memory" → "Curated Knowledge". * ui(tables): list Agnes internal tables in /admin/tables + group in /catalog /admin/tables previously rendered three per-source-type listings (BQ / Keboola / Jira) and dropped any row whose source_type didn't match — so the agnes_sessions / agnes_telemetry / agnes_audit rows seeded into table_registry were invisible. Adds a fourth read-only section "Agnes internal tables" that filters source_type === 'internal' and renders the same registry-table layout the other sections use, with two changes: - no Register button (these rows are seeded on every app boot from connectors/internal/registry.py) - Edit + Delete actions hidden (any change would be reverted on the next start). Manage access stays so admins can still inspect. Mode badge picks up a new mode-internal CSS class (teal accent) so the display doesn't lie and call it "local". In /catalog, internal tables now group under an "agnes" accordion section (bucket="agnes" on seed) instead of falling into the catch-all "default". Single source of truth for which tables exist; admins find them where they expect. * ui(tables): Agnes internal as a 4th tab next to BQ/Keboola/Jira Previous iteration mounted the internal-table listing as a separate standalone card under the tab strip. Reshapes it to a proper tab-content section so admins switch between data sources via one consistent nav (BigQuery / Keboola / Jira / Agnes internal). - New tab button "Agnes internal" in the tab-nav. - The listing card becomes <section id="tab-content-internal" class="tab-content">; switchTab() already routes by id so no JS change beyond extending the hash allowlist for direct #internal links. - Tab content keeps the read-only treatment from the previous commit (no Register button, no Edit / Delete in renderRegistryListing). * ui: rename Curated Knowledge → Curated Memory Settles the naming back on "Curated Memory" — parallel structure with "Curated Marketplaces" in the same Agent Experience section, and zero rename ripple: URL (/corporate-memory), API (/api/memory/), CLI (agnes admin memory), and Python modules all stay on "memory" so the admin label finally lines up with the underlying surfaces. The "Curated" prefix still tells admins what they do on the page (review pending → approve / mandate / reject) and reads as a sibling of "Curated Marketplaces" right next to it in the dropdown. Touches: admin dropdown label, page <title>, page H1. DB tables stay on knowledge_ (already the canonical naming for the data shape). * ui: rename "Server activity" → "Audit log" "Audit log" is what the page actually is — server-side audit_log table rendered with KPI cards + filter bar + sortable table. The "Server activity" label confused the term with Claude Code session telemetry (Telemetry page) and didn't make the source/concept clear. Touches: - Admin dropdown nav label - /admin/activity page H1 + subtitle - /admin/telemetry subtitle cross-link - test_activity_api page-renders assertion URL (/admin/activity) and API (/api/admin/activity/) stay — the "activity" name has stuck at the route layer for a year; rerouting those would churn dashboards/bookmarks for zero analyst-visible win. ui(admin-nav): gray band on each section header for clearer separation Previous iteration used a 1px top border between section labels — the labels still blended into the items above/below at a glance. Switches to a light gray background band per section header, extended edge-to- edge inside the panel via negative horizontal margins. Bolder font-weight (700) reinforces the separation; bumping the font color isn't needed because the band itself does the work. First section's header tucks into the panel's top border-radius so the band reaches the corners without a gap. * ui(catalog): rename internal-table category to "Agnes Internal" `bucket` is what /catalog renders as the accordion category header verbatim — "agnes" lowercase didn't read as a real category name and got confused with a system identifier. Bumps to "Agnes Internal". Seed re-applies on every app boot so existing rows pick up the new bucket value via `ON CONFLICT (id) DO UPDATE`. * ui(catalog): split Agnes Internal into its own card on /catalog Previously the three internal tables landed inside the "Core Business Data" card under an "Agnes Internal" accordion alongside Keboola / BQ buckets — readers conflated system telemetry with business datasets, and the data_stats header counter ("3 tables · ~X rows total") only ever counted synced rows so internal tables looked invisible. Split the catalog page into two cards: - Core Business Data: only non-internal source_types (Keboola, BQ, Jira). Accordions group by bucket as before. Stats counter reflects this card's tables. - Agnes Internal: a dedicated card with its own visual treatment (teal accent matching the mode-internal badge in /admin/tables). Flat list (no accordion — only 3 rows, never grows here), each row carries the canonical `agnes query` snippet. Read-only — no profiler click, no In-stack toggle, no sync metadata. Route adds `internal_card` context object; template renders the new card only when it's non-None. * fix(rbac): hide internal tables from /admin/access + drop "my" framing Two related cleanups for the Agnes-internal tables: 1. /admin/access (resource grants) no longer lists them. The `can_access` check has a hardcoded internal-table bypass — security is row-level (per-request view filter), so a table-grain `resource_grants` row would do nothing. Surfacing them in the UI let admins set up grants that silently no-op. Filter at the `_table_blocks` projection so the UI tree never sees them. 2. Display names drop the analyst-perspective "my" framing: "Agnes — my sessions" → "Agnes sessions" "Agnes — my telemetry events" → "Agnes telemetry events" "Agnes — my audit log" → "Agnes audit log" The "my" only makes sense from the querying analyst's seat (`SELECT … FROM agnes_sessions` returns their rows); on /admin/* pages where admin sees / configures them across users, the pronoun was misleading. Description text now spells out the row-level RBAC contract explicitly. Display names update via TableRegistryRepository.register's ON CONFLICT UPDATE on next app boot; no manual cleanup needed. * ui: subtitle notes about agnes_* tables on each Activity Center page The recursive observability story — Agnes serves its own audit / telemetry / session data through the same `agnes query` plumbing analysts use for business data — wasn't surfaced anywhere on the admin pages that show that data. Three pages get a one-liner with the canonical `agnes query` snippet + the RBAC contract (analysts see their own rows, admin sees all): - /admin/activity (Audit log) → agnes_audit - /admin/telemetry (Tool usage) → agnes_telemetry - /admin/sessions → agnes_sessions Sets up the discovery moment for admins: they're reading the page, they see "you can query this from Claude Code", they remember it when an analyst asks "how do I find my own failed tool calls?". * ui(tables): explain "Show log" empty-state on /admin/tables Cache warmup log <pre> renders with a dark background and is only populated by the SSE stream during a Re-warm all run. Opening the page cold + clicking Show log just revealed a black bar with no context — admins couldn't tell what they were looking at. Adds an inline paragraph above the <pre> explaining what the log is, the row format, when it fills in, and where to find the historical audit trail (/admin/activity). The actual <pre> stays empty until SSE events arrive, but the surrounding copy carries the meaning. * ui(tables): auto-open cache-warmup log on Re-warm all click A Re-warm all run takes ~24s per remote BQ row. With the <details> collapsed by default, operators saw the button disable, watched a quiet ~24s pass, and assumed nothing had happened — the streaming log was hidden behind a closed disclosure. Two small JS tweaks: - cacheWarmupRun() opens the details on click, so streamed lines appear without an extra interaction - cacheWarmupOnStart() hides the inline hint paragraph the moment real log content lands, so the dark log block isn't competing with redundant context Hint paragraph also clarifies that only `query_mode='remote'` BQ rows are warmed — operators with only materialized/internal tables would see total=0 and the page would "do nothing" by spec. * ui: trim Agnes internal copy across surfaces Descriptions had grown to explain the extraction pipeline ("parsed out of session JSONLs"), the underlying table ("Backed by usage_session_summary"), the RBAC mechanic ("row-level RBAC at query time — analysts see their own; admin sees all"), and the SQL snippet. Every implementation detail meant another rewrite on the next iter. Strips to one stable line per surface: what the data is, plus "Also available locally for analysis". Mechanics live in code + docs; the page copy says what the user needs to know. Touched: - connectors/internal/access.py: INTERNAL_TABLES descriptions - activity_center.html / admin_usage.html / admin_sessions.html subtitles - catalog.html Agnes Internal card description + row strip - admin_tables.html "Agnes internal" tab hint * fix(internal): is_user_admin arity bugs + + saved-view payload cap Round-1 code review (PR #278) caught two blocking bugs and three nits. Blocking — both `is_user_admin(user)` (single dict arg) calls raised TypeError. is_user_admin signature is `(user_id, conn)`. Affected: - app/api/query.py:_run_internal_query — every POST /api/query that references agnes_sessions / agnes_telemetry / agnes_audit blew up with a 500. The headline analyst-facing feature of this PR was unusable through the API. - app/api/v2_sample.py — same shape; `GET /api/v2/sample/agnes_` returned 500. Both fixed to call `is_user_admin(user.get("id"), conn)`. Added two FastAPI-level tests in test_internal_data_source.py that go through the TestClient — the existing unit tests on `execute_internal_query` and `build_filter_clause` skipped the request-handler layer where the bugs lived, which is why this landed. Nits also closed: - connectors/internal/access.py: `+` allowed in _USERNAME_RE / _USER_ID_RE so RFC 5321 email local-parts (alice+test@x) resolve correctly without hitting InternalAccessError. - app/api/observability.py: saved-view payload capped at 64 KiB to prevent an admin from bloating system.duckdb with a malformed save. fix(security): close non-admin data-leak via underlying-table refs PR #278 R2 review surfaced a non-admin-exploitable bypass: SQL whose string literal contains 'agnes_sessions' routed into the privileged internal-query path, then queried the underlying physical table (usage_session_summary / usage_events / audit_log) directly, escaping the CTE wrapper's row filter. Two reinforcing defenses: 1. find_internal_refs() now strips single-quoted string literals before scanning for alias names — a literal alone no longer routes the request into the privileged code path. 2. execute_internal_query() rejects non-admin SQL that references the underlying physical tables (usage_, audit_log). The CTE wrapper only scopes the agnes_ aliases; a direct FROM on the base table — or a shadowing inner WITH that still has to read the base table — bypasses RBAC. Block before execution with an actionable error pointing to the agnes_* alias. Admins are unaffected (god-mode short-circuit on the filter clause). 3. tests/test_internal_data_source.py — three new negative tests covering literal-only matches, direct-table refs, and CTE shadow attempts. Also tightens usage_ask.py's SELECT-only validator: pragma_table_info, pragma_storage_info, pragma_database_, and duckdb_tables / columns / views / indexes / schemas are reflection functions that leak metadata the analyst question shouldn't reach. \bPRAGMA\b in _FORBIDDEN never matched the function-call form (word-boundary between `A` and `_`). fix(security): dynamic denylist for non-admin internal queries R3 review (PR #278) caught a wider data-leak than R2: the underlying- physical-table guard listed only the 7 usage_* + audit_log tables, but system.duckdb has 30+ other sensitive tables — users (emails + ids), personal_access_tokens, resource_grants, user_groups, user_observability_views, store_, marketplace_, knowledge_, etc. A non-admin SQL like SELECT FROM agnes_sessions UNION ALL SELECT email, id, … FROM users LIMIT 1 would leak every user's row. Replaces the hardcoded denylist with a dynamic allowlist — non-admin SQL may reference ONLY the registered agnes_* aliases. Every other table in `information_schema.tables` (main schema) is rejected. Future migrations that add a new sensitive table are automatically covered without re-editing this module. Also strips SQL comments (`/* /` and `--`) before the identifier scan so a comment-wrapped table name (`//users//`) can't slip past the regex. Four new negative tests pin: `users`, `personal_access_tokens`, block-comment wrap, line-comment wrap. Plus: per-user view-count cap (100) on /api/admin/observability/views so an admin can't fill system.duckdb with thousands of saved views. release: 0.54.0 — Activity Center + Telemetry + Sessions + internal datasource Cuts the work shipped across this PR (Activity Center build, recursive internal data source) into a versioned release. Bumps pyproject.toml to 0.54.0; renames the top of CHANGELOG.md from [Unreleased] to [0.54.0] — 2026-05-12 with a header summary; opens a fresh [Unreleased] section for the next round. --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 22:41:19 +02:00
ZdenekSrotyr	b6cdd68e8d	feat(catalog): entity_type + validated where_examples + view-aware cost-guard + scheduler hygiene Three behavioural improvements driven by the sub-agent end-to-end test findings, plus scheduler tweaks to prevent the post-deploy contention burst we measured. CATALOG (catalog-side bugs the test agents tripped on): - new entity_type field per remote row (BASE TABLE / VIEW / MATERIALIZED VIEW). For views, rows + size_bytes return null instead of the misleading 0 that __TABLES__ reports. - where_examples now validates against the table's actual schema (cached known_columns from refresh). The pre-fix behavior blindly advertised `country_code = 'CZ'` on tables with no country_code column — the sub-agent tests reliably hit this on unit_economics. - new known_columns + entity_type columns on bq_metadata_cache; populated by bq_metadata_refresh.refresh_one from the same fetch_bq_columns_full call (no extra BQ roundtrip) plus a cheap INFORMATION_SCHEMA.TABLES lookup for table_type. QUERY COST-GUARD: - remote_scan_too_large suggestion now names views explicitly: `Target(s) <ids> are VIEW or MATERIALIZED VIEW. BigQuery does not push LIMIT into the view body — SELECT * FROM <view> LIMIT 1 still runs the full underlying scan.` Programmatic consumers get a new view_targets field on the error detail. SCHEDULER HYGIENE (the post-deploy 1-minute window where concurrent parquet downloads dropped to ~1 MB/s): - SCHEDULER_STARTUP_GRACE_SECONDS (default 60) holds the first tick so the burst doesn't overlap cache_warmup writes. - SCHEDULER_BQ_METADATA_INITIAL_OFFSET_MAX_SECONDS (default 900) randomises bq-metadata-refresh's first-fire offset. TESTS: - test_bq_metadata_cache_repo: entity_type + known_columns round-trip - test_v2_catalog_remote_metadata: where_examples validation, views return null rows/size_bytes, cold rows have empty examples - test_api_query_guardrail: VIEW-aware suggestion text + view_targets - test_connectors_bigquery_metadata: entity_type lookup mock + new fields in TableMetadata expectations - test_scheduler_sidecar: grace + jitter env-var resolution	2026-05-12 10:37:35 +02:00
ZdenekSrotyr	b3841f5b6c	release: 0.50.0 — persistent BQ metadata cache + scheduled refresh; catalog never blocks on BigQuery Since 0.47.0 GET /api/v2/catalog enriched each remote BigQuery row by fetching INFORMATION_SCHEMA.TABLE_STORAGE + COLUMNS through the DuckDB BigQuery extension inside the request. On cold caches that fanned out to O(N) sequential BQ jobs-API roundtrips — easily 90 s+ on partitioned / view-backed tables — and reliably blew the CLI's 30 s httpx ReadTimeout. Reproduced with py-spy: three AnyIO worker threads stuck inside connectors/bigquery/metadata._fetch_via_legacy_tables. Refactor: enrichment is read exclusively from a new persistent bq_metadata_cache DuckDB table (schema v40), populated by a scheduler- driven refresh job at SCHEDULER_BQ_METADATA_REFRESH_INTERVAL (default 4 h). Cold catalog response on a fresh container is now tens of milliseconds with metadata_freshness=never_fetched for unwarmed rows. New surface: - POST /api/admin/run-bq-metadata-refresh (scheduler-driven, full) - POST /api/v2/metadata-cache/refresh?table=<id> (admin, single) - GET /api/v2/metadata-cache/status (auth, non-admin) - metadata_freshness field per catalog row Removed (internal API): v2_catalog._size_hint_for_row, _resolve_remote_metadata, _metadata_provider_for, _build_metadata_request, _materialized_size_hint, in-memory _metadata_cache. Response shape unchanged for external consumers. 991 tests passing; 2 pre-existing failures (test_db v3→v4 ladder, test_cli_binary_rename) unrelated to this change.	2026-05-11 20:37:17 +02:00
Vojtech	d6ad08f107	Flea-market upload guardrails + soft delete + JOIN-based admin queue (#233 ) * feat(store): flea-market upload guardrails + soft delete + JOIN-based admin queue Adds an end-to-end guardrails pipeline for store uploads (manifest + static-security + LLM review), persists blocked bundles for forensics, introduces soft-delete (Archive) semantics, consolidates the legacy /store/{id} surface into /marketplace/flea/{id}, and reworks the admin queue so lifecycle filters read live entity visibility via LEFT JOIN rather than a denormalized submission column. Schema v29 → v35: * v29 store_submissions table + store_entities.visibility_status * v30 file_size, bundle_sha256, bundle_purged_at on submissions * v31 reshape store_submissions (drop legacy unique on entity_id) * v32 store_entities.archived_at/by + 'archived' visibility value * v33 drop store_submissions.retry_count (unused) * v34 ensure idx_store_submissions_entity exists post column-drop * v35 broaden visibility_status enum + JOIN architecture cutover Pipeline (src/store_guardrails/): * Inline checks: manifest_check, static_scan, quality_check * LLM review configurable haiku\|sonnet\|opus (default haiku) * BackgroundTasks-driven async path with structured-output JSON * Per-submitter daily quota (default 50) * 30-day TTL purge job (POST /api/admin/run-blocked-purge) * Bundle SHA256 + size persisted; sha256 survives purge for forensics Visibility model: * pending \| approved \| hidden \| archived * _enforce_visibility returns 404 (no leak) for non-owner non-admin * Owner sees own non-approved entries via include_owner_id widening * Install refused with 409 entity_not_approved when not approved Soft-delete (DELETE /api/store/entities/{id}): * Default = soft (visibility_status='archived'); existing installs keep getting served the bundle so users don't lose the plugin * ?hard=true admin-only: drops bundle + cascades user_store_installs * Hard-delete preserves entity_id on submission as tombstone so audit_log linkage survives for the activity timeline Admin queue lifecycle (the JOIN refactor): * Verdict (store_submissions.status) is immutable forensic record * Lifecycle (store_entities.visibility_status) is live state * /admin/store/submissions Archived chip translates to `e.visibility_status='archived'` via LEFT JOIN — any path that flips visibility surfaces in the queue immediately * Detail page renders Status (verdict) and Entity lifecycle side by side so admins see "approved at review, now archived" at a glance URL consolidation: * /store/{id} deleted (no redirect, stale bookmarks 404) * /marketplace/flea/{id} is the canonical detail surface * Three in-tree callers (upload-success, my-stack card, store listing card) updated to point at the new URL * Quarantine banner extracted to _quarantine_banner.html partial, self-guarded, included from both flea detail templates * Banner JS auto-refreshes when the verdict lands by polling /api/marketplace/flea/{id}/detail (visibility_status + submission_status — the latter is needed because blocked_llm keeps the entity at visibility_status='pending') Audit log resource format: * runner.py emits prefixed `store_submission:{id}` (post-fix) * Detail-page timeline query handles three patterns: prefixed submission, helper-emitted `store_entity:{sub_id}`, and bare-id legacy rows — all surface in the activity timeline UX fixes: * Owner sees Under review / Quarantined / Hidden banner with status * Install button gray-disabled (not blue) when non-approved * Owner cannot delete quarantined entries (403); admin can * Admin queue: filter chips, sortable columns, paging, page-size * Auto-refresh queue every 5s while pending rows are visible * Store upload page file picker no longer opens twice (label → input default action collided with explicit JS handler) Tests: 168 passed across the guardrails suites (admin submissions, store API, inline / LLM / purge guardrails, store repositories, marketplace filter, schema version). New regression coverage includes: archive surfaces via JOIN even when API path is bypassed; deleted submission renders activity timeline (tombstone); flea detail surfaces submission_status only for owner/admin; detail page renders Entity lifecycle row; audit log resource format covers both helper and runner paths. * fix(store-guardrails): PR #233 follow-up — prompt injection, atomic PUT, BG race, schema, reaper, sort whitelist Addresses 9 of the 23 findings from the PR #233 review (spec at docs/superpowers/specs/2026-05-09-pr233-guardrails-fixes-spec.md). Merge-gate items #1-#6 plus high-value mediums #7, #9-#12, #23. Architectural items (#8 enum split, #14 factory) and pure maintainability (#15-#22) deferred to follow-ups. Security: * #1 prompt injection — SYSTEM_PROMPT now passed via the SDK's dedicated system= parameter; bundle wrapped in <bundle>...</bundle> sentinels declared data-only by the system prompt; literal sentinel strings in user content are escaped so an adversarial README can't forge a close tag. * #6 static scan honesty — module docstring + admin copy + docs declare static scan as signal not gate; .md/.txt/.rst/.html/.json/ .yaml/.yml/.toml skipped to avoid false positives on prose. AST mode for Python deferred (separate flag, FP comparison work). Correctness: * #2 PUT atomicity — bundles bake into plugin.staging-<rand>/ alongside live, atomic-rename on success; failed checks leave live tree byte-for-byte intact. * #3 BG-task race — set_visibility_if_pending guards verdict flips to the (pending, hidden) review window; admin archives during review survive; skipped flips audit-logged. * #4 v35 NOT NULL/DEFAULT — schema v35→v36 re-applies them on store_entities.visibility_status. CHECK constraint enforced application-side (DuckDB ADD CHECK on existing column unsupported). * #7 stuck-review reaper — reap_stuck_llm_reviews flips pending_llm rows older than guardrails.stuck_review_grace_seconds (default 1800) to review_error. Scheduler runs every 15 min via new /api/admin/run-reap-stuck-reviews. Set knob to 0 to disable. * #9 quota counter — count_blocked_for_submitter_since now counts blocked_inline + blocked_llm + review_error so a submitter triggering only LLM-blocked verdicts is bounded. * #10 missing risk_level — surfaces as review_error with error='missing_risk_level' instead of silently defaulting to 'medium' (which looked like a model-decided block). * #11 archived_at clear — set_visibility nulls archived_at + archived_by when transitioning out of 'archived' so a future read doesn't show stale archive forensics on an approved row. Maintainability: * #12 FSM doc comment — accurate insert/transition/lifecycle description in src/db.py near store_submissions schema. * #23 sort-key whitelist — admin queue rejects unknown sort keys with 400 invalid_sort_key; substring-replace footgun removed. Deferred (separate PRs): * #5 quota race — proper fix requires asyncio.Lock spanning the full pipeline; threading.Lock blocks event loop, DuckDB MVCC doesn't help. API-level slowapi bounds worst case for now. * #6 part 3 (AST static scan), #8 (enum split), #13 (import bundle docs), #14 (factory consolidation), #15-#22 (maint). Tests: * New: tests/test_store_guardrails_prompt_injection.py (corpus + trust-boundary invariants), tests/test_store_put_atomic.py, tests/test_store_guardrails_reaper.py. * Extended: test_store_guardrails_llm.py (system param, missing risk_level, BG race), test_admin_store_submissions.py (quota counter widening, sort whitelist 400), test_store_repositories.py (un-archive metadata clear), test_db_schema_version.py (v36). * Full suite: 3738 passed; 17 pre-existing baseline failures unchanged (db migration tests, cli binary rename, catalog export, user mgmt v5 backfill — confirmed by stash + rerun on clean tree).	2026-05-09 17:32:53 +04:00
minasarustamyan	e26236fdc1	Extract session-pipeline framework + UsageProcessor skeleton (#232 ) * Extract session pipeline framework, refactor verification, add UsageProcessor skeleton Pluggable framework under services/session_pipeline/ (contract + lib + per-processor runner) so multiple processors can read /data/user_sessions/<key>/.jsonl on their own cadence with full failure isolation. Verification flow becomes the first plugin; a no-op UsageProcessor reserves the second slot pending a separate brainstorm on extraction logic + storage shape. Schema v28→v29: rename session_extraction_state → session_processor_state with composite PK (processor_name, session_file). Existing rows copied over with processor_name='verification'; legacy table dropped. Migration is idempotent and no-ops the copy step on fresh installs that came up at the new schema. Endpoint: /api/admin/run-verification-detector replaced by parametrized /api/admin/run-session-processor?processor=<name>. Audit action format follows. Scheduler JOBS: verification-detector entry split into session-processor:verification + session-processor:usage. SCHEDULER_VERIFICATION_DETECTOR_INTERVAL retained for operator compatibility (drives both cadence and health-check grace window); SCHEDULER_USAGE_PROCESSOR_INTERVAL added. Address PR #232 review: scan dead branch + per-processor lock - `SessionProcessorStateRepository.scan_unprocessed_for` dead else: both branches surfaced every jsonl, the SELECT was unused, runner MD5-rehashed every stable session per tick. Replaced with an mtime precheck — stable sessions (mtime <= processed_at) are filtered at scan; modified files still surface for the runner's authoritative `file_hash` invalidation. Naive-local comparison matches the existing health-check idiom (DuckDB TIMESTAMP strips tz on storage). - Per-processor advisory lock around `_run_processor` in `/api/admin/run-session-processor`. Scheduler tick + manual admin POST could otherwise both run, both call create_evidence on overlapping detections, and accumulate duplicate verification_evidence rows (the dedup short-circuit only covers create+contradiction, not evidence per ADR Decision 3). Non-blocking acquire → 409 Conflict on concurrent invocation; release in finally so a runner exception doesn't wedge the processor. Tests: two new scan unit tests (mtime filter + post-mark mtime bump), 409 endpoint test, lock-released-on-exception test. Two existing tests updated for the new "filtered at scan" stat shape (previously asserted skipped == 1, now scanned == 0). * Address PR #232 review #2: parallel scheduler tick + last_run on terminal state Two pre-existing scaffold bugs in services/scheduler/__main__.py amplified by adding more session-pipeline jobs: 1. Serial for-loop over jobs with synchronous httpx.post(timeout=900) — a 10-minute verification run blocked every other job (data-refresh, health-check, usage, corporate-memory) for the whole window. The PR's stated isolation guarantee held inside the runner but broke at the scheduler dispatch layer. 2. last_run advanced only when _call_api returned True. Permanent-failure jobs hot-looped on every tick (30s) instead of cadence (15min). Fix: ThreadPoolExecutor.submit per due job + per-job in_flight set so a long-running job can't be re-launched on subsequent ticks. last_run advances unconditionally in finally; errors still surface via _call_api logging + audit_log on the receiving side. _run_job extracted to module-level for unit testing. New tests: - TestRunJobBookkeeping: advances on success / failure / unhandled raise - TestRunLoopParallelism: in_flight protection prevents duplicate launches across ticks for a single slow job --------- Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>	2026-05-08 19:47:46 +02:00
Vojtech	107195730d	feat(observability): optional PostHog integration (#231 ) * feat(observability): optional PostHog integration (errors, LLM traces, replay, flags) Off by default. Activates when POSTHOG_API_KEY is set in env. Defaults to PostHog Cloud EU; override host for US Cloud or self-hosted. Coverage: - FastAPI 500 handler captures unhandled exceptions - src/orchestrator.py rebuild + rebuild_source failures - services/scheduler/ HTTP-job failures - cli/main.py uncaught CLI errors (Typer.Exit/SystemExit/KeyboardInterrupt skipped; flushes before re-raise so short-lived CLI invocations don't drop events) - connectors/llm/anthropic_provider.py + openai_compat.py emit $ai_generation events with provider, model, latency, token counts (prompt/completion bodies stay off unless POSTHOG_LLM_PAYLOADS=1 because LLM prompts here routinely include customer SQL/data) - Browser snippet injected into every text/html response by PosthogInjectionMiddleware — registered inside the GZip layer so it sees uncompressed HTML before compression. Many templates are standalone (their own DOCTYPE) and never extend base.html, so a per-template include would miss them. - Frontend: $pageview, $pageleave, JS error capture via window.error and unhandledrejection handlers, masked session replay (maskAllInputs: true plus CSS-selector mask for known data surfaces), feature flags (browser posthog.isFeatureEnabled + server-side feature_enabled with fallback for older SDKs). Identification mode operator-configurable: none / id / email / full. Default email ships user.id + email but never name. CLI entry point moves from cli.main:app to cli.main:main (Typer wrapper). Files: - src/observability/posthog_client.py — lazy singleton, no network when disabled, single-process flush on shutdown - src/observability/llm_tracing.py — trace_generation context manager - app/middleware/posthog_inject.py — HTML rewrite middleware - app/web/templates/_posthog.html — browser snippet template - docs/observability.md — operator guide - config/.env.template — documented POSTHOG_* knobs - tests/test_posthog_disabled.py + tests/test_posthog_client.py + tests/test_llm_tracing.py — 18 tests covering disabled state, identify-mode payloads, $ai_generation shape, error variant. CHANGELOG entry under [Unreleased] Added. * feat(observability): tag every PostHog event with environment + release Splits PostHog dashboards cleanly between localhost / dev / staging / production without manual tagging on every capture call. - POSTHOG_ENVIRONMENT explicit override; auto-resolves to "local" when LOCAL_DEV_MODE=1, else RELEASE_CHANNEL, else AGNES_DEPLOYMENT_ENV, else "unknown". - AGNES_VERSION → RELEASE_CHANNEL fallback feeds the `release` property for "is this error new in this release?" cohorting. - Backend gets both via the PostHog SDK's super_properties constructor arg (every captured event picks them up automatically). - Browser snippet calls posthog.register({environment, release}) inside the loaded callback so $pageview, $exception, autocapture, etc. all carry the same labels. - request.state.user now populated by auth dependencies so the snippet can actually call posthog.identify(user_id, {email}) for logged-in users (previously the user block always resolved to None because nothing wrote to request.state.user). 4 new tests cover env resolution: explicit > LOCAL_DEV_MODE > channel > unknown, plus super-properties forwarding into the SDK constructor. * feat(observability): inline user attrs on every PostHog event + debug throw route PostHog's UI shows person properties on the Person profile page, not inline on each event — so a reviewer triaging an exception couldn't tell which user hit the bug without clicking through. Fix it on both sides. - Backend capture_exception merges user_id / user_email / user_name into the event properties (gated by POSTHOG_IDENTIFY_PII: none/id/email/full). Backed by a new _user_props_for_event helper on PosthogClient. - Browser snippet registers user_id + user_email + user_name as super- properties via posthog.register({...}) so every $exception, $pageview, and custom event coming from posthog.captureException() carries them inline. Mirrors the backend so cross-referencing client/server events doesn't require a person-profile lookup. - /api/debug/throw — debug-only endpoint gated by DEBUG=1 (404 in prod). Runs Depends(get_current_user) first so request.state.user is set when the unhandled-exception handler captures the event. Lets operators exercise the full observability path end-to-end without hand-rolling a TestClient script. Configurable via ?kind=ValueError&msg=... 7 new tests cover: backend user-attr merge across identify modes, anonymous request fall-through, browser snippet super-prop emission for logged-in / anonymous / id-only / full-name cases. * fix(observability): address minasarustamyan PR #231 review Two bugs caught in review. 1. PosthogInjectionMiddleware dropped Response.background on every return path. BaseHTTPMiddleware materialises the body and asks subclasses to return a fresh Response — three paths in dispatch() omitted background=, silently cancelling any BackgroundTask / BackgroundTasks the route attached (audit logging, async webhooks, email sends) with no log line. Fix: route every return through a _passthrough() helper that forwards background. Also adds a _MAX_BUFFER_BYTES (4 MB) cap so a streamed-HTML response can't balloon RSS during buffering. Bigger bodies short-circuit through with a warning rather than being injected. Regression tests in tests/test_posthog_inject_middleware.py exercise four return paths (snippet present, render-fail, double-injection guard, non-HTML passthrough) plus the streaming-guard short-circuit. 2. $ai_input / $ai_output_choices were emitted without truncation, so POSTHOG_LLM_PAYLOADS=1 silently dropped events past PostHog's ~32 KB per-event ingest limit — exactly the calls (large prompts with schemas / sample rows / SQL) an operator would want to inspect. Fix: clip both at POSTHOG_LLM_PAYLOAD_MAX_CHARS (default 30000) with an explicit "…[truncated N chars]" marker so readers don't mistake truncated captures for complete ones. Metadata (provider, model, tokens, latency, error) flows regardless. Three new tests cover default-cap clipping, env-override, and pass-through under the cap. 37 PostHog tests pass.	2026-05-08 17:57:10 +04:00
ZdenekSrotyr	cc1886c97c	release: 0.47.4 — Docker collector skip + FIFO session-pipeline check (#229 ) ## Summary Two minimum-viable fixes after today's 0.44.0 → 0.47.3 release train and the production 30-user launch. Devil's advocate review of a 3-PR / 7-item plan cut scope to these 2 — the rest is deferred to a separate "operate-first, instrument-second" backlog item. ### B2 — Docker session_collector log skip `services/session_collector` was logging `Collection complete: 0 users, 0 files copied` + `WARNING: Group 'data-ops' not found, using default group` every 10 minutes in the Docker layout (where `/home//user/sessions/` doesn't exist). New env var `AGNES_SKIP_LEGACY_COLLECTOR=1` set by default in `docker-compose.yml` short-circuits the collector pass. The bare-VM deployment path (where /home/ IS populated by Claude Code) leaves the env var unset and continues to scan normally — including the data-ops warning, which is load-bearing for catching missing-group mis-deploys. ### O2 — FIFO check in `_check_session_pipeline` The existing check compares `MAX(processed_at)` to newest jsonl mtime — catches "detector hasn't run lately" but blind to "old file was skipped while newer ones were processed". New code finds the oldest FS jsonl that's NOT in `session_extraction_state.session_file` and flags if its mtime is older than `SESSION_PIPELINE_STUCK_FILE_GRACE_SECONDS` (default 4× the existing grace = 2h). Severity intentionally starts at `info` so we can collect prod data on false-positive rate before tightening to `warning`. The aggregator already treats `info` as non-promoting (see the severity vocabulary docstring at the top of `app/api/health.py`), so the headline `status` stays at `healthy` even when this fires — the operator sees the entry in the per-check breakdown but no spurious `degraded` overall. ## Test plan - [x] `pytest tests/test_session_collector.py` — 17 tests pass (existing 9 + new 8 covering env-set/unset, truthy variants, falsy non-skip). - [x] `pytest tests/test_health_session_pipeline.py` — 8 tests pass (existing 4 + new 4 FIFO tests covering stuck-file, under-threshold, all-processed, env-override). <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/229" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-08 09:38:21 +02:00
ZdenekSrotyr	c97fd504c5	release: 0.45.0 — easy-wins bundle (#84 #164 #177 #178 #203 #204 ) Operator-and-analyst quality bundle: a security fix for the optional Telegram bot, two CLI gaps closed, and three rounds of UX polish on `agnes diagnose` and `agnes pull` so non-TTY consumers (CI runners, Claude Code SessionStart hooks, sub-agent watchdogs) get readable, actionable signal. - Pairing-code RNG: random.choices -> secrets.choice (CSPRNG). - Telegram script runner: refuse out-of-shape usernames before sudo -u. CLAUDE.md.bak.<ISO-timestamp> before regenerating. - agnes admin unregister-table <id> -> DELETE /api/admin/registry/{id} - agnes admin update-table <id> --field=value ... -> PUT /api/admin/registry/{id} response but never promotes the headline. BQ billing-equals-data check downgraded warning -> info. default (5 s / 1 MiB vs 30 s / 10%) so sub-agent watchdogs don't kill the pull as a hung process. New env knobs: AGNES_PULL_PROGRESS_INTERVAL_{SECONDS,BYTES}. --include-schema (or ?include=schema) to opt back in. Tests: 120 passed across the touched modules, including new tests for each fix. Pre-existing failures on main (DB migration v1->v9, binary rename) are unrelated and not introduced here.	2026-05-07 11:43:16 +02:00
ZdenekSrotyr	9f9aabd72b	fix(corporate-memory): CLI catches fail-fast ValueError, exits 1 with clean message (Devin Review on #179 ) The PR's #176 fail-fast change made collect_all() raise ValueError when neither an ai: block nor ANTHROPIC_API_KEY/LLM_API_KEY was available. verification_detector's CLI was updated to handle it; corporate_memory's CLI was missed and crashed with an unhandled traceback. services/corporate_memory/collector.py:main() now wraps the collect_all call in try/except ValueError, prints a one-line actionable message to stderr, and returns rc=1. Regression test: test_llm_connector.py::TestCorporateMemoryCollector::test_main_returns_1_on_no_ai_config_instead_of_traceback.	2026-05-05 06:45:10 +02:00
ZdenekSrotyr	e68c2d3f0f	fix(session-collector): argv-free run() helper, drop SystemExit footgun (Devin Review on #179 ) run_session_collector called collector.main() which did argparse.parse_args() on uvicorn's sys.argv (['app.main:app', '--host', ...]) → sys.exit(2) → SystemExit(2), which inherits from BaseException, escapes FastAPI handlers, and propagates through the thread pool. Every scheduler tick that fired the endpoint either 500-ed or risked killing the uvicorn worker. services/session_collector/collector.py now exposes run(dry_run, verbose) that returns (rc, stats); main() is a thin CLI shim that parses argv and delegates. The admin endpoint calls run() directly and audit-logs the per-run stats (users_processed, files_copied, files_skipped) instead of just the rc. Three regression tests in TestRunHelper. Closes Devin Review finding on app/api/admin.py:2819 (#179).	2026-05-05 06:31:55 +02:00
ZdenekSrotyr	fa3a76a528	fix(scheduler): single env var drives cadence + grace (#179 review) Devin NOTABLE: SCHEDULER_VERIFICATION_DETECTOR_INTERVAL was already read by app/api/health.py to compute the staleness grace window, but the actual scheduler cadence was hardcoded to 'every 15m'. The env var name implied it controlled the cadence — it didn't. An operator throttling the detector via the env was silently ignored by the scheduler while the health grace silently widened. Wired the env var into both ends. Same pattern applied to the other two LLM-pipeline jobs: - SCHEDULER_SESSION_COLLECTOR_INTERVAL (default 600s = 10m) - SCHEDULER_VERIFICATION_DETECTOR_INTERVAL (default 900s = 15m) - SCHEDULER_CORPORATE_MEMORY_INTERVAL (default 1020s = 17m) Defaults preserve the existing 10m / 15m / 17m coprime offset so the three jobs don't fire on the same tick. build_jobs() now reads all three through _read_positive_int (matching the existing pattern for data-refresh / health-check / script-runner) and feeds them to _seconds_to_schedule. The smallest-interval check includes the new variables so an operator can't accidentally set a tick larger than any LLM cadence. New tests in tests/test_scheduler.py: - TestLLMPipelineCadenceEnvVars: env override changes the schedule string at scheduler-init time, with parametrized invalid-value rejection. - TestVerificationDetectorGraceFollowsCadence: pinning the single-source-of-truth contract — same env var moves both the scheduler cadence and the health-check grace.	2026-05-05 05:59:18 +02:00
ZdenekSrotyr	9f33e24bf9	fix(config): overlay-aware LLM consumers + env-ref resolution (#179 review) Devin BUG: /api/admin/configure seeds an ai: block to the writable overlay at DATA_DIR/state/instance.yaml, but the three LLM consumers imported from config.loader.load_instance_config — which reads the static config dir only. Even if they had read the overlay, the loader ran yaml.safe_load directly without passing through _resolve_env_refs, so '${ANTHROPIC_API_KEY}' would have stayed a literal placeholder. The pipeline appeared to work because the factory falls back to the env var directly, but the overlay path itself was dead code. Two fixes, both required: 1. Switched the three LLM consumers to app.instance_config.load_instance_config: - services/corporate_memory/collector.py:collect_all - services/verification_detector/__main__.py:main - app/api/admin.py:run_verification_detector 2. app/instance_config.py runs the loaded overlay through config.loader._resolve_env_refs before the deep-merge, so '${ANTHROPIC_API_KEY}' resolves at config-load time. New regression suite tests/test_instance_config_overlay.py pins: - env-ref resolution against the overlay (resolved when env set, empty when env missing — never the literal placeholder) - deep-merge still preserves static-only sections - the three consumers reach app.instance_config (inspected via inspect.getsource so a future refactor that reverts the import fails the test) - end-to-end: a seeded overlay + ANTHROPIC_API_KEY env reaches the factory with a resolved api_key	2026-05-05 05:57:22 +02:00
ZdenekSrotyr	98a8aba3be	fix(tests): align test_llm_connector with new factory + fail-fast (#179 review) The PR rewrote collect_all() to call the new create_extractor_from_env_or_config() helper, but the existing tests still mocked the old direct create_extractor() symbol and the old silent-skip-on-missing-config behavior. Five tests in TestCorporateMemoryCollector and one in TestCollectorExtractorIntegration were red on the PR branch. Changes: - Tests now mock connectors.llm.create_extractor_from_env_or_config (the symbol the collector imports lazily). - Renamed test_collect_all_no_ai_config_skips -> test_collect_all_no_ai_config_or_env_raises and test_collector_handles_invalid_config -> test_collector_raises_on_invalid_config. Both assert pytest.raises(ValueError) — the explicit fail-fast semantics defect 5 of #176 was supposed to enforce. - collect_all() no longer swallows the factory's ValueError into stats["errors"]; it propagates so the scheduler / admin endpoint surface the actionable misconfiguration message instead of pretending the run was a no-op. - /api/admin/run-corporate-memory translates the propagated ValueError into a 500 with the factory's message, matching /api/admin/run-verification-detector.	2026-05-05 05:55:01 +02:00
ZdenekSrotyr	45de71e8ab	fix(scheduler): wire LLM pipeline into scheduler-v2 (#176 ) The session-collector, verification-detector, and corporate-memory services now run on the same scheduler-v2 model that already drives data-refresh, health-check, script-runner, and marketplaces: - New admin endpoints in app/api/admin.py: POST /api/admin/run-session-collector POST /api/admin/run-verification-detector POST /api/admin/run-corporate-memory All admin-gated, sync-def (FastAPI thread pool), with one audit row per invocation. Same single-writer-of-system.duckdb pattern as the existing /api/marketplaces/sync-all job. - services/scheduler/__main__.py JOBS gains three entries with offset cadences (10m / 15m / 17m, all coprime modulo the 30s tick) so the three LLM-backed jobs don't fire on the same tick and stack their API + DB load. - The verification-detector endpoint surfaces the LLM factory's fail-fast ValueError as HTTP 500 with the actionable message, preserving the no-silent-skip contract from the previous commit. Tests: - tests/test_admin_run_endpoints.py covers admin gating + scheduler registration + endpoint contract. - tests/test_scheduler_sidecar.py existing tests continue to pass.	2026-05-04 23:57:43 +02:00
ZdenekSrotyr	bbb04ac041	fix(setup): seed default ai: block + env-var fallback (#176 ) POST /api/admin/configure now writes a default ai: block into the instance.yaml overlay when the request leaves it untouched and either ANTHROPIC_API_KEY or LLM_API_KEY is set in the environment. The block references the env var via ${VAR} syntax — secrets never land in YAML. connectors.llm.factory grows create_extractor_from_env_or_config which falls back to ANTHROPIC_API_KEY / LLM_API_KEY when ai_config is empty and raises a clear ValueError when neither is available. Both services/corporate_memory and services/verification_detector switch to the new helper, replacing the old 'silently skip when ai: missing' path that was the silent-failure root cause. Tests: - tests/test_setup_ai_block.py — overlay seeding contract. - tests/test_llm_provider_env_fallback.py — fallback + fail-fast.	2026-05-04 23:55:19 +02:00
ZdenekSrotyr	8233c3e3f9	chore(docs): replace stale `da` verbs and vendor-specific install paths Sweep operator runbooks (docs/QUICKSTART, docs/HEADLESS_USAGE, docs/architecture, docs/sample-data, docs/agent-workspace-prompt, docs/metrics/metrics.yml, dev_docs/server, dev_docs/disaster-recovery), the corporate-memory service README, the jira connector README + backfill scripts, the deploy skill, and test docstrings. Replaces `da sync` → `agnes pull`, `da analyst setup` → `agnes init`, `da metrics ...` → `agnes catalog --metrics` / `agnes admin metrics ...`, `da fetch` → `agnes snapshot create`, plus the matching docker-compose admin invocations. Vendor-specific `/opt/data-analyst/` install paths in jira backfill / consistency scripts and operator docs are replaced with the placeholder `<install-dir>` and a new `AGNES_ENV_FILE` env-var override that lets a deployment inject its actual install path without a code change. Aligns with the OSS vendor-agnostic policy in CLAUDE.md. CHANGELOG `### Internal` entry summarizes the audit and reaffirms the intentional stale-marker tuples (`_LEGACY_STRINGS`, `_OUR_COMMAND_MARKERS`) that must keep referencing `da sync` / `da fetch` / etc. for hook upgrade and override-detection logic.	2026-05-04 21:22:19 +02:00
Vojtech	38f6b639d2	feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136 ) Cuts release 0.20.0. ## Highlights - X-Request-ID header on every response + sanitized to [A-Za-z0-9_-] (CRLF log-forging mitigation) - Error pages (HTML + JSON 500) surface request_id for support tickets - Dev debug toolbar gated by DEBUG=1 — fastapi-debug-toolbar with custom DuckDBPanel - Centralized app.logging_config.setup_logging() replaces 23 scattered basicConfig calls - Telegram bot drops bot.log file — stdout only (BREAKING) ## Devin findings addressed - BUG_0001: .env.template no longer claims FastAPI debug=True - BUG_0002: subprocess extractor logs INFO to stderr again - ANALYSIS_0003: _wants_html no longer matches Accept: / (curl gets JSON as before) - BUG on b1c6ee9: HTML 500 page no longer leaks str(exc) in production - BUG on b13d2fe: 2 CLAUDE.md compliance flags (transform.py + ws_gateway) accepted as scope-limited logging refactor — follow-up to update CLAUDE.md if needed See CHANGELOG [0.20.0] for full notes.	2026-04-29 22:54:21 +02:00
ZdenekSrotyr	b7a1795834	feat(scheduler): re-wire sync_schedule + script.schedule; tune via env; OpenMetadata TLS (#135 ) Bundles 4 issues: - #79 — table_registry.sync_schedule honored at runtime (API-side filter + Pydantic validators) - #78 — script_registry.schedule honored via new POST /api/scripts/run-due (atomic claim, BackgroundTask exec, deploy-time safety validation) - #77 — sidecar JOBS env-driven (SCHEDULER_DATA_REFRESH_INTERVAL/HEALTH_CHECK_INTERVAL/SCRIPT_RUN_INTERVAL/TICK_SECONDS) - #89 — OpenMetadataClient verify=True default (BREAKING for self-signed) Cuts release 0.19.0. See CHANGELOG for full notes incl. Known Limitations.	2026-04-29 22:06:30 +02:00
ZdenekSrotyr	82c5d71d63	feat(memory): #62 — duplicate hints + tree-view + bulk-edit (#126 ) Issue #62. Tree view with cross-axis filtering, duplicate-candidate hints (Jaccard score on entity overlap), bulk-edit endpoints (PATCH /api/memory/admin/{id} + POST /api/memory/admin/bulk-update), schema v17 (knowledge_item_relations), full CLI parity (da admin memory tree/edit/bulk-edit/duplicates list/resolve).	2026-04-29 13:55:15 +02:00
ZdenekSrotyr	995e4cd366	fix(scheduler): HTTP marketplaces job + SCHEDULER_API_TOKEN shared secret (#127 ) * fix(scheduler): HTTP marketplaces job + SCHEDULER_API_TOKEN shared secret Two scheduler-reliability bugs surfaced after the v0.12.1 USER-agnes flip: 1. The marketplaces job called src.marketplace.sync_marketplaces() in-process from the scheduler container, racing the app's long-lived system.duckdb handle. DuckDB rejects cross-process writers — every cron tick 500-ed on "Could not set lock on file ... PID 0". 2. The data-refresh + new marketplaces jobs both 401-ed on the API because SCHEDULER_API_TOKEN was never propagated by the Terraform startup script. The scheduler had no credential to authenticate with. Fix: - New POST /api/marketplaces/sync-all (admin-only) drives the nightly refresh through the app process so it inherits the existing DB connection. - Scheduler swaps fn->http for marketplaces; all jobs are now plain HTTP and the scheduler is reduced to a cron clock. - New app/auth/scheduler_token.py adds a shared-secret auth path. The startup script generates a 256-bit secret on first boot, persists it across reboots, and writes it to /opt/agnes/.env. Both containers source the same .env. The app validates incoming Bearer tokens against the env var (constant-time, length-floored) and resolves matches to a synthetic scheduler@system.local user that's a member of the Admin system group. Audit-log entries from the scheduler are attributed to this user. - app/main.py seeds the synthetic user at startup so the first cron tick has a valid actor; lazy seed in get_scheduler_user covers token rotation before the next app restart. Tests: 5 new in tests/test_auth_scheduler_token.py covering empty/short secret rejection, exact-match comparison, idempotent user seeding, and lazy provisioning. 142 marketplace + scheduler tests + 96 auth tests remain green. Existing VMs with .env from before this change need a one-time re-provisioning (re-run startup-script or rotate via openssl rand); documented in CHANGELOG. * fix(audit): use '_all' sentinel for bulk marketplace sync — Devin review #127 Avoids the literal string 'marketplace:None' in the audit_log resource column when the bulk sync endpoint writes its summary row. * fix(scheduler): unblock event loop + per-job timeouts — Devin review #127 Two findings from Devin re-review on commit 5fbad15: 1. BUG: trigger_sync_all was async def, so FastAPI ran it on the asyncio event loop. sync_marketplaces() does blocking I/O (subprocess git clones up to GIT_TIMEOUT_SEC=300 each, threading.Lock, DuckDB writes) and would freeze every concurrent request for the duration of a bulk sync. Switched to plain def so FastAPI auto-routes to the thread pool. 2. ANALYSIS: scheduler used a fixed 120s httpx timeout for every POST. Bulk marketplace sync iterates the registry under a single lock with up to 300s per repo — easily exceeds 120s on 2-3 slow repos. The scheduler then sees a timeout, doesn't update last_run, and re-fires on the next 30s tick, queueing redundant work. Per-job timeout override added to the JOBS tuple; marketplaces gets 900s (15 min), data-refresh keeps 120s, health-check 30s. * fix(auth): require_session_token rejects scheduler shared secret — Devin review #127 require_session_token gates /auth/tokens (PAT minting). Pre-fix it only rejected JWTs with typ=pat — but the scheduler shared secret is an opaque string, so verify_token() returns None, payload becomes {}, and the PAT-claim check silently passed. A caller bearing SCHEDULER_API_TOKEN could mint persistent PATs that survive a secret rotation. Added explicit is_scheduler_token() check before the PAT-claim check; new regression test in tests/test_auth_scheduler_token.py. Devin's other note (pre-existing async def trigger_sync at marketplaces.py:392 also calls blocking sync_one) — Devin flagged it as out-of-scope for this PR and I agree; tracking separately. * release(0.17.0): cut + clean up CHANGELOG duplicates Cuts 0.17.0 (minor: scheduler shared-secret auth + sync-all endpoint plus the deploy-shape fixes that landed since the last release tag). Bumps pyproject from 0.15.0 — also corrects the missed bump from PR #120 (v0.16.0 was tagged on GitHub and shipped as :stable, but pyproject stayed at 0.15.0, so /api/version, /cli/latest, and `da --version` had been under-reporting the running release). Removes the long-form duplicate entries for 0.13.0 / 0.14.0 / 0.15.0 above [0.16.0] — the canonical short summaries (with GitHub-release links) already exist below 0.16.0, the long forms were leftover state from before those versions were cut and have been silently shadowed ever since.	2026-04-29 11:44:00 +02:00
PavelDo	e1108b6112	feat(memory): corporate memory v1+v1.5 + 0.15.0 (#72 ) Adds corporate memory v1 (verification flywheel + contradiction detection + confidence scoring) and v1.5 (audience-based distribution + per-item privacy + admin curation). Server: GET /api/memory/bundle returns mandatory + ranked-approved items within a token budget; POST /api/memory/admin/mandate accepts an audience field gated against user_group_members; /api/memory/stats uses SQL aggregation. CLI: da sync writes received items to .claude/rules/km_*.md. Verification detector extracts knowledge candidates from session JSONL files. Auto-tagging via Haiku when ai: is configured. Adapted from the v9-era branch onto v13/v14 RBAC: _is_privileged_viewer + _effective_groups now query user_group_members JOIN user_groups; require_role(Role.KM_ADMIN) replaced with require_admin (km_admin collapsed into admin). Schema v15: knowledge_items context-engineering columns + knowledge_contradictions + session_extraction_state. Schema v16: verification_evidence. Cuts release v0.15.0 (also bundles #116 /me/debug page).	2026-04-29 07:16:22 +02:00
ZdenekSrotyr	e9d7af3cce	feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening This squashes 13 commits from ma/staging plus a small docstring translation into a single coherent unit. Three workstreams. == RBAC v13 redesign == - Drops core.viewer/analyst/km_admin/admin hierarchy and the internal_roles / group_mappings / user_role_grants / plugin_access tables. - Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry. - Two authorization primitives in app.auth.access: require_admin — Admin-group god-mode require_resource_access(rt, "{path}") — entity-scoped grants Single DB lookup per request; no session cache; no implies BFS. - /admin/access UI (single page) replaces /admin/role-mapping + /admin/plugin-access. CLI `da admin group/grant ` replaces `da admin role/mapping/grant-role/revoke-role/effective-roles`. - ResourceType.TABLE listing-only — admins can record table grants, runtime enforcement still flows through legacy dataset_permissions (migration plan in docs/TODO-rbac-data-enforcement.md). == Claude Code marketplace == - Aggregated /marketplace.zip + /marketplace.git/ (PAT-gated, RBAC-filtered, content-addressed cache via dulwich). - Admin god-mode dropped on the marketplace surface — admins curate their own view via grants like everyone else. - Bare-repo cache materializes per RBAC-filtered ETag; stale entries not pruned in this iteration (disclaimed in git_backend.py docstring). == #81 #83 #44 security/ops hardening == - #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias). - #81 Group B — Keboola extractor 3-state exit codes: 0 success / 1 total fail / 2 PARTIAL fail Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary alerting must teach it the new partial signal. - #81 Group C — schema v10 view_ownership; rejects silent overwrite of a prior connector's view name on collision. - #81 Group D — extractor-side identifier validation. - #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset + path-traversal fix. - #44 — entire /api/scripts/* surface is admin-only (planted-script + sandbox-bypass risk closed). == Web UI polish + deploy fix == - /admin/access: live grant-count badges (no stale snapshot revert), shared-header CSS link added to /catalog and /admin/{tables,permissions}, per-resource-type colored stripes. - docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't silently shadow sub-mounts and write state to the wrong disk. == OSS vendor-neutralization (waves 1+2) == - scripts/grpn/ → scripts/ops/. Customer-specific identifiers (project IDs, internal hostnames, dev/prod VM IPs, brand names) replaced with placeholders across code, docs, Terraform, Caddyfile, OAuth probe, and planning docs. Downstream infra repos that copied scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must update the path. == Translation == - src/repositories/user_groups.py::ensure_system docstring translated from Czech to English for codebase consistency. Co-authored-by: Mina Rustamyan <mina@keboola.com>	2026-04-28 14:25:04 +02:00
Petr Simecek	83ced81966	feat(auth): unified role management — UI + REST API + CLI + schema v9 (v0.11.4) (#73 ) * feat(auth): v9 schema — unified role management foundation (WIP) Tasks 1-5, 10 of the role-management-complete plan. Foundation only, follow-up commits add REST API, CLI, UI, and tests. Schema v9: - user_role_grants table: direct user → internal_role mapping (complementary to group_mappings). Drives PAT/headless auth and persists across sessions. Source field tracks 'direct' vs auto-seed. - internal_roles.implies (JSON): transitive role hierarchy. core.admin implies core.km_admin → core.analyst → core.viewer. Resolver does BFS expand at lookup time. - internal_roles.is_core (BOOL): distinguishes seeded core.* hierarchy from module-registered roles. UI renders them differently. - v8→v9 migration: ADD COLUMN, CREATE TABLE, _seed_core_roles + _backfill_users_role_to_grants, then NULL legacy users.role values. DuckDB FK constraint blocks DROP COLUMN — sloupec zůstává jako deprecated artifact (UserRepository ignoruje), fyzický drop deferred. Resolver: - Regex extended to allow dotted namespace (core.admin, context_engineering.admin), max 64 chars total. - expand_implies(role_keys, conn): BFS over implies JSON column. - resolve_internal_roles signature gains optional user_id parameter; unions group-mapping resolution with user_role_grants direct grants before implies expansion. require_internal_role: - Two-path resolution: session cache (OAuth) → DB grants (PAT/headless fallback). PAT clients now legitimately satisfy gates without the OAuth round-trip, fixing the v8 limitation where every PAT-callable admin endpoint needed require_role(Role.ADMIN) instead of require_internal_role(...). Backward-compat: - require_role(Role.X) and require_admin become thin wrappers over require_internal_role(f"core.{role}"). Implies hierarchy preserves the legacy "at least this level" semantics automatically — no per-level comparison code needed. - src/rbac.py helpers (is_admin, has_role, get_user_role, set_user_role, can_access_table, get_accessible_tables) all read from the resolver via _get_internal_role_keys. - UserRepository.create() and update() now mirror role changes into user_role_grants via _grant_core_role helper. Preserves API while making the new table the source of truth. - UserRepository.delete() pre-deletes user_role_grants rows (FK cascade — DuckDB doesn't auto-cascade). - count_admins() reads user_role_grants ⨝ internal_roles instead of the now-NULL users.role column. First consumer: - app/api/admin.py module-level docstring documents the v9 pattern for future module authors. Existing require_role(Role.ADMIN) callsites flow through the wrapper; no behavior change for OAuth callers, and PAT callers gain access via direct grants. Tests: full suite green (1396 passed, 6 skipped). Existing tests exercise the new pathway transparently because UserRepository.create auto-grants. New test_pat_caller_with_direct_grant_passes pins the PAT-aware contract. Schema: v9 (was v8). pyproject.toml + CHANGELOG bump deferred to the final PR-prep commit. * feat(auth): role management complete — REST API + CLI + UI + docs (v0.11.4) Sjednocuje legacy users.role enum s v8 internal-roles foundation pod jeden model s implies hierarchií, dodává admin UI + REST API + CLI pro správu group mappings i přímých user grants, a dělá require_internal_role PAT-aware tak, aby admin endpointy fungovaly uniformly napříč OAuth i headless callery. REST API (app/api/role_management.py, +496 LOC): - 8 endpointů pod /api/admin: internal-roles list, group-mappings CRUD, users/{id}/role-grants CRUD, users/{id}/effective-roles debug. - Všechny gated require_internal_role("core.admin"). Audit-log na každé mutaci (role_mapping.created/deleted, role_grant.created/deleted). - Last-admin protection: refuse to delete the final core.admin grant (mirrors users.py:count_admins protection). - Nový UserRoleGrantsRepository v src/repositories/user_role_grants.py. CLI (cli/commands/admin.py extension, +258 LOC): - da admin role list / show <key> - da admin mapping list / create <group-id> <role-key> / delete <id> - da admin grant-role <email> <role-key> - da admin revoke-role <email> <role-key> - da admin effective-roles <email> - Všechno přes typer + PAT auth, --json flag, response-shape tolerantní. UI (admin_role_mapping.html + admin_user_detail.html + nav + user list): - Nová stránka /admin/role-mapping: internal_roles read-only table + group_mappings table with create/delete forms. - Nová stránka /admin/users/{id}: core role single-select + capabilities multi-checkbox + effective-roles debug (direct + group + expanded). - Existing user list dostává "Detail" link na novou stránku. - Nav link na /admin/role-mapping. Tests: +85 nových testů přes 4 nové soubory: - test_schema_v9_migration.py (8) — fresh install + v8→v9 backfill + legacy column NULL semantics + unknown-role fallback + invariants. - test_api_role_management.py (33) — všech 8 endpointů, happy + error paths, audit-log assertions, last-admin protection. - test_cli_admin_role.py (25 + 1 conditional) — typer subcommands, text + json output, PAT integration smoke. - test_admin_role_mapping_ui.py (9) + test_admin_user_capabilities_ui.py (10) — page rendering, auth gating, form contracts, JS hooks. Full suite: 1482 passed, 6 skipped (was 1396 → +86, žádné regrese). Docs: - docs/internal-roles.md kompletní rewrite — odstranil "no UI yet", přidal hierarchy diagram, dual-path resolution, dotted-namespace convention, admin workflow přes UI/CLI/REST, refresh semantics for group mappings vs direct grants, migration notes. - CLAUDE.md schema v8 → v9. - CHANGELOG.md [0.11.4] s BREAKING marker pro users.role NULL semantics + complete Added/Changed/Removed/Internal sekce. - pyproject.toml: 0.11.3 → 0.11.4. Sequencing: po mergi tohoto PR Pabu rebasuje pabu/local-dev (PR #72) na main, jeho schema migrations se posouvají z v9/v10/v11 na v10/v11/v12. Implementation breakdown: - Sequential (já): foundation tasks — schema v9, resolver, PAT-aware require_internal_role, backward-compat wrappers, rbac refactor, UserRepository auto-grant. - Parallel sub-agents (3 worktrees, ~10 min): REST API, CLI, UI. - Sequential (já): integrace, docs/CHANGELOG/version, schema tests, fullsuite verification. * fix(auth): address Devin review on PR #73 — three regressions Three concrete bugs caught in Devin's PR review, all fixed in this commit. 1. users.role hydration on read (the big one): v8→v9 migration NULLs users.role for every existing user, but a long tail of read sites still inspect user["role"] directly: - app/web/templates/_app_header.html:15 — admin nav gate - app/web/templates/_app_header.html:36-37 — role badge in dropdown - app/web/router.py:319-321 — UserInfo.is_admin/is_analyst/is_privileged - app/web/router.py:489 — corporate memory is_km_admin - app/api/catalog.py:54 — admin "see all tables" bypass - app/api/sync.py:215 — admin "see all sync states" bypass Without a fix, every existing admin loses the entire admin nav (and API admin bypasses) immediately after upgrade — a serious regression. Fix: new helper _hydrate_legacy_role() in app/auth/dependencies.py maps the highest-level core.* grant back into user["role"] as the legacy enum string. Called from get_current_user() on both auth paths (LOCAL_DEV_MODE + JWT/PAT). Idempotent — skips when role is already populated. Net effect: every pre-v9 callsite keeps working transparently for both OAuth and PAT callers, with one extra DB round-trip per authenticated request (same cost as the existing PAT-aware require_internal_role fallback). 3 regression tests in tests/test_schema_v9_migration.py: - test_hydration_recovers_role_from_user_role_grants - test_hydration_returns_highest_grant (multi-grant → highest wins) - test_hydration_falls_back_to_viewer_when_no_grants (safe fallback) 2. CLI effective-roles TypeError: API returns direct/group as List[Dict] (RoleGrantResponse-shaped), but the CLI did ', '.join(direct) which raises TypeError on dicts. Tests masked it because mocks used bare string lists. Replaced raw .join() with a _names() helper that extracts role_key from each item, falling back to str() for legacy mock shapes. 3. UI template field-name mismatch: admin_user_detail.html JS reads data.groups but the API serializes the field as group (singular, per EffectiveRolesResponse pydantic). Currently benign because the API always returns group:[], but the field would silently disappear once the group-derived view is wired up. Added data.group as the primary lookup, kept the legacy aliases for shape-drift tolerance. Full suite: 1485 passed (was 1482, +3 hydration tests), 6 skipped, no regressions. * fix(auth): Devin review #2 + UX self-service + RBAC docs rename Three threads landed in one commit because they share the same auth/role surface and CHANGELOG entry. Devin review #73 second round (2 actionable findings): - _hydrate_legacy_role no longer short-circuits on truthy users.role. The role-management endpoints (POST/DELETE /api/admin/users/{id}/ role-grants + the changeCoreRole UI flow) only mutate user_role_grants — they don't update the legacy column. The early return trusted that stale value, so a user downgraded via the new REST/UI kept role="admin" in their dict on subsequent requests, which fooled _is_admin_user_dict (src/rbac.py) and the catalog/sync admin-bypass short-circuits into retaining elevated table access even though require_internal_role correctly denied the API gates. Always re-resolves now, making user_role_grants the single source of truth on every authenticated request. Cost: one DB round-trip per request — same as the existing PAT-aware fallback. Pinned by test_hydration_ignores_stale_legacy_role_after_grant_revoke. - Dev-bypass (app/auth/dependencies.py) and OAuth callback (app/auth/providers/google.py) now pass user_id to resolve_internal_roles so direct grants land in session["internal_roles"] alongside group-mapped roles. Pre-fix, every admin-gated request fell through to the per-request DB fallback inside require_internal_role and the dev-bypass log line read "resolved 0 internal role(s)" for an obviously-admin user. test_session_internal_roles_populated updated to assert union. User-visible UX (also addresses local-test feedback): - HTTP 500 on /admin/users post-v8→v9 migration — UserResponse.role is required str, but legacy users.role was NULL-ed by the migration. _to_response in app/api/users.py now routes every dict through _hydrate_legacy_role; same fix lifts the silent no-op of last-admin protection in update_user/delete_user (the role-equality short-circuits would skip the count_admins guard for migrated admins). Three regression tests under TestAPIUsersPostMigration. - /profile is now a real self-service detail page for every signed-in user (not just admins). Three new server-side sections: Effective roles (resolver output as chip cloud), Direct grants (rows in user_role_grants with source label), Roles via groups (which Cloud Identity / dev group grants which role for the current user). Non-admins finally see why a feature is or isn't accessible. Admins additionally see a deep-link to /admin/users/{id} for editing their own grants. - /admin/role-mapping group-id picker. New "Known groups" panel above the create form: clickable chips for the calling admin's own session.google_groups (tagged "your group") merged with external_group_ids already used in existing mappings (tagged "already mapped"). Click a chip → fills the form. Empty-state copy points operators at LOCAL_DEV_GROUPS / Google sign-in instead of leaving them to guess Cloud Identity opaque IDs from memory. Operational fixes: - Scheduler log-noise: every cron tick produced a POST /auth/token 401 because the auto-fetch fallback called the endpoint with just an email (no password) and silently fell through. Removed the broken path entirely. Operators set SCHEDULER_API_TOKEN (long-lived PAT) in production; in LOCAL_DEV_MODE the dev-bypass auto-authenticates the un-tokenized request, so jobs continue to work. Docs: - docs/internal-roles.md → docs/RBAC.md (git mv preserves history). Standard industry term, more discoverable for engineers grepping for RBAC in a new repo. Restructured: Quickstart-by-role (operator / end-user / module author), step-by-step Module-author workflow with code examples (register key, gate endpoint, declare implies, write contract test), naming pitfalls, refresh semantics. CLAUDE.md gets a new "Extensibility → RBAC" section pointing contributors at the doc before they add gated endpoints. Cross-refs in app/api/admin.py + tests/test_role_resolver.py updated. Tests: 293 in the auth/role/scheduler/UI test set passed, 0 regressions. * fix(auth): Devin review #3 — login flows + RBAC docs Two new findings on commit 7d1c048, both real and addressed. Finding 1 (BUG, HTTP 500): every auth login flow loaded users via UserRepository.get_by_email and passed user["role"] straight to create_access_token, Pydantic response models, and _set_login_cookie without going through _hydrate_legacy_role. Post-v9 the legacy column is NULL for migrated users, and TokenResponse.role is a required str — so POST /auth/token raised ValidationError → HTTP 500 for any v8-admin trying to log in via password. Same root cause produced non-crashing but semantically wrong JWTs (role: null) from Google OAuth, password web flows, and email magic-link verification. Fix: hydrate inline in every login flow before reading user["role"]: - app/auth/router.py — POST /auth/token (the crash site) - app/auth/providers/google.py — OAuth callback (was just stale JWT) - app/auth/providers/password.py — 5 flows: JSON login, web login, JSON setup, web reset confirm, web setup confirm - app/auth/providers/email.py — centralized in _consume_token, covers both /verify endpoints New regression class TestAuthLoginFlowsPostMigration pins both the no-crash and the correct-role contracts for all four legacy levels (viewer/analyst/km_admin/admin) on POST /auth/token. Finding 2 (DOCS): docs/RBAC.md showed register_internal_role() being called with implies=[...], but the function signature is (key, , display_name, description, owner_module). A module author copying the example would TypeError at import time. The implies field on internal_roles IS honored at runtime by expand_implies, but the registry-side write path (register_internal_role + InternalRoleSpec + sync_registered_roles_to_db) doesn't exist yet — implies is currently seeded only for the core. hierarchy via _seed_core_roles in src/db.py. Rewrote the Implies hierarchy and Module-author workflow sections to document what's actually supported in 0.11.4 and what a future change would need to add. The "for cross-module hierarchies, register each level + grant both" pattern works today. Tests: 322 in the auth/role/scheduler/UI/password test set passed, 0 regressions. * fix(db): _seed_core_roles actually runs on every connect (Devin review #4) Devin flagged that the docstring on `_seed_core_roles` promised per-connect execution as a safety net for accidental DELETEs and in-code seed changes, but the only call sites lived inside `if current < SCHEMA_VERSION:` — so once a DB was on v9 the function never ran again, and the docstring lied. Picked option (b) from the review (actually call it on every startup) over option (a) (fix the docstring) because the safety net is genuinely useful: - recovery from accidental admin DELETE on internal_roles, - in-code _CORE_ROLES_SEED tweaks (display_name/description/implies) ship without a manual SQL deploy, - fresh installs and migrations stop needing their own seed call sites. Tail call gated by `get_schema_version(conn) <= SCHEMA_VERSION` so the future-version-is-noop rollback contract still holds — a v9 binary won't touch a DB that's been upgraded past v9. Test coverage: new TestSeedCoreRolesSafetyNet class (3 tests) pins the three contracts — deleted row re-seeds, mutated display_name re-syncs from in-code seed, applied_at on schema_version doesn't churn on already-current DBs. Existing TestMigrationSafety::test_future_version_is_noop still passes (verified against the gating logic).	2026-04-27 02:23:01 +02:00
ZdenekSrotyr	98af8e2df3	fix: make bot.py FileHandler resilient to missing log directory	2026-04-13 13:28:59 +02:00
ZdenekSrotyr	fa30298589	fix: use DATA_DIR env var instead of hardcoded /data paths - services/telegram_bot/config.py: NOTIFICATIONS_DIR now uses DATA_DIR fallback - src/profiler.py: DATA_DIR now uses main DATA_DIR env var instead of PROFILER_DATA_DIR - services/telegram_bot/dispatch.py: WS_GATEWAY_SOCKET_PATH now uses WS_GATEWAY_SOCKET env var	2026-04-09 16:39:44 +02:00
ZdenekSrotyr	4bad893cb8	feat: Docker services (ws-gateway, corporate-memory, session-collector) + scheduler auto-auth	2026-04-08 07:04:26 +02:00
ZdenekSrotyr	b0eaef88cc	refactor: delete old server infra — 4,200 lines removed Remove all legacy deployment infrastructure replaced by Docker + Kamal: - server/ directory (deploy.sh, setup.sh, webapp-setup.sh, sudoers, nginx config, systemd units, bin scripts) - scripts/sync_data.sh (replaced by da sync + API) - All services/*/systemd/ files (replaced by docker-compose) - tests/test_deploy_guard.py and tests/test_sync_data.py 688 tests passing.	2026-03-31 08:06:41 +02:00
ZdenekSrotyr	3701130a11	feat: add Docker, CLI tool, scheduler, and agent skills - Dockerfile (uv-based) + docker-compose.yml (3 services) - CLI tool 'da' with commands: auth, sync, query, status, admin, diagnose, skills - Scheduler sidecar service (replaces systemd timers) - pyproject.toml for uv distribution - Built-in skills (setup, troubleshoot) for AI agents - 17 CLI tests, 75 total tests passing	2026-03-27 15:30:03 +01:00
Petr	74ecf66f80	Increase knowledge item content limit from 500 to 1000 chars	2026-03-24 00:12:15 +01:00
Petr	1318b74ff1	Add Corporate Memory governance — Phase 1 (data model + admin API) Add admin curation layer between AI extraction and knowledge distribution. Admins (km_admin flag in instance.yaml) can approve, reject, mandate, and revoke knowledge items. Mandatory items distribute to all targeted users automatically. Three governance modes (configurable per instance): - mandatory_only: admin controls everything, no user voting - admin_curated: admin controls, users vote as feedback signal - hybrid: mandatory from admin + optional from user voting Three approval workflows: - review_queue: nothing published without admin approval - auto_publish: items go live immediately, admin intervenes retroactively - threshold: confidence-based auto-publish (Phase 5) Includes: - 9 admin action functions (approve/reject/mandate/revoke/edit/batch/...) - 11 new admin API endpoints under /api/corporate-memory/admin/ - Immutable audit log (audit.jsonl) - Audience targeting via groups - Automatic migration of existing items to "approved" status - km_admin_required auth decorator - 69 tests covering all governance logic - Backward compatible: no config = legacy wiki behavior	2026-03-23 19:15:33 +01:00
Petr	95358448e6	Add modular LLM connector for Corporate Memory Replace hardwired Anthropic API calls with a pluggable provider system. Each deployment configures its AI provider in instance.yaml — switching between Anthropic, LiteLLM, OpenRouter, or any OpenAI-compatible proxy is a config change, not a code change. New connectors/llm/ module: - StructuredExtractor Protocol with extract_json() interface - AnthropicExtractor: direct Anthropic SDK with retry + backoff - OpenAICompatExtractor: any OpenAI-compatible proxy with three-layer structured output fallback (json_schema -> json_object -> prompt) - Configurable structured_output policy (strict/json/auto) - Custom exception hierarchy (auth/rate_limit/timeout/format/refusal) - Zero secrets in logs: no API keys, prompts, or responses logged Reviewed by: Google Gemini, Claude Sonnet, OpenAI GPT-5.4. Security audit passed with all critical findings resolved.	2026-03-23 12:08:33 +01:00
Petr	2181d490e9	Fix systemd NAMESPACE failures caused by missing ReadWritePaths dirs data-refresh.service: use /tmp instead of /tmp/data_analyst_staging in ReadWritePaths — the subdirectory may not exist at service start, causing mount namespace setup to fail before any Exec* directive runs. deploy.sh: fix typo services/corporate-memory -> services/corporate_memory so the mkdir conditional actually matches the repo directory name. deploy.sh: add ReadWritePaths validation loop that auto-creates any missing directories listed in installed .service files before daemon-reload. This acts as a safety net against future NAMESPACE failures from new services.	2026-03-15 11:40:11 +01:00
Petr	80c5b902e0	Add scheduled data sync and catalog refresh with systemd timers - New sync_schedule and profile_after_sync fields in TableConfig (formats: "every 15m", "every 1h", "daily 05:00") - New src/scheduler.py with schedule evaluation logic (is_table_due) - New --scheduled mode in data_sync.py: only syncs tables that are due, respects profile_after_sync flag, auto-restarts webapp after profiling - Systemd timer+service for data-refresh (every 15 min) - Systemd timer+service for catalog-refresh (every 15 min) - deploy.sh enables new timers automatically - Complete table config reference in data_description.md.example - 58 new scheduler tests	2026-03-15 02:16:31 +01:00
Petr	f2d3d156e3	Move standalone services from server/ to services/ Extract 4 self-contained services into services/ module: - server/telegram_bot/ -> services/telegram_bot/ - server/ws_gateway/ -> services/ws_gateway/ - server/corporate_memory/ -> services/corporate_memory/ - server/session_collector.py -> services/session_collector/ Each service now has its own systemd/ directory with .service and .timer files. deploy.sh updated to auto-discover service units from services//systemd/. server/ now contains only deployment infrastructure (deploy.sh, setup scripts, bin/ management tools, sudoers, nginx config). All imports updated: webapp/app.py, server/bin/ scripts, systemd ExecStart paths.	2026-03-09 12:54:30 +01:00

39 commits