agnes-the-ai-analyst

Author	SHA1	Message	Date
Monika Feigler	9f778251c3	fix(query): wrap backtick-only BQ paths in bigquery_query() to avoid local DuckDB parser error SQL using only a full backtick path (`<proj>.<dataset>.<table>`) as the table reference had neither bare name_lookups nor direct bq.ds.tbl matches, so _rewrite_user_sql_for_bigquery_query's Skip 1 returned the original SQL unchanged. DuckDB then rejected the backtick syntax locally with "syntax error at or near `"" before the query ever reached BigQuery. Detect _BACKTICK_FULL_PATH matches in the rewriter and include them in the Skip 1 guard so the SQL gets wrapped in bigquery_query(). No identifier rewrite is needed — backtick paths are already BQ-native and _rewrite_bq_table_refs_to_native preserves them verbatim via its backtick-split pass. Closes #363	2026-05-20 14:03:29 +02:00
ZdenekSrotyr	62336bfd32	fix(rbac): stack-gated analyst access + first-demo polish (#333 follow-up) (#356 ) * fix(rbac): stack-gate analyst table access via data_packages exclusively Previously analysts could see a table in ``agnes catalog`` / ``/api/sync/manifest`` either by: 1. being in a group with ``resource_grants(group, 'table', id)``, or 2. being in a group with ``resource_grants(group, 'data_package', …)`` for a package containing the table. Path 1 leaked: admins who minted a per-table grant without ever wrapping the table in a data_package still shipped the table to analysts — directly contradicting the unified-stack mental model ("the stack is the unit of access"). User report: "i když to admin nedal do data package tak to by default uživatelé dostali to by se nemělo stát". New policy: analyst visibility is strictly stack-gated. A table is visible iff at least one data_package containing it is in the analyst's stack (required ∪ subscribed). Admin god-mode and the three internal data-source tables (agnes_sessions / _telemetry / _audit with row-level RBAC) keep their existing carve-outs. Touched surfaces: * ``src/rbac.can_access_table`` + ``get_accessible_tables`` — routed through ``StackResolver.stack(user, DATA_PACKAGE)`` + ``data_package_tables`` join instead of ``resource_grants(table)``. * ``app/api/sync._build_direct_tables_section`` — always returns ``[]`` (key kept for older CLI destructuring); per-table grants no longer manifest. * Standardised 403 detail across ``/api/data/``, ``/api/query``, ``/api/v2/sample``, ``/api/v2/scan``, ``/api/v2/schema``: ``Table 'X' is not in your stack. Ask an admin to add it to a Data Package you have access to (Required or in your stack), then run `agnes pull` to refresh.`` Single source of truth lives in ``src.rbac.table_not_in_stack_message`` so the wording stays consistent across CLI surfaces. UX side: ``/catalog/t/<id>`` (table detail page) dropped the four editorial sections (Sample questions, What's inside, Things to know, Pairs well with) per user feedback — the page's job is now "what is this table, where do I find it" (hero + parent packages). Tests: ``tests/conftest.grant_table_via_package`` / ``revoke_table_via_package`` — shared helpers that wrap a table in an auto-named data_package + grant the package required to a custom group. Replaces the legacy per-test ``_grant_table_to_analyst`` table-grant pattern. * All 17 previously-failing legacy tests (test_access_control, test_journey_rbac, test_audit_gap_, test_rbac, …) migrated to use the new helper; logic stays the same. ``tests/fixtures/analyst_bootstrap._grant_table_access`` updated to wrap via data_package so the ``test_pat`` fixture's "two table grants" semantics still ship parquets through ``agnes init``. * New ``tests/test_table_not_in_stack_message.py`` locks in the standardised 403 detail across the data + check-access endpoints. 5204 tests passing (added 1). * fix(catalog): first-demo UX feedback — required-first grouping + longer card description Two minor polish items from the 2026-05-19 stakeholder demo: 1. Required packages cluster at the top of the Browse grid instead of being interleaved by ``created_at``. Sort key ``(requirement != 'required', name)`` runs before the adapter call in both /catalog (data_packages) and /corporate-memory (memory_domains) so the required block is visible without scrolling. Regression test pins the order via ``data-id="…"`` position in rendered HTML. 2. ``.stack-card__desc`` line clamp bumped 2 → 4 lines. Two-line clamp trailed almost every admin-authored description off in "…" before the second clause, forcing a click-through to read it. The detail page (/catalog/p/<slug>) keeps the unclamped body for longer content. * release: 0.55.3 — stack-gated analyst RBAC (BREAKING) + first-demo UX polish + #345 A/B/C/D + #347 UI consistency	2026-05-19 17:01:14 +02:00
ZdenekSrotyr	b4d3c576af	Activity Center: audit log + telemetry + sessions + agnes_* tables (#278 ) * docs(spec): admin observability spec + Activity Center MVP plan Parent spec (480 lines) + executable plan (2295 lines, 14 TDD tasks). Covers Activity Center rebuild (/admin/activity), with /admin/sessions and /admin/feedback deferred to follow-up plans. Already incorporates reviewer-pass revisions across three angles (security, production resilience, code architecture): - _get_db import path corrected to app.auth.dependencies - Test fixtures aligned with seeded_app / admin_user / get_system_db - All new audit writes wrapped in try/except + logger.exception - Filename sanitization on session uploads - DuckDB DESC index behavior documented; upgrade window flagged - Migration idempotency + evolved-DB test cases - reveal_raw + shared-cache multi-worker explicitly deferred Targets schema v40 (audit_log gains params_before, client_ip, client_kind, correlation_id + 3 indices). * feat(db): schema v40 — audit_log gains params_before, client_ip, client_kind, correlation_id + 3 indices * chore(test): clean up Task 1 — drop unused import, rename stale test * feat(audit): AuditRepository.log() accepts params_before/client_ip/client_kind/correlation_id * test(audit): strengthen params_before assertion to round-trip JSON content * feat(audit): AuditRepository.query() rich filters + keyset cursor pagination * feat(sync): SyncStateRepository.list_recent() cross-table feed * feat(audit): POST /api/sync/trigger writes audit_log row * feat(audit): POST /api/scripts/run-due writes audit_log row * feat(audit): POST /api/upload/sessions writes audit_log row + sanitizes filename * feat(audit): GET /api/data/{table_id}/download writes audit_log row * feat(activity): /api/admin/activity timeline + /health + /sync endpoints * feat(ui): /admin/activity rebuilt — health pulse, timeline, sync grid; /activity-center → 308 redirect BREAKING: removed demo executive-pulse / maturity-roadmap content from activity_center.html. The page now reflects real audit_log + sync_history data. * feat(ui): admin nav + dashboard widget point at /admin/activity * feat(activity): recursive-audit suppression for AC read endpoints (60s window per actor+filter) * feat(activity): emit PostHog events when integration enabled (no-op default) * fix(audit): move v40 indices out of _SYSTEM_SCHEMA + update test_repositories to unpack query() tuple _SYSTEM_SCHEMA CREATE INDEX on audit_log(timestamp) failed when migration tests hand-roll a bare audit_log (id, action) without the timestamp column. Fix: remove indices from _SYSTEM_SCHEMA; add ADD COLUMN IF NOT EXISTS guards for timestamp and other pre-v40 columns in _v39_to_v40() so the upgrade path is safe on any hand-rolled schema; call _v39_to_v40 explicitly in the fresh-install (current==0) path to restore index creation there. Also unpack the (rows, next_cursor) tuple from AuditRepository.query() in the three TestAuditRepository tests that still treated it as a list. * docs: CHANGELOG entry for Activity Center MVP * chore: refresh stale module docstring in app/api/activity.py * feat(cli): agnes admin activity — terminal access to Activity Center (timeline + health + sync) * fix(db): _v39_to_v40 — add IF NOT EXISTS guard for 'action' column The v39→v40 ladder step adds defensive ADD COLUMN IF NOT EXISTS for every audit_log column so a hand-rolled bare audit_log (id only) is safe through the ladder. 'action' was missing from the guard list, causing CREATE INDEX idx_audit_action_time to fail on tests that stub audit_log with only an id column (tests/test_e2e_extract.py:: TestSchemaMigration::test_migration_preserves_and_extends). Local 6/6 schema tests + the previously-failing CI test pass. * docs(spec): platform telemetry epic — Boss directive + Activity Monitoring plan rebased onto v40 (stacked on zs/spec-activity-center) * feat(db): schema v41 — 7 usage_* tables for telemetry (events, summary, rollups, attribution) * chore(db): tighten v41 — usage_session_summary.session_id NOT NULL + upgrade test asserts all 7 tables * feat(usage): UsageAttributionRepository — replace/delete/lookup over usage_attribution_* tables * refactor(marketplace): extract list_inner_skills/agents/commands to src/marketplace_listing.py for reuse * feat(usage): explode plugin attribution on marketplace sync + store entity write; backfill script * refactor(marketplace): finish src/marketplace_listing.py extraction — drop duplicate _list_inner_* + _parse_frontmatter from app/api/marketplace.py * feat(usage): promote attribution helpers to src/usage_attribution_helpers.py; hook update_entity rename + bundle-swap; clarify best-effort semantics * feat(usage): UsageProcessor real extraction + rollup rebuild + 10 fixture-driven tests * fix(usage): include tool_id in event hash + executemany + rollup transaction (critical multi-tool-turn drop fix) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(marketplace): popularity stats — invocations_30d + trend + sort=most_used\|trending + Most Popular section * feat(admin): /admin/users/<id> Sessions section — list + single-file + bulk-zip downloads (audit-logged) * feat(usage): admin export endpoint + CLI — csv/json/parquet streaming, filters, audit-logged * feat(usage): agnes admin ask — LLM Text-to-SQL over usage_events with SELECT-only validator (audit-logged) * feat(usage): reprocess + prune endpoints + scheduler daily prune job + CLI * docs: PLATFORM_SETUP.md operator playbook + HOWTO/ cookbook (5 guides + index) Adds docs/PLATFORM_SETUP.md as a consolidated operator playbook covering bootstrap, TLS, marketplaces (curated + flea), scheduler env vars, telemetry extraction/export/ask/prune, privacy posture, and daily routine. Adds docs/HOWTO/ with 5 analyst cookbook guides: first query, snapshots for remote tables, private sessions, feedback + admin ask, and customizing skills. Existing setup docs (QUICKSTART, DEPLOYMENT, ONBOARDING, HEADLESS_USAGE) get a one-line cross-reference at the top pointing to PLATFORM_SETUP.md. * docs(changelog): platform telemetry epic — usage_* foundation + surfaces + admin access + docs Comprehensive [Unreleased] entry covering: usage_events/session_summary/ tool_daily/plugin_daily tables (v41), attribution lookup tables, backfill script, marketplace Most Popular + invocation chips + sort, admin Sessions section, export/ask/reprocess/prune endpoints + CLI mirrors, Activity Center (v40), PLATFORM_SETUP.md + HOWTO/ docs, and operations notes for v41 upgrade. * fix(security): block DuckDB read_/http_/glob functions in usage_ask validator + symlink escape guard in session zip + clarify mark-private semantics * fix(admin): parquet export tempfile cleanup on COPY failure + correct processed-first sort on /admin/users/<id>/sessions * feat(audit): close 8 production audit gaps — query (local/remote/hybrid), catalog/schema/sample, snapshot estimate/create, check-access * feat(ui): /admin/usage summary dashboard + per-user activity tab on /admin/users/<id> * fix(audit): cap error messages at 200 chars + audit user_activity reads + recursion guard on usage.summary * fix(audit): catalog.list audits on error path + clean up deferred json import * fix(ux): client_kind=cli for PAT auth + timeline empty state + email-instead-of-uuid + nav reorder + help text + loading indicators + ask doc * feat(observability): unify /admin/activity into single page with saved views - KPI cards (events, users, error rate, p95) clickable as quick-filters - Faceted filter dropdowns populated from audit_log in the current window - Sortable audit table, cursor pagination, per-row JSON side panel - Saved views (schema v43: user_observability_views) — per-user state - Top bar: window selector + 30s Live toggle + saved views dropdown - /admin/scheduler-runs → 308 redirect (source=scheduler filter) - New endpoints: /api/admin/observability/{facets,kpis,views} * test: update activity + scheduler-runs tests for unified page - test_admin_activity_page_renders asserts new structural anchors - test_admin_scheduler_runs_page_admin_only asserts 308 redirect * fix(observability): respect [hidden] on modal + side panel CSS `display: flex` on .obs-modal beat the [hidden] attribute's UA display:none, so the save-view modal rendered on page load and Cancel clicks couldn't dismiss it. Gate the modal's flex layout on :not([hidden]); add the same display:none guard prophylactically to .obs-panel and .obs-views-panel. * feat(observability): user enrichment in audit + interactive /admin/usage Activity: - /api/admin/activity now joins users for user_email + user_name per row - User column renders "name (id-prefix)" or "email (id-prefix)" instead of an opaque truncated UUID; falls back to id when the user record is missing Usage: - /admin/usage rewritten as the same filter/group-by/search pattern as /admin/activity. Faceted dropdowns (User / Tool / Source / Event type) populated from usage_events; debounced free-text search across tool_name / skill_name / subagent_type / command_name - New endpoints /api/admin/usage/{facets,kpis,query}; the query endpoint supports group_by in {day, username, tool_name, source, ref_id} with sort + offset pagination, plus an ungrouped raw-events mode - 4 KPI cards (events, distinct users, distinct tools, error rate) are clickable quick-filters; clicking a grouped row applies the bucket as a filter - Old static `?window=7d\|30d\|all` server preload removed; all state is client-side via since_minutes + group_by + filters in the URL * fix(observability): clearer labels, all-column sort, drop saved views UI - Rename page titles: "Activity" → "Server activity", "Usage" → "Tool usage" with a one-line subtitle on each explaining what the page covers and linking the other one. The two pages source different data (audit_log vs usage_events) and the previous labels conflated them. - Drop the saved-views dropdown + save modal from /admin/activity. The modal pop-open bug was the trigger; the value wasn't there yet. The /api/admin/observability/views CRUD + DuckDB table stay in place. - Rename "Live (30s)" to "Auto-refresh (30s)" with a tooltip clarifying that it's the re-fetch rate, not the time range. Time range now labeled "Time range" instead of "Window". - All audit-table columns are sortable (User, Source, Action, Resource, Result added); sort is page-local with a Jinja comment explaining the trade-off. Same for raw usage rows. - Fix duplicate sort-arrow bug — the literal "▼" in the Time th HTML was rendering alongside the CSS ::before arrow. Removed the literal; CSS is the single source of truth. * feat(observability): global Sessions browser + transcript viewer + CLI Web: - /admin/sessions — list every collected session JSONL across all users with time-range, user, model, errors-only and free-text filters. Default sort surfaces error-heavy sessions first. KPI cards (sessions, distinct users, sessions w/ errors, tool error rate) clickable as quick-filters. - /admin/sessions/<username>/<file> — transcript viewer rendering the JSONL chronologically: user prompts, assistant text, tool calls (with JSON input) and tool results (with flattened output). Errors get a red border + chip and a "Next error" navigation button at the top. - Admin dropdown gains a "Sessions" link. API: - GET /api/admin/sessions/{list,kpis,facets} — filtered cross-user reads off usage_session_summary - GET /api/admin/sessions/{username}/{file}/transcript — parses JSONL via the existing services.session_pipeline.lib, returns chronological events - GET /api/admin/sessions/{username}/{file}/download — JSONL stream, same path-safety guards as the per-user endpoint, audit-logged CLI: - `agnes admin sessions list [--user X] [--errors] [--since 7d]` — table output with `!` prefix on rows that hit a tool error - `agnes admin sessions show <username> <file>` — transcript dump, with `--errors` to print only the failed tool_result blocks - `agnes admin sessions download <username> <file> [-o path]` - `agnes admin sessions kpis` — top-level numbers * feat(internal): expose telemetry tables to agnes query with row-level RBAC Three new registered tables backed by system.duckdb, queryable through the same /api/query plumbing analysts use for Keboola / BigQuery / local sources: agnes_sessions → usage_session_summary (filter: username) agnes_usage → usage_events (filter: username) agnes_audit → audit_log (filter: user_id) RBAC is per-row, not per-table: admins see every user's rows; non-admins see only their own. The filter is built server-side from the auth user dict; non-admin filter values are regex-validated before SQL interpolation. Implementation: - new connector connectors/internal/ with access (filter+exec) + registry (idempotent table_registry seed at startup) - /api/query detects internal table refs and short-circuits to a CTE wrapper that prepends "WITH agnes_x AS (SELECT * FROM <src> WHERE …), …" then "SELECT * FROM (<user_sql>) AS _q". DuckDB cursor on the shared system.duckdb handle — opening parallel handles / ATTACH on the same file is blocked process-wide. - mixing internal + BQ / registered local tables in one SELECT is rejected (v1 limitation) - src.rbac.can_access_table waves internal tables through for all authenticated users; row scoping is the actual security control - /api/v2/schema and /api/v2/sample gained internal branches; sample intentionally skips its cache because rows are RBAC-scoped per caller - audit row written as action='query.internal' with is_admin flag Tests: connectors/internal/access — RBAC, filter clause, schema, CTE wrapper coexistence with user-supplied aggregations, unsafe-username rejection. 16/16 passing. Motivating queries this enables: SELECT tool_name, COUNT() FROM agnes_usage WHERE is_error GROUP BY 1 ORDER BY 2 DESC -- analyst self-introspection: which tools fail for me? SELECT user_id, COUNT() FROM agnes_audit WHERE action = 'session.transcript_view' GROUP BY 1 -- admin: who's been looking at whose session transcripts? * feat(admin): group dropdown into 5 named sections + internal tables in /catalog Admin dropdown gains section headers so admins can land on the right page without re-reading the full menu: Activity Center Server activity / Tool usage / Sessions Users & Access Users / Groups / Resource access / Tokens Data Tables Agent Experience Curated Marketplaces / Flea Submissions / Agent Setup Prompt / Agent Workspace Prompt Server Server config "Agent Experience" frames the curated content + prompts as one cluster — it's all admin-controlled material that shapes what an analyst's AI agent encounters. "Configuration" → "Server" since only one item lives there now. Renamed the section's first two items: "Activity" → "Server activity" (matches page H1) "Usage" → "Tool usage" Also fixes /catalog visibility of the internal tables (agnes_sessions / _usage / _audit) for non-admin users: ``app.auth.access.can_access`` short-circuits to True for resource_type='table' + an internal-table id. Without this, non-admins saw the tables in /api/v2/catalog (which uses the same RBAC bypass) but not on the /catalog HTML page (which calls can_access directly, requiring a resource_grants row internal tables don't have). CSS for `.app-nav-menu-section`: small caps, muted, non-clickable; first section trims top padding so the panel doesn't open with an awkward gap. * refactor(admin): move corporate memory into Admin > Agent Experience Memory link was the only admin-only entry in the primary nav (gated by session.user.is_admin). Moves it into the Admin dropdown under Agent Experience, alongside Curated Marketplaces / Flea Submissions / Prompts — all admin-curated content that shapes what an analyst's AI agent encounters. Renamed the nav label to "Shared Knowledge" to match what the page actually is (admin-curated organisational knowledge from session verification, surfaced to agents). URL stays at /corporate-memory; the route still gates on require_admin per the existing comment. Side effect: primary nav (Home / Marketplace / Data Packages) is now uniform for every authenticated user — no conditional admin-only entry. * ui: rename admin entries to Curated Knowledge / Init Prompt / Workspace Prompt - "Shared Knowledge" → "Curated Knowledge" (parallel with "Curated Marketplaces" in the same Agent Experience section; "curated" tells the admin what they do there — review + approve) - "Agent Setup Prompt" → "Init Prompt" (matches the `agnes init` flow it actually drives) - "Agent Workspace Prompt" → "Workspace Prompt" (the "Agent" prefix was redundant — every item in the section is agent-facing) Renames page titles + H1s on /admin/agent-prompt and /admin/workspace-prompt to match. * refactor: rename Usage → Telemetry across user-facing surfaces External surfaces all switch; internal Python module / file names and the physical DB tables (usage_events, usage_session_summary, usage_tool_daily, usage_plugin_daily) stay — renaming them would force a schema migration + a redo of the LLM Text-to-SQL prompt for no analyst-visible win. Changes: - Admin dropdown: "Tool usage" → "Telemetry" - Page H1 / <title>: same - URL: /admin/usage → /admin/telemetry; old URL 308-redirects - API prefix: /api/admin/usage/* → /api/admin/telemetry/* - CLI: primary command `agnes admin telemetry …`; `agnes admin usage` kept as a deprecated alias so existing operator scripts keep working - Internal data-source table id: agnes_usage → agnes_telemetry. The registry seed now evicts any stale internal-source row whose id no longer matches INTERNAL_TABLES, so the old `agnes_usage` row is removed from table_registry on next app boot - All tests + JS endpoint paths updated * test(rbac): include auto-appended internal tables in expectations get_accessible_tables now appends agnes_sessions / agnes_telemetry / agnes_audit to every authenticated user's accessible-tables list so the internal data source shows up in /catalog. The two existing rbac tests asserted hardcoded list shapes that pre-dated the change. Rewritten to assert "granted tables + the canonical internal-table set" instead of literal lists, so the test stays correct if the internal table roster changes again later. * ui: visual dividers between admin-dropdown sections Adds a 1px top border + 6px top margin to every section header except the first, so the five named groups (Activity Center, Users & Access, Data, Agent Experience, Server) read as visually separated clusters. The header itself stays small-caps + muted as before — the border is additive. * ui(memory): match obs-topbar visual on /corporate-memory The Curated Knowledge page (linked from the admin dropdown's Agent Experience section) opened straight into the stats bar — no title, no subtitle, no shared chrome with the other admin pages. Adds an obs-topbar-style header at the top of .container-memory: - H1 "Curated Knowledge" - subtitle explaining what the page is + how AI agents pull from it The `.ck-` class set duplicates the inline obs- styles from /admin/activity etc. for this one page; promoting the obs-* class set to style-custom.css for shared reuse is the obvious next step (4 pages already inline the same CSS), tracked as a follow-up. Page <title> also renamed from "Corporate Memory" → "Curated Knowledge". * ui(tables): list Agnes internal tables in /admin/tables + group in /catalog /admin/tables previously rendered three per-source-type listings (BQ / Keboola / Jira) and dropped any row whose source_type didn't match — so the agnes_sessions / agnes_telemetry / agnes_audit rows seeded into table_registry were invisible. Adds a fourth read-only section "Agnes internal tables" that filters source_type === 'internal' and renders the same registry-table layout the other sections use, with two changes: - no Register button (these rows are seeded on every app boot from connectors/internal/registry.py) - Edit + Delete actions hidden (any change would be reverted on the next start). Manage access stays so admins can still inspect. Mode badge picks up a new mode-internal CSS class (teal accent) so the display doesn't lie and call it "local". In /catalog, internal tables now group under an "agnes" accordion section (bucket="agnes" on seed) instead of falling into the catch-all "default". Single source of truth for which tables exist; admins find them where they expect. * ui(tables): Agnes internal as a 4th tab next to BQ/Keboola/Jira Previous iteration mounted the internal-table listing as a separate standalone card under the tab strip. Reshapes it to a proper tab-content section so admins switch between data sources via one consistent nav (BigQuery / Keboola / Jira / Agnes internal). - New tab button "Agnes internal" in the tab-nav. - The listing card becomes <section id="tab-content-internal" class="tab-content">; switchTab() already routes by id so no JS change beyond extending the hash allowlist for direct #internal links. - Tab content keeps the read-only treatment from the previous commit (no Register button, no Edit / Delete in renderRegistryListing). * ui: rename Curated Knowledge → Curated Memory Settles the naming back on "Curated Memory" — parallel structure with "Curated Marketplaces" in the same Agent Experience section, and zero rename ripple: URL (/corporate-memory), API (/api/memory/), CLI (agnes admin memory), and Python modules all stay on "memory" so the admin label finally lines up with the underlying surfaces. The "Curated" prefix still tells admins what they do on the page (review pending → approve / mandate / reject) and reads as a sibling of "Curated Marketplaces" right next to it in the dropdown. Touches: admin dropdown label, page <title>, page H1. DB tables stay on knowledge_ (already the canonical naming for the data shape). * ui: rename "Server activity" → "Audit log" "Audit log" is what the page actually is — server-side audit_log table rendered with KPI cards + filter bar + sortable table. The "Server activity" label confused the term with Claude Code session telemetry (Telemetry page) and didn't make the source/concept clear. Touches: - Admin dropdown nav label - /admin/activity page H1 + subtitle - /admin/telemetry subtitle cross-link - test_activity_api page-renders assertion URL (/admin/activity) and API (/api/admin/activity/) stay — the "activity" name has stuck at the route layer for a year; rerouting those would churn dashboards/bookmarks for zero analyst-visible win. ui(admin-nav): gray band on each section header for clearer separation Previous iteration used a 1px top border between section labels — the labels still blended into the items above/below at a glance. Switches to a light gray background band per section header, extended edge-to- edge inside the panel via negative horizontal margins. Bolder font-weight (700) reinforces the separation; bumping the font color isn't needed because the band itself does the work. First section's header tucks into the panel's top border-radius so the band reaches the corners without a gap. * ui(catalog): rename internal-table category to "Agnes Internal" `bucket` is what /catalog renders as the accordion category header verbatim — "agnes" lowercase didn't read as a real category name and got confused with a system identifier. Bumps to "Agnes Internal". Seed re-applies on every app boot so existing rows pick up the new bucket value via `ON CONFLICT (id) DO UPDATE`. * ui(catalog): split Agnes Internal into its own card on /catalog Previously the three internal tables landed inside the "Core Business Data" card under an "Agnes Internal" accordion alongside Keboola / BQ buckets — readers conflated system telemetry with business datasets, and the data_stats header counter ("3 tables · ~X rows total") only ever counted synced rows so internal tables looked invisible. Split the catalog page into two cards: - Core Business Data: only non-internal source_types (Keboola, BQ, Jira). Accordions group by bucket as before. Stats counter reflects this card's tables. - Agnes Internal: a dedicated card with its own visual treatment (teal accent matching the mode-internal badge in /admin/tables). Flat list (no accordion — only 3 rows, never grows here), each row carries the canonical `agnes query` snippet. Read-only — no profiler click, no In-stack toggle, no sync metadata. Route adds `internal_card` context object; template renders the new card only when it's non-None. * fix(rbac): hide internal tables from /admin/access + drop "my" framing Two related cleanups for the Agnes-internal tables: 1. /admin/access (resource grants) no longer lists them. The `can_access` check has a hardcoded internal-table bypass — security is row-level (per-request view filter), so a table-grain `resource_grants` row would do nothing. Surfacing them in the UI let admins set up grants that silently no-op. Filter at the `_table_blocks` projection so the UI tree never sees them. 2. Display names drop the analyst-perspective "my" framing: "Agnes — my sessions" → "Agnes sessions" "Agnes — my telemetry events" → "Agnes telemetry events" "Agnes — my audit log" → "Agnes audit log" The "my" only makes sense from the querying analyst's seat (`SELECT … FROM agnes_sessions` returns their rows); on /admin/* pages where admin sees / configures them across users, the pronoun was misleading. Description text now spells out the row-level RBAC contract explicitly. Display names update via TableRegistryRepository.register's ON CONFLICT UPDATE on next app boot; no manual cleanup needed. * ui: subtitle notes about agnes_* tables on each Activity Center page The recursive observability story — Agnes serves its own audit / telemetry / session data through the same `agnes query` plumbing analysts use for business data — wasn't surfaced anywhere on the admin pages that show that data. Three pages get a one-liner with the canonical `agnes query` snippet + the RBAC contract (analysts see their own rows, admin sees all): - /admin/activity (Audit log) → agnes_audit - /admin/telemetry (Tool usage) → agnes_telemetry - /admin/sessions → agnes_sessions Sets up the discovery moment for admins: they're reading the page, they see "you can query this from Claude Code", they remember it when an analyst asks "how do I find my own failed tool calls?". * ui(tables): explain "Show log" empty-state on /admin/tables Cache warmup log <pre> renders with a dark background and is only populated by the SSE stream during a Re-warm all run. Opening the page cold + clicking Show log just revealed a black bar with no context — admins couldn't tell what they were looking at. Adds an inline paragraph above the <pre> explaining what the log is, the row format, when it fills in, and where to find the historical audit trail (/admin/activity). The actual <pre> stays empty until SSE events arrive, but the surrounding copy carries the meaning. * ui(tables): auto-open cache-warmup log on Re-warm all click A Re-warm all run takes ~24s per remote BQ row. With the <details> collapsed by default, operators saw the button disable, watched a quiet ~24s pass, and assumed nothing had happened — the streaming log was hidden behind a closed disclosure. Two small JS tweaks: - cacheWarmupRun() opens the details on click, so streamed lines appear without an extra interaction - cacheWarmupOnStart() hides the inline hint paragraph the moment real log content lands, so the dark log block isn't competing with redundant context Hint paragraph also clarifies that only `query_mode='remote'` BQ rows are warmed — operators with only materialized/internal tables would see total=0 and the page would "do nothing" by spec. * ui: trim Agnes internal copy across surfaces Descriptions had grown to explain the extraction pipeline ("parsed out of session JSONLs"), the underlying table ("Backed by usage_session_summary"), the RBAC mechanic ("row-level RBAC at query time — analysts see their own; admin sees all"), and the SQL snippet. Every implementation detail meant another rewrite on the next iter. Strips to one stable line per surface: what the data is, plus "Also available locally for analysis". Mechanics live in code + docs; the page copy says what the user needs to know. Touched: - connectors/internal/access.py: INTERNAL_TABLES descriptions - activity_center.html / admin_usage.html / admin_sessions.html subtitles - catalog.html Agnes Internal card description + row strip - admin_tables.html "Agnes internal" tab hint * fix(internal): is_user_admin arity bugs + + saved-view payload cap Round-1 code review (PR #278) caught two blocking bugs and three nits. Blocking — both `is_user_admin(user)` (single dict arg) calls raised TypeError. is_user_admin signature is `(user_id, conn)`. Affected: - app/api/query.py:_run_internal_query — every POST /api/query that references agnes_sessions / agnes_telemetry / agnes_audit blew up with a 500. The headline analyst-facing feature of this PR was unusable through the API. - app/api/v2_sample.py — same shape; `GET /api/v2/sample/agnes_` returned 500. Both fixed to call `is_user_admin(user.get("id"), conn)`. Added two FastAPI-level tests in test_internal_data_source.py that go through the TestClient — the existing unit tests on `execute_internal_query` and `build_filter_clause` skipped the request-handler layer where the bugs lived, which is why this landed. Nits also closed: - connectors/internal/access.py: `+` allowed in _USERNAME_RE / _USER_ID_RE so RFC 5321 email local-parts (alice+test@x) resolve correctly without hitting InternalAccessError. - app/api/observability.py: saved-view payload capped at 64 KiB to prevent an admin from bloating system.duckdb with a malformed save. fix(security): close non-admin data-leak via underlying-table refs PR #278 R2 review surfaced a non-admin-exploitable bypass: SQL whose string literal contains 'agnes_sessions' routed into the privileged internal-query path, then queried the underlying physical table (usage_session_summary / usage_events / audit_log) directly, escaping the CTE wrapper's row filter. Two reinforcing defenses: 1. find_internal_refs() now strips single-quoted string literals before scanning for alias names — a literal alone no longer routes the request into the privileged code path. 2. execute_internal_query() rejects non-admin SQL that references the underlying physical tables (usage_, audit_log). The CTE wrapper only scopes the agnes_ aliases; a direct FROM on the base table — or a shadowing inner WITH that still has to read the base table — bypasses RBAC. Block before execution with an actionable error pointing to the agnes_* alias. Admins are unaffected (god-mode short-circuit on the filter clause). 3. tests/test_internal_data_source.py — three new negative tests covering literal-only matches, direct-table refs, and CTE shadow attempts. Also tightens usage_ask.py's SELECT-only validator: pragma_table_info, pragma_storage_info, pragma_database_, and duckdb_tables / columns / views / indexes / schemas are reflection functions that leak metadata the analyst question shouldn't reach. \bPRAGMA\b in _FORBIDDEN never matched the function-call form (word-boundary between `A` and `_`). fix(security): dynamic denylist for non-admin internal queries R3 review (PR #278) caught a wider data-leak than R2: the underlying- physical-table guard listed only the 7 usage_* + audit_log tables, but system.duckdb has 30+ other sensitive tables — users (emails + ids), personal_access_tokens, resource_grants, user_groups, user_observability_views, store_, marketplace_, knowledge_, etc. A non-admin SQL like SELECT FROM agnes_sessions UNION ALL SELECT email, id, … FROM users LIMIT 1 would leak every user's row. Replaces the hardcoded denylist with a dynamic allowlist — non-admin SQL may reference ONLY the registered agnes_* aliases. Every other table in `information_schema.tables` (main schema) is rejected. Future migrations that add a new sensitive table are automatically covered without re-editing this module. Also strips SQL comments (`/* /` and `--`) before the identifier scan so a comment-wrapped table name (`//users//`) can't slip past the regex. Four new negative tests pin: `users`, `personal_access_tokens`, block-comment wrap, line-comment wrap. Plus: per-user view-count cap (100) on /api/admin/observability/views so an admin can't fill system.duckdb with thousands of saved views. release: 0.54.0 — Activity Center + Telemetry + Sessions + internal datasource Cuts the work shipped across this PR (Activity Center build, recursive internal data source) into a versioned release. Bumps pyproject.toml to 0.54.0; renames the top of CHANGELOG.md from [Unreleased] to [0.54.0] — 2026-05-12 with a header summary; opens a fresh [Unreleased] section for the next round. --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 22:41:19 +02:00
ZdenekSrotyr	1ade1300c6	fix(bq-hint): drop literal backslash escapes from syntax-error hint string (#275 ) PR #274 (just merged) introduced `\`AS \\\`rows\\\`\`` in the syntax-error branch of _hint_for_bq_bad_request. Python doesn't recognize \\\` as an escape sequence, so the literal backslashes survived into the JSON `hint:` field. Analyst reading the CLI error saw: Backtick the alias (`AS \`rows\``) or rename it ... with visible backslashes — exactly the misleading shape this dispatcher exists to clean up. Self-review caught it; this PR replaces the problematic substring with plain prose ("rename the alias to a non-reserved word (AS row_count) or backtick-quote it BQ-style (AS `rows` with literal backticks around the identifier)") that needs no escape gymnastics. New regression test test_no_hint_branch_leaks_literal_backslashes pins every dispatch branch against `\\\`` and `\\\\` substrings — pytest now catches this class on the next regression instead of waiting for an analyst to spot it. CHANGELOG bullet rephrased to match (the same broken backslashes leaked into the [Unreleased] entry). Verified: 4162 tests pass; 26 in test_api_query_guardrail.py green; demo print of the syntax-error branch shows clean output.	2026-05-12 18:57:46 +00:00
ZdenekSrotyr	5458ccc41b	hygiene: BQ error hint dispatch + catalog ENTITY column (#274 ) Two analyst-UX papercuts surfaced by the v0.53.4 onboarding smoke test. 1) /api/query remote_estimate_failed hint now branches on the BigQuery error class instead of always claiming a column doesn't exist. The previous hardcoded "Most often this means a column referenced … doesn't exist" misled analysts whenever BigQuery actually rejected on syntax — concretely, `SELECT COUNT(*) AS rows FROM …` fails with `Syntax error: Unexpected keyword ROWS at [1:20]` (`rows` is a BQ reserved word) and the hint pointed at non-existent columns. New _hint_for_bq_bad_request() helper dispatches: - "Syntax error" / "Unexpected keyword" → reserved-keyword alias hint with `AS row_count` workaround - "Unrecognized name" / "not found inside" → `agnes schema <id>` - "Table not found" → `agnes catalog` - fallback → enumerate all three 4 unit tests in TestHintForBqBadRequest pin each branch. Existing guardrail tests (test_fallback_fails_fast_on_pure_duckdb_syntax, test_remote_estimate_failed_surfaces_first_error_when_attempts_differ) continue to pass — both hint substrings they assert on still appear in the relevant branches. 2) `agnes catalog` replaces the FLAVOR column with ENTITY. FLAVOR rendered t['sql_flavor'] which duplicated SOURCE for any catalog dominated by one source type — analysts saw `SOURCE=bigquery FLAVOR=bigquery` on every row. ENTITY instead surfaces the upstream BigQuery entity_type (BASE TABLE / VIEW / MATERIALIZED_VIEW) for remote rows; non-remote rows render `-`. The distinction matters operationally: views don't support predicate pushdown, so `agnes query --remote` against a view trips the cost guardrail where the same query against a BASE TABLE pushes down cleanly. The entity_type field has been in the v2 catalog response since 0.51.0; this PR just stops hiding it behind a column header that conveyed no information. JSON output (`agnes catalog --json`) is unchanged — only the human- readable column changed. No DB migration; no API change. Verified: 4161 tests pass locally; 25 in test_api_query_guardrail.py green; the 4 new TestHintForBqBadRequest cases pin each branch.	2026-05-12 18:32:29 +00:00
ZdenekSrotyr	b6cdd68e8d	feat(catalog): entity_type + validated where_examples + view-aware cost-guard + scheduler hygiene Three behavioural improvements driven by the sub-agent end-to-end test findings, plus scheduler tweaks to prevent the post-deploy contention burst we measured. CATALOG (catalog-side bugs the test agents tripped on): - new entity_type field per remote row (BASE TABLE / VIEW / MATERIALIZED VIEW). For views, rows + size_bytes return null instead of the misleading 0 that __TABLES__ reports. - where_examples now validates against the table's actual schema (cached known_columns from refresh). The pre-fix behavior blindly advertised `country_code = 'CZ'` on tables with no country_code column — the sub-agent tests reliably hit this on unit_economics. - new known_columns + entity_type columns on bq_metadata_cache; populated by bq_metadata_refresh.refresh_one from the same fetch_bq_columns_full call (no extra BQ roundtrip) plus a cheap INFORMATION_SCHEMA.TABLES lookup for table_type. QUERY COST-GUARD: - remote_scan_too_large suggestion now names views explicitly: `Target(s) <ids> are VIEW or MATERIALIZED VIEW. BigQuery does not push LIMIT into the view body — SELECT * FROM <view> LIMIT 1 still runs the full underlying scan.` Programmatic consumers get a new view_targets field on the error detail. SCHEDULER HYGIENE (the post-deploy 1-minute window where concurrent parquet downloads dropped to ~1 MB/s): - SCHEDULER_STARTUP_GRACE_SECONDS (default 60) holds the first tick so the burst doesn't overlap cache_warmup writes. - SCHEDULER_BQ_METADATA_INITIAL_OFFSET_MAX_SECONDS (default 900) randomises bq-metadata-refresh's first-fire offset. TESTS: - test_bq_metadata_cache_repo: entity_type + known_columns round-trip - test_v2_catalog_remote_metadata: where_examples validation, views return null rows/size_bytes, cold rows have empty examples - test_api_query_guardrail: VIEW-aware suggestion text + view_targets - test_connectors_bigquery_metadata: entity_type lookup mock + new fields in TableMetadata expectations - test_scheduler_sidecar: grace + jitter env-var resolution	2026-05-12 10:37:35 +02:00
ZdenekSrotyr	917f9aaef0	release: 0.47.2 — restore #218 + #219 fixes silently reverted by #217 (#225 ) ## Summary Smoke-testing the just-shipped 0.47.1 against production exposed two regressions: 1. `agnes query --remote "SELECT FROM unit_economics WHERE bad_col=1"` returned `Table "unit_economics" must be qualified` (the OLD error) instead of `Unrecognized name: bad_col` (the #218 fix's intended behavior). 2. `agnes query "DESCRIBE unit_economics"` showed only DuckDB's misleading `Did you mean order_economics?` with no Agnes hint paragraph (the #219 fix is missing). Root cause: PR #217's squash merge (`506a378c`) carried stale snapshots of `app/api/query.py` and `cli/commands/query.py` from before #218 and #219 merged. The rebase-and-merge auto-merged those files cleanly (no conflict markers) but the result silently reverted both fixes. Restore the two changes verbatim. Tests for both fixes already on main and continue to pass against the restored code. ## Test plan - [x] `pytest tests/test_api_query_guardrail.py tests/test_cli_query.py` — clean - [x] Manual repro against prod after deploy: both flows now surface the intended diagnostic. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/225" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 19:57:18 +02:00
ZdenekSrotyr	506a378c3a	release: 0.47.1 — Keboola connector v27 (incremental, partitioned, where_filters, typed parquet) (#217 ) ## Summary Brings the Keboola connector to feature parity with the legacy internal data-analyst's per-table sync strategies. Closes the four documented gaps from the spec branch (`zs/keboola-connector-specs`): - Typed parquet in the legacy SDK extraction path — column types from Keboola Storage metadata (provider cascade `user > ai-metadata-enrichment > keboola.snowflake-transformation`) survive the CSV → parquet roundtrip; invalid date strings (`'0000-00-00'`) and invalid numeric strings (`'Non-Manager'`) become NULL while keeping the column's typed schema. Pre-fix everything was VARCHAR. - Incremental sync via Storage API `changedSince` — opt-in per table; pulls only delta rows, merges into the existing parquet by `primary_key` (drop_duplicates with keep='last'). Cuts daily extraction from O(full table) to O(delta). - Partitioned sync — flat per-partition layout `data/<table>/<key>.parquet` (e.g. `2026_05.parquet`), per-affected-partition merge for daily updates, chunked initial load with 1-day overlap and 2-empty-chunk stop heuristic. - `where_filters` — server-side row filter with date placeholders (`{{today}}`, `{{last_3_months}}`, `{{start_of_3_months_ago}}`, etc.) resolved at sync time. Force the SDK path; reject `incremental + where_filters` combination at API layer (changedSince already filters temporally). ## Architecture - Schema migration v25 → v26: 7 new columns on `table_registry`. Existing `sync_strategy` column reused (pre-v26 it was inert catalog metadata; post-v26 the extractor dispatches off it). - Per-table dispatcher in `extractor.run()` routes to one of `_extract_via_extension` (full_refresh + extension), `_extract_via_legacy` (full_refresh + filters or extension fallback), `extract_incremental`, or `extract_partitioned`. - API conflict policy: `incremental + where_filters` → 422; `partitioned + query_mode='remote'` → 422; `partitioned ⇒ partition_by required`. - Admin UI: third "Direct extract (Storage API)" radio in the Keboola Register / Edit modals, alongside existing "Whole table (extension)" and "Custom SQL". When selected, exposes a v26 sync-strategy panel with conditional fields per strategy. ## Test plan - [x] Unit + module — 134 v26 tests covering migration, repo, parquet_io, where_filters, incremental (compute_changed_since + merge_parquet + extract_incremental E2E), partitioned (key derivation + merge_partition + chunked windows + extract_partitioned E2E), extractor dispatcher, admin API validators, PUT field clearing, registry-shape → dispatcher bridge - [x] HTML form structure — all v26 inputs + visibility classes + JS payload fields verified in rendered template - [x] Real Keboola roundtrip — registered a small test table as `sync_strategy='incremental'` against a test Storage project, triggered two syncs: - Sync 1: `changedSince=None` → full pull → 9 rows typed parquet - Sync 2: `changedSince=last_sync - 1d window` → 9 delta rows merged with 9 existing → 9 after dedup on primary_key (PK merge confirmed) - [x] Browser UX — agent-browser session against a local uvicorn: login → admin/tables → register modal → switch radios → verify field visibility per strategy → submit → edit existing row → switch to Direct/Incremental → save → confirm DB persistence - [x] Regression — no regressions in the broader 3252-test suite (3 pre-v26 tests updated for the deprecation-marker removal + schema-version bump; 2 pre-existing environment-sensitive test failures unrelated to this change) ## Bugs caught + fixed during E2E The browser + real-Keboola roundtrip exposed four bugs the unit tests missed: 1. JS visibility race — two competing `forEach` loops set `display=''` then `display='none'` on form elements sharing `kb-strategy-incremental kb-strategy-partitioned` classes (window_days + max_history_days are reused across strategies). Fix: single-pass selector with class-based visibility resolver. 2. PUT cannot clear field — pre-v26 `updates = {k: v ... if v is not None}` collapsed "omitted from body" and "sent as null" into the same case, so admin couldn't switch a partitioned row back to full_refresh and have stale `partition_by` clear. Fix: `model_dump(exclude_unset=True)`. 3. Subprocess DB lock conflict — `_read_last_sync` reopened `system.duckdb` while the parent server held the write lock (subprocess contract at `app/api/sync.py:_run_sync` line 260). Fix: parent injects `__last_sync__` into table_config before subprocess spawn. 4. Wrong KBC table_id — `extract_incremental` / `extract_partitioned` built the Storage API table_id from the registry row's slugified `id` (`circle_inc`) instead of `bucket.source_table` (`in.c-finance.circle`), producing 404s. Fix: prefer `bucket+source_table`; fall back to `id` only when bucket empty. ## Operator notes - Existing tables stay on `full_refresh` after migration; admins opt individual tables in via `agnes admin register-table --sync-strategy ...`, the Keboola Edit modal, or `POST/PUT /api/admin/registry`. - `merge_parquet` and `merge_partition` use `pd.concat + drop_duplicates`, loading both existing and delta into pandas RAM. For tables in the multi-million-row range this may OOM — switch to `partitioned` strategy for those (per-partition merge keeps memory bounded). Documented in `### Internal` of the changelog entry. - Date placeholders are resolved at sync time, not register time — a typo'd `{{lasst_week}}` is accepted at register and surfaces only when the next sync runs. By design (rolling windows need late-binding). ## Spec source The four corresponding plans on the `zs/keboola-connector-specs` branch under `docs/superpowers/plans/2026-05-07-0[1-4]-*.md` capture the design rationale and link back to internal repo references for each subsystem. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/217" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 19:01:27 +02:00
ZdenekSrotyr	378ee40459	release: 0.46.1 — surface real BQ error from remote_estimate_failed retry (#218 ) ## Summary When `agnes query --remote` references a column that doesn't exist on the FROM table, users were seeing `Table "<id>" must be qualified with a dataset` instead of the actually-useful `Unrecognized name: <column>` from BigQuery. Surface the first-attempt diagnostic now; keep the second-attempt context as `underlying_original`. Reproduced against production: ``` $ agnes query --remote "SELECT COUNT(*) FROM unit_economics WHERE authorize_date = DATE '2025-05-06'" Error: remote_estimate_failed (HTTP 400) message: Could not estimate scan size for this query. underlying: 400 ... Table "unit_economics" must be qualified with a dataset. ``` (`unit_economics` has `authorize_timestamp`, not `authorize_date`.) ## Test plan - [x] New `test_remote_estimate_failed_surfaces_first_error_when_attempts_differ` asserts the first-attempt message wins, second-attempt is preserved as `underlying_original`, hint points to `agnes schema`. - [x] Existing `test_guardrail_returns_400_remote_estimate_failed_on_double_parse_error` still passes (both attempts mocked to identical error). - [x] `pytest tests/test_api_query_guardrail.py` clean. <!-- devin-review-badge-begin --> --- <a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/218" target="_blank"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1"> <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review"> </picture> </a> <!-- devin-review-badge-end -->	2026-05-07 16:54:45 +02:00
ZdenekSrotyr	f4bc04958d	fix: Devin Review #1 — apply backtick mask to wrapping rewriter `_rewrite_user_sql_for_bigquery_query` does its own bare-name detection (mirroring the non-RBAC parts of `_bq_guardrail_inputs`). The backtick masking from #201 was applied to `_bq_guardrail_inputs` and the forbidden-table loop, but missed this third site — so a registered local-mode table name appearing as the table segment of a user-supplied full backtick path (e.g. ``\`prj.ds.orders\`` matching registered local ``orders``) tripped the cross-source guard and forced every backtick-path query into the 50-100× slower ATTACH-catalog fallback. Mask once at the top of the function, route both the BQ-name detection (line ~830) and the cross-source check (line ~867) through the masked copy. New regression test `test_local_name_inside_backtick_path_does_not_trip_cross_source` proves the wrapper now wraps when it should.	2026-05-06 21:06:21 +02:00
ZdenekSrotyr	824e3cb636	feat(query): registry-gate full backtick BigQuery paths (#201 ) Adds Pass 3 to `_bq_guardrail_inputs` that scans user SQL for full backtick paths `<project>.<dataset>.<table>` and gates them identically to the `bq."<dataset>"."<table>"` pass: - Project must match the configured BigQuery data project (`get_bq_access().projects.data`). Mismatch → HTTP 403 `bq_path_cross_project`. - Path must point at a registered row. Unregistered → HTTP 403 `bq_path_not_registered`. - Non-admin caller must hold a grant on the registered row's id. Missing grant → HTTP 403 `bq_path_access_denied`. Pre-fix, full backtick paths bypassed Agnes RBAC entirely — only the service account scope limited reach. Post-fix the boundary matches what `agnes catalog`-driven flows already enforce. Admin still bypasses the per-id grant check but cannot bypass registration or project match. Pass 3 also seeds `dry_run_set` for resolved registered paths so the cost-cap dry-run runs against the same physical table the user named — composing cleanly with the Layer 2 fail-fast fallback.	2026-05-06 18:02:53 +02:00
ZdenekSrotyr	c32be3fe96	fix(query): cap-guard fallback retries original SQL, fails fast (#201 ) When BQ rejects the rewritten dry-run SQL with `bq_bad_request`, the cap-guard now retries with the user's ORIGINAL SQL instead of building a synthetic `SELECT * FROM <table>` per registered table. The synthetic path threw away user filters / projections / partition predicates and routinely ballooned the estimate to "full table size", falsely tripping `remote_scan_too_large` on legitimate narrow queries (typical issue #201 trace: rewriter corrupts a backtick path → BQ parse error → synthetic over-estimate → 400). Behaviour: - Rewritten SQL succeeds: same as before (issue #171 single-dry-run). - Rewritten SQL parse-errors, original SQL succeeds: use original estimate. Common case for users submitting BQ-native input. - Both fail with `bq_bad_request`: HTTP 400 `remote_estimate_failed` with a hint pointing at `agnes catalog` / BQ-native syntax. No silent over-estimate. - Non-parse BQ error (forbidden, upstream): still 502 as before. This is a behaviour change for clients matching error kinds — failure to estimate scan size now surfaces as `remote_estimate_failed` instead of being masked behind `remote_scan_too_large` from the synthetic path. Replaces the existing `test_guardrail_falls_back_to_per_table_estimate_on_bq_parse_error` (which pinned the old contract) with `test_fallback_tries_original_sql_first` and `test_fallback_fails_fast_on_pure_duckdb_syntax`.	2026-05-06 18:02:53 +02:00
ZdenekSrotyr	720a2180c0	fix(query): rewriter respects backtick segments (#201 ) `agnes query --remote` corrupted user SQL when the request contained a full BigQuery backtick path (`<project>.<dataset>.<table>`) whose table segment matched a registered bare-name alias. The bare-name rewriter used `\b` word-boundary matching against the lower-cased SQL; both `.` and `` ` `` are non-word characters, so the regex fired INSIDE the user's backtick path and produced malformed nested-backtick SQL that BigQuery rejected at parse time. Fix: - Add `_mask_backticks(sql)` helper: replace each `…` segment with spaces of equal length, preserving offsets so word-boundary searches find positions only outside backticks. - `_bq_guardrail_inputs` (bare-name pass + forbidden-table pass) searches against the masked SQL. - `_rewrite_bq_table_refs_to_native` Pass 1 splits the SQL on `(\`[^\`]*\`)` and rewrites only the outside-backtick chunks. Pass 2 (`bq."ds"."tbl"` → backtick form) is unchanged — its prefix can't appear inside backticks. Adds three regressions covering the rewrite + guardrail paths.	2026-05-06 18:02:53 +02:00
ZdenekSrotyr	81d065b1ea	fix: Devin Review #1 — bigquery_query() first arg uses billing project, not data In cross-project BQ setups (where billing != data), the SA typically has serviceusage.services.use on the billing project but not on the data project. The rewriter passed bq.projects.data as the first arg to bigquery_query(), which BQ uses as the execution + billing project → 403 USER_PROJECT_DENIED. Match the convention used everywhere else in the codebase (app/api/v2_scan.py, app/api/v2_sample.py, app/api/v2_schema.py, connectors/bigquery/extractor.py): backtick paths inside the inner SQL use the data project (resolves the actual table location), the bigquery_query() first arg uses the billing project (decides who pays + which project the job runs under). For single-project deploys the two are identical so the fix is a no-op there. Test pins the cross-project case: data-prj for backticks, billing-prj for the bigquery_query() first arg.	2026-05-06 14:07:38 +02:00
ZdenekSrotyr	aee585fac6	fix: devil's advocate R2 — narrow shared-client try, PID tmp suffix, Syntax error anchor R2 adversarial review surfaced 3 issues, all addressed: #1 cli/client.py:572-577 outer try/except wrapped both _get_shared_client() AND the actual download. A 401/403/404/5xx from the server triggered a full second download attempt with a fresh client — wasted bandwidth on hard failures, no fail-fast on revoked PAT. Narrowed the try to only the shared-client construction; the download itself is no longer retried under the fallback except. #2 concurrent agnes pull invocations (e.g. SessionStart hook + manual run) collided on bare <target>.tmp / <target>.partN paths — one process's in-progress write got yanked by the other's cleanup, manifest hash check then failed spuriously. Per-process suffix (<target>.{pid}.tmp, <target>.{pid}.partN) makes intermediate files disjoint; the final os.replace to the bare target is atomic so last-writer-wins. #3 _looks_like_bq_rewrite_parse_error patterns 'Syntax error' could false-positive on a query like WHERE log_msg = 'Syntax error in foo' that fails for an unrelated reason (quota, network) and has the literal substring echoed in the error text. Anchored to 'Syntax error: ' (with trailing colon) — BQ always emits the colon in this error format, user SQL string literals normally don't.	2026-05-06 13:57:29 +02:00
ZdenekSrotyr	e5645fd280	fix: devil's advocate R1 — chunked probe, parse-error heuristic narrow, pool settings refresh, content-length sanity, multi-project skip R1 adversarial review surfaced 5 issues, all addressed: #1 chunked download silently disabled in non-Caddy deployments (HEAD on GET-only FastAPI route returns 405). _probe_range_support now falls back to GET with Range: bytes=0-0 when HEAD fails — works against both Caddy file_server (HEAD-friendly) and dev FastAPI direct (GET-only). #2 parse-error fallback heuristic too broad — matched on Unrecognized name / Function not found / No matching signature / Invalid cast, which BQ surfaces for ordinary user-column typos. That triggered slow ATTACH-catalog retry on every typo (2× latency tax). Narrowed to just 'Syntax error' / 'syntax error' which are the genuine DuckDB-vs-BQ dialect mismatch markers. #3 apply_bq_session_settings was only run on fresh-built pool entries, not on reuse. An operator's /admin/server-config change to bq_query _timeout_ms wouldn't propagate to long-lived pooled sessions until restart. Fixed: re-apply on every pool acquire (idempotent + fail-soft). #4 content-length sanity bound — a misconfigured proxy returning a wildly inflated Content-Length would cause overlapping chunked Range requests against the actual file → corrupt assembled output (caught by manifest hash check, but only after wasted bandwidth). Cap at 100 GiB; above that, drop to single-stream. #5 rewriter assumed every BQ row resolves under the single bq.projects.data project. Bucket containing '.' suggests a project- qualified bucket (multi-project deployment); rewriter would silently target the wrong project. Conservative skip with regression test.	2026-05-06 13:50:46 +02:00
ZdenekSrotyr	8e56d45c68	fix(query): code-review fixes — outer LIMIT wrap, dollar-quoting, parse-error fallback Address code-reviewer findings on the bigquery_query() rewrite path: 1. Outer LIMIT wrap — bigquery_query() materialises BQ result into DuckDB before fetchmany sees it (vs ATTACH-catalog Storage Read API streaming). A user 'SELECT *' against a billion-row remote table would buffer the entire result before request.limit applied. Wrap rewritten SQL in an outer 'LIMIT N+1' so the cap pushes into the BQ job itself. 2. Dollar-quoted inner SQL — naive replace("'", "''") doubling missed DuckDB backslash-escape sequences (\\, \\n, \\t, …). A predicate like 'WHERE name = ''O\\'Brien''' was unsafe under the doubling path. DuckDB $bqq_inner$ … $bqq_inner$ form takes the inner SQL verbatim with no escapes whatsoever. Falls back to legacy doubling if user SQL improbably contains the literal tag. 3. Parse-error fallback — when the rewritten path fails with a BQ-side parse / validation error (DuckDB-only syntax like ::INT cast that survives identifier rewrite but BQ refuses), retry the user's original SQL via the legacy ATTACH-catalog path so the request still succeeds. Mirrors the existing dry-run fallback contract. 4. CHANGELOG — delete duplicate CLI bullets that landed under already-released [0.38.1] (file corruption from merge — entries are correctly under [0.39.0]).	2026-05-06 13:29:45 +02:00
ZdenekSrotyr	b2c1ff143c	fix(query): rewrite BQ-backed user SQL via bigquery_query() to enable predicate pushdown User SQL hitting query_mode='remote' BigQuery rows was 50-100x slower than the equivalent direct bigquery_query() call because DuckDB's master view (CREATE VIEW … AS SELECT * FROM bigquery.<ds>.<tbl>) does not push WHERE/SELECT/LIMIT into BQ in ATTACH-catalog mode. The BQ extension opens a Storage Read API session over the entire upstream table; on >100M-row sources this was 70-150s and frequently failed with 'Response too large to return'. Extract the existing dry-run rewriter's core (table-name → BQ-native backtick path) into a shared helper. Add an execution-path rewriter that wraps the whole user SQL in bigquery_query('<project>', '<inner>') so the BQ planner sees the full query and engages partition pruning + projection pushdown server-side. Conservative fall-through: cross-source JOINs (BQ ↔ Keboola/Jira local), queries already containing bigquery_query(, and unconfigured BQ project all skip the rewrite and run the original SQL via ATTACH-catalog so behavior degrades gracefully.	2026-05-06 13:02:34 +02:00
ZdenekSrotyr	e5fb913cec	perf: Tier 1 event-loop unblocking — async def → def on BQ-bound handlers Five hottest BQ-touching endpoints were `async def` but invoked synchronous DuckDB / BQ-extension calls inside the body. Under uvicorn's single event loop that meant a single heavy `agnes query --remote` (waiting up to ~200 s for BQ's jobs.query) froze EVERY other request — /api/health, dashboard, auth, even another query — for the full BQ wait. Operators saw "VM idle, app frozen" during PR #188's testing. Convert to plain `def` so FastAPI auto-offloads the body to the anyio thread pool. Event loop stays free for non-BQ requests. - app/api/query.py:execute_query - app/api/v2_scan.py:scan_estimate_endpoint, scan_endpoint - app/api/v2_sample.py:sample - app/api/v2_schema.py:schema Audit: 0 `await` statements in any converted handler (verified file-by- file), so the rename is safe. Tests in tests/test_v2_*.py called the handlers via `asyncio.run(...)` which now fails on a non-coroutine return; swapped for direct calls (asyncio.run( -> ( ) — keeps paren balance). Plus AGNES_THREADPOOL_SIZE env var (default 200, was anyio's stock 40) in app/main.py:lifespan. Set via anyio.to_thread.current_default_thread_limiter().total_tokens. 200 is comfortable headroom for <50 concurrent analysts; bump for more. 480/480 impacted tests pass (the 2 remaining errors are a pre-existing fixture setup issue in test_reader_smoke_matrix.py unrelated to this change).	2026-05-05 17:44:08 +02:00
ZdenekSrotyr	5915f92eaa	fix(query-guardrail): single-pass alternation regex (Devin Review on query.py:464) The iterative bare-name rewriter (one re.sub per name, longest-first) was vulnerable to cross-contamination when the GCP project ID contained a registered table name as a hyphen-delimited word. Concrete repro: project = 'my-ue-project' registered = ['orders', 'ue'] user SQL = 'SELECT * FROM orders JOIN ue ON ...' iter 1 (orders): produces 'FROM `my-ue-project.fin.orders` JOIN ue ...' iter 2 (ue): '\bue\b' matches 'ue' INSIDE 'my-ue-project' (hyphen creates word boundary on both sides) — corrupts the iter-1 path Fallback at query.py:576 caught the resulting BQ parse error and fell back to per-table SELECT * estimate, so impact was over-estimation, not fail-open — but the #171 partition-pruning fix silently degraded to pre-fix behavior whenever a project name shared a hyphen-segment with a registered table. Fix: single re.sub call with an alternation regex sorted longest-first. Single-pass means each source position is processed exactly once, so freshly-inserted backticked text from one match isn't re-scanned by later names in the alternation. Regression test test_rewrite_helper_does_not_corrupt_when_project_id_contains_registered_name covers the exact Devin repro.	2026-05-04 22:51:33 +02:00
ZdenekSrotyr	500db8cd3c	fix(query-guardrail): dry-run user SQL not synthetic SELECT * (#171 ) Closes #171. The /api/query cost guardrail used to dry-run a synthetic `SELECT * FROM <table>` for each registered remote-BQ row referenced by the user SQL — which made BigQuery estimate a full table scan, with column projection, predicate pushdown, and partition pruning all disabled. Narrow queries on big partitioned/clustered tables (the documented happy path for `agnes query --remote`) hit ~30,000× over-estimates and got rejected with 400 `remote_scan_too_large` even when BQ's own dry-run reported single-digit MB. Pavel's report on #171 traced the root cause and proposed the fix: rewrite the user SQL to BQ-native syntax and dry-run it as a single job, exactly the way `bq query --dry_run` works. Implementation: - New helper _rewrite_user_sql_for_bq_dry_run rewrites bare registered names (word-boundary, case-insensitive, longest-first to avoid prefix collisions) + bq."<ds>"."<tbl>" forms to backticked `<project>.<ds>.<tbl>` paths. - _bq_quota_and_cap_guard runs ONE dry-run on the rewritten SQL. Cap check uses the real estimate. - Fallback path: if BQ rejects with bq_bad_request (e.g. DuckDB-only syntax like ::INT casts), the guard falls back to the pre-fix per-table SELECT * approach so non-portable queries still get a (loose) cap estimate instead of fail-opening. Non-parse BQ errors (forbidden, upstream) still propagate as 502. - _bq_guardrail_inputs now also returns name_lookups so the rewriter has the (registered_name, bucket, source_table) mapping it needs. - Per-table breakdown is unavailable from a composite dry-run; total bytes are pinned to dry_run_set[0] for the post-flight record_bytes(sum(...)) call to keep returning the right total. Tests (7 new, 3 existing still pass): - dry-run receives rewritten user SQL with WHERE clause intact (the load-bearing assertion for #171) - single dry-run per request even with multiple registered tables (JOIN, UNION) referenced - fallback to per-table SELECT * on bq_bad_request - non-parse BQ errors (forbidden) still 502 - rewriter unit tests: bare + bq.path in same SQL, longest-name-wins on prefix collision, case-insensitive bare-name match	2026-05-04 21:08:21 +02:00
ZdenekSrotyr	3d58768143	fix: address Devin Review findings — incomplete renames + estimate guard 13 Devin findings across 10 files: 🔴 Critical: - app/api/v2_catalog.py:42 — `_fetch_hint` returns `da fetch` in /api/v2/catalog responses (user-visible in every catalog list) - cli/skills/agnes-data-querying.md — 11 stale `da fetch`/`da sync` refs in the bundled skill markdown - config/claude_md_template.txt:38 — referenced `agnes pull --docs-only` flag that does NOT exist in agnes pull (removed; spec only ships --quiet/--json/ --dry-run) 🟡 Important: - app/api/admin.py:252 — `da fetch` in bq_max_scan_bytes hint - cli/commands/auth.py:119 — `da sync` in import-token docstring (--help text) - cli/commands/tokens.py:48 — "Export it so `da` can use it" prose - ARCHITECTURE.md — 4 stale rows in CLI commands table - README.md — stale paragraphs for analysts (da sync, da analyst setup) 🚩 Substantive observations addressed: - app/api/query.py:249,302,489 — server-side error/help strings still said `da sync`/`da fetch` (returned in API responses to clients) - cli/commands/snapshot.py:235-241 — DuckDB existence guard incorrectly blocked `--estimate` (server-side dry-run that never opens local DB). Added test ensuring estimate path skips the guard. Skipped (intentionally historical): - app/api/admin.py:2377,2429,2437 — historical comments describing past manifest-vs-sync_state bug; past tense, accurate to keep as `da sync`.	2026-05-04 20:05:06 +02:00
ZdenekSrotyr	1563b05f2e	refactor(cli): hard-cutover env vars + config dir to AGNES_* Task 0.5 of clean-analyst-bootstrap. Greenfield rewrite — no fallback, no aliases. Existing dev environments lose their cached PAT and must re-authenticate. Env var renames (hard cutover): - DA_CONFIG_DIR -> AGNES_CONFIG_DIR - DA_SERVER -> AGNES_SERVER - DA_SERVER_URL -> AGNES_SERVER_URL (test-only stale ref, not in spec) - DA_NO_UPDATE_CHECK -> AGNES_NO_UPDATE_CHECK - DA_LOCAL_DIR -> AGNES_LOCAL_DIR - DA_TOKEN -> AGNES_TOKEN - DA_STREAM_RETRIES -> AGNES_STREAM_RETRIES Config dir rename: ~/.config/da/ -> ~/.config/agnes/ (across code, comments, docstrings, error messages, install templates, dev scripts). Stale `da X` references in CLI source (and adjacent app/, tests/): swept docstrings, comments, help text, and error messages where the verb survives the rewrite (init, pull, push, catalog, status, diagnose, auth, admin, skills, query, schema, describe, explore, disk-info, snapshot, login, logout, whoami, server, setup) and replaced `da X` with `agnes X`. Intentionally kept `da sync`, `da fetch`, `da analyst`, `da metrics` — those verbs are removed in later tasks; the legacy strings will be detected by `_LEGACY_STRINGS` (added in Task 2). Test fixes: - TestCLIVersion now asserts output starts with `agnes ` (was `da `). Test results: 2675 passed, 25 skipped (full pytest run, excluding 9 pre-existing test_db.py / test_user_management.py / test_e2e_extract.py / test_cli_binary_rename.py failures unrelated to this rename).	2026-05-04 16:35:44 +02:00
ZdenekSrotyr	4bd1919f77	fix(query): #168 review iter 5 — forbidden-table check uses registry IDs Devin Review iter #5 flagged a pre-existing class of name/id mismatch in app/api/query.py:131-136 — the SAME root cause as the bq.* RBAC issue I fixed in iter #3 (line 332/362). Devin called it out as "NOT introduced by this PR" / "might merit follow-up", but it's exactly the same security-boundary pattern this PR is hardening, so fixing here keeps the RBAC story consistent across the handler. The `forbidden = all_views - set(allowed)` comparison mixed types: - `all_views` carries DuckDB master view names (= registry display `name` from the orchestrator's CREATE VIEW) - `set(allowed)` carries registry IDs (resource_grants.resource_id) When `id != name` (e.g. id="bq.finance.ue", name="ue"), authorized users got spurious 403s — the view name landed in `forbidden` even though the caller had a valid grant on the registry id. Build a name->id map from the registry, then the forbidden check compares apples to apples: allowed_view_names = {r["name"] for r in registry_rows if r.get("name") and r.get("id") in allowed_ids} forbidden = all_views - allowed_view_names 107 affected tests pass; 487 pass in wider RBAC/query/access/admin domain — no regressions.	2026-05-04 14:18:43 +02:00
ZdenekSrotyr	28aba4c1f9	fix(query): #168 review iter 3 — RBAC name-vs-id, placeholder dead code Devin Review iter #3 found 3 new real bugs after iter #2's fixes landed. 🔴 RBAC check at app/api/query.py:362 used `row["name"]` against `accessible_set`, but `accessible_set` is keyed by registry IDs (`get_accessible_tables` returns `resource_grants.resource_id` — table IDs, not display names). Confirmed by `_table_blocks` projection at `app/resource_types.py:157-158`. When `id != name` (e.g. `id="bq.finance.ue", name="ue"`), non-admin users with valid grants got 403 `bq_path_access_denied`. Switch to `row["id"]`. 🚩 Bare-name pass at app/api/query.py:332 had the same name-vs-id mismatch (different impact): legitimate accessible rows were skipped from `dry_run_set`, so the cost guardrail under-counted scan bytes for non-admin users. Could let an over-cap query through and under-bill quota. Switch to `row_id` comparison. 🟡 `placeholder_from` for billing_project was dead code. `_BQ_OPTIONAL_FIELD_DEFAULTS["billing_project"] = ""` seeded an empty string into every GET payload via `_ensure_bq_optional_fields`. JS `isUnset = (value === undefined)` evaluated False, so the `(defaults to <project>)` placeholder NEVER rendered. Drop the seed — field stays in `known_fields` (UI sees it) but routes through the unset rendering path on GET, where placeholder_from fires. Tests: test_get_surfaces_bq_fields_even_when_unset assertion flipped from "billing_project IS present" to "billing_project NOT auto-seeded" to lock in the new shape. 67 affected tests pass.	2026-05-04 13:51:36 +02:00
ZdenekSrotyr	5eaa449fcc	fix(query): #168 review iter 2 — quota user_id parity + concurrent-slot 429 Devin Review iter #2 found 2 new issues (after iter #1's 5 fixes landed). Both real, both addressed. 🔴 Quota user_id key mismatch defeated shared daily budget. /api/query computed `user.get("id") or user.get("email")` while /api/v2/scan uses `user.get("email") or "anon"` (app/api/v2_scan.py:327). Same user → two different keys in the singleton QuotaTracker. BQ bytes consumed via /api/query were tracked under UUID; via /api/v2/scan under email; the `check_daily_budget` pre-flight on either endpoint never saw the other's recorded bytes — per-user cap was effectively doubled. Match v2/scan's email-first ordering. 🟡 QuotaExceededError(KIND_CONCURRENT) → 400 instead of 429. `quota.acquire(user_id)` raises this from __enter__ when the per-user concurrent-scan slot is at cap. The exception propagated through the @contextlib.contextmanager generator, the caller's `with guard:` block, and was caught by execute_query's generic `except Exception` handler → mapped to 400 with a flattened "Query error: concurrent_scans: N/M" string, dropping the typed retry_after_seconds field. Wrap the `with quota.acquire(...)` in a try/except QuotaExceededError that maps to 429 with the same typed-detail shape used for the daily-budget rejection — consistent with /api/v2/scan:392-402. Tests: test_api_query_quota.py user_id strings updated to "admin@test.com" (the seeded_app admin's email) to match the new email-first ordering. 40 affected tests pass.	2026-05-04 13:38:31 +02:00
ZdenekSrotyr	1263b80726	fix(query): #168 review — concurrent-slot wraps execute, doc/JS fixes Devin Review on PR #168 found 5 issues — all real, all addressed. 🚩 ANALYSIS_001 (architectural): concurrent-slot guard didn't protect actual BQ query execution. Earlier `_enforce_remote_bq_quota_and_cap` ran dry-run + cap check inside `with quota.acquire(user_id):`, then returned — releasing the slot BEFORE `analytics.execute(...)` ran. Spec §4.3.3 explicitly designs the slot to wrap execute so the per-user concurrent cap limits BQ scans, not just dry-runs. Refactor to a context manager `_bq_quota_and_cap_guard`. Caller's `with` block now holds the slot through dry-run, cap check, the actual `analytics.execute(...)` (which is what triggers the BQ scan when DuckDB resolves the master view), AND the post-flight record_bytes. Slot released only when caller's `with` body exits. 🟡 BUG_001: placeholder JS walked `original` (full GET payload root) instead of `original.sections`. `placeholder_from: ["data_source", "bigquery", "project"]` is a section-relative path, so billing_project placeholder NEVER rendered. Fix: walk `original.sections` (with fallback to `original` for safety). 🟡 BUG_002 + BUG_003: admin_tables.html register and edit modals' operator help text referenced `max_bytes_per_remote_query` (the old name from the spec) but the actual config key is `bq_max_scan_bytes` after the fix-up commit `6423888d` moved it. Replace both occurrences. 🟡 BUG_004: CHANGELOG entry said `api.query.bq_max_scan_bytes` (the old path) but the read at app/api/query.py:53 is `get_value("data_source", "bigquery", "bq_max_scan_bytes", ...)`. An operator who set it under `api.query` in their yaml would have no effect. Correct path in CHANGELOG. All 95 #160-affected tests pass after the changes.	2026-05-04 13:28:03 +02:00
ZdenekSrotyr	6423888d02	fix(query): #160 move bq_max_scan_bytes to data_source.bigquery (UI editable) E2E test on dev VM revealed: spec said "configurable via /admin/server-config" for the cost guardrail cap, but the underlying read path was `api.query.bq_max_scan_bytes` and `api` is NOT in `_EDITABLE_SECTIONS`. POST to /admin/server-config rejected `{"sections":{"api":...}}` as "unknown section(s): api" — the cap was only adjustable via direct YAML edit. Move to `data_source.bigquery.bq_max_scan_bytes`: - `_default_remote_query_cap_bytes()` reads from the new path. - Add to `_OPTIONAL_FIELDS["data_source"]["bigquery"]["fields"]` with the same shape as `max_bytes_per_materialize` (kind=int, default 5 GiB, hint). - Add to `_BQ_OPTIONAL_FIELD_DEFAULTS` so it surfaces in the GET payload even when YAML omits it. Convention now mirrors `max_bytes_per_materialize` — both BQ cost guardrails live under `data_source.bigquery`, both editable in the UI.	2026-05-04 12:46:38 +02:00
ZdenekSrotyr	77cdb65f76	sec(query): #160 BQ_PATH catches quoted "bq" catalog token (Phase 3 review) Phase 3 review identified an RBAC + cost-cap bypass: `SELECT * FROM "bq"."ds"."tbl"` (catalog token quoted as a DuckDB identifier) was NOT matched by the BQ_PATH regex, so direct quoted-form references skipped both the registry check and the cost-cap dry-run. DuckDB resolves `"bq"` to the same ATTACHed BQ catalog, so the bypass is real. Widen the catalog-token alternation: `(?:"bq"\|bq)` matches both forms. Negative lookbehind `(?<![\w.])` still rejects look-alike prefixes (`other_bq`, `my_bq`); the new "my_bq".ds.tbl negative test locks that in alongside `other_bq.ds.tbl`. Tests: - 2 new positive cases in tests/test_query_bq_regex.py for the quoted form (`"bq"."finance"."ue"` and uppercase `"BQ"."ds"."tbl"`). - 1 new negative case rejecting `"my_bq".ds.tbl` so the quoted-form widening doesn't open a different evasion. - 1 new RBAC test in tests/test_api_query_rbac_bq_path.py: admin hitting an unregistered quoted path returns the same bq_path_not_registered 403 as the unquoted form. All 33 Phase 3 tests pass after the fix.	2026-05-04 10:31:35 +02:00
ZdenekSrotyr	896c43c7a2	feat(query): #160 cost guardrail + bq.* RBAC + quota integration on /api/query The headline implementation for issue #160. POST /api/query now gates direct `bq."<dataset>"."<source_table>"` references behind the registry and bounds the BQ scan cost behind a configurable cap. Wired through the same singleton QuotaTracker as /api/v2/scan so daily-byte budgets are shared across both BQ-touching paths. Changes in app/api/query.py: - Add module-level `BQ_PATH` regex matching the 16 syntax variants verified empirically (fully-quoted, unquoted, mixed quoting, case-insensitive, inside CTE bodies, multi-path, …). - Add `bigquery_query` to the SQL keyword blocklist. Closes the pre-existing function-call backdoor where a user could run an arbitrary BQ jobs API call against any reachable dataset, bypassing the registry and RBAC. Wrap views internal to the BQ extractor still use bigquery_query() — but those run via DuckDB view resolution at query time, not via user-submitted SQL, so the blocklist doesn't break them. - Add `_bq_guardrail_inputs` helper: walks user SQL twice — once for bare-name matches against accessible registered remote-BQ names (contributes to dry_run_set), once for direct `bq.X.Y` matches (gated against `find_by_bq_path` lookups, returns 403 with structured detail on miss or grant violation). - Add `_enforce_remote_bq_quota_and_cap` helper: pre-flight `check_daily_budget` (over-cap → 429), then `with quota.acquire(...)` wraps a per-path BQ dry-run, sums bytes, raises 400 `remote_scan_too_large` when total > cap. - Cap default 5 GiB; configurable via `api.query.bq_max_scan_bytes` in /admin/server-config (next phase wires the UI). - Post-flight `record_bytes` against the user's daily counter. - Module-level imports of `_bq_dry_run_bytes`, `_build_quota_tracker`, `get_bq_access` so tests can monkeypatch via `app.api.query.<name>`. Tests: - All 23 RED tests from the previous commit now pass (regex matrix, blocklist with detail-string assertion, RBAC unregistered/admin-bypass, guardrail dry-run-called/over-cap-rejected, quota pre-flight 429). - mock_dry_run fixture stubs both `_bq_dry_run_bytes` and `get_bq_access` so guardrail tests don't require a live BQ project. - Quota test uses `admin1` (the seeded_app fixture's actual user id, not `admin`). Smoke: 887 passed across query/bq/admin/extractor/registry/quota domains. No regressions.	2026-05-04 10:31:35 +02:00
ZdenekSrotyr	dc03837a7b	feat(query-api): better error message when --remote query references a materialized-but-not-rebuilt id E2E sub-agent finding: `da query --remote "SELECT * FROM <id>"` against a materialized table that hasn't yet been rebuilt in the server's analytics.duckdb returns a confusing DuckDB "Table does not exist" message even though the table is in the registry. Materialized rows produce parquets at `${DATA_DIR}/extracts/<source>/data/<id>.parquet`, but the orchestrator's master-view creation is `_meta`-driven — fresh instances or pre-tick states have the registry row without a corresponding view, so analysts hit the bare "does not exist" with no path forward. Improve the error rendering in `app/api/query.py:execute_query`. When DuckDB raises a "table does not exist" error, scan the registry for any `query_mode='materialized'` row whose id or name appears in the failed SQL. On a hit, return a 400 whose detail names the table, explains the materialize state, and offers two concrete next steps: 1. Run `da sync` (or wait for the scheduler tick / hit POST /api/sync/trigger) to materialize the parquet, OR 2. Query the source directly via the catalog alias when the registry row carries bucket+source_table (e.g. `bq."dataset"."table"` for BigQuery, `kbc."bucket"."table"` for Keboola). Detection is bounded — the registry round-trip only fires when DuckDB's error mentions a missing table, so happy-path queries pay no cost. Non-materialized unknowns fall through to DuckDB's raw error. 2 new tests: materialized id surfaces the hint with the bucket+source_table payload; unknown table falls back to the generic error path with no false positive on the new hint.	2026-05-01 23:09:52 +02:00
ZdenekSrotyr	2e1dfb7553	feat(v2): claude-driven fetch primitives + 0.14.0 (#102 ) Replaces the BigQuery wrap-view pattern with a discovery + scoped-fetch toolkit driven by the analyst's Claude session. Adds /api/v2/{catalog,schema,sample,scan,scan/estimate}, da catalog/schema/describe/fetch/snapshot/disk-info CLI commands, sqlglot-backed WHERE validator, process-local quota tracker, agent rails skill (cli/skills/agnes-data-querying.md). BREAKING: BQ wrap views off by default — set data_source.bigquery.legacy_wrap_views=true for one cycle. Backward-compat field_validator on primary_key. Catalog cache now matches documented 300s TTL with RBAC fresh per request. Cuts release v0.14.0.	2026-04-29 01:07:19 +02:00
ZdenekSrotyr	55515266ea	fix: block DuckDB metadata functions and relative paths in query endpoint Add information_schema, duckdb_* introspection functions, pragma_* functions, and relative path traversal patterns to the SQL blocklist so users cannot enumerate schema metadata regardless of RBAC. Add six corresponding tests.	2026-04-09 16:29:11 +02:00
ZdenekSrotyr	1b3acce7e9	fix: replace substring table access check with word-boundary regex Replace substring matching with word-boundary regex in query endpoint's table access validation. Prevents false positives where short table names like 'id' would block any query containing the word. Uses re.escape() to safely handle special characters in table names. - Import re module at top - Use regex pattern with word boundaries (\b) for matching - Add tests to verify no false positives and proper blocking	2026-04-09 07:00:48 +02:00
ZdenekSrotyr	23ae6a602c	security: harden query endpoint SQL blocklist and disable external access Expand blocked keywords to cover parquet_scan, read_csv_auto, query_table, iceberg_scan, delta_scan, call, URL schemes (http/https/s3/gcs), and additional file-scan functions. Set enable_external_access=false on the non-read-only analytics connection path. Add three new tests covering parquet_scan, read_csv_auto, and query_table blocking.	2026-04-09 06:54:58 +02:00
ZdenekSrotyr	05a1b452e9	security: harden query (read-only DB), uploads (path sanitization), scripts (AST validation)	2026-04-08 12:09:19 +02:00
ZdenekSrotyr	1074d5ec49	feat: implement data access control — table-level permissions Schema v3: add is_public column to table_registry (default true). src/rbac.py: can_access_table() checks admin bypass, public flag, explicit permissions, wildcard bucket permissions. API enforcement: - manifest: filters tables by user access - download: 403 if no access - catalog: filters table list - query: validates referenced tables against allowed list New admin permissions API (/api/admin/permissions) for grant/revoke. 28 access control tests + 733 total tests passing.	2026-03-31 12:33:31 +02:00
ZdenekSrotyr	c5527ec153	fix: harden script sandbox and SQL query security Fixes found by E2E QA agent: - Script sandbox: block os, sys, socket, eval, exec, open, __import__, getattr, pathlib and 20+ other dangerous patterns - SQL query: block COPY, ATTACH, read_csv, semicolons, non-SELECT - Added 24 security tests covering all attack vectors	2026-03-27 16:11:05 +01:00
ZdenekSrotyr	a3918d3833	feat: add FastAPI server with auth, RBAC, and all API endpoints - JWT auth with role-based access control (viewer/analyst/admin/km_admin) - Endpoints: health, sync manifest, data download, query, users CRUD, corporate memory, session/artifact upload - 18 API tests covering auth, RBAC, all endpoints	2026-03-27 15:19:18 +01:00

39 commits