* System plugin tier with mark/unmark fanout (schema v39)
Adds a mandatory plugin tier so admins can pin a small set of curated
plugins into every user's stack from day one. Marking a plugin via the
new toggle on /admin/marketplaces materializes resource_grants for every
group and user_plugin_optouts subscriptions for every user, so the
existing resolver pulls the plugin into every served set without a new
filter layer. Hooks on user-create (Google OAuth, magic-link, admin
POST, scheduler) and group-create propagate the same materialization to
new principals. UI locks: /admin/access disables the checkbox with a
SYSTEM pill; /marketplace cards swap the "In stack" green pill for an
amber "Required" badge with shield icon; the plugin detail install
button reads "Required by your org"; /my-ai-stack toggle is disabled.
Bypass paths return 409 (DELETE /api/admin/grants for system grants,
PUT /api/my-stack/curated/.../{enabled:false}, DELETE
/api/marketplace/curated/.../install). Unmark only flips the flag —
materialized rows persist so admins curate cleanup at their leisure
through the now-unlocked /admin/access checkboxes.
* Marketplace UX polish + drop legacy /store and /my-ai-stack pages
Two-part cleanup post-v39:
(1) Page deletion. /store and /my-ai-stack were already replaced by
/marketplace?tab=flea and /marketplace?tab=my respectively, but the
standalone routes lingered. Hard delete in dev mode — no redirects,
stale bookmarks 404. The /store/new upload wizard, the flea
detail/edit pages, the admin queue, and all /api/store/* +
/api/my-stack endpoints (CLI consumers) stay. Internal hardcoded
hrefs in the upload wizard's Cancel button and the advanced-setup
page repointed to the marketplace tabs.
(2) Detail-page install button rework. The single button that morphed
between "+ Add to my stack" and "✓ In your stack" did not
communicate uninstall affordance. The installed state now renders an
inline white status label *before* a separate red-bordered
"✕ Remove from stack" button on the same row, both at identical
height to avoid layout shift. System plugins keep their locked amber
"✓ Required by your org" pill (no Remove button — API refuses 409).
The post-action hint panel now fires on remove too with the title
flipped to "✓ Removed from your stack" — Claude Code needs the same
/update-agnes-plugins refresh either way.
Also: /admin/marketplaces Details modal "Mark as system" toggle
redesigned. The button was near-invisible (matched neutral row
metadata). It's now a balanced amber-toned chip with shield icon
and a structured confirm modal replacing the native confirm() dialog
that summarizes fanout consequences before commit.
* Move stack-hint inside hero with glass-on-gradient styling
The post-action hint card ("✓ Added to your stack" with the
/update-agnes-plugins recipe) used to live below the hero in
panel-what (gray card on white page body). Clicking add/remove
inserted/removed it between the hero and content, shifting the
panels below — a noticeable scroll jump.
The hint is now anchored inside the hero's top-right corner alongside
the install/remove buttons, both as flex children of an absolutely
positioned .actions container. The card uses a translucent
white-on-glass treatment that adopts the hero's kind color (blue for
plugin, green for skill, purple for agent) without per-kind branching.
Hero is always tall enough (160px photo) to contain the action+hint
stack without overflow, so toggling the hint visibility doesn't grow
the hero or shift body content.
The hero-head grid reserves a third 300px column for the absolute
actions overlay so meta gets the proper 1fr free space instead of
being squeezed by a padding-right hack. Responsive breakpoint at
1100px reflows the actions stack below hero-head when the viewport
isn't wide enough to keep meta + actions side-by-side comfortably.
* Add optional -DataPath bind mount to run-local-dev.ps1
When the operator wants to inspect DuckDB files (system.duckdb, extracts,
marketplaces, store/, …) directly from Windows Explorer, the named volume
inside the Docker Desktop WSL VM isn't reachable. The new -DataPath param
generates a transient compose override that rebinds /data on app, scheduler,
extract (and Caddy's /srv:ro mirror) to a Windows host folder.
Fully additive — when -DataPath is omitted everything behaves exactly as
before: no override file is generated, $composeFiles array is unchanged,
finally cleanup is a no-op. Existing positional invocations
(.\run-local-dev.ps1 up | down | logs) keep binding to $Action because
$DataPath is a named-only parameter with no Position attribute.
The override is written via [System.IO.File]::WriteAllText so the YAML is
BOM-less across PS 5.1 / 7+ — Compose rejects BOM-prefixed YAML on Windows.
The override file is unique per PID and removed in the script's finally
block so concurrent invocations and crashes don't leak files.
* factor mark_system fanout into UserCuratedSubscriptionsRepository
The endpoint imported UserCuratedSubscriptionsRepository, ignored it
(noqa: F841), then duplicated the user-side fanout SQL inline. Adds
fanout_system_for_plugin() symmetric to the existing
fanout_system_for_user() and routes mark_plugin_system through it —
removes the dead import + 14 lines of inline SQL, returns the same
`affected_users` delta count, no behavior change.
* drop customer-specific path from .ps1 example
Per CLAUDE.md vendor-agnostic OSS rule: replaced
C:\\Business\\Groupon\\Agnes\\agnes-data with the generic
C:\\Users\\<you>\\agnes-data placeholder so the docstring
example reads cleanly on any reviewer's box.
* release: 0.48.0 + parallelize Release-workflow pytest
Cuts the release shipped via #228 #230 #231 #232 #233 #234 #236 #237 #238
#239 #240 plus this PR (#241). Major changes:
- System plugin tier (schema v39) — admins mark a plugin mandatory; fans
out RBAC grants + subscriptions to every existing user/group plus
hooks for new principals
- BREAKING: removed standalone /store + /my-ai-stack page routes
(replaced by /marketplace?tab=flea + /marketplace?tab=my)
- Setup-prompt + bootstrap recovery fixes (#240)
- DuckDB CHECKPOINT-on-shutdown + 60s compose grace (#235)
- Marketplace + flea-market UX polish, agnes-metadata.json enrichment
Bonus: switch release.yml test step to `-n auto` (matches ci.yml).
Single-threaded was 15-20 min and frequently the bottleneck on PR
mergeability — now ~6 min.
---------
Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
362 KiB
Changelog
All notable changes to Agnes AI Data Analyst.
Format: Keep a Changelog. Versions follow Semantic Versioning, pre-1.0 — public surface (CLI flags, REST endpoints, instance.yaml schema, extract.duckdb contract) may shift between minor versions; breaking changes called out under Changed or Removed with the BREAKING marker.
CalVer image tags (stable-YYYY.MM.N, dev-YYYY.MM.N) are produced for every CI build; semver tags (v0.X.Y) are cut at release boundaries and reference the same commit as a stable-* tag from the same day.
[Unreleased]
[0.48.0] — 2026-05-10
Fixed
-
agnes refresh-marketplace --bootstrapnow recovers when the local marketplace clone exists but Claude Code's registry has lost theagnesentry (fresh Claude Code install on the same machine, manualclaude plugin marketplace remove agnes, or an earlier interrupted bootstrap). The previous behaviour skipped_bootstrap_clonewhenever~/.agnes/marketplace/.gitexisted and fell straight through toclaude plugin marketplace update agnes, which failed withMarketplace 'agnes' not found. Available marketplaces: claude-plugins-officialand cascaded into per-plugin install errors. The bootstrap path now parsesclaude plugin marketplace list, callsclaude plugin marketplace add ~/.agnes/marketplacewhenagnesisn't registered, and only then proceeds with fetch + reset + reconcile. Idempotent: a second bootstrap run withagnesalready registered is a no-op.In the same path,
claude plugin marketplace addfailures are now fatal instead ofwarn:-and-continue. The previous warn-and-continue was the root cause of the cascade above — the operator never saw the real error fromadd, only the downstream "Marketplace not found" symptoms.Source: 2026-05-10 init report from a clean-machine bootstrap against a private-CA Agnes deployment.
Added
-
Setup prompt always registers the
agnesClaude Code marketplace, even when the operator has zero plugin grants. Registering the per-user marketplace clone pre-wires the SessionStart hook so future admin grants land automatically on the next Claude Code session without re-running setup. The marketplace block's copy adapts: empty plugin list shows "no plugins granted yet", populated list shows "install plugins". Steps 4 (preflight) + 5 (marketplace) are now always emitted; Confirm shifts from step 6 to step 9 across the full layout. -
Setup prompt registers the Atlassian Remote MCP server unattended via
claude mcp add --transport sse atlassian https://mcp.atlassian.com/v1/sse(Fix C in the 2026-05-10 init-report response). Hosted Remote MCP, so Claude Code handles OAuth automatically the first time the operator asks it to read a Jira ticket or Confluence page — no PAT/keychain dance. Idempotent across re-runs (|| trueswallows the "server already exists" exit). Asana and Google Workspace stay on the /home connector cards because their PAT/CLI flows don't fit an unattended bootstrap. -
Setup prompt's Confirm step nudges the user toward connector cards on /home for Asana / Google Workspace / Atlassian PAT flows that the bash script can't automate. Surfaces the cards so analysts don't finish bootstrap thinking they're fully wired.
-
System plugin tier (schema v39). Admins can now mark a curated marketplace plugin as a system plugin via a new toggle in the Details modal on
/admin/marketplaces. Marking materializes aresource_grantsrow for every existing user_group and auser_plugin_optouts(subscription) row for every existing user, so the plugin lands in every user's stack from day one. Hooks on user-create (Google OAuth, email magic-link, admin-create, scheduler token) and group-create (admin POST + Google Workspace sync ensure) fan out the same materialization to new principals. The resolver itself is unchanged — system semantics emerge from the materialized rows. UI locks the corresponding controls:/admin/accesscheckbox is checked + disabled with a SYSTEM pill;/marketplacebrowse cards show a "Required" badge and the detail-page install button reads "✓ Required by your org";/my-ai-stacktoggle is disabled with a System pill. Backend guards return 409 on the bypass paths (DELETE /api/admin/grantsfor system grants,PUT /api/my-stack/curated/.../{enabled:false},DELETE /api/marketplace/curated/.../install). Unmark flips the flag only — materialized rows persist so admins curate cleanup at their leisure via the now-unlocked/admin/accesscheckboxes. Endpoints:POST/DELETE /api/marketplaces/{id}/plugins/{name}/system. -
/update-agnes-pluginsslash command — installed automatically byagnes initinto<workspace>/.claude/commands/. Runsagnes refresh-marketplace(the chatty default mode) so the user sees install/update progress streamed into the Claude Code transcript and can react to errors interactively, instead of having a full reconcile happen silently behind a SessionStart hook. -
agnes refresh-marketplace --check— lightweight detector mode for the SessionStart hook. Runsgit fetchonly, compares localHEADwith remoteFETCH_HEAD, and emits a Claude Code hook JSON message pointing the user at/update-agnes-pluginswhen there are remote changes. Silent when up to date. Nogit reset, noclaude plugin marketplace update, no plugin install/update side effects. -
Flea-market entity edit feature with version history (schema v38). Owner + admin can now edit a store entity from a real Edit page at
/marketplace/flea/{id}/edit(replaces the prior "coming soon" placeholder). Editable fields: display name, description, category, video URL, cover photo, and an optional new bundle. Type is locked (400type_lockedon change attempt). Display-name change renames the on-disk slug for both the liveplugin/dir and the version dir, mirroring the rename-on-archive flow.Each bundle update creates a new version: bytes bake into
${DATA_DIR}/store/<id>/versions/v<N+1>/plugin/, run the standard guardrails pipeline. Deferred promotion: the liveplugin/dir andentity.version_nostay at the prior approved version through the LLM review window, so existing installers keep receiving the previously approved bundle while the new version is being validated. Promotion (live swap + version_no/version/file_size bump) happens only on LLM approval; if the new version is blocked, installers continue serving the prior approved version indefinitely. The entity row carriesversion_no(current served index) andversion_historyJSON (append-only per-version metadata: hash, sha256, size, submission_id, created_at, created_by). Existing entities backfill to v1 with a single-entry history seeded from the row's currentversionhash.Block-while-pending: an in-flight LLM review blocks any further edit with 409
prior_version_pending. Owner waits ~5-30s; the detail page Edit button renders disabled in the same window.Rollback: new endpoint
POST /api/store/entities/{id}/versions/{n}/restore(owner + admin) copies a prior version's bundle forward as v<max+1> and re-runs guardrails. Forward-only history — the original row keeps its verdict; the new copy gets a fresh one. Detail page renders a Versions card with restore buttons for owner/admin only.Admin queue gains a
v#column (with "current" badge) and a separate Hash column. Submission detail page surfaces Version + Bundle hash rows. Activity timeline splits into per-submission + entity-wide cards so admins can tell version-scoped events apart from entity-wide ones; entity-wide rows rendervNchips when the audit row's params reference a version.
Changed
-
CLAUDE.md template renames the marketplace section to "Agnes Marketplace — plugins available to you" and clarifies that Claude Code addresses every plugin as
<plugin>@agnesregardless of upstream marketplace slug — the per-user aggregated marketplace name is alwaysagnes. Resolves the naming-drift confusion flagged in the 2026-05-10 init report (CLAUDE.md previously rendered upstream marketplace registry names like<Org> Marketplace/<org>-marketplacewithout explaining the typed name is alwaysagnes). Upstream marketplace names still render as nested bullets so admins see what's been folded in. -
SessionStart marketplace hook is now read-only. The hook installed by
agnes initwas previouslyagnes refresh-marketplace --quiet, which performed a full fetch+reset+install cycle on every session start (slow, invisible to the user, not interactively recoverable). It now runsagnes refresh-marketplace --check— detect-only — and surfaces a hint to run/update-agnes-pluginswhen updates are available. Existing workspaces auto-upgrade on nextagnes init(the substring markeragnes refresh-marketplacematches both the old and new entry shapes, so the idempotent-replace path correctly rewrites them). -
Marketplace "Added to your stack" hint points at
/update-agnes-plugins. The post-install green panel on plugin and skill/agent detail pages used to suggestagnes refresh-marketplacein a shell prompt and reference the SessionStart auto-install. With the hook now being detect-only, that text was outdated. The hint is condensed to a single instruction — open a new Claude Code session and run/update-agnes-plugins— with the slash command in a copy chip. Affectsmarketplace_plugin_detail.htmlandmarketplace_item_detail.html. -
Plugin / skill / agent detail page install button split into two elements when in stack. The single button that morphed between
+ Add to my stackand✓ In your stackdid not communicate the uninstall affordance — clicking the green "In your stack" button silently removed the plugin with no visible signal that the click meant "remove". The installed state now renders an inline white status label✓ In your stackbefore a separate red-bordered✕ Remove from stackbutton on the same row. Both buttons share the install button's exact height to avoid layout shift on toggle. System plugins still render the locked amber pill✓ Required by your orgwith no Remove button (API refuses uninstall with 409). The post-action hint panel now also fires on remove with the title flipped to✓ Removed from your stack— Claude Code needs the same/update-agnes-pluginsrefresh either way. -
/admin/marketplacesDetails modal "Mark as system" toggle redesigned. The toggle button was previously near-invisible — same border + neutral-gray text as surrounding row metadata. It now renders as a balanced amber-toned chip with a shield icon: outlined white when the plugin is off-system (calls attention without shouting), tinted amber-50 when on-system (reads as "currently active, click to revert"). The nativeconfirm()dialog is replaced with a structured modal that summarizes the fanout consequences (RBAC grants for every group, subscriptions for every user, locked in user-facing UI, new principals inherit it).
Removed
-
BREAKING:
/storeand/my-ai-stackpage routes deleted. Both surfaces are fully replaced by/marketplace?tab=fleaand/marketplace?tab=myrespectively, which already render the same data via the unified marketplace tabs. Hard delete with no redirects — stale bookmarks 404. The upload wizard at/store/new, the flea detail/edit at/marketplace/flea/{id}[/edit], the admin queue at/admin/store/submissions, and all/api/store/*+/api/my-stackendpoints stay untouched. Theagnes my-stackCLI subcommand andagnes storeare unaffected. Internal hard-coded hrefs (advanced setup page, store upload-wizard Cancel button, admin marketplaces modal copy, navbar active-state guard) repointed to the new tab URLs. -
BREAKING:
agnes refresh-marketplace --quietflag. Replaced by--check(detect-only) and the new/update-agnes-pluginsslash command (interactive update). Existing SessionStart hooks calling--quietwill silent-noop after the CLI upgrade — the hook's2>/dev/null || trueswallows the unknown-flag error — until the user re-runsagnes init, which rewrites the hook to use--checkand installs the slash command. Dashboard/setupflow re-runsagnes initautomatically on next paste. -
BREAKING: legacy
git config --global http.<host>.sslVerify=falsedowngrade in the install setup prompt. The marketplace step (step 5) used to emit this line onAGNES_DEBUG_AUTH=1instances when noca_pemwas readable fromAGNES_TLS_FULLCHAIN_PATH(default/data/state/certs/fullchain.pem). It tripped Claude Code auto-mode classifiers ("do not disable TLS verification" rule) and silently masked operator misconfigurations — a debug-auth instance without a fullchain on disk would fall through to a TLS-disabled clone instead of surfacing the missing cert. With this change there is exactly one trust-bootstrap path: the cross-platform step 0 trust block (gated on_read_agnes_ca_pemreturning a PEM). Operators serving a self-signed or private-CA cert MUST place the fullchain at the configured path so step 0 picks it up; publicly-trusted certs need no trust block at all. Theself_signed_tlsparameter onapp.web.setup_instructions.resolve_linesandrender_setup_instructionsis also dropped (was only consumed by the deleted block).
Fixed
v34→v35migration is now idempotent under partial-rebuild recovery. The original list-form_V34_TO_V35_MIGRATIONSran four ALTER statements in sequence:ADD _vis_v35→UPDATE _vis_v35 = visibility_status→DROP visibility_status→RENAME _vis_v35 TO visibility_status. If the RENAME failed for any reason after the DROP succeeded (DuckDB lock contention at startup, scheduler-vs-app race openingsystem.duckdb, container kill mid-migration, …), the DB was stranded with_vis_v35populated andvisibility_statusmissing — andschema_versionnever bumped because the UPDATE at the bottom of the migration ladder only runs when every step succeeds. Subsequent restarts then hitDROP visibility_statusagain with noIF EXISTSguard and looped on the same error; the only recovery was hand-editing the DB. The migration is rewritten as a Python function_v34_to_v35_migratethat inspects the table's columns up front and dispatches into one of three paths: clean v34 (run the full rebuild), partial v35 with_vis_v35only (finish the RENAME alone), or both columns present (drop the temp). The audit columns (archived_at,archived_by) ship first behindIF NOT EXISTSso they're safe in all states. Operators stranded by the original bug recover automatically on next startup. Tests cover the three direct paths plus an end-to-end scenario where_ensure_schemawalks aschema_version=32DB with the half-applied state up through to v36.
Security
-
Prompt-injection hardening for store guardrails LLM review (#1).
SYSTEM_PROMPTis now passed via the Anthropic SDK's dedicatedsystem=parameter instead of being concatenated into the user message. Bundle file contents are wrapped in<bundle>...</bundle>sentinels that the system prompt declares data-only; literal sentinel strings appearing in user content are escaped (<_bundle_>) so an adversarial README can't forge a closing tag and inject instructions. The system prompt explicitly tells the reviewer to flag injection attempts inside<bundle>rather than follow them. Seetests/test_store_guardrails_prompt_injection.pyfor the corpus. -
Static security scan documented as signal, not gate (#6 partial). Module docstring + admin-queue copy +
docs/STORE_GUARDRAILS.mdcall out that substring matches are suggestive only — the LLM verdict carries the safety determination. Documentation files (.md,.txt,.rst,.html,.json,.yaml,.yml,.toml) now skip static scan to avoid false positives on prose that legitimately discusseseval/exec. AST-mode for Python source is tracked as a follow-up.
Added
-
Stuck-review reaper (schema v35 + new endpoint).
POST /api/admin/run-reap-stuck-reviewsflips submissions stuck atstatus='pending_llm'past the configured grace (guardrails.stuck_review_grace_seconds, default 1800s) toreview_error. Scheduler invokes every 15 min. Without this a worker crash between status flip and verdict write left rows pending forever. Set the knob to 0 to disable. -
PUT /api/store/entities/{id} atomic rename (#2). Bundle updates now bake into a sibling
plugin.staging-<rand>/dir, run inline checks against the staging copy, then atomic- rename onto the live path on success. Failed checks leave the live tree byte-for-byte intact. Pre-fix the bake wrote into the live path BEFORE checks ran; concurrent GETs could see partial / unverified content. -
Schema v35 → v36 re-applies
NOT NULL+DEFAULT 'pending'onstore_entities.visibility_status(lost in the v34→v35 column rebuild). Value-list invariant remains application-side enforced via the repo whitelist (DuckDBADD CHECKon existing columns is not supported).
Changed
-
BG-task verdict-vs-archive race fixed (#3).
StoreEntitiesRepository.set_visibility_if_pendingflips visibility only when the row is still in the review window (pending/hidden). When an admin archives an entity while the LLM review is in flight, the BG verdict no longer clobbers the archive — admin's decision wins. Skipped flips emit astore.submission.bg_verdict_skippedaudit row so admins can see why an "approved" verdict didn't publish. -
Quota counter widened to all reject states (#9).
count_blocked_for_submitter_sincenow countsblocked_inline,blocked_llm, ANDreview_erroragainst the per-submitter daily cap. Pre-fix a bot triggering only LLM-blocked verdicts was unbounded. -
Un-archive clears archive metadata (#11).
set_visibilitynullsarchived_at+archived_bywhen transitioning OUT of'archived'so a future read doesn't show stale archive forensics on an approved row. -
Missing
risk_levelsurfaces asreview_error(#10). An LLM response that omits or emptiesrisk_levelno longer defaults tomedium(which looked like a model decision and silently blocked); it persists asreview_errorwitherror='missing_risk_level'so the admin gets a real Retry button. -
Sort-key whitelist for admin queue (#23).
/api/admin/store/submissions?sort=…rejects unknown keys with HTTP 400invalid_sort_key. Pre-fix a substring-replace chain could drop column references silently when one column name was a substring of another. -
FSM doc comment in
_SYSTEM_SCHEMAcorrected (#12). Explicit insert/transition/lifecycle sections describe the actual status machine instead of the misleadingpending → pending_llm → ...chain.pending_inlineclarified as reserved-but-unused. -
Soft delete (Archive) for store entities (schema v35).
DELETE /api/store/entities/{id}is now soft by default — flipsvisibility_status='archived'+ stampsarchived_at/archived_by. Bundle stays on disk, existinguser_store_installscontinue serving the bundle throughmarketplace.zip/.gitso already-installed users don't lose the plugin. Browse listings hide archived entries from everyone (including the owner — admins triage). New installs refused. My AI Stack still shows installed-but-archived entries with a subtle "Archived by owner" badge.Hard delete moves to
DELETE /api/store/entities/{id}?hard=true— admin-only. Drops the bundle bytes + cascades to removeuser_store_installs(existing users lose the plugin on next sync). Use only for legal / privacy removals where the bytes have to go.Detail-page UX: owner of an approved entity sees an Archive button. Admin sees both Archive and a separate red Hard delete (admin) button with an install-count warning in the confirm dialog. Quarantined (pending / blocked) entities lock both buttons for the owner — admin still sees both.
Visibility-leak gates (similar audit):
/api/store/owners+/api/marketplace/categories?tab=fleanow filter tovisibility_status='approved'for non-admin callers (admin sees all). Without this, owner identity + per-category counts of quarantined or archived entries leaked through the public dropdown / filter chips.
Changed
-
Rename-on-archive frees the name for re-upload. Archiving an entity now appends
__archived__<epoch>tostore_entities.namein the same UPDATE that flipsvisibility_status='archived'. The on-disk skill / agent / plugin subdir is renamed in lockstep (skills/<old_suffix>/→skills/<new_suffix>/) and SKILL.md / agent.md / plugin.json frontmatternameis rewritten so consumers' Claude Code resolves the new slug after their next sync. The(owner_user_id, name)UNIQUE slot AND the global<name>-by-<owner_username>invocation slot free up, so the same owner can re-upload under the original name without picking a new one. Admin un-archive (set_visibility from 'archived' to 'approved') strips the suffix; if the original slot is taken by a re-upload, the un-archived row gets<name>-restored-N. Display layer (admin queue, my-stack, marketplace cards / detail) strips the suffix so users see the original label with an "Archived" badge instead of the marker. Trade-off: existing installers see the plugin renamed on next pull and need to re-add (one-tap recovery via the My AI Stack card; same data, new slug).audit_log.params['original_name']preserves forensic traceability. -
Admin submissions queue: Archived chip filters live entity visibility via LEFT JOIN, not denormalized submission status. Verdict (
store_submissions.status) is immutable forensic record; lifecycle (store_entities.visibility_status) is the live source of truth. Any code path that flips visibility now surfaces in the queue immediately — no denormalization to drift. Deleted chip still filtersentity_id IS NULL AND status='deleted'(entity row is gone after hard delete; explicit marker required). The submission detail page renders Status (verdict) and Entity lifecycle side by side. Closes the bug where archiving an entity outside the soft-delete API didn't surface under?status=archived. -
Consolidated
/store/{id}into/marketplace/flea/{id}. The legacy detail surface is gone; the unified marketplace detail page is the canonical home for every flea entity. Three in-tree callers (upload-success redirect, My AI Stack card href, /store browse card href) now point straight at the new URL — no redirect hop. Stale external/store/{id}bookmarks 404. The marketplace detail templates (marketplace_plugin_detail.html+marketplace_item_detail.html) gained the quarantine banner (extracted into a shared_quarantine_banner.htmlpartial), an owner-actions strip (Edit "coming soon" + Delete with locked variants), and the install-button gating (gray inert when non-approved). The marketplace listing now surfaces a small "Under review" / "Quarantined" corner badge on the submitter's own non-approved cards (only visible to them; everyone else still sees only approved entries).
Added
- Visibility gate on
/marketplace/flea/{id}+/api/marketplace/flea/{id}/detail. Non-owner non-admin gets 404 (not 403, no leak) on any non-approved entity — closes the bypass where guessing an entity_id pulled the bundle metadata through the marketplace JSON feed even though the entity was excluded from the public listing. StoreEntitiesRepository.list(include_owner_id=…). When set, the WHERE expands to(visibility_status IN (...) OR owner_user_id = :uid)so the caller's own non-approved entries surface alongside everyone's approved ones. Used by/api/store/entitiesand/api/marketplace/items?tab=flea.
Removed
/store/{id}route +store_detail.htmltemplate. Replaced by the consolidated marketplace detail surface above.
Removed
store_submissions.retry_countcolumn (schema v34). Counter mixed two unrelated things (LLM error count + admin rescan count), was asymmetric (Retry LLM didn't bump but Rescan did), and is fully redundant with the audit_log activity timeline now rendered on the detail page — every rescan / retry / review_error is a row there with timestamp + actor. Removed from schema, repo signatures, admin endpoints, and the detail-page metadata.
Internal
- Migrate
src/marketplace_asset_mirror.pyfromurllib.requesttohttpx(PR #234 review #16). The asset mirror was the only HTTP call site in Agnes still usingurllib.request; every other module (CLI, Jira / OpenMetadata / OpenAI connectors, scheduler, Telegram bot) already usedhttpx. Following the existing convention has three concrete benefits here: (a) the SSRF defence collapses from five urllib classes (_PinnedHTTPConnection,_PinnedHTTPSConnection,_PinnedHTTPHandler,_PinnedHTTPSHandler,_SafeRedirectHandler) into a single_SSRFGuardTransportbecause httpx invokeshandle_request()on every redirect hop, so re-validation is automatic; (b) the per-leg URL host is rewritten to the SSRF-validated IP and the original hostname is preserved in theHostheader +sni_hostnameextension, defeating DNS rebinding without subclassingHTTPConnection/HTTPSConnection; (c) error handling collapses fromURLError+HTTPError+ manual unwrap into onehttpx.HTTPErrorcatch + specific subclasses for timeout / too-many-redirects, matching the_translate_transport_errorshape fromcli/client.py. The sharedhttpx.Clientis built lazily at module load (same pattern ascli/client.py:_get_shared_client) withfollow_redirects=True,max_redirects=5, and our custom transport. Externally observable behaviour is unchanged: sameFetchOutcomestatuses (ok / not_modified / failed / rejected), same manifest format, same conditional GET semantics. Tests migrated fromurllib-shaped fakes tohttpx-shaped (status_code,iter_bytes, context manager); five urllib-specific tests replaced with httpx equivalents (transport unit tests + DNS-rebinding integration test). - Maintainability cleanup batch (PR #234 review #10, #14, #11). #10: dropped
_path_underfromapp/api/marketplace.py— it was a byte-equivalent clone of_safe_join(samePath.resolve(strict=True) + relative_to()containment check), so the three callers in the v32 asset / doc / mirrored endpoints now share the existing helper. #14: renamedsrc/marketplace_assets.py→src/marketplace_asset_validation.pyso the file's purpose (image / doc magic-byte validators + Content-Type allowlist + agnes-metadata parsers) is obvious from the name and the previous overlap withsrc/marketplace_asset_mirror.pyis gone; six call-site imports updated in lockstep. #11: consolidated the three URL builders that resolve/api/marketplace/curated/<slug>/<plugin>/{asset,doc,mirrored}/...paths —_internal_asset_url/_internal_doc_url/_mirrored_asset_urllived insrc/marketplace.py, while a copy named_mirrored_urllived inapp/api/marketplace.pywith a "must stay aligned" comment. The new modulesrc/marketplace_urls.pyis the single source of truth; both call sites import from it. The route-handler endpoints themselves still own the path string literals — keeping the builders identical to the route declarations remains a checklist item. - Consolidate marketplace detail-page video embeds + format-guide CSS (PR #234 review #12, #13). The YouTube nocookie / Vimeo /
<video>/ link-fallback detection logic was duplicated verbatim acrossmarketplace_plugin_detail.htmlandmarketplace_item_detail.html(~40 JS lines each, with subtly-different inline styles); the function now lives in a single_marketplace_video_embed.htmlpartial that both templates{% include %}inside their IIFE. The.video-wrapselectors (one inline<style>rule inmarketplace_plugin_detail.html, one inlinestyle="..."attribute inmarketplace_item_detail.html) are replaced by the existing.video-embed16:9 wrapper fromstyle-custom.css, with new.video-embed video/.video-embed achild rules added so the wrapper handles all four embed shapes uniformly. The 60-line inline<style>block inmarketplace_format_guide.htmlmoves verbatim tostyle-custom.cssunder a new "Marketplace format guide page" section, scoped to.format-guideso other pages aren't affected. No user-visible behaviour change — the rendered HTML for valid YouTube / Vimeo / mp4 / external links is byte-identical to before; the format-guide page renders the same. - Drop unused curated-marketplace helpers flagged in PR #234 review:
src.marketplace_metadata.build_db_payload(imported but never called — strict-drop semantics were re-implemented inline insrc.marketplace._refresh_plugin_cacheand the standalone helper would have silently regressed back to "fall through to original external URL on mirror failure" if a future contributor re-wired it),app.api.marketplace._resolve_marketplace_name(one-line shim with no remaining call sites; callers use_resolve_marketplace_metawhich returns name + curator together). Also removes the misleading# noqa: F401 Optional kept for forward-compatonsrc/marketplace.py—OptionalIS used (twice in the file).
Fixed
- My Stack tab now surfaces curated cover photos / category overrides. Once a user clicked "+ Add to my stack" on a curated card, the same plugin in
?tab=myrendered with the gradient placeholder instead of its cover photo — the My Stack handler built rows from the on-diskmarketplace.json(which doesn't carry theagnes-metadata.jsonenrichment columns) and hard-codedcover_photo_url=None. The handler now looks up the enrichedmarketplace_pluginsrow for each(marketplace_id, plugin)in the user's RBAC ∩ subscriptions intersection, falling back to the synthetic on-disk shape only when the DB row is missing (rare race — granted before the first sync ingested the plugin). RBAC gating is unchanged. Regression test exercises the full flow: seed plugin row withcover_photo_url, subscribe user, hit/api/marketplace/items?tab=my, assertphoto_urlcarries the served URL. - Asset mirror manifest re-keyed by
(plugin_name, url)+ per-URL fetch dedup (PR #234 review #4 + #8). The manifest used to be keyed by URL alone, so two plugins in the same marketplace referencing the same external image (a shared CDN icon, a common cover) collided onentry.plugin_name— last writer won. The DB row for the losing plugin then stored a served URL pointing under the winning plugin's tree, andrequire_resource_access(MARKETPLACE_PLUGIN, ...)denied legitimate access on one side and let the other plugin's user reach the wrong asset. Manifest is now keyed by(plugin_name, url)in memory; on disk the format flips from a{url: entry}dict to a[entry, …]list of self-describing entries (each carries plugin_name + url + the previous fields). Phase 1 ofsync_assetsdeduplicates fetches by URL — three plugins sharing one URL share one HTTP request, but Phase 2 still creates a per-(plugin, url)manifest entry pointing under the plugin's own subdir. Body files are still stored per plugin (RBAC-clean isolation: deleting plugin A's cache can't strand plugin B). Consumer code (src.marketplace._refresh_plugin_cache+app.api.marketplace._resolve_external_via_mirror/_curated_inner_cover/_curated_inner_enrichment) re-keyedserved_url_for/mirror_status/ manifest lookups to the composite key. Tests cover the per-plugin manifest entries with shared URL, the single HTTP fetch for N plugins, and Phase 3 drop-one-keep-other. - Asset mirror persists manifest per body write, before unlinking old files (PR #234 review #7). Phase 2 of
sync_assetspreviously wrote each body atomically (tmp + rename) but persisted the manifest only at end-of-batch. Akill -9mid-batch (OOM, deploy, power loss) left on-disk files the manifest never referenced — and once a curator dropped that URL fromagnes-metadata.json, Phase 3's cleanup logic had no record of the file and the orphan stayed forever (no GC pass walks the cache dir today). The new ordering writes the body, mutates the in-memory manifest, persists the manifest, then unlinks the previous body. The crash window narrows from "all of Phase 2" to "between persist and unlink" (microseconds). Cost: one extra tmp+rename per body write; manifest is a few KB so the overhead is negligible vs. the HTTP fetches. A persist failure mid-batch keeps the old body on disk (the on-disk manifest still references it — a stale file beats a 404). Phase 3 (curator-removed URLs) follows the same discipline: collect to-delete relpaths, persist the manifest with the entries already gone, then unlink. Tests cover per-body persist (spy on_write_manifestcall count), the post-update on-disk manifest content, and the Phase 3 persist-before-unlink ordering (spy onPath.unlinkreads the on-disk manifest from inside the call). - Hard 1 MB cap + broadened exception catch on
agnes-metadata.jsonreader (PR #234 review #9). The reader is invoked once per marketplace per sync and the file is curator-controlled. Without a size cap, a curator could commit a multi-GB JSON and OOM the sync worker onpath.read_text(). Without catchingRecursionError, a deeply-nested document ({"a":{"a":{"a":...}}}) fitting under any size cap would still propagate past theValueErrorcatch and abort the sync for every marketplace in the same pass. Now:path.stat().st_sizeis checked againstAGNES_METADATA_MAX_BYTES(1 MB — generous; a real-world file with covers / docs / categories for ~50 plugins fits in <100 KB) before the body is read, and the JSON parseexceptis widened to(ValueError, RecursionError). Either failure mode degrades to an empty metadata dict (the same fall-back the malformed-JSON path uses) so one bad upstream never blocks the rest of the sync. - Curator now mandatory on
PATCH /api/marketplaces/{id}too (PR #234 review). The POST handler enforcedcurator_name+curator_emailat create time, but the PATCH handler treated empty / missing curator inputs as "no change" — so legacy rows that pre-date v32 (curator_name=NULL) could be edited indefinitely (URL, description, name) without ever filling the curator gap, and theOWNER_TODO_PLACEHOLDERlingered on every/marketplacecard. The PATCH path now rejects with400 curator_name is required/curator_email is requiredwhen the post-merge row would persist with empty curator. The DB column itself stays nullable so untouched legacy rows continue to coexist; the gate fires only the moment an admin opens the edit modal. Existing PATCH semantics (empty-string input = "leave existing value alone", once-filled curator can't be cleared) are preserved. - Stored-XSS hardening on the curated
/asset/{path}endpoint (PR #234 review). The endpoint previously served any file in the cloned marketplace repo with stdlib-detectedContent-Type, so a curator who could land anevil.html(or a renamedevil.pngcarrying HTML bytes) in.agnes/got a same-origin XSS payload — the response shares cookie scope with/adminand/api/me/*. The endpoint is now image-only with three layered checks: extension must be inIMAGE_EXTENSIONS(.png/.jpg/.jpeg/.webp; SVG intentionally excluded —<script>inside SVG executes), body must passvalidate_image_filemagic-bytes (defeats the rename-extension attack), and the responseContent-Typeis pinned from the validated extension (never stdlib mimetypes). Defense-in-depth headersX-Content-Type-Options: nosniffplus a strictContent-Security-Policy: default-src 'none'; img-src 'self'; style-src 'unsafe-inline'are now applied to every/asset/response. The/doc/(already extension-gated) and/mirrored/(mirror-validated body) siblings were untouched. Regression tests cover the HTML extension, the renamed-HTML-as-PNG bypass, the SVG extension, and the happy-path PNG with the security headers attached. - SSRF hardening of the curated-marketplace asset mirror (PR #234 review). The pre-flight
_is_safe_urlcheck validated only the initial URL, buturllib.request.urlopenthen followed redirects and re-resolved the hostname for the actual connection — both bypassable. An attacker-controlled origin could 302 tohttp://169.254.169.254/...and exfil cloud metadata; an attacker-controlled DNS server could return a public IP for the validation lookup and127.0.0.1for the connection lookup (DNS rebinding). The mirror now uses a single sharedOpenerDirectorwith three custom handlers:_SafeRedirectHandlerre-runs the SSRF allowlist on every redirectLocation(max 5 hops, down from urllib's default of 10), and_PinnedHTTPHandler/_PinnedHTTPSHandlerconnect directly to the IP that passed validation rather than re-resolving the hostname. TLS SNI + cert verification still bind to the original hostname so a curator-supplied URL whose cert chain matches the hostname keeps working._resolve_safereturns the validated IP (the existing_is_safe_url2-tuple wrapper stays for backwards compatibility) and also rejects round-robin DNS that mixes a public + private record. Regression tests cover redirect blocking, redirect error unwrapping insideURLError, the pinned-IP connection target, and the end-to-end DNS-rebinding scenario.
Added
-
Curated marketplace enrichment via
.claude-plugin/agnes-metadata.json. Upstream marketplace repos can ship a sibling file next tomarketplace.jsondeclaring per-plugin (and per-skill / per-agent) cover photo, demo video URL, doc links, and category override — seedocs/curated-marketplace-format.mdfor the schema anddocs/examples/agnes-metadata.jsonfor a worked example. Asset references are hybrid: acover_photovalue beginning withhttps://is treated as external (mirrored to${DATA_DIR}/marketplace-cache/<slug>/at sync time so linkrot doesn't break the UI); other values are repo-relative paths served straight from the cloned working tree. The.claude-plugin/agnes-metadata.jsonfile plus anything under a.agnes/directory is stripped from the synthetic Claude Code marketplace Agnes serves to user instances (/marketplace.zip,/marketplace.git/*) — the upstream repo stays a fully valid Claude Code marketplace for directplugin marketplace addconsumers, and Agnes-only metadata never reaches Claude Code. New shared validation modulesrc/marketplace_asset_validation.pyenforces document allowlist (PDF, Markdown, plain text) and image allowlist (PNG, JPEG, WEBP) on both the curated mirror flow and the Flea upload flow. Strict drop semantics: any cover or doc Agnes can't serve as a real file (missing internal path, mirror fail, allowlist reject, magic-bytes mismatch) is dropped from the served metadata entirely — the UI renders identically to the no-entry case (gradient placeholder for missing covers, no row in the doc list) so curators never ship a broken link to every analyst until they notice. Curated card render swaps a 404 cover for the gradient placeholder via<img onerror>so a stale DB row pointing at a deleted file still looks clean. Doc clicks force-download viaContent-Disposition: attachment. YouTube embeds use theyoutube-nocookie.comprivacy-enhanced domain with the canonicalallow="..."permissions list so corporate / private-CA setups don't render a blank frame. Inner-card cover photos on the plugin detail page (skills + agents) populate from the sameagnes-metadata.jsonsub-trees. New publicly-readable format guide at/marketplace/format-guide(linked from/admin/marketplacesnext to the+ Add Marketplacebutton) renders the curator-focused markdown source viamarkdown-it-py. -
Mandatory curator on registered marketplaces. Admin must supply
curator_nameandcurator_emailwhen registering a marketplace through/admin/marketplaces; both are editable later through the same admin UI. The values surface on/marketplacecards and plugin detail pages in place of the historicalowner_todoplaceholder (which still appears for legacy rows that pre-date the migration until an admin patches them). Validation lives at the API layer (POST /api/marketplacesreturns 400curator_name is required/curator_email is required) — the DB columns themselves are nullable so existing rows survive migration without forcing a refill before the next request. -
External-asset mirror cache. New module
src/marketplace_asset_mirror.pydrives the per-sync HTTP fetch with conditional GET (If-None-Match/If-Modified-Since), 60 s timeout, 10 MB body cap, max 4 concurrent fetches, and SSRF guards (onlyhttp(s)://, blocks loopback / private / link-local / metadata IPs). HTTPUser-Agentfollows the Wikipedia / Wikimedia Commons policy format (Agnes-Marketplace-Mirror/1.0 (+<repo-url>; agnes-mirror)) so strict CDNs that reject generic UA strings still serve. On fetch failure the previous good copy is preserved (b1 fallback) and the manifest entry records the error — admin sees a "mirror failed" indicator without users seeing 404s. Per-marketplace manifest at${DATA_DIR}/marketplace-cache/<slug>/manifest.json. Cache dir is removed alongside the cloned working tree ondelete_marketplace_dir. Inner-level (skill / agent) external URLs are also mirrored — the request-time skill / agent detail enrichment looks them up in the same per-plugin manifest and applies the same drop-on-failure rule as the plugin level. -
New
/api/marketplace/curated/{mp}/{plugin}/asset/{path},/doc/{path}, and/mirrored/{key}endpoints serving internal repo files, internal doc files (allowlist-gated), and mirrored external assets respectively. All three are gated byrequire_resource_access(MARKETPLACE_PLUGIN, "{mp}/{plugin}")and validate paths viaPath.resolve(strict=True) + is_relative_to()so..segments and symlinks pointing outside the marketplace tree return 404. -
Session pipeline framework under
services/session_pipeline/— pluggable processors for the centralized/data/user_sessions/<key>/*.jsonltree. Each processor implements aSessionProcessorProtocol (name,cadence_minutes,process_session(...)) and runs through its own per-processor scheduler tick + scan loop. No cross-processor coupling: a slow or failing processor cannot block any other. Pure-utility lib (parse_jsonl,compute_file_hash) is shared; orchestration is per-processor inrunner.run_processor(). Adding a new processor is one file inservices/session_processors/<name>.py, one entry in the registry list, one entry in the schedulerJOBSlist. Seeservices/session_pipeline/contract.pyfor the protocol andservices/session_processors/__init__.pyfor the registry pattern. -
services/session_processors/usage.py—UsageProcessorskeleton (no-op,cadence_minutes=10). Reserves the registry slot + scheduler entry so the framework end-to-end exercises two processors. Extraction logic (skill / agent invocation events) and storage shape (DuckDB table vs. append-only parquet event log) are deferred to a separate brainstorm. -
POST /api/admin/run-session-processor?processor=<name>— parametrized admin endpoint that drives one session-pipeline processor end-to-end. Admin-gated; same audit pattern as the other/api/admin/run-*endpoints (one row per call with actionrun_session_processor:<name>); 400 whenprocessoris unknown. -
SessionProcessorStateRepositoryinsrc/repositories/session_processor_state.py— backs the new state table. -
Flea-market upload guardrails (schema v32). Every
POST/PUTto/api/store/entitiesnow passes through a four-stage check pipeline before the entity becomes visible in the public flea browse. Inline checks (manifest shape, static security scan for shell-eval / hardcoded API keys / reverse shells / pickle deserialization, quality + Jinja-template recommendation) run synchronously and return a structured422body listing every failed rule on rejection. An async LLM security review then runs onBackgroundTasks; onsafe/lowrisk with nohigh|criticalfindings the entity flips tovisibility_status='approved', otherwise it stays hidden until an admin overrides the verdict. Every submission attempt — pass, fail, or in-flight — is captured in a newstore_submissionstable that powers/admin/store/submissionswith override / retry / rescan / download / delete actions, all audit-logged. The reviewer model is configurable viainstance.yaml→guardrails.review_model: haiku|sonnet|opus(defaulthaiku); when noANTHROPIC_API_KEYis configured the LLM step auto-disables and uploads auto-approve so first-boot UX stays sane. A non-blocking quality hint encourages uploaders to add{{var}}placeholders so first-use customization works. Schema v32: addsstore_entities.visibility_status(existing rows backfilled to'approved'so live uploads survive the upgrade) and createsstore_submissions.UserStoreInstallsRepository.list_for_usernow filters non-approved entities so a user-installed entity that gets blocked by review stops being served to Claude Code viamarketplace.zip/marketplace.gituntil override. Seedocs/STORE_GUARDRAILS.md. -
Blocked-bundle persistence + 30-day TTL purge (schema v33). Inline-blocked uploads no longer roll back the bundle at upload time — the ZIP stays on disk under a
visibility_status='hidden'entity row so admins can Rescan, Override + publish, or Download bundle for forensic inspection from/admin/store/submissions/{id}. Three new columns onstore_submissions:file_size— bytes on disk; sortable in the admin list (click the new Size column header).bundle_sha256— content-addressed hash; survives the TTL purge so admins can correlate "this submitter / IP tried the same payload N times" or match against a known-bad list.bundle_purged_at— TTL stamp, surfaces as "Bundle purged on YYYY-MM-DD" on the detail page once the bytes are gone. Two operator knobs underguardrails:ininstance.yaml:blocked_bundle_ttl_days(default 30; set to 0 to retain forever) andblocked_quota_per_day(default 50; per-submitter cap on rejected uploads in trailing 24h, returns 429quota_exceededonce exceeded). New scheduler jobstore-blocked-purgeruns daily at 04:00 UTC againstPOST /api/admin/run-blocked-purge. Override no longer 409s on inline-blocked submissions — flow is uniform with blocked_llm. Detail page also shows an Activity timeline pulled fromaudit_logso admins can confirm a verdict is fresh after Rescan / Retry. Seedocs/STORE_GUARDRAILS.md.
-
PostHog snippet middleware preserves
Response.backgroundon every return path so anyBackgroundTask/BackgroundTasksattached to an HTML route still fires once the integration is enabled (PR #231 review by minasarustamyan).BaseHTTPMiddlewarematerialises the body and asks subclasses to return a freshResponse; the previous implementation droppedbackgroundon three paths, silently cancelling deferred audit logging / async webhooks / email sends with no log line. Also adds a_MAX_BUFFER_BYTES(4 MB) cap so a streamed-HTML response can't balloon RSS — bigger bodies short-circuit through with a warning instead of being buffered. Regression tests intests/test_posthog_inject_middleware.pyexercise the four return paths plus the streaming guard. -
POSTHOG_LLM_PAYLOAD_MAX_CHARS(default 30000) clips$ai_input/$ai_output_choicesbefore they hit PostHog so oversized prompts don't get silently dropped at ingest. PostHog's per-event ceiling is ~32 KB and the SDK does not chunk; Agnes prompts routinely include sample rows / table schemas / analyst SQL that exceed it, and unbounded payloads landed exactly the calls operators wanted to inspect on the floor (PR #231 review by minasarustamyan). Truncated payloads carry an explicit…[truncated N chars]marker so a reader doesn't mistake them for a complete capture; metadata (provider, model, tokens, latency, error) flows regardless. Override the cap via the env var. -
PostHog event-level user attributes so a reviewer reading an event in PostHog sees who the user was inline, without clicking through to the person profile. Backend
capture_exceptionmergesuser_id/user_email/user_name(perPOSTHOG_IDENTIFY_PII) into the event properties; browser snippet registers the same keys as super-properties viaposthog.register({...})so every client-side event includingposthog.captureException()carries them. -
/api/debug/throwdebug-only endpoint for verifying observability wiring end-to-end. Gated byDEBUG=1(404 in production), runs afterDepends(get_current_user)sorequest.state.useris populated, then raises a configurable exception (?kind=ValueError&msg=…). Use to confirm PostHog receives the exception with full user context attached, not justrequest_id. -
PostHog
environment+releasesuper-properties on every event. Resolved at startup asPOSTHOG_ENVIRONMENT(explicit) →localwhenLOCAL_DEV_MODE=1→RELEASE_CHANNEL→AGNES_DEPLOYMENT_ENV→unknown. Backend events get them via the SDK'ssuper_properties; browser events get them viaposthog.register({...})in the loaded callback. Filtering PostHog dashboards byenvironment = productioncleanly hides traffic from developer laptops, CI, and staging deployments.releasefalls back fromAGNES_VERSIONtoRELEASE_CHANNEL. -
request.state.userpopulated by auth dependencies so response-phase middleware (PostHog snippet injector, 500 handler) can identify the actor without re-running the auth dependency. Adds an_stash_userhelper inapp/auth/dependencies.pycalled from every successful resolution path (LOCAL_DEV_MODE seeded user, scheduler shared-secret, PAT/JWT). The browserposthog.identify(user_id, {email})call now actually fires for logged-in users. -
Optional PostHog observability integration. Off by default; activates only when
POSTHOG_API_KEYis set in the environment. Covers backend exception capture (FastAPI 500s +src/orchestrator.pyrebuild failures +services/scheduler/HTTP-job failures +cli/main.pyuncaught CLI errors), LLM call tracing ($ai_generationevents with provider, model, latency, and token counts; prompt / completion bodies stay off unlessPOSTHOG_LLM_PAYLOADS=1because LLM prompts in this product routinely include customer data), frontend errors +$pageview/$pageleave, masked session replay (maskAllInputs: trueplus a CSS-selector mask for known data surfaces), and feature flags (server-sideis_feature_enabled+ browserposthog.isFeatureEnabled). Defaults to PostHog Cloud EU (https://eu.i.posthog.com) — override withPOSTHOG_HOSTfor US Cloud or a self-hosted endpoint. Identification mode is operator-configurable (none/id/email/full); defaultemailshipsuser.id+ email but never name. The browser snippet is injected by an HTML-rewrite middleware (app/middleware/posthog_inject.py) so it reaches everytext/htmlpage including standalone templates that don't extendbase.html— registered inside the GZip layer so it sees uncompressed HTML before compression. CLI entry point moved fromcli.main:apptocli.main:main(Typer wrapper that captures uncaught exceptions, flushes, and re-raises). New filesrc/observability/posthog_client.py(lazy singleton, no network when disabled),src/observability/llm_tracing.py($ai_generationcontext manager),app/web/templates/_posthog.html(browser snippet template). Seedocs/observability.mdfor the operator guide andconfig/.env.templatefor the env-var reference. -
New
/marketplacebrowse page combining curated marketplaces with the community Flea Market in a single discovery + install surface. Three tabs (Curated / Flea / My Stack), per-tab category filter with inline SVG icons (Heroicons MIT, no new dependency, insrc/category_icons.py), Flea-only type filter, search across both sources with Curated/Flea scope checkboxes, numeric pagination — all with URL state via query string. Detail pages live at/marketplace/flea/<id>and/marketplace/curated/<slug>/<plugin>. Curated detail returns 403 without the RBAC grant. Plugin detail surfaces inner skills/agents as clickable nested cards (/marketplace/curated/<slug>/<plugin>/{skill,agent}/<name>); commands/hooks/MCPs render as plain name lists. Guide pages at/marketplace/guide/{curated,flea}host the publication-flow placeholder for full copy to be authored separately. -
New REST router under
/api/marketplace(inapp/api/marketplace.py):GET /itemsper-tab listing,GET /categoriesper-tab counts,GET /curated/{slug}/{plugin}detail,POST/DELETE /curated/{slug}/{plugin}/installsubscribe/unsubscribe,GET /curated/{slug}/{plugin}/{skill,agent}/{name}for inner items. -
marketplace_plugins.created_atcolumn for "newest first" sorting on/marketplace.MarketplacePluginsRepository.replace_for_marketplaceswitched from delete-and-insert to upsert socreated_atsurvives across syncs. -
Redesigned
/marketplace/curated/<slug>/<plugin>/{skill,agent}/<name>and the flea-side/marketplace/flea/<entity_id>skill / agent detail pages (app/web/templates/marketplace_item_detail.html). Width matches the plugin detail page (1280 px), light surface hero with kind-tinted accents (skill = green, agent = purple — matching the marketplace cards), Description + Details sidebar, Docs section (flea only — curated docs deferred until per-skill / per-agent metadata YAML lands) and Files section walking the bundle on disk. Curated nested has no install button — instead a "Open parent plugin →" link with helper text noting the install happens at the parent plugin level. -
Curated skill / agent detail pages now render the "How to call it" copy-able invocation chip (previously flea-only). Curated items show
/<plugin-manifest-name>:<inner-name>— the exact namespace Claude Code applies after install. Themanifest_nameis read from the parent plugin's own.claude-plugin/plugin.jsonvia the now-publicsrc.marketplace_filter.resolve_manifest_name, matching the synthmarketplace.jsonAgnes serves so the chip and the post-install slash command stay in sync. Surfaced via a newmanifest_namefield onInnerDetailResponse(app/api/marketplace.py). -
Per-tab info blocks above the filter row on
/marketplace: curated trust signal ("Each plugin here has a named curator accountable for it.", blue accent), flea open-shelf signal ("Anyone in the company can upload here.", purple accent + Tips-for-sharing link), My Stack personal-shelf orientation ("Your AI stack — everything you've added.", slate accent, no link). -
Hero illustration anchored to the right of the blue hero panel (absolute, 47% wide, behind the search row content); hidden under 900px viewport. New asset at
app/web/static/marketplace-cover.png. -
Per-tab Heroicons next to each tab title (shield-check for Curated / building-storefront for Flea / rectangle-stack for My Stack), tinted to match each tab's accent; flips white when the tab is active.
Changed
- Flea /
/store/newupload allowlist tightened. Document uploads (docs[]) restricted to PDF (.pdf), Markdown (.md,.markdown), and plain text (.txt); anything else returns HTTP 415unsupported_doc_type. Photo uploads keep their existing extension allowlist (.jpg/.jpeg/.png/.webp) but now also pass through a body-level magic-bytes check (PNG signature, JPEG\xff\xd8\xff, WEBPRIFF…WEBP, PDF%PDF) so a renamedpayload.pngcarrying SVG XML or arbitrary bytes can't smuggle through. SVG photos remain rejected (XSS via inline<script>). The wizard's file inputs now carry matchingacceptattributes plus a JS sanity-check that surfaces an inline message before submit. Same allowlist (insrc/marketplace_asset_validation.py) is enforced on the curated mirror side so the two surfaces stay aligned. - BREAKING: Schema bump v30 → v31 renames
session_extraction_state→session_processor_statewith composite PK(processor_name, session_file)so multiple processors can track their own processed-set independently. Existing rows are copied across withprocessor_name='verification'and the old table is dropped. TheKnowledgeRepository.is_session_processed/mark_session_processedhelpers are removed — sessions bookkeeping now lives inSessionProcessorStateRepository. The session-state-awareis_processedcheck now comparesfile_hashso a session jsonl that grows (live append from an active Claude Code session) gets reprocessed on the next tick — previously the file_hash was stored but never read back. - BREAKING:
POST /api/admin/run-verification-detectoris dropped in favor ofPOST /api/admin/run-session-processor?processor=verification. Audit action renamesrun_verification_detector→run_session_processor:verification. The schedulerJOBSlist reflects the new endpoint; no operator action required if the only caller is the in-tree scheduler. The legacydry_runflag (no real callers outside the dropped CLI shim) is gone. services/scheduler/__main__.pyJOBS —verification-detectorentry replaced by two new entries:session-processor:verificationandsession-processor:usage. New env varSCHEDULER_USAGE_PROCESSOR_INTERVAL(default 600s);SCHEDULER_VERIFICATION_DETECTOR_INTERVALis retained (still drives the verification cadence AND the health-check grace window inapp/api/health.py) for operator compatibility with existing docker-compose env files.services/verification_detector/detector.pyis reduced to LLM-side helpers (_generate_id,_format_turns,extract_verifications); the orchestration loop moves intoVerificationProcessorinservices/session_processors/verification.py. The CLI (python -m services.verification_detector) still works — it now constructs the processor and runs the sharedrun_processorrunner.app/api/health.py_check_session_pipelinenow queriessession_processor_state WHERE processor_name='verification'instead ofsession_extraction_state(same heuristic, scoped explicitly to the verification processor).app/web/router.py/profile/sessionsjoin target updated tosession_processor_state(verification rows).SCHEDULER_AUDIT_ACTIONSupdated to include the new per-processor audit actions.- Marketplace UI rebrand:
+ Install→+ Add to my stack,✓ Installed→✓ In your stack, card "Installed" badge → "In stack" (amber pill),My Subscriptionstab →My Stack. Bridges the conceptual gap between "saved on the server" (what the click does) and "installed on my laptop" (what users assumed). Same vocabulary now consistent across/marketplace,/store/<id>detail, navbar link, and the post-add hint panel. - Plugin and skill/agent detail pages now show an inline post-add hint panel after a successful "Add to my stack" click: green-bordered block under the description with a 2-step recipe ("open new Claude Code session" or run
agnes refresh-marketplace+/reload-plugins), Copy button on the command, "Don't show again" dismiss persisted inlocalStorage. Removes the dead-end where users clicked Install, saw "Installed", opened Claude Code, and found nothing. - Action-row CTA on
/marketplace: curated tab[How to add new content]→[Submit a plugin], flea tab[How to add new content]removed (the+ Uploadbutton next to it already covers self-service publishing — second CTA was redundant). Empty-state CTAs aligned: curated empty state links toSubmit a plugin →, flea empty state shows only+ Upload. Guide page titles updated toSubmit a plugin to Curated Marketplace/Upload to Flea Market. - Skill/agent detail page (curated nested) helper text changed from "To install, install the plugin." to "Add the bundle to your stack to use it." for terminology consistency.
- BREAKING: Curated marketplace plugins no longer auto-appear in a user's served marketplace on RBAC grant (Model B opt-in). Users must explicitly Install each curated plugin from
/marketplace.resolve_user_marketplacecomposition changes from(rbac ∖ opt_outs) ∪ store_installsto(rbac ∩ subscriptions) ∪ store_installs. Existing users will see an empty served marketplace until they re-install previously-granted curated plugins; no auto-migration of prior preferences is performed. - The
user_plugin_optoutsDB table is reused for Model B subscriptions — table and column names are kept (no DDL rename) to avoid migration churn on running operator instances. The v28 migration wipes existing rows since the semantic inverts (presence used to mean "excluded", now means "subscribed"). The Python repository is renamedUserPluginOptoutsRepository→UserCuratedSubscriptionsRepository(insrc/repositories/user_curated_subscriptions.py) with method names flipped tosubscribe / unsubscribe / is_subscribed / subscribed_set / list_for_user / delete_for_plugin / delete_for_marketplace. /api/marketplace/items?tab=myand/categories?tab=myread directly fromuser_curated_subscriptions ∪ user_store_installs(notresolve_user_marketplace, which bundles flea skills/agents into a singlestore-bundlesynthetic entry useful for serving the Claude Code marketplace ZIP/git but wrong for browsing where each item should appear as its own card)./my-ai-stackcurated toggle is now a subscribe/unsubscribe action against the renamed repository (UX — toggle on/off — unchanged; default state is now off)./admin/marketplacesDELETE cleanup now also dropsuser_plugin_optoutsrows for that marketplace so a re-registered slug doesn't inherit stale subscribe state.- Navbar: standalone "My AI Stack" relabelled "My Stack" and points at
/marketplace?tab=my; "Store" link removed (Store flow is reachable via the Flea Market tab's+Uploadbutton). The standalone/my-ai-stackand/storeroutes still work for old bookmarks. GET /api/marketplace/curated/<slug>/<plugin>/{skill,agent}/<name>(InnerDetailResponse) now also returnsmarketplace_name,category,parent_author_name,parent_updated_at,bundle_size, andfiles(recursive listing with sizes) so the redesigned detail page can render the hero badges, sidebar, and Files section without a second roundtrip.GET /api/marketplace/flea/<entity_id>/detailandGET /api/marketplace/curated/<slug>/<plugin>(PluginDetailResponse) now also returnfiles,docs,install_count, andowner_display(friendly name resolved viausers.name → email → owner_username, mirroring/store/<id>).
Security
GET /api/marketplace/curated/<slug>/<plugin>/{skill,agent}/<name>now containment-checks the resolved file path againstplugin_rootvia a new_safe_joinhelper (resolve(strict=True)+relative_to). The direct URL exploit was already blocked by Starlette's[^/]+path-param regex, but a curator-planted symlink inside a curated marketplace's git mirror could previously dereference outside the plugin tree on read. Now centralized so_read_inner, the skillfileswalk, and the agentstatcall all share the same boundary.
Fixed (PR #232 review)
services/scheduler/__main__.pytick loop is now parallel + advanceslast_runon terminal state. Pre-fix it was a synchronousfor-loop + httpx.post(timeout=900)— a 10-minute verification run blocked every other job (data-refresh,health-check,usage,corporate-memory) for the entire window. The PR's stated isolation guarantee ("slow / failing processor cannot block any other") only held insideservices/session_pipeline/runner.py; the scheduler dispatch layer broke it. Pre-fixlast_runalso only advanced on success, so a permanently failing job was retried every 30s tick instead of on its 15-min cadence (30× the configured request rate + LLM tokens). Replaced withThreadPoolExecutor.submitper due job + per-job in-flight set so a long-running job can't be re-launched on subsequent ticks._run_jobextracted to a module-level helper so the bookkeeping is unit-testable.SessionProcessorStateRepository.scan_unprocessed_forhad a deadif/elsewhere both branches surfaced every jsonl, making theSELECT session_file FROM session_processor_stateround-trip pointless and forcing the runner to MD5-rehash every stable session on every scheduler tick. Replaced with an mtime precheck: stable sessions (mtime <= processed_at) are filtered at scan and the runner never reads or hashes them. Files modified since the last run still surface for the runner's authoritativefile_hashinvalidation.POST /api/admin/run-session-processornow takes a per-processor advisory lock (threading.Lockkeyed by name) before invoking the runner. Two trigger paths exist for the same processor (scheduler tick + manual admin POST); without serialization, overlapping runs would re-process the same/data/user_sessions/*set, double-call the LLM, and pile up duplicateverification_evidencerows (the dedup short-circuit only catches the create+contradiction branches, notcreate_evidence, per ADR Decision 3). Concurrent invocation returns HTTP 409 Conflict so the operator sees what happened instead of stacking behind a long-running tick. Lock releases unconditionally infinally:so a runner exception can't wedge the processor permanently.
Internal
- Schema bump v31 → v32 — adds
curator_name,curator_emailtomarketplace_registryandcover_photo_url,video_url,doc_links(JSON) tomarketplace_plugins. Migration is pureALTER TABLE … ADD COLUMN IF NOT EXISTSand idempotent against fresh installs that come up via test fixtures at a pre-v32 version. Fresh-install schema in_SYSTEM_SCHEMAcarries the new columns. - New shared validation module
src/marketplace_asset_validation.pyexporting allowlist constants, body-level validators (validate_doc_file,validate_image_file), HTTP-response validators (accept_doc_response,accept_image_response), and theparse_doc_link/parse_cover_photo_refhelpers used by the agnes-metadata.json parser. Single source of truth for "what types we accept" across curated sync and Flea upload flows. - New
src/marketplace_metadata.py(lenient parse + per-plugin / per-skill resolution) andsrc/marketplace_asset_mirror.py(HTTP fetch with conditional GET, manifest persistence, SSRF guards). The mirror is invoked fromsrc/marketplace.py::_refresh_plugin_cacheafterread_plugins; failures never abort the sync. New helperapp.utils.get_marketplace_cache_dir()for the on-disk cache root. src/marketplace_filter.is_agnes_only_path(public) + matching strip inapp/marketplace_server/packager.py::_collect_members,app/marketplace_server/git_backend.py::file_set_for_user, andcompute_etaginmarketplace_filter. ETag stays stable across additions/removals of Agnes-only files so user-side caches don't bust on enrichment-only changes.- New tests
tests/test_marketplace_metadata.py,tests/test_marketplace_asset_mirror.py,tests/test_marketplace_synth_strip.py,tests/test_marketplace_v32_endpoints.py. Existingtests/test_marketplace.pyextended with curator validation + round-trip;tests/test_db_schema_version.pyupdated for v32 + new column presence. services/session_processors/verification.py:build_verification_processorfactory mirrors the lazy LLM-extractor construction previously inlined inapp/api/admin.run_verification_detectorandservices/verification_detector/__main__. Single source of truth for processor instantiation.- Schema bumped v27 → v28 (
DELETE FROM user_plugin_optoutsfor the semantic flip +marketplace_plugins.created_atwithregistered_atbackfill). - New tests
tests/test_marketplace_api.py(browse, categories, install/uninstall, RBAC 403,_safe_joincontainment). Existingtests/test_marketplace_filter_store.py,tests/test_marketplace_server_zip.py,tests/test_marketplace_server_git.py,tests/test_store_api.py,tests/test_store_repositories.pyupdated for Model B (explicit subscribe in fixtures).
Added (home + news work)
- State-aware
/homelanding page — alternative to/dashboardfor not-onboarded users. Inline 3-step install (Claude Code via OS-tabbed installer,agnes pullbootstrap, optional auto-accept mode), one-click "Setup a new Claude Code" CTA that mints a 90-day PAT and copies a ready-to-paste setup script to the clipboard, and connector-card prompts for Asana / Google Workspace / Atlassian. Onboarded users see a hero + green-check completion badge; install steps + connectors stay visible below for adding another machine or connecting more services. Manual reload picks up the flip afteragnes initPOSTs/api/me/onboarded. - News section on
/home+/newspermalink +/admin/newseditor — admin-edited rich content (intro at the bottom of/home, full body on/news). Single versioned entity in the newnews_templatetable (schema v30). Every save creates / updates a draft; admin must publish a draft before it goes live; older versions stay browsable; concurrent edits surface as 409 conflicts (expected_versionquery param + CLI--versionflag) instead of silently overwriting. Drafts and superseded published versions older than 30 days are pruned on save; the currently-displayed published version is never pruned. POST /api/me/onboarded— flipsusers.onboardedfor the calling user (idempotent, audit-logged withsource ∈ {agnes_init, self_acknowledged, self_unmark}). Optionalonboardedbody field toggles the flag back to FALSE for the "Mark me as offboarded" button on the post-onboarding /home view./setup-advancedpage — second-hour reference covering VS Code layout, recommended plugins, multi-model second opinions, custom skills/rules/hooks, plus a YOLO-mode warning section.agnes admin newsCLI —show,draft,edit,publish,unpublish,versions,export. Talks to/api/admin/news/*endpoints (PAT-authed) so it coexists with a running uvicorn. Optimistic-lock guard via--version N(publish) and--expect-version N/--force(edit).agnes onboarded {on,off,status}CLI — self-scoped flag toggle, equivalent to the in-page button on/home. POSTs/api/me/onboardedwith{onboarded: bool, source: 'self_acknowledged' | 'self_unmark' | …}; the--sourceflag overrides the default source string for audit_log distinction (CLI vs web button vsagnes initautomation).- Schema v29 (instance_templates singleton consolidation +
users.onboarded) → v30 (news_templateversioned). Legacywelcome_template+claude_md_templaterows migrate into the consolidatedinstance_templatestable; the legacy tables are dropped post-migration. Repository APIs preserved. - Configurable home route —
AGNES_HOME_ROUTEenv (Terraform-friendly) >instance.home_routeYAML > default/dashboard. Allowlist-validated. Auth callbacks (Google OAuth, magic-link, password form, LOCAL_DEV_MODE) honor the resolved route —safe_next_path(default=None)resolves toget_home_route(). - Configurable Google Workspace CLI OAuth client —
AGNES_GWS_*env >instance.gws.*YAML > unset. When set, /home's GWS connector prompt skipsgws auth setupand writesclient_secret.jsondirectly with the operator's pre-provisioned OAuth app. GWS scope set widened to includechat.spaces+chat.messages. - Connector setup prompts (Asana / GWS / Atlassian) precheck whether the tool is already installed/connected before re-running setup.
.news-hero/.callout-{info,warn,success,danger}/.video-embed/.news-section/.news-grid-{2,3}/.news-ctaauthor CSS vocabulary — single shared block instyle-custom.css("News content vocabulary (shared)") used by /home perex, /news body, and the /admin/news preview. Documented indocs/operator/news-content-guide.md. Iframe host allowlist (YouTube / Vimeo / Loom) enforced bynh3-backed sanitizer insrc/sanitize_news.py.nh3>=0.2dependency for the news sanitizer; closes the bypass shapes flagged on the legacy regex sanitizer insrc/welcome_template.py(the legacy path is left alone in this PR).scripts/dev/run-local.sh— local uvicorn launcher. Pulls Google OAuth client id/secret from GCP Secret Manager (AGNES_OAUTH_GCP_PROJECT-driven, no vendor defaults), pointsAGNES_CLI_DIST_DIRat./distso the wheel endpoint resolves, and--devflipsLOCAL_DEV_MODE=1+AGNES_HOME_ROUTE=/homefor one-command iteration.
Changed (home + news work)
dashboard.htmlnow extendsbase.htmlvia the new{% block layout %}opt-out (full-width pages skip the 800px.container). One shell, one place to fix chrome bugs.style-custom.css:rootextended with--space-{7,9,10,12},--radius-2xl,--shadow-{card,elevated},--text-{muted,disabled},--focus-ring,--transition-*,--width-{narrow,app,wide}so inline page styles can migrate incrementally.LOCAL_DEV_MODE=1now also enables the FastAPI debug toolbar (was gated onDEBUG=1separately; every local-dev session wants both).
Internal (home + news work)
- Schema bumped v28 → v29 → v30. New tests: news repository (14), sanitizer (20), API (8), web (5), CLI (14) — 61 total — plus updated home/auth/template tests for the shared-shell architecture. CLAUDE.md "Run tests before every push" section codifies
pytest tests/ -n auto -qas non-negotiable before each push.
Fixed (system DB shutdown)
close_system_db()now CHECKPOINTs before closing the system DB connection, so the WAL flushes intosystem.duckdband the file is left in a clean state acrossdocker compose up -drecreate windows. Previously, a SIGKILL after the default 10sstop_grace_periodcould leave a populated.walthat the next process must replay on open; if the next image carried a different DuckDB version, replay could trip an internal assertion (Failure while replaying WAL ... GetDefaultDatabase with no default database set) and 500 every authed request until the WAL file was manually removed. CHECKPOINT is best-effort with operator-visible logging —WARNINGon failure,DEBUGon success.
Changed (compose grace)
docker-compose.ymlstop_grace_period: 60son theappandschedulerservices (was Docker's 10s default). Gives uvicorn time to drain in-flight requests + run the new shutdown CHECKPOINT before SIGKILL. Healthydocker compose downis unaffected (services still stop as soon as their lifespan exits).
[0.47.4] — 2026-05-08
Fixed
services/session_collectorno longer logs "Collection complete: 0 users, 0 files copied" + "Group 'data-ops' not found" every 10 minutes in the Docker layout where/home/*/user/sessions/doesn't exist. New env varAGNES_SKIP_LEGACY_COLLECTOR=1(set by default indocker-compose.yml) short-circuits the collector pass. The bare-VM deployment path (where /home/* IS populated by Claude Code) leaves this unset and continues to scan + log normally — including the data-ops warning, which is load-bearing for catching missing-group mis-deploys.agnes diagnosesession_pipelinecheck gains a FIFO-aware lookup: in addition to the existing MAX(processed_at) comparison (catches "detector hasn't run lately"), it now flags the case where an OLD jsonl never got processed even though newer ones did (= verification-detector skipped a file). Threshold defaults to 4× the verification-detector grace (= 2h with default 30min grace) and is configurable viaSESSION_PIPELINE_STUCK_FILE_GRACE_SECONDS. Severity intentionally starts atinfo— operators can tighten towarningonce they have prod data on false-positive rate.
[0.47.3] — 2026-05-07
Fixed
agnes self-upgrade(without--force) previously read the local 24hupdate_check.jsoncache to decide whether an upgrade was needed — meaning that for up to 24 hours after a server-side version bump, the explicitagnes self-upgradecommand exited silently as a no-op even though a newer wheel was available. Cache is now always invalidated for the explicit command (the cache still gates the implicit warning loop in the root callback to avoid hammering/cli/lateston everyagnes <anything>invocation). Surfaced when a server bump 0.47.1 → 0.47.2 didn't trigger client-side upgrade.
[0.47.2] — 2026-05-07
Fixed
- Restore #218 (real BQ error surfacing in
remote_estimate_failed) and #219 (friendlier missing-table hint inagnes query) — both fixes were silently reverted by the squash merge of #217 because that branch carried stale snapshots ofapp/api/query.pyandcli/commands/query.pyfrom before #218 and #219 merged. Verified end-to-end against production:agnes query --remote "SELECT FROM unit_economics WHERE bad_col=1"now returns the BQ "Unrecognized name" diagnostic;agnes query "DESCRIBE unit_economics"now appends the remote-table hint.
[0.47.1] — 2026-05-07
Keboola connector v27 — incremental, partitioned, where_filters, typed parquet.
Added
query_mode='local'for Keboola is back — admins can opt specific tables out of the v26 materialized default and into a per-table sync-strategy dispatcher (full_refresh / incremental / partitioned). The radio sits in the/admin/tablesEdit modal; metadata stored in seven newtable_registrycolumns (see schema v27 below).- Three Keboola sync strategies:
full_refresh(default): full-table export-async, replaces the on-disk parquet atomically. Same shape as the v26 materialized default.incremental: delta export byincremental_column(timestamp), merge into existing parquet keyed by primary_key. New_convert_columnpath coerces string-typed deltas to the existing parquet's typed columns; PK conversion failure now raises hard (was silent mixed-type column → broken dedup).partitioned: per-partition export bypartition_by(date/timestamp column),partition_granularity(DAY / MONTH / YEAR), withinitial_load_chunk_daysfor backfill. Each partition lives in its own parquet underdata/<table>/partition_<value>.parquet.
where_filtersper table — JSON list of column-value predicates injected as Storage API export filters; lets admins narrow a wide source table at the connector edge.- Typed parquet writes — Keboola Storage API exports are CSVs with all string columns; the new pipeline reads the table schema (column types) via
get_table_infoand coerces each column to its target dtype before writing parquet. Types preserved acrossagnes pullinstead of every analyst seeing strings.
Changed
- Schema v26 → v27. Auto-migration adds the seven new columns to
table_registry:incremental_window_days,max_history_days,incremental_column,where_filters,partition_by,partition_granularity,initial_load_chunk_days. NULL on existing rows; meaningful only when paired with the matching strategy. Pre-existingsync_strategycolumn (default'full_refresh') is now load-bearing — pre-v27 it was inert catalog metadata; post-v27 the Keboola extractor dispatches off it. PUT /api/admin/registry/{id}changed from{k: v for k, v in request.model_dump().items() if v is not None}torequest.model_dump(exclude_unset=True). Semantic shift: previously, sending explicitnullin the request body was silently ignored (field kept its existing value); now explicitnullpropagates as a real null update. Intentional — the v27 Edit modal needs to clearincremental_columnetc. when an admin switches strategy fromincrementalback tofull_refresh. Inline comment + regression test pin the new behavior.
Fixed (Devin Review)
- Schema docs in CLAUDE.md updated from v25 to v27, with v25→v26 and v26→v27 migration entries describing what each version adds.
update_tableexclude_unset semantic shift documented inline;test_api_put_clears_v26_fields_on_strategy_switchpins the explicit-null-propagates behavior.incremental.py:_convert_columnfailure on primary_key column now raises hard (was silent mixed-type column → broken dedup downstream). Test added.
[0.47.0] — 2026-05-07
Catalog metadata enrichment + cache discipline + automatic warmup. Closes #155 + #156.
Added
/api/v2/catalogreturns four new optional fields per row —rows,size_bytes,partition_by,clustered_by— populated by per-source-type metadata providers (connectors/bigquery/metadata.py,connectors/keboola/metadata.py). Forquery_mode='remote'BigQuery rows,size_bytesisactive_logical_bytes + long_term_logical_bytes(a full scan reads both); region resolved fromdata_source.bigquery.location→bq_client.get_dataset(...)→ fall back to legacy__TABLES__. Existing CLI consumers reading onlyrough_size_hintare unaffected.- Automatic cache warmup at startup. FastAPI startup event schedules
a background task that walks BQ remote rows and pre-populates
_metadata_cache+_schema_cachewith bounded concurrency (default 4, tunable viaAGNES_WARMUP_CONCURRENCY). Doesn't block readiness; per-row failures logged + skipped. Opt-out viaAGNES_SKIP_CACHE_WARMUP=1. - Three new admin endpoints under
/api/admin/cache-warmup/*:GET /status— JSON snapshot of the latest run.POST /run— manual trigger, idempotent under concurrent invocation.GET /stream— Server-Sent Events withstart/row/completeevents for live UI updates.
/admin/tablescache freshness panel. Toolbar above the per-source-type listings with progress bar + "Re-warm all" button + collapsible terminal-style log fed by SSE (polling fallback at 3 s). Per-row badge in the existingcol-statuscolumn updates live (fresh / warming / pending / error).docs/admin/query-modes.md— source-agnostic admin reference for registering tables aslocal/remote/materialized. Decision tree, per-source-type IAM + setup, three worked examples. Linked from the?icon next to thequery_modefield in the admin UI edit modal and from the third post-register hint inagnes admin register-table.agnes admin register-tablepost-register hint forquery_mode=remote: points atagnes query --remote "SELECT COUNT(*)..."as the IAM smoke check so a missingdataViewer/jobUsersurfaces at registration time, not 30 minutes later.
Changed
/api/v2/schema/{id}cache miss now does 1 BQ job instead of 2.connectors/bigquery/access.py:fetch_bq_columns_fullcollapses what used to be_fetch_bq_schema+_fetch_bq_table_optionsinto a singleINFORMATION_SCHEMA.COLUMNSquery (same view, same predicate, just a combined SELECT list). The metadata provider's partition/cluster path shares the same helper — zero SQL duplication across the two consumers.- All four catalog/schema/sample/metadata caches are flushed on registry
change.
app/api/v2_catalog.py:invalidate_for_tableis wired intoPOST /api/admin/register-table,PUT /api/admin/registry/{id}, andDELETE /api/admin/registry/{id}. After a registry write, a single-row re-warm task is scheduled in the background so the admin's verification request hits warm caches within ~1 s instead of waiting for the next analyst miss. Pre-fix none of the caches were invalidated — admin registers a table,agnes catalogdoesn't show the new row for up to 5 min; admin updates a row's bucket,agnes schemareturns the OLD column list for up to 1 hour. v2_schema.build_schemasplit into RBAC-aware outer + RBAC-naive inner (build_schema_uncached). Live endpoint behavior unchanged; warmup uses the inner entry point to populate_schema_cachewithout a user context.
Internal
- New shared dataclass module
app/api/_metadata_models.pywithMetadataRequest(frozen) +TableMetadatafor source-agnostic provider input/output. - New
connectors/keboola/storage_api.py:KeboolaStorageClient.get_table_infothin wrapper — keeps_getprivate to the module. - New env vars (operator-facing tuning, no required setup change):
AGNES_SKIP_CACHE_WARMUP— opt-out of startup warmup.AGNES_WARMUP_CONCURRENCY— default 4, max parallel BQ INFORMATION_SCHEMA jobs during a warmup pass.
- New runtime dependency:
sse-starlette>=2.0(Server-Sent Events responses for the cache-warmup stream). - Tests added:
test_metadata_models,test_v2_schema_columns_consolidation,test_v2_catalog_dispatcher,test_connectors_bigquery_metadata,test_connectors_keboola_metadata,test_v2_catalog_remote_metadata,test_v2_catalog_invalidation,test_cache_warmup,test_main_startup_warmup,test_admin_tables_warmup_ui.
[0.46.5] — 2026-05-07
Fixed
agnes describe <table> -n 5previously failed withMissing argument 'TABLE_ID'because the command was registered as aTyper.Typersubcommand group; the combination of positionaltable_id+ short option-n INTEGERmis-parses in that pattern. Switched to a flat@app.command("describe")registration. All forms (-nbefore/after positional,--rows=N, default n=5) now parse correctly. Surfaced from a real analyst session following the CLAUDE.md "agent rails" discovery workflow./api/v2/sample/<id>(called byagnes describe) returned HTTP 500 withValueError: Out of range float values are not JSON compliant: nanwhen the result rows contained NaN values from the underlying DuckDB / BigQuery scan. The endpoint now sanitizes NaN/±inf to JSONnullbefore serialization. Same surfaced from a real analyst session.
[0.46.4] — 2026-05-07
Fixed
- SessionEnd
agnes pushhook previously synchronous-ran in the foreground; Claude Code's-p(headless) mode terminates SessionEnd hook subprocesses after ~1 second regardless of work in progress, so the upload was killed mid-stream and most session JSONLs never reached the server. Now wrapped inbash -c "( nohup agnes push ... & ) ; true"so the upload child detaches from the hook subprocess and survives Claude's aggressive shutdown. Existing workspaces pick up the detached form on their nextagnes initinvocation via the existing migration path. Verified end-to-end against production:claude -pexited in 5s, the detached child completed the upload, and the session JSONL landed on the server within 30s.
[0.46.3] — 2026-05-07
Added
agnes initnow installs a third SessionStart hook entry (agnes push --quiet) so orphan session JSONLs left behind byclaude -pheadless invocations (where Claude Code does NOT fire SessionEnd) or abnormal exits get uploaded on the next interactive session start. Symmetric self-healing alongside the existingagnes pullSessionStart entry. Existing workspaces pick up the third entry on their nextagnes initinvocation via the existing migration path incli/lib/hooks.py:_OUR_COMMAND_MARKERS.
Fixed
agnes diagnosesession_pipelinewarning previously read "uploads are not being processed", which led users to suspect theiragnes pushuploads were failing. The warning now reads "verification-detector backlog" and includeslast_processedso operators see at a glance that uploads are fine and only the LLM extraction step is behind.
[0.46.2] — 2026-05-07
Fixed
agnes queryagainst aquery_mode='remote'table previously surfaced DuckDB's misleading "did you mean " suggestion. Now appends a friendlier hint pointing users toagnes catalog,agnes schema <id>, andagnes query --remote. Reproduces from a real analyst session whereDESCRIBE unit_economics(a remote table) sent the user down a 30-second wrong path.
[0.46.1] — 2026-05-07
Fixed
remote_estimate_failednow surfaces the rewritten-SQL diagnostic (the actual BQ "Unrecognized name" / "Syntax error" message) instead of the unhelpful "Table must be qualified" from the user-original-SQL retry. Addsunderlying_originalfor the second-attempt context. Hint now points users toagnes schema <id>first — the typical cause is a typo'd column name.
[0.46.0] — 2026-05-07
Catalog metadata enrichment + cache discipline + automatic warmup. Closes #155 + #156.
Added
/api/v2/catalogreturns four new optional fields per row —rows,size_bytes,partition_by,clustered_by— populated by per-source-type metadata providers (connectors/bigquery/metadata.py,connectors/keboola/metadata.py). Forquery_mode='remote'BigQuery rows,size_bytesisactive_logical_bytes + long_term_logical_bytes(a full scan reads both); region resolved fromdata_source.bigquery.location→bq_client.get_dataset(...)→ fall back to legacy__TABLES__. Existing CLI consumers reading onlyrough_size_hintare unaffected.- Automatic cache warmup at startup. FastAPI startup event schedules
a background task that walks BQ remote rows and pre-populates
_metadata_cache+_schema_cachewith bounded concurrency (default 4, tunable viaAGNES_WARMUP_CONCURRENCY). Doesn't block readiness; per-row failures logged + skipped. Opt-out viaAGNES_SKIP_CACHE_WARMUP=1. - Three new admin endpoints under
/api/admin/cache-warmup/*:GET /status— JSON snapshot of the latest run.POST /run— manual trigger, idempotent under concurrent invocation.GET /stream— Server-Sent Events withstart/row/completeevents for live UI updates.
/admin/tablescache freshness panel. Toolbar above the per-source-type listings with progress bar + "Re-warm all" button + collapsible terminal-style log fed by SSE (polling fallback at 3 s). Per-row badge in the existingcol-statuscolumn updates live (fresh / warming / pending / error).docs/admin/query-modes.md— source-agnostic admin reference for registering tables aslocal/remote/materialized. Decision tree, per-source-type IAM + setup, three worked examples. Linked from the?icon next to thequery_modefield in the admin UI edit modal and from the third post-register hint inagnes admin register-table.agnes admin register-tablepost-register hint forquery_mode=remote: points atagnes query --remote "SELECT COUNT(*)..."as the IAM smoke check so a missingdataViewer/jobUsersurfaces at registration time, not 30 minutes later.
Changed
/api/v2/schema/{id}cache miss now does 1 BQ job instead of 2.connectors/bigquery/access.py:fetch_bq_columns_fullcollapses what used to be_fetch_bq_schema+_fetch_bq_table_optionsinto a singleINFORMATION_SCHEMA.COLUMNSquery (same view, same predicate, just a combined SELECT list). The metadata provider's partition/cluster path shares the same helper — zero SQL duplication across the two consumers.- All four catalog/schema/sample/metadata caches are flushed on registry
change.
app/api/v2_catalog.py:invalidate_for_tableis wired intoPOST /api/admin/register-table,PUT /api/admin/registry/{id}, andDELETE /api/admin/registry/{id}. After a registry write, a single-row re-warm task is scheduled in the background so the admin's verification request hits warm caches within ~1 s instead of waiting for the next analyst miss. Pre-fix none of the caches were invalidated — admin registers a table,agnes catalogdoesn't show the new row for up to 5 min; admin updates a row's bucket,agnes schemareturns the OLD column list for up to 1 hour. v2_schema.build_schemasplit into RBAC-aware outer + RBAC-naive inner (build_schema_uncached). Live endpoint behavior unchanged; warmup uses the inner entry point to populate_schema_cachewithout a user context.
Internal
- New shared dataclass module
app/api/_metadata_models.pywithMetadataRequest(frozen) +TableMetadatafor source-agnostic provider input/output. - New
connectors/keboola/storage_api.py:KeboolaStorageClient.get_table_infothin wrapper — keeps_getprivate to the module. - New env vars (operator-facing tuning, no required setup change):
AGNES_SKIP_CACHE_WARMUP— opt-out of startup warmup.AGNES_WARMUP_CONCURRENCY— default 4, max parallel BQ INFORMATION_SCHEMA jobs during a warmup pass.
- New runtime dependency:
sse-starlette>=2.0(Server-Sent Events responses for the cache-warmup stream). - Tests added:
test_metadata_models,test_v2_schema_columns_consolidation,test_v2_catalog_dispatcher,test_connectors_bigquery_metadata,test_connectors_keboola_metadata,test_v2_catalog_remote_metadata,test_v2_catalog_invalidation,test_cache_warmup,test_main_startup_warmup,test_admin_tables_warmup_ui.
[0.45.0] — 2026-05-07
Operator-and-analyst quality bundle: a security fix for the optional
Telegram bot, two CLI gaps closed, and three rounds of UX polish on
agnes diagnose and agnes pull so non-TTY consumers (CI runners,
Claude Code SessionStart hooks, sub-agent watchdogs) get readable,
actionable signal. Closes #84, #164, #177, #178, #203, #204.
Security
- Telegram bot pairing-code RNG hardened (#84). The pairing code
used to link a Telegram chat to an Agnes user is now generated via
secrets.choice(CSPRNG) rather thanrandom.choices. Pre-fix an attacker who scraped one issued code could recover therandommodule's PRNG state and predict subsequent codes issued in the same process — the fix neutralizes that class of attack (services/telegram_bot/storage.py:_generate_code). - Telegram script runner refuses out-of-shape usernames (#84). The
optional notification runner shells out via
sudo -u <username>. A username controlled by an attacker — e.g. via tampering withtelegram_users.json— could otherwise carry sudo flags (-u,--shell=…) or shell metacharacters. The runner now validates the value against a POSIX-conservative regex (^[a-z_][a-z0-9._-]{0,31}$) and returnsNonebefore invokingsubprocess.runif it doesn't match (services/telegram_bot/runner.py:_USERNAME_RE).
Added
agnes admin unregister-table <id>— CLI wrapper forDELETE /api/admin/registry/{id}(#177). Confirms before destructive action; pass--yesto skip the prompt in scripts. The server-side endpoint already does the parquet/sync_statecleanup; the CLI is a thin client.agnes admin update-table <id>— CLI wrapper forPUT /api/admin/registry/{id}(#177). Only the supplied flags go in the body (--name,--bucket,--source-table,--query-mode,--query,--description,--sync-schedule,--source-type); the rest stay unchanged on the server.--queryaccepts@path/to.sqlfor files. Calling with no flags errors (No fields supplied) instead of silently no-opping.agnes diagnose --include-schema(#204). The defaultagnes diagnoseno longer surfaces the DB schema-version check — analysts hitting the CLI rarely care about the integer, and it dominated the agent-facing output. Pass--include-schema(or query/api/health/detailed?include=schemadirectly) when verifying a migration.infoseverity tier in/api/health/detailed(#178). Sits betweenokandwarning: surfaces a non-trivial observation worth reading without promoting the headline status todegraded. See the module docstring atapp/api/health.pyfor the full severity ladder. The BQ billing-equals-data check is the first consumer (waswarning→ nowinfo).AGNES_PULL_PROGRESS_INTERVAL_SECONDSandAGNES_PULL_PROGRESS_INTERVAL_BYTESenv knobs for the textual progress emitter (#203). Defaults are tighter than pre-fix (5 s / 1 MiB vs the previous 30 s / 10%-of-total) so non-TTY consumers see continuous output and don't trip dead-process watchdogs on multi-GB parquets. Override either independently.
Changed
agnes pullnon-TTY progress is more chatty by default (#203). Previous cadence (30 s / 10%) produced one line every several minutes on multi-GB parquets, long enough for Claude Code sub-agent watchdogs to kill the pull as a hung process. New defaults: emit when any of (10% boundary, 5 s elapsed, 1 MiB bytes since last emit). The 10% boundary is unchanged so small files still get the original visual rhythm./api/health/detailedno longer includesdb_schemaby default (#204). Pass?include=schemato opt back in. The aggregator treats the schema check as "not asserted" when absent, so unrelated services can still drive the headline. Operators using the legacy entry should add the parameter to their probe configuration.- BQ billing-project equals data-project surfaces as
info, notwarning(#178). Many valid single-project dev instances run with billing == data; the message is informational. Thedetailhintstrings are unchanged so the operator still gets the USER_PROJECT_DENIED context if they're hitting it. Pre-fix, the message alone promoted the overall headline todegradedeven on intentionally collapsed setups.
agnes init --forcenow snapshots the priorCLAUDE.mdtoCLAUDE.md.bak.<ISO-timestamp>before regenerating it (#164). Each re-run produces a fresh backup; the prior backup is not clobbered. A FS error on the backup path is logged but does not abort the init (the existing-workspace gate still requires--force).
Internal
- New
cli.client.api_puthelper to mirrorapi_get/api_post/api_delete/api_patchfor the newupdate-tablecommand. - Tests added:
tests/test_telegram_bot_runner.py,tests/test_health_schema_gate.py, plus extensions totest_telegram_storage,test_pull_progress,test_diagnose_billing,test_cli_admin,test_cli_init. infra/modules/customer-instance(taginfra-v1.8.0):startup-script.sh.tplno longer overwrites operator-editedAGNES_TAG/AGNES_TEMP_DIRin/opt/agnes/.envon every boot. Reads the existing values when present and lets them win over the template-computed$IMAGE_TAG. Pre-fix, an in-place TF action that stopped/started the VM (e.g.machine_typechange) would re-run the startup script and clobber any manually-pinned image tag — operators had to re-edit the file post-restart. Fresh provisions still get the TF-driven values; the.envfile's existence is the disambiguator. To force a TF-driven reset,rm /opt/agnes/.envand reboot. Folded in from #214, which landed on main between 0.44.1 and this cut.
[0.44.1] — 2026-05-07
[0.44.1] — 2026-05-07
Fixed
/admin/users/{id}— "Add to group" dropdown explains itself when empty instead of leaving the admin staring at a silent— Pick a group —placeholder. Three cases now surface a hint below the picker: (a) user is already in every group, (b) every remaining group is Google-Workspace-managed and Agnes can't grant manually (POST would 409 — link to/admin/groupsto create a custom group), (c) no groups exist at all. Pre-fix on deployments whereAdmin+Everyoneare mapped viaAGNES_GROUP_{ADMIN,EVERYONE}_EMAILand no custom groups exist, the picker was empty with zero indication that the operator needed to create a custom group first./admin/users/{id}— "Add to group" dropdown'sloadAll()race fixed: pre-fixloadGroups()andloadMemberships()ran in parallel andrefreshGroupDropdown()(called fromloadGroups) read themembershipsglobal, which could still be[]if memberships hadn't returned yet — letting the dropdown show groups the user was already in.loadMemberships()now re-runs the dropdown refresh once it has its data, so the final render reflects both data sets regardless of which fetch completes first.
[0.44.0] — 2026-05-07
Added
agnes refresh-marketplace— single CLI command that owns the per-user filtered Claude Code marketplace lifecycle.--bootstrapdoes the first-time setup: clones the per-user marketplace bare repo to~/.agnes/marketplace, strips the PAT from the cloned origin URL so it doesn't sit in plaintext at rest, registers the local path with Claude Code, and installs every plugin in the served manifest at--scope project. Without--bootstrapit does an incremental refresh: fetch + reset to the remote, then version-aware reconcile (install missing plugins, update on version diff, skip on match). Plugins removed from the manifest are deliberately NOT auto-uninstalled — a transient empty manifest from the server would otherwise wipe the user's stack.agnes initnow installs a SessionStart hook that runsagnes refresh-marketplace --quieton every Claude Code session, alongside the existing chainedagnes self-upgrade; agnes pullentry. The marketplace refresh runs as a separate hook entry (not chained) so a failure (e.g. fresh workspace with no clone yet) doesn't suppress the data pull. The refresh command is wrapped inbash -c "..."because Claude Code on Windows runs hook commands directly without a shell, which would otherwise leave the2>/dev/null || truesyntax uninterpreted.- When
agnes refresh-marketplacedetects an actual change, it emits Claude Code hook JSON on stdout —systemMessage(transient toast) andadditionalContext(model-side system reminder) — both pointing at/reload-pluginsso the running session loads new plugins without a restart.
Changed
- Install-prompt step 5 (in the dashboard-served setup payload) collapses
from a 15-line inline shell sequence —
rm -rf+git clone+ per-pluginclaude plugin installcalls — to a singleagnes refresh-marketplace --bootstrapinvocation. The old inline form tripped Claude Code's agentrm -rfpermission gate on first run. scripts/dev/agnes-client-reset.sh: now cleans~/.claude/plugins/{marketplaces,cache}/agnes, drops the uv build cache, and documents workspace-scoped residue that can't be enumerated from a user-level reset.
Internal
infra/modules/customer-instance(taginfra-v1.7.0):google_compute_instance.vmnow setsallow_stopping_for_update = true. Without it, changingmachine_type(or any other field GCP will only mutate on a stopped VM) caused Terraform to fall back to a destroy + recreate, churning VM-local state for what should be an in-place resize. Consumers do not need to update — the field is provider-side only — but bumping the module ref toinfra-v1.7.0enables in-place machine-type bumps.
[0.43.0] — 2026-05-06
Added
- CLI auto-upgrade:
agnes self-upgradereinstalls the CLI from the server's currently-shipped wheel viauv tool install --force, falling back topip install --force-reinstall --no-depsviasys.executablewhen uv is not on PATH. After install, the new binary is smoke-tested at the install-resolved path (uv tool dir --binfor uv,<sys.executable parent>/agnesfor pip) — never via PATH lookup, to avoid stale-shadow false positives. Smoke failure triggers automatic rollback to the previously verified-good wheel (recorded in~/.config/agnes/last_known_good.json); rollback's exit code is captured and surfaced on stderr if it also fails. First-ever upgrade or unrecoverable rollback prints the canonical bootstrap recovery:curl -fsSL <your-agnes-server>/cli/install.sh | bash. The new command is wired into the SessionStart hook installed byagnes initas a chained shell entry (agnes self-upgrade … || true; agnes pull … || true) so an upgrade failure does not block the pull. - Server:
/api/*responses now carryX-Agnes-Latest-VersionandX-Agnes-Min-Versionheaders. CLIs older thanX-Agnes-Min-Versionexit with code 2 and a remediation message instead of failing on a wire-protocol mismatch. Day-one floor is0.0.0(no enforcement) — bumpMIN_COMPAT_CLI_VERSIONinapp/version.pyin the same PR that ships a deliberate wire break. - CLI:
cli/update_check.py:check()accepts a keyword-onlybypass_disabled=Trueso explicitagnes self-upgradeinvocations probe/cli/latesteven whenAGNES_NO_UPDATE_CHECK=1is set (which silences the implicit warning loop only).
[0.42.0] — 2026-05-06
Fixed
agnes query --remote: full backtick BigQuery paths in user SQL are no longer corrupted by the registered-name rewriter. Previously a query likeSELECT … FROM `<project>.<dataset>.<table>` WHERE …whose table name happened to be registered as a bare-name alias would have the alias re-substituted inside the backtick path, producing malformed SQL that BigQuery rejected with a parse error. The cap-guard then fell back to a filter-lessSELECT *size estimate (often orders of magnitude larger than the real scan), blocking the query asremote_scan_too_large. Issue #201.
Changed
agnes query --remote: cap-guard fallback no longer estimates from a syntheticSELECT *when the rewritten SQL fails dry-run. It first retries the user's original SQL (handles BQ-native input cleanly), and only when that also fails returns a structuredremote_estimate_failedHTTP 400 with a hint instead of silently over-estimating.- BREAKING (clients matching error kinds): failure to estimate
remote-query scan size now returns
kind="remote_estimate_failed"instead of being masked asremote_scan_too_largecaused by over-estimation. Operators that grep for the old kind in dashboards should update.
Security
agnes query --remote: full backtick BigQuery paths are now registry-gated identically tobq."<dataset>"."<table>"syntax. Previously, full backtick paths bypassed Agnes RBAC entirely — only the configured service account scope limited what users could query. Newbq_path_cross_project(when the project ≠ configured data project) andbq_path_not_registered(when path is unknown) error kinds. Issue #201.
[0.41.0] — 2026-05-06
Fixed
-
Orchestrator filesystem fallback for materialized parquets that couldn't register in
extract.duckdb's_meta(src/orchestrator.py:_attach_and_create_views). The 0.40.0 fix inmaterialize_queryopensextract.duckdbfrom a fresh DuckDB handle to write the_metarow + inner view; in production the same uvicorn process already holdsextract.duckdbATTACHed read-only as the source-name alias under the orchestrator's analytics connection, and DuckDB's single-process file-handle uniqueness rejects the second open withBinder Error: Unique file handle conflict: Cannot attach "extract" — already attached by database "<source>". The 0.40.0 helper logs WARNING and falls through; parquet stays canonical, but the master view never appears via the meta path.This release adds a second pass at the end of
_attach_and_create_views: scan<extract_dir>/data/*.parquetand create a master view viaread_parquet('<path>')for any parquet whose<id>is not already in the per-sourcetableslist (i.e. the meta path didn't pick it up). Decoupled frommaterialize_query's open-handle race; robust against any registration drift between materialize and rebuild. Honors the sameview_ownership/ cross- connector collision rules as the meta path (first-come-first-served viaview_repo.claim). Tests cover: fallback fires when meta row is missing; fallback skips when meta path already created the view (no shadow); invalid identifier in parquet stem is skipped without crash; source withoutdata/subdir doesn't crash the scan.
[0.40.0] — 2026-05-06
Fixed
- Materialized BigQuery parquets now register themselves in
extract.duckdbso the master view actually appears (connectors/bigquery/extractor.py:materialize_query). Pre-fix the function wrote the<id>.parquetto disk and returned the row count, but never wrote a_metarow or an inner view in the connector'sextract.duckdb. The orchestrator'srebuild()scans_metato decide which master views to create, so materialized tables remained invisible:agnes query "SELECT … FROM <id>"returned HTTP 400 "registered as query_mode='materialized' but is not yet materialized in this instance's analytics views" even though the parquet was sitting there. Symptom appeared after every container recreate (image upgrade) and after every_create_meta_tablecycle in the extractor subprocess (whichDROP TABLE IF EXISTS _meta+CREATE TABLEcleanly each pass — wiping any prior materialized rows). Fix: after the atomicos.replace(tmp_path, parquet_path), openextract.duckdband `DELETE FROM _meta WHERE table_name = ? + INSERT- CREATE OR REPLACE VIEW AS SELECT * FROM read_parquet('')
inside a single transaction. Idempotent, fail-soft (parquet remains canonical, the next sync pass recovers any registration drift). Whenextract.duckdb` doesn't exist yet (fresh BQ-only deployment), the fix logs and continues — the next extractor pass creates the file and the master view appears on the rebuild after that.
- CREATE OR REPLACE VIEW AS SELECT * FROM read_parquet('')
[0.39.0] — 2026-05-06
Performance
/api/query(andagnes query --remote) now rewrites user SQL referencingquery_mode='remote'BigQuery rows into a singlebigquery_query()call before execute (app/api/query.py). Pre-fix the master view (CREATE VIEW <name> AS SELECT * FROM bigquery.<bucket>.<source_table>) did not push WHERE / SELECT / LIMIT into BQ — the DuckDB BQ extension opened a Storage Read API session over the entire upstream table, scanning the full partitioned dataset before the local DuckDB filter ran. On 100M+ row remote-mode tables this was 50-100× slower than the equivalent directbigquery_query()call (70-150 s vs 1.5 s) and frequently failed withResponse too large to return. The rewriter (shared core with the existing dry-run helper) wraps the user's whole SQL inbigquery_query('<project>', '<inner-sql>')so the BQ planner receives the full query and applies partition pruning + projection pushdown server-side. Conservative fall-through: cross-source JOINs (BQ ↔ Keboola/Jira local), queries already containingbigquery_query(, and unconfigured BQ project all keep the original ATTACH-catalog path so behavior degrades gracefully.- DuckDB BigQuery-extension session pool
(
connectors/bigquery/access.py).BqAccess.duckdb_session()now acquires pre-warmed connections from a bounded process-local pool instead of runningINSTALL bigquery; LOAD bigquery; CREATE SECRET; ATTACH …on every request. Each acquire saves the ~0.5 s extension-load + secret-creation cost when the pool has a warm entry; auth SECRET is refreshed on acquire so a long-lived pooled entry doesn't keep a stale GCE metadata token past its TTL. Pool size is configurable viadata_source.bigquery.session_pool_size(default 4; sentinel0disables pooling). Affects every BQ-touching path —/api/query,/api/v2/scan,/api/v2/sample,/api/v2/schema, materialize, and the orchestrator's remote-attach. agnes pullchunked download for large parquets: when the server advertisesaccept-ranges: bytesand a parquet exceedsAGNES_PULL_CHUNK_THRESHOLD_BYTES(default 50 MB), the CLI now splits the file into N parallel HTTP Range requests (AGNES_PULL_CHUNK_PARALLELISM, default 4, capped 1..16) and assembles the parts into the destination atomically. Targets the per-flow-shaped network (corp VPN with per-TCP-connection rate-limiting) where a single stream is throttled but N parallel streams over the same connection scale roughly linearly. Falls back to single-stream when the server responds 200 instead of 206 to a Range probe, when noaccept-ranges: bytesis advertised, or when content is below the threshold — no behavior change in the small-file / non-cooperating- server cases.- Persistent HTTP/2 client across
agnes pull:stream_downloadnow routes through a process-wide pooledhttpx.Clientso N parquet downloads share a single TLS handshake; HTTP/2 multiplexing (when the optionalh2package is installed) lets all chunk Range requests share one TCP connection. Gracefully falls back to HTTP/1.1 pooling whenh2is missing — no crash, just slightly less benefit.
Fixed
- BigQuery
responseTooLargeno longer surfaces as a generic 400 / 502 with the raw upstream message (connectors/bigquery/access.py). Thetranslate_bq_errorhelper now classifies "Response too large to return" errors via a dedicatedbq_response_too_largekind (HTTP 400) with an actionable hint pointing at the WHERE / aggregation / materialized-table remediations. Pre-fix this failure mode fell through to the genericbq_bad_requestmapping, which implied the user's SQL had a syntax error — wrong root cause. Affects every BQ-touching path (/api/query,/api/v2/scan,/api/v2/sample,/api/v2/schema, materialize) since they all sharetranslate_bq_error.
Added
- New optional dependency
h2>=4.1.0(HTTP/2 transport for httpx). Pure performance —agnes pullworks on HTTP/1.1 if the install skips it. - Textual progress fallback for non-TTY
agnes pull: when stderr is not a terminal (Claude Code SessionStart hook, CI runner, Docker log capture, …),agnes pull --no-quietnow emits a plain-text progress line per file at most every 10% or 30 s, plus a final completion line. Replaces the previous Rich-bar-on-pipe behavior that either suppressed output entirely or leaked ANSI escape sequences. TTY path unchanged (Rich progress bar with bytes / speed / ETA, aggregated per-file across chunked-download chunks).
[0.38.3] — 2026-05-06
Changed
- Admin / Tables: registry table now shows Source (bucket/table), Schedule, Folder, Registered by/at, and a sync-error warning icon per row. The page widens to ~1600px to accommodate.
Fixed
- Admin / Tables: long table descriptions no longer push the row's Edit / Manage access / Delete buttons off-screen. The Description column is now clamped to 2 lines with the full text available on hover and in the Edit modal.
- Admin / Tables: descriptions stored with shell-quoting backslash-escapes (
Don\'t,\n) now render correctly. The same normalization also runs at register/update time so newly-saved descriptions are never corrupted. - Admin / Tables:
scripts/fix_description_escapes.pycleans up already-corrupted descriptions intable_registry(run with--dry-runfirst, then--apply).
[0.38.2] — 2026-05-06
Fixed
bq_query_timeout_mswas not applied on every BigQuery ATTACH branch (src/db.py:_reattach_remote_extensions,src/orchestrator.py:_attach_remote_extensions). Pre-fix only the metadata-token branch (the BqAccess contract,token_env='') calledapply_bq_session_settings. BigQuery sources registered with an explicittoken_env, or with no auth env, ATTACH'd without ever applying the timeout — falling back to the extension's 90 s default. Default-config operators on those branches now consistently get the configured 600 s (or whateverdata_source.bigquery.query_timeout_msis set to).apply_bq_session_settingsswallowed everyExceptionsilently (connectors/bigquery/access.py). Two realistic failure modes — the BigQuery extension not yet loaded on the connection, or an installed extension version that doesn't recognise the setting — left the 90 s default in place with no log line explaining why. Each failure path now logsWARNINGwith the actionable cause; on success the applied value is verified via acurrent_setting('bq_query_timeout_ms')readback (catches the silent-ignore mode some extension versions exhibit) and a mismatch logsWARNINGtoo.
[0.38.1] — 2026-05-06
Internal
CLAUDE.md—Claude Code marketplace endpointsection now documents the two-step fallback (systemgit clone+ localclaude plugin marketplace add) for users registering manually against a private-CA Agnes instance. Bun-compiledclaudeignores the OS trust store and CA env vars on the marketplace HTTPS path, so direct/plugin marketplace addover HTTPS can fail with TLS errors on macOS / Windows even when system tools work fine. The dashboard-served setup payload (app/web/setup_instructions.py) already branches between the two automatically based on platform; the doc snippet now matches that behavior for manual flows.
[0.38.0] — 2026-05-06
Added
/storepage — community marketplace where every authenticated user can upload skills, agents, and plugins as ZIPs. Listing has type / category / search filters; detail page shows metadata, file list, photo, video link, and an[Install]button. Same owner can't have two entities with the samename(any type). Plugin/skill/agent name is suffixed-by-<owner-username>(sanitized email-local-part) at upload time to avoid collisions in Claude Code's flat namespace./my-ai-stackpage — every user's per-user composition view: the admin-granted plugins (with an opt-out toggle each, default enabled) plus the entities they've installed from the Store. Toggling a curated plugin off writes auser_plugin_optoutsrow; admin removing the underlying grant drops everyone's opt-out (re-grant restarts at enabled).- Composed served marketplace: the
/marketplace.zipand/marketplace.git/endpoints now serve(admin_granted ∖ opt_outs) ∪ store_installs— driven by the newsrc/marketplace_filter.py:resolve_user_marketplace. Same content-addressed ETag / git-commit-SHA contract as before; any change on either layer propagates to Claude Code on the next refresh. - Store skill+agent bundle: skill/agent installs are merged into a single
synthetic
agnes-store-bundleplugin in the served marketplace (one plugin with N skills/agents inside), whiletype='plugin'Store entities stay standalone. Cuts plugin-entry count in Claude Code from O(installs) down to O(1) for the skill+agent path. Bundle'sversionfield hashes its combined contents so install/uninstall flips it for auto-update detection. - REST:
POST/PUT/DELETE/GET /api/store/entities[/{id}],POST/DELETE /api/store/entities/{id}/install,GET /api/store/entities/{id}/photo,GET /api/store/entities/{id}/docs/{filename},POST /api/store/entities/preview(wizard step-1 validation),GET /api/store/categories,GET /api/store/owners,GET /api/my-stack,PUT /api/my-stack/curated/{marketplace_id}/{plugin_name}. - CLI:
agnes store {list,show,install,uninstall,upload,update,delete,pull,info}andagnes my-stack {show,toggle}— full analyst-side coverage of the new Store + composition REST surface. Multipart upload helper added tocli/v2_client.py(api_post_multipart/api_put_multipart/api_get_stream) so future multipart and binary-download endpoints don't have to roll their own httpx wiring. - CLI:
agnes admin store {pull,push,info}— operator-flavored bulk Store ops.pullandinfoshare the openGET /api/store/bundle.zip//entitiesendpoints;pushwraps the admin-gatedPOST /api/store/import-bundle.pushaccepts either a *.zip file or a directory containingmanifest.json+entities/(CLI zips a directory client-side, so a backup git repo's working tree round-trips straight back into Agnes via a single command). - CLI:
agnes store mine— analyst-facing self-bundle. Same endpoint asadmin store pull, scoped via?owner=me(server resolves the magic value to the caller's user_id) so authors can archive their own uploads without admin role. - REST:
GET /api/store/bundle.zip— deterministic ZIP of all (filtered) Store entities for whole-Store backup. Layout:manifest.jsonat the top with per-entity metadata +owner_emailfor cross-instance restore, thenentities/<entity_id>/{plugin,assets}/. Auth: any authenticated user (Store is community-open, the same set is already visible viaGET /api/store/entities). Filters mirror the listing endpoint (type / category / owner / search). - REST:
POST /api/store/import-bundle— admin-only restore of a bundle ZIP. Modes:merge(default — upsert byentity_id, replace when version differs),replace(overwrite all matching),skip(only insert new). Owner resolution byowner_emailagainstusers.email; missing emails get a stub disabled user (active=False, no password, idimported-<sha256[:12]>) so the historical owner stays attached and an admin can later activate or reassign in/admin/users. Audit-logged with the full counts.
Changed
/admin/marketplacesadmin nav entry moved from the top-level header into the Admin dropdown and renamed to Curated Marketplaces to disambiguate from the new community Store.app/api/access.pyDELETE /api/admin/grants/{grant_id}now drops every user'suser_plugin_optoutsrow matching the deleted plugin and flushes the marketplace ETag cache. Audit log entry forresource_grant.deletedcarriesoptouts_droppedso operators can correlate.app/marketplace_server/{packager,git_backend}.pyconsumeresolve_user_marketplaceinstead ofresolve_allowed_plugins. The/marketplace/infopayload now splits itspluginsarray bysource, exposingplugins(admin) andstore_plugins(community).
Fixed
- Stored XSS via
video_url(app/api/store.py) —video_urlaccepted onPOST/PUT /api/store/entitiesis now scheme-validated tohttp(s)://only. Previously ajavascript:URI flowed through the form field intostore_detail.html's<a href>and would execute in any viewer's session on click. 400invalid_video_urlon bad input. - ZIP decompression bomb (
app/api/store.py:_safe_zip_extract) — the uncompressed-side total of an upload is now capped at 200 MB (MAX_ZIP_UNCOMPRESSED); the compressed-side cap (50 MB) alone did not bound the on-disk footprint. 413zip_too_large_uncompressedon oversize. - Admin authz parity for Store mutations (
app/api/store.py,app/web/router.py,app/web/templates/store_detail.html) —PUT /api/store/entities/{id}now permits owner OR admin (matchesDELETE); the store-detail page passesis_adminto the template and gates the Edit/Delete buttons onis_owner OR is_admin. Pre-fix, an admin could delete via the API but saw no Edit/Delete affordance in the UI, and could not update non-owned entities at all. - Scratch directory leak on ZIP validation failure (
app/api/store.py, Devin Review) —create_entityandupdate_entitycreated thescratchtemp dir inside onetry/finallyblock but cleaned it up in a separate one. When_safe_zip_extractraisedHTTPException(zip-slip, uncompressed-too-large) orBadZipFilewas caught and re-raised, the exception exited the first scope and the cleanupfinallywas never reached. Each failed upload leaked a temp dir. Fixed by collapsing scratch creation + cleanup into a single outertry/finallycovering both extraction and the metadata/bake work. - Cross-owner suffix collision (
app/api/store.py:create_entity) —sanitize_usernameis many-to-one (alice.smithandalice_smithboth →alice-smith). Two such users uploading entities with the same displaynameproduced identical<name>-by-<username>suffixes, silently colliding in the served bundle's on-disk paths and the manifest catalog (Claude Code dedupes byplugin.json'sname). We now refuse the second upload with 409conflict_global_suffix.
Internal
- Schema v24 → v25: adds
store_entities,user_store_installs,user_plugin_optouts. Auto-migration via_V24_TO_V25_MIGRATIONSladder branch insrc/db.py(existing self-heal path also creates the tables on same-version starts). - New helpers in
src/store_naming.py:sanitize_username,suffixed_name,compute_entity_version(sha256 of sorted(relpath, content)tuples, 16-char hex prefix). Predefined category taxonomy insrc/store_categories.py. - New repositories:
src/repositories/{store_entities,user_store_installs, user_plugin_optouts}.py(mirror existingmarketplace_pluginsstyle — dict returns, parameterized SQL, no ORM). app/utils.py:get_store_dir()—${DATA_DIR}/store/.humanbytesJinja2 filter on Store detail page (binary KB/MB/GB).- New CLI command modules:
cli/commands/store.py,cli/commands/my_stack.py. Registered as Typer subappsagnes storeandagnes my-stackincli/main.py. Tests attests/test_cli_store.py. tests/test_store_api.py:TestStoreSecurityFixes— regression suite for F1 (video_url), F2 (zip-bomb), F4 (admin authz parity), F5 (cross-owner suffix collision).
[0.37.0] — 2026-05-06
Operator-side disk-layout release. Closes the 2026-05-05 shadow-mount class identified in v0.36.0's deploy notes via two independent fixes that operators can adopt separately: (#194 folds in @cvrysanek's #191 + #192). The image-side change is invisible — STATE_DIR defaults to the legacy nested path, so existing deployments see no behavior change unless they opt into the new flat layout. Folds in three rounds of Devin Review (3 BUGs + 1 ANALYSIS class, ANALYSIS deferred per the operator-side limitation it describes).
Added
STATE_DIRenv var +docker-compose.flat-mount.ymloverlay — operators can now place the writable state disk in parallel to the data disk (sdbat/data,sdcat/data-state) instead of nested (sdcat/data/stateinside/data). The flat layout removes three structural fragilities of the legacy nested layout: bind-mount propagation gotchas (the 2026-05-05 shadow-mount class), two-writer collisions on a shared prefix (host'stls-rotate.timeras root + container app as uid 999 on the same path), and mount-order coupling on disk resize.STATE_DIRdefaults to${DATA_DIR}/stateso existing deployers see no behavior change; opt-in to flat layout via the new overlay +STATE_DIR=/data-stateper the runbook indocs/state-dir.md. Read bysrc/db.py:_get_state_dir(),app/secrets.py:_state_dir(),app/main.py(.env_overlay),app/instance_config.py(instance.yamloverlay reader),app/api/admin.py(writers for both/api/admin/configureand/api/admin/server-configagainst the same overlay),app/api/marketplaces.py(marketplace PAT persistence into.env_overlay),scripts/ops/agnes-auto-upgrade.sh(mount-sanity + cert detection),scripts/ops/agnes-tls-rotate.sh(CERT_DIR=$STATE_DIR/certs). All read/write sites resolve via the same helper so underSTATE_DIR=/data-statethe irreplaceable tier (system.duckdb, secrets,instance.yaml,.env_overlay, certs) lands on sdc consistently — partial migration would silently lose secrets on container restart.
Changed
docker-compose.host-mount.ymlswitched from "named volume + driver_opts" to direct service-level bind mounts (volumes: !overrideper service). Docker named volumes have an immutability footgun: once a volume is created, its driver options are fixed for the life of the volume, and editing this file does NOT propagate the new options to existing volumes. This bit a deployer in production: the volume was created before the overlay hadbind,rbind, kept the oldbind(non-recursive) propagation, and containers wrote to a shadowed subdirectory of the parent disk instead of the nested child mount. DuckDB went FATAL on a root-owned WAL during a routine container recreate; sign-in broke. Direct service binds re-evaluate options every container start and default to recursive in modern Docker (20.10+) — no immutable state to migrate, no shadow-mount class. Operators on this overlay: nextdocker compose up -dstarts containers with direct binds; the oldagnes_datanamed volume is no longer referenced and can be removed withdocker volume rm agnes_data(operator's choice — orphaned but harmless if left). Bothhost-mount.ymlandflat-mount.ymlvolumes: !overrideblocks forcaddynow restate every mount the base service depends on (notablydata:/srv:rofor the v0.36.0 file_server bypass andcaddy_config:/configfor ACME state) — a Devin-caught regression where!overridesilently dropped these mounts under the new layout, defeating the parquet-download perf bypass.
[0.36.0] — 2026-05-05
Combined performance + analyst-clarity bundle. Folds three previously-staged work streams into one PR (#188): the long-running agnes query --remote timeout (#181), the Caddy parquet-download bypass (#182), and Pavel's #185 Phase 1 trace findings (silent 44-min first-init, opaque CLI tracebacks, no analyst-Claude size signal). Also performs the Tier 1 event-loop unblocking — the five hottest BQ-touching endpoints were async def over synchronous DuckDB / BQ-extension calls, so a single heavy agnes query --remote froze every other request for the duration of the BQ wait. The image-side fixes ship in this release; for existing VMs, the new auto-upgrade.sh self-fetches the matching Caddyfile + compose overlays from main on its next 5-minute tick, so deployment requires no operator action beyond letting the cron run.
Added
data_source.bigquery.query_timeout_msconfig knob (default 600 000 ms = 10 min). The DuckDB BigQuery extension's built-in default of 90 s was too tight for analyst-scale queries against view-backed BQ datasets —agnes query --remotewould HTTP 400 withBinder Error: Query execution exceeded the timeout. Job ID: …whenever the underlying BQ job took longer than 90 s, even though the BQ job itself was healthy. The new knob is applied viaSET bq_query_timeout_msafter everyLOAD bigqueryon every BQ-touching DuckDB session — the orchestrator's_remote_attachATTACH path (src/orchestrator.py), the analytics-DB read-only reattach path (src/db.py:_reattach_remote_extensions— the primaryagnes query --remoterequest path), theBqAccesssession factory (connectors/bigquery/access.py), and the standalone extractor (connectors/bigquery/extractor.py). Sentinel0(or non-numeric / unparseable values) leaves the extension default in place so operators on legacy extension versions that don't recognise the setting aren't broken. Configurable via/admin/server-configUI. Note: BigQuery'sjobs.queryRPC caps the wait at ~200 s per call regardless of this setting; the extension polls on top so the effective ceiling is the value here but each poll is ~200 s. DuckDB emits an informational warning when the value is set above the BQ RPC cap — operators can safely ignore it.- Per-user parallel parquet downloads in
agnes pull— the download loop incli/lib/pull.pynow uses aThreadPoolExecutorwith concurrency capped by the newAGNES_PULL_PARALLELISMenv var (default 4, set 1 to restore pre-PR serial behavior). On a registry of N tables the wall-clock time drops fromΣ stream_download_seconds(table_i)to roughlymax × ceil(N/4). Works hand-in-hand with the Caddyfile_serverchange below: without it parallel client-side downloads would still queue on the single uvicorn worker; with it each request is its own caddy goroutine + sendfile, so 4-way parallelism actually delivers throughput. Per-table error semantics preserved — a failure on one table no longer aborts the rest of the batch. agnes init/agnes pull --skip-materialize— opts the first sync out of materialized-mode tables (server-side scheduled-query parquets, often multi-GB). Pavel's #185 Phase 1: a single 6.3 GBorder_economicsparquet kept first init silent for 44 minutes. Materialized rows stay discoverable viaagnes catalog; rerun without the flag once the analyst actually needs them locally.agnes pullprogress bar — Rich-driven aggregate transfer display rendered to stderr when not--quietand not--json. Per-file label + bytes / total / rate / ETA, aggregated across the parallelThreadPoolExecutorworkers introduced earlier in this PR. Replaces the prior 0-stdout silence on first init.- CLI clean-error wrapper (
cli/main.py:_run_with_clean_errors, new entry point inpyproject.toml) —httpx.ReadTimeout/ConnectError/RemoteProtocolErroretc. used to dump a five-frame Python traceback to the analyst's terminal when aagnes query --remoteagainst a slow BQ view timed out client-side. Now: one-lineError: …message + actionable hint (e.g. "narrow the WHERE on the partition column fromagnes catalog --json, or runagnes snapshot create --estimate"), exit code 1. Full traceback is appended to~/.config/agnes/last-error.logso an operator can recover it for support without spamming the analyst's terminal. Implemented asAgnesTransportErrorraised from theapi_get/api_post/api_delete/api_patch/stream_downloadhelpers incli/client.py; the top-level Typer wrapper renders it. UnhandledExceptions are caught at the same boundary, logged, and printed as "internal CLI error (see logfile)" so a Python traceback never leaks to the analyst. scripts/ops/agnes-auto-upgrade.shnow re-fetches Caddyfile + every compose overlay fromkeboola/agnes-the-ai-analyst@mainon every tick, hashes them, and triggers adocker compose up -drecreation when the hash changes — same path as an image-digest change. Pre-fix the script only watcheddocker imagesdigests, so a Caddyfile or compose change in main never reached running VMs (only fresh boots ranstartup.sh's file fetch). Without this, the new file_server downloads-path below would land in the image but stay inert against an old Caddyfile. The script also self-updates from the same path so the very fix that watches config files isn't itself stuck on running VMs. Fail-soft on curl errors — keeps the existing file rather than blanking it.- Caddy
file_serverfor parquet downloads —GET /api/data/{table_id}/downloadis now intercepted at the Caddy layer (TLS profile only) and served directly via sendfile/zero-copy from the data volume mounted read-only at/srvinside the caddy container. Caddy authorises every request via a new lightweight RBAC probeGET /api/data/{table_id}/check-access(returns 204 when the caller has read access on the table, 403 otherwise) using theforward_authdirective — the bulk byte transfer never touches uvicorn workers. Resolves a real production failure mode where a single multi-GB analyst pull held the app's only uvicorn worker for the duration of the stream and starved the UI //api/health/ every other API endpoint, eventually flipping the container tounhealthy. Path discovery uses Caddy'stry_filesover the knownextract.duckdbv2 source subdirs (bigquery/data/<id>.parquet,keboola/data/<id>.parquet,jira/data/<id>.parquet); a parquet not at any of those paths transparently falls through to the existing app handler so legacysrc_data/parquetlayouts and future connectors keep working with no Caddyfile change. Non-Caddy deployments (devdocker compose upwithout--profile tls) continue to use the app handler unchanged. - Workspace prompt: decision tree, common-mistakes callout, failure-mode dictionary in
config/claude_md_template.txt(the templateagnes initwrites to<workspace>/CLAUDE.md). Surfaces every catalog-row field analyst Claude should read before deciding which command to use (query_mode,sql_flavor,where_examples,fetch_via,rough_size_hint); explicitly binds--estimatetoagnes snapshot createONLY (was the most-failed first-try misuse — fails withNo such option: --estimateonagnes query); calls out theagnes fetch→agnes snapshot createrename so stale-doc analysts don't run a non-command; documents the BQ permission model (server SA, not personal Google identity) and a 6-row failure-mode table mapping each common error wording to its cause + the right next step. rough_size_hintpopulated forlocal+materializedcatalog rows inGET /api/v2/catalog(was hardcodednullwith a "Task 8" TODO). Reads the parquet file size at${DATA_DIR}/extracts/<source_type>/data/<table_id>.parquetand buckets intosmall(≤100 MiB),medium(≤1 GiB),large(≤10 GiB),very_large(>10 GiB).remoterows staynullfor now (size requires a BQ INFORMATION_SCHEMA call; tracked separately). Lets analyst Claude pickagnes snapshot createoveragnes query --remoteby inspectingagnes catalog --jsonrather than discovering size empirically via a failed--remoteround-trip.
Changed
- Tier 1 event-loop unblocking — the five hottest BQ-touching endpoints (
POST /api/query,POST /api/v2/scan,POST /api/v2/scan/estimate,GET /api/v2/sample/{id},GET /api/v2/schema/{id}) were declaredasync defbut invoked synchronous DuckDB / BQ-extension calls inside the body. Under uvicorn's single event loop that meant a single heavyagnes query --remote(waiting up to ~200 s for BQ'sjobs.queryto return) froze every other request —/api/health, the dashboard, auth, even another query — for the full duration of the BQ wait. Operators saw "VM idle, app frozen" symptoms during this work. Converted all five to plaindefso FastAPI auto-offloads the blocking body to the anyio thread pool; the event loop stays free for non-BQ requests. Verified via 0-await audit (noawaitstatements in the converted handlers, so the rename is safe). Tests:tests/test_v2_*.pywere rewritten to call the handlers directly instead ofasyncio.run(...)(which now fails on a non-coroutine return). Pairs with the thread-pool capacity bump below. AGNES_THREADPOOL_SIZEenv var (default 200, was anyio's stock 40) controls the FastAPI / Starlette thread pool capacity used by every plain-defroute handler. Set inapp/main.py:lifespanviaanyio.to_thread.current_default_thread_limiter().total_tokens. 200 leaves comfortable headroom over the BQ extension's connection budget while keeping the per-process thread cost bounded — for the workload of <50 concurrent analysts this is well over what's needed; bump for higher concurrency.- CLI update-banner now says
agnesinstead ofda(cli/update_check.py:format_outdated_notice). The string[update] da X is out of datehad survived theda→agnesCLI rename and was the most-visible stale identifier in the analyst-facing surface — every CLI command printed it on stderr when a newer wheel was available.
Fixed
- CLI ReadTimeout message reports the actual httpx timeout (was hardcoded to
QUERY_TIMEOUT_S= 300s). On a 30s-default call (agnes catalog,agnes auth, …) the analyst saw "didn't respond within the read timeout (300s)" while the call had actually given up after 30s — confusing and unactionable. The translator now takes the real timeout from the calling helper and renders it; the long-running-BQ advisory only appears for calls where the timeout was set ≥ 60s. Devin Review on PR #188. - Keboola sync now falls back to the legacy Storage-API client when the DuckDB Keboola extension's per-table scan fails, not just when the initial
ATTACHfails. Two changes:kbcstorage>=0.9.0is promoted from optional to core dependency. The legacy fallback path inconnectors/keboola/extractor.py:_extract_via_legacyhas been there since the extension landed, but until now the barefrom kbcstorage.client import Clientwould crash any default install withModuleNotFoundError.connectors/keboola/extractor.py:runnow wraps_extract_via_extensionin a per-table try/except — on any per-table scan failure it retries via the legacy client. Previously, whenATTACHsucceeded but the table-levelCOPY (SELECT * FROM kbc."<bucket>"."<table>")failed, the table was just marked failed with no retry. Together these unblock deployments where the extension's bucket-schema scans returnSchema '..."in.c-..."' does not exist or not authorized(keboola/duckdb-extension#17) while the upstream extension fix is in flight.
[0.35.1] — 2026-05-05
Fixed
agnes query --remoteno longer dies after 30s on long-running BigQuery SELECTs. The CLI HTTP client now defaults to a 300s timeout for/api/queryand exposesAGNES_QUERY_TIMEOUT(seconds, float) for operators who need to extend it further. Other CLI calls keep the 30s default. (cli/client.py,cli/commands/query.py)
[0.35.0] — 2026-05-05
Five-defect fix for the silently-broken session pipeline on default Compose deploys (#176). Sessions uploaded by agnes push landed on /data/user_sessions/<user>/*.jsonl, but on a stock docker compose up deploy nothing ever processed them — /corporate-memory stayed empty even when sessions and CLAUDE.local.md were uploaded. The root cause was a stack of compounding defects: LLM SDKs were dev-only deps so the scheduler container boot-looped on ModuleNotFoundError, the side-car services were profile-gated and ran as tight restart: unless-stopped boot loops anyway, the verification_detector had no scheduler entry at all, the first-time setup never seeded an ai: block, and the /corporate-memory page silently filtered out the pending review queue. This release wires the LLM pipeline into the existing scheduler-v2 model (one HTTP-driven cron tick per service) and adds a health-check that warns when uploaded jsonls aren't being processed.
Changed
- BREAKING
docker-compose.ymlanddocker-compose.prod.ymlno longer ship thecorporate-memoryandsession-collectorservices. The scheduler container drives both jobs through admin HTTP endpoints (see Added below) on offset cadences (10 min / 17 min). Operators previously runningCOMPOSE_PROFILES=fullor maintaining custom Compose overrides need to drop those service stanzas — leaving them in produces a double-driver footgun (the standalone container loop races the scheduler-v2 cron tick on/data/user_sessionsandknowledge_itemswrites). The Python entry points (services/{corporate_memory, session_collector, verification_detector}/__main__.py) remain — they're still callable from the CLI for one-shot manual runs and from the new admin endpoints.
Added
- New admin endpoints in
app/api/admin.pythat wrap the LLM pipeline jobs so the scheduler can drive them over HTTP (matching the existing/api/marketplaces/sync-allpattern):POST /api/admin/run-session-collector— copies Claude Code session jsonls from user homes to/data/user_sessions/<user>/.POST /api/admin/run-verification-detector— extracts verified knowledge from session transcripts via the LLM, writes pending items toknowledge_items.POST /api/admin/run-corporate-memory— refreshes the catalog from teamCLAUDE.local.mdfiles. All three are admin-gated, sync-def (FastAPI thread pool), and emit one audit row per invocation.
- Three new entries in
services/scheduler/__main__.py:JOBSwith deliberately offset cadences (10 m / 15 m / 17 m, all coprime modulo the 30 s tick) so the LLM-backed jobs don't fire on the same tick and stack their API + DB load:session-collector— every 10 min →POST /api/admin/run-session-collector.verification-detector— every 15 min →POST /api/admin/run-verification-detector.corporate-memory— every 17 min →POST /api/admin/run-corporate-memory.
connectors.llm.factory.create_extractor_from_env_or_config(ai_config)— falls back toANTHROPIC_API_KEY/LLM_API_KEYenv vars when theai:block is empty, raises a clearValueErrorwhen neither is available.services/corporate_memoryandservices/verification_detectorswitch to the new helper so a missingai:section is no longer a silent skip.POST /api/admin/configurenow seeds a defaultai:block into the writableinstance.yamloverlay when the overlay has noai:yet ANDANTHROPIC_API_KEY(orLLM_API_KEY) is present in the environment. The block stores the env-var reference (${ANTHROPIC_API_KEY}), never the raw secret. Existing operator config is preserved verbatim./corporate-memorypage renders an admin-only banner (N pending items awaiting review — review them at /corporate-memory/admin) when the pending review queue is non-empty. Non-admins see no change — the route zeroes the count server-side before the template renders. Closes the silent-failure UX gap that hid the review queue from operators withapproval_mode='review_queue'(the default).GET /api/health/detailednow returns asession_pipelineservice entry that warns when uploaded session jsonls aren't being processed. Heuristic:max(mtime of /data/user_sessions/**/*.jsonl) <= max(processed_at in session_extraction_state) + grace_seconds, wheregrace_seconds = 2 ×the verification-detector cadence (default 30 min, configurable viaSCHEDULER_VERIFICATION_DETECTOR_INTERVAL). Surfaces asstatus='warning'(nevererror) with an actionabledetailpointing at the verification-detector job. A warning bubbles up to the existingoverall='degraded'aggregation soagnes diagnose systemflags it.
Fixed
- Defect 4 — LLM provider SDKs in dev-only deps caused scheduler container boot loops.
anthropic>=0.30.0andopenai>=1.30.0are now in[project].dependencies, not[project.optional-dependencies].dev. The Dockerfile'suv pip install --system --no-cache .picks them up automatically, no Dockerfile change required.tests/test_packaging.pylocks the contract. - Defect 5 — first-time setup never wrote an
ai:block. Two paths to a working LLM pipeline now actually work end-to-end (#179 review): (a) a defaultai:block seeded byPOST /api/admin/configureinto the writable overlay at${DATA_DIR}/state/instance.yamlwhen env keys are present (Added above), or (b) env-var fallback at service start time. The seeded overlay path was dead code on the initial 0.35.0 cut — the three LLM consumers (services/corporate_memory/collector.py,services/verification_detector/__main__.py,app/api/admin.py:run_verification_detector) importedload_instance_configfromconfig.loader(which only reads the static config dir), and even if they had read the overlay,app/instance_config.pyranyaml.safe_loadon it without resolving${ENV_VAR}references so the seeded${ANTHROPIC_API_KEY}placeholder would have stayed literal. Both fixes shipped: consumers switched to the overlay-awareapp.instance_config.load_instance_config, and the overlay is now passed throughconfig.loader._resolve_env_refsbefore deep-merge with the static base.collect_allno longer swallows the factory'sValueErrorintostats["errors"]— fail-fast propagates so the scheduler / admin endpoint surface the actionable misconfiguration message. - #179 review — scheduler ignored its own LLM cadence env vars.
app/api/health.pyalready readSCHEDULER_VERIFICATION_DETECTOR_INTERVALto compute the staleness grace window, but the scheduler cadence was hardcoded toevery 15m, so an operator throttling the detector via the env was silently ignored on the schedule side while the health grace silently widened. All three LLM-pipeline cadences are now env-driven through the same_read_positive_intpattern asdata-refresh/health-check/script-runner:SCHEDULER_SESSION_COLLECTOR_INTERVAL(default 600s = 10m),SCHEDULER_VERIFICATION_DETECTOR_INTERVAL(default 900s = 15m), andSCHEDULER_CORPORATE_MEMORY_INTERVAL(default 1020s = 17m). Defaults preserve the 10/15/17m coprime offset so the three jobs don't fire on the same tick. The verification-detector env var remains the single source of truth for the health-check grace (still2 ×the cadence). - Defect 3 —
verification_detectorhad no scheduler entry. Now inJOBSwith a 15 min cadence, hitting the new/api/admin/run-verification-detectorendpoint. - Defect 2 — side-car services gated by
profiles: [full]were silently skipped on default deploys. Both stanzas dropped (Changed above); the scheduler-v2 cron is the sole driver. - Defect 1 —
/corporate-memoryfilteredstatus IN ('approved','mandatory')with no hint that pending items existed. Admin banner added (Added above). - #179 review —
/api/admin/run-session-collectorwould SystemExit the worker. The endpoint calledcollector.main(), whoseargparse.parse_args()parsed uvicorn'ssys.argv(['app.main:app', '--host', …]) and calledsys.exit(2)on the unrecognised flags.SystemExitinherits fromBaseException, escapes FastAPI's exception machinery, and propagates through the thread pool — every scheduler tick that fired the endpoint either 500-ed or risked killing the uvicorn worker. Fix:services/session_collector/collector.pynow exposes an argv-freerun(dry_run, verbose) -> (rc, stats)helper;main()is a thin CLI shim around it and the admin endpoint callsrun()directly. Audit log now carries the per-run stats (users_processed,files_copied,files_skipped) instead of just the rc. Regression tests intests/test_session_collector.py::TestRunHelper. - #179 review —
python -m services.corporate_memorycrashed on missing LLM config instead of exiting cleanly. The PR's fail-fast change madecollect_all()raiseValueErrorwhen neither anai:block norANTHROPIC_API_KEY/LLM_API_KEYwas available. Theverification_detectorCLI was updated to catch it; the corporate-memory CLI was missed. Now also wrapped — operators get a one-lineCorporate Memory cannot run: <factory message>on stderr and rc=1 instead of a raw traceback. Regression test intests/test_llm_connector.py::TestCorporateMemoryCollector::test_main_returns_1_on_no_ai_config_instead_of_traceback. - E2E test — Anthropic API rejected every extraction request. The structured-output API now requires
additionalProperties: falseon every{"type": "object"}node in the json_schema; without it the API returns 400invalid_request_error("output_config.format.schema: For 'object' type, 'additionalProperties' must be explicitly set to false"). Surfaced on a real BQ-backed deploy: every uploaded session jsonl failed verification-extraction in a tight retry loop. Fix:connectors/llm/anthropic_provider.pynow wraps the caller-supplied schema through a recursive_strict_json_schema()walker that adds the field where missing (preserving any explicit operator override), then passes the strict variant to the API. Six unit tests intests/test_llm_connector.py::TestStrictJsonSchemapin the recursion across nested objects, array items, and the no-mutation invariant. - #179 review —
/api/admin/run-verification-detectorskipped audit on unhandled exceptions. Ifdetector.run()threw anything other than the already-translatedValueError(DuckDB lock, network blip, unexpected SDK error), the audit_log row was never written — the operator's only signal wasdocker logs agnes-scheduler-1. The endpoint now wrapsdetector.runin try/except, records the exception inaudit_params["unhandled_error"], then re-raises as 500 after audit. The/admin/scheduler-runspage surfaces the failure row with the error type and message. - #179 review —
SCHEDULER_AUDIT_ACTIONSlisted action strings that don't actually appear inaudit_log. The list atapp/web/router.py:952had"marketplaces_sync_all"(wrong — actual is"marketplace.sync_all") plus"data_refresh"and"scripts_run_due"(whichapp/api/sync.pyandapp/api/scripts.pydon't write). Corrected to the four actually-logged strings, with a comment pointing at the missing audit calls in sync/scripts as a follow-up. - #179 review —
/api/admin/run-corporate-memoryskipped audit on unhandled exceptions (same gap asrun_verification_detectorfrom the previous round). Mirrored the same try/except +unhandled_erroraudit pattern, so a DuckDB lock or unexpected SDK error fromcollect_all()now produces an audit row with the error type+message before re-raising as 500. Regression test intests/test_admin_run_endpoints.py::TestRunCorporateMemory::test_unhandled_exception_still_audits. - #179 review —
/api/admin/run-session-collectorskipped audit on unhandled exceptions (third occurrence of the same pattern, completes the trilogy of LLM-pipeline endpoints). Mirrored the same try/except +unhandled_erroraudit pattern from the other two endpoints, so aPermissionErrorwalking/home, anOSErroron/data/user_sessionsmkdir, or any other unhandled exception fromcollector.run()now produces an audit row before re-raising as 500. Regression test intests/test_admin_run_endpoints.py::TestRunSessionCollector::test_unhandled_exception_still_audits. - #179 review —
/profile/sessions500-ed on transientstat()failure. The previous implementation usedsorted(glob, key=lambda p: p.stat().st_mtime); if any single jsonl file's stat call raised (race with delete, EACCES from a remount, etc.), the whole sort raised and the page returned 500 instead of just dropping that one row. Reworked the gather: stat each path under try/except into a(path, stat)list, then sort the already-statted entries. Bad files are silently dropped from the listing. Regression test intests/test_web_ui.py::TestAdminRoleGuards::test_profile_sessions_page_tolerates_stat_failures.
Added
/admin/scheduler-runs— read-only admin page showing the last 200 audit-log entries from scheduler-driven actions (run_session_collector,run_verification_detector,run_corporate_memory,marketplace.sync_all). NewAuditRepository.query_actions(actions, limit)query helper, new admin nav entry under the Admin dropdown.data-refresh(POST /api/sync/trigger) andscript-runner(POST /api/scripts/run-due) are scheduler jobs but don't write toaudit_logtoday, so they can't appear here yet. Failed scheduler ticks (HTTP 401, network errors) don't reach the audit_log either — those still live only indocker logs agnes-scheduler-1; the page calls that out with a hint to setSCHEDULER_API_TOKENif no rows show up./profile/sessions— self-service user page in the user menu, showing all session jsonls the caller uploaded viaagnes pushjoined againstsession_extraction_state. Each row shows uploaded_at, file size, status badge (pending/processed/extracted), processed_at,items_extracted, and a per-row Download button. The page docstring explicitly calls out thatitems_extracted = 0means the verification detector ran successfully but the LLM found no claims worth tracking — that's the documented "no items" outcome, not a broken pipeline. Closes the gap surfaced during the e2e test of #176 where a user could see their sessions on disk and process them through the LLM but had no UI to inspect what happened.GET /profile/sessions/<filename>— owner-only download of a single jsonl. Auth viaget_current_user; path safety locks the served file under${DATA_DIR}/user_sessions/<caller.id>/and rejects path-traversal / nested-component / non-.jsonl/ dotfile filenames with 404 (never 403, so existence of files belonging to other users is not leaked).Content-Disposition: attachmentreturns the file as a download.
Internal
tests/test_packaging.py— guards againstanthropic/openaislipping back into dev extras.tests/test_setup_ai_block.py— overlay seeding contract forPOST /api/admin/configure.tests/test_llm_provider_env_fallback.py— env fallback + fail-fast forcreate_extractor_from_env_or_config.tests/test_admin_run_endpoints.py— admin gating + scheduler registration + endpoint contract for the three new run-* endpoints.tests/test_docker_compose.py— pins the compose contract: the two side-car services must not reappear under either Compose file.tests/test_corporate_memory_page.py— pending-banner contract (admin sees, non-admin doesn't).tests/test_health_session_pipeline.py— session-pipeline staleness check across cold-start + ok + warning + never-processed cases.tests/test_instance_config_overlay.py— pins overlay env-ref resolution + the three LLM consumers reading fromapp.instance_config(#179 review).tests/test_scheduler.py—TestLLMPipelineCadenceEnvVars+TestVerificationDetectorGraceFollowsCadencepin the new env-var-driven cadences and the single-source-of-truth contract between scheduler and health-check grace (#179 review).docs/architecture.md— Services table updated to reflect the scheduler-v2 cadence map.
[0.34.0] — 2026-05-04
End-to-end clean-analyst-bootstrap rewrite. The web /setup page now produces a single unified paste prompt that, dropped into Claude Code in an empty folder, fully bootstraps a workspace — installs the CLI, authenticates, fetches CLAUDE.md, installs SessionStart/End hooks, runs the first data refresh, and writes a human-readable workspace docs file (AGNES_WORKSPACE.md). The admin-vs-analyst layout split (introduced as ?role= mid-cycle) was collapsed before merge: every caller sees the same flow, with the marketplace + plugins block emitted iff the caller has plugin grants. 26 implementation tasks across 6 phases plus a 10-task unification follow-up.
Changed
- BREAKING CLI binary renamed from
datoagnes. No backward-compat alias is shipped. Update shell aliases, hook commands in any pre-existing.claude/settings.json, scripts, and cron jobs. Reinstall viauv tool install <wheel>; the wheel now ships anagnesentry point. - BREAKING Environment variables and config dir renamed:
DA_CONFIG_DIR/DA_SERVER/DA_NO_UPDATE_CHECK/DA_LOCAL_DIR/DA_TOKEN/DA_STREAM_RETRIES→AGNES_*;~/.config/da/→~/.config/agnes/. Hard cutover, no fallback. Existing analysts re-authenticate viaagnes auth import-token. - BREAKING Analyst bootstrap rewritten end-to-end.
da analyst setupis removed; replaced byagnes init(non-interactive, requires--server-urland--token).da syncis split intoagnes pull(refresh) andagnes push(upload).da fetchis folded intoagnes snapshot create.da metrics list/showis folded intoagnes catalog --metrics;da metrics import/export/validatemove toagnes admin metrics {import,export,validate}. Theda analystnamespace is removed; the workspace status command is nowagnes status. The previousda status(server-health overview) becomesagnes diagnose system. - BREAKING Workspace layout simplified. Removed:
data/parquet/,data/duckdb/,data/metadata/,user/artifacts/. Canonical paths:server/parquet/(synced parquets),user/duckdb/analytics.duckdb(DuckDB views),user/snapshots/(ad-hoc snapshots),user/sessions/(recorded sessions). Lazy-mkdir contract — no empty pre-allocated directories. - BREAKING
/setupis now a single unified flow regardless of caller's role. The?role=query parameter (introduced earlier in this Unreleased cycle but never released) is removed before merge — no migration needed. The admin tile is gone. PAT scope is uniform: every install-page mint usesscope=generalwithexpires_in_days=90, calling the existingPOST /auth/tokensendpoint. Thebootstrap-analyst1 h-clamped scope is no longer used from/setup(still defined in code for future reuse, see open issue for redesign). The marketplace + plugins block is emitted iff the caller has plugin grants inresource_grants.agnes initis now part of every setup flow (admin and analyst alike) — it's the workspace-rails delivery mechanism./installcontinues to 302 to/setup. CLAUDE.mdserver-side template + repo-rootCLAUDE.mdupdated to reference the new CLI verbs and workspace paths. The admin UI for theclaude_md_templateDB override (/admin/workspace-prompt) renders a yellow banner when the saved override contains legacy strings (data/parquet/,da sync,da fetch,da analyst setup,da metrics list/show); admins re-author and save to clear it. Migration is manual.
Added
agnes init <opts>— non-interactive workspace bootstrap orchestrator. 8 steps: detect existing workspace, verify PAT (GET /api/catalog/tables), save config + token globally, fetchCLAUDE.mdfrom/api/welcome, install SessionStart/End hooks viacli/lib/hooks.py:install_claude_hooks, writeCLAUDE.local.mdstub (preserved on--force), run firstagnes pull, writeAGNES_WORKSPACE.md. Errors render viacli/error_render.py:render_error()with typed kinds (auth_failed,server_unreachable,partial_state,manifest_unauthorized).agnes pull/agnes push— split from the oldda sync/da sync --upload-only.--quiet/--json/--dry-runflags. SessionStart hook runsagnes pull --quiet; SessionEnd hook runsagnes push --quiet.agnes snapshot create <table>— folded fromda fetch. Addsif not local_db.exists()guard soagnes snapshot createno longer silently materializes an empty DuckDB file when run before anyagnes pull.agnes catalog --metrics(replacesda metrics list) andagnes catalog --metrics --show <id>(replacesda metrics show).agnes admin metrics {import,export,validate}— write paths relocated from the deletedda metricsnamespace.agnes diagnose system— server-side health check (was the oldda status).AGNES_WORKSPACE.md— human-readable workspace docs file generated byagnes initin the workspace root. Documents global install, workspace layout, hooks, cheat sheet, uninstall recipe.- PAT request body now accepts
scope: str = "general"andttl_seconds: int | None = Nonefields. PATs minted withscope="bootstrap-analyst"are TTL-clamped to ≤ 1 h server-side. Existingexpires_in_daysfield continues to work;ttl_secondswins when both are set.ttl_secondsupper bound is 315_360_000 (matchesexpires_in_days <= 3650cap). JWT carries thescopeclaim via newextra_claimsparameter oncreate_access_token; reserved keys (sub/email/typ/iat/jti/exp) cannot be overridden viaextra_claims. Audit log includes the scope. cli/lib/shared-library tree withcli/lib/pull.py:run_pull(data-refresh primitive callable from both the Typer wrapper andagnes init) andcli/lib/hooks.py:install_claude_hooks(workspace-scoped, idempotent Claude Code hook installer)._scan_legacy_stringshelper +legacy_strings_detectedfield onGET /api/admin/workspace-prompt-template— server scans saved CLAUDE.md overrides for stale CLI verbs / paths; the admin UI banner consumes the field./setuppre-flight check (step 4, gated on the marketplace block being present) now verifiesclaude --versionin addition togit --version. Both binaries are needed byclaude plugin marketplace addand the git-clone fallback — checking them together surfaces a clear "install X" message instead of a confusing downstream error. Install hints:npm i -g @anthropic-ai/claude-codefor Linux/WSL plus a doc URL (https://docs.claude.com/claude-code) for macOS / Windows native installers.
Fixed
agnes pull(formerlyda sync) no longer creates.claude/rules/when the corporate-memory bundle is empty.agnes pullno longer createsserver/parquet/when the manifest is empty (mkdir is lazy — only on first per-table write).agnes snapshot create(formerlyda fetch) no longer materializes an emptyuser/duckdb/analytics.duckdbwhen run before anyagnes pull. Friendly hint redirects toagnes pull.- Workspace
agnes statusreads from the canonicalserver/parquet/anduser/duckdb/analytics.duckdbpaths (was reading legacydata/parquet/,data/metadata/last_sync.json). agnes initandagnes pullerrors now use thecli/error_render.pytyped-error renderer (added in 0.32.0), so analyst-facing error UX matches the structured shapeagnes query --remotealready produces.- Schema v24 migration retry path is no longer dead (Devin Review on
db.py:1757, escalated from advisory to critical on rescan). Pre-fix: when_v23_to_v24_finalizehad materialized BQ rows to migrate butdata_source.bigquery.projectwas not configured, it logged a warning per row and returned normally. The schema_version then bumped to 24 unconditionally, theif current < 24:gate in_ensure_schemaskipped the function on every subsequent startup, and the affected rows kept their DuckDB-flavorbq."ds"."tbl"source_query forever — which the new_wrap_admin_sql_for_jobs_apiwrapping path rejects as unparseable BQ SQL with no automatic recovery. The "set the project and restart to retry" log hint pointed at a code path that no longer ran. Fix: the migration now raisesRuntimeErrorBEFORE the schema_version bump when it has rows to migrate but no project_id, blocking startup with a clear actionable error pointing atdata_source.bigquery.project. Operator configures the project, restarts, and the migration completes (schema_version is still at 23, so theif current < 24:gate fires). Side effect: a BQ-using deployment that hasn't set the project blocks startup until they do — that's the right call for a config error that would otherwise silently break materialized tables. Two regression tests intest_schema_v24_source_query_rewrite.py:test_v24_raises_when_project_not_configured_and_rows_need_migration(raise + version-stays-at-23) andtest_v24_skips_clean_when_no_rows_match_even_without_project(no-rows-no-block invariant). agnes admin register-tableUX: three real-world feedback items addressed.--query-mode materializednow requires--bucket(client-side validation; exits with a clear error before hitting the server). The previous help docstring claimed--bucketwas ignored for materialized rows, but the value is actually load-bearing —agnes schema <name>builds the BQ identifier asbq.<bucket>.<source_table>, so an empty bucket registered the row but broke subsequent schema/describe with HTTP 400 "unsafe BQ identifier in registry". Docstring rewritten to reflect reality.- Post-success hints: after a successful registration the CLI now points operators at the two follow-ups they routinely miss: (a)
agnes setup first-syncto materialize the parquet (registration alone doesn't trigger a build;agnes pullreports "Updated 0 tables" until the scheduler tick), and (b)agnes admin grant create <group> table <name>to make the row visible inagnes catalogfor non-admin users (catalog is RBAC-filtered). - Test coverage:
tests/test_cli_admin_materialized.py::test_register_materialized_without_bucket_fails_with_clear_errorandtest_register_table_emits_first_sync_and_grant_hints.
agnes query --remoteSQL rewriter no longer corrupts output when the GCP project ID contains a registered table name as a hyphen-delimited word (Devin Review onquery.py:464). The previous iterative rewrite (onere.sub(\b<name>\b, ...)per registered name) was vulnerable to cross-contamination: e.g. projectmy-ue-project+ registeredorders+ registeredue→ iter 1 rewritesordersto\my-ue-project.fin.orders`, iter 2's\bue\bthen matches theueINSIDEmy-ue-projectand corrupts the iter-1 path. Fix: replaced the iteration with a SINGLEre.subwhose alternation regex (sorted longest-first) handles every name in one pass, so freshly-inserted backticked text isn't re-scanned. The fallback atquery.py:576(per-table SELECT * on BQ parse error) caught the corrupted output asbq_bad_requestso impact was over-estimation rather than fail-open, but the partition-pruning benefit of #171 is now preserved for projects whose IDs share a hyphen-segment with a registered table name. Regression test intests/test_api_query_guardrail.py::test_rewrite_helper_does_not_corrupt_when_project_id_contains_registered_name`.- BigQuery materialize TTL reclaim is no longer dead code (Devin Review on
extractor.py:166)._try_acquire_file_lockused to callopen(lock_path, mode="w")BEFORE checking the lock-file mtime, which truncated the file and refreshed mtime to now on every invocation. The subsequenttime.time() - lock_path.stat().st_mtimealways saw age ~0, soage > TTLnever fired, andmaterialize.lock_ttl_secondswas a silently no-op config knob. Fix: stat the lock path BEFORE anyopen()to read the real pre-probe mtime; if older than TTL, unlink (forcing a fresh inode for the nextopen + flock); only then probe. Two regression tests added:test_stale_held_lock_is_reclaimed_despite_live_holderexercises the full reclaim path with a still-living fcntl holder,test_failed_probe_does_not_self_refresh_lock_mtimepins that a failed acquisition doesn't pathologically loop. Residual cross-process risk (a genuinely overrunning materialize past TTL races a fresh attempt) is documented in the helper docstring; in-processthreading.Lockkeyed ontable_idblocks the single-process race. agnes init --token Xnow correctly uses the explicit token in the verify call, even when~/.config/agnes/token.jsonalready holds a stale token from a prior install. Pre-fixcli.config.get_token()read the on-disk file first and only fell back to env vars, so step 2 (PAT-verify) ran with the stale token and failed with a confusing 401 — even though the--tokenarg was valid (Devin Review oninit.py:99). Fix: aContextVar-based override incli.configshort-circuitsget_token()before the file read;_override_server_env(used by bothagnes initandagnes pull'srun_pull) sets it for the duration of the call. Async-safe (each task sees its own override) and leak-proof (resets on context exit).agnes statussessions counter now reads the same source asagnes push—~/.claude/projects/<encoded-cwd>/(Claude Code's actual write path) with the legacy<workspace>/user/sessions/as a fallback, viacli.lib.claude_sessions.list_session_files(). Pre-fix the counter only checked the legacy dir and always reported 0 in workspaces bootstrapped withagnes init(since Claude Code never writes there).- BigQuery materialize lock-reclaim docstring at
connectors/bigquery/extractor.py:_try_acquire_file_lockcorrected: a still-running holder'sfcntl.flockdoes NOT block the post-unlink reacquisition (new file = new inode = independent lock). The in-processthreading.Lockkeyed ontable_idis the actual concurrency guard; cross-process protection (two schedulers on one workspace) relies on operators not running multiple concurrent schedulers AND on the TTL being well above the longest plausible COPY (24 h default). Documenting the residual risk so it isn't masked by a misleading "we're safe" comment (Devin Review on extractor.py:111). agnes pullnow re-downloads parquets when the local file is missing, even if the recorded hash matches the server. Pre-fix the download set was computed fromsync_state.jsonhash equality alone — if the parquet had been deleted (manualrm, disk cleanup, a different workspace sharing the same global~/.config/agnes/sync_state.jsonwriting one workspace's parquets while another reads sync_state and assumes "I already have these"), the hash-equal check would short-circuit the download and the next DuckDB view rebuild would fail on a missing file. Now the existence check on<workspace>/server/parquet/<tid>.parquetruns alongside the hash compare; missing file → forced re-download regardless of hash.agnes query --remoteno longer over-rejects narrow queries on partitioned/clustered BigQuery tables. Closes #171. Pre-fix the/api/querycost guardrail dry-ran a syntheticSELECT * FROM <table>per registered remote-BQ row referenced by the user SQL, which forced BQ to estimate "full table scan" — column projection, predicate pushdown, and partition pruning were all ignored, producing scan-byte estimates up to ~30,000× larger than the actual query would scan. Narrow queries on big partitioned tables (the documented happy-path use case) were rejected with 400remote_scan_too_largeeven when BQ's own dry-run reported single-digit MB. Now the guardrail rewrites the user SQL from DuckDB-flavor (bare registered names +bq."<ds>"."<tbl>") to BQ-native (`<project>.<ds>.<tbl>`) and runs ONE dry-run on the EXACT user SQL — partition pruning, column projection, and predicate pushdown all engage. Cap check uses the real estimate. Fallback: if BQ rejects the rewritten SQL withbq_bad_request(DuckDB-only syntax that doesn't translate, e.g.::INTcasts), the guardrail falls back to the pre-fix per-table SELECT * estimate so a non-portable query still gets bounded; non-parse errors (forbidden / upstream) propagate as 502. Helpers exported as_rewrite_user_sql_for_bq_dry_run(test seam).- Windows:
agnesCLI no longer crashes on cs-CZ / non-UTF-8 consoles. Two failure modes addressed (originally reported in #172 against the pre-renamedaCLI; ported and broadened here): (1)agnes pulland any other Rich-progress-bar codepath crashed withUnicodeEncodeErrorbecause cp1250 / cp1252 cannot encode Rich's Braille spinner glyphs —cli/main.pynow reconfiguressys.stdout/sys.stderrto UTF-8 witherrors="replace"at import time whensys.platform == "win32". (2)agnes skills listandagnes skills showcrashed withUnicodeDecodeErrorreading skill markdown that contains em-dashes / accents — everyPath.read_text()/Path.write_text()/open()call site incli/(including ones not touched by #172, since several files were renamed in the bootstrap rewrite) now passesencoding="utf-8"explicitly. Defensive: also covers JSON / YAML config files that were ASCII-only in practice but were one non-ASCII value away from the same failure mode. agnes snapshot create … --estimatein a pre-init directory no longer leaks an httpxConnectErrortraceback to stderr. The estimate-guard fix (3d587681) let--estimatereachapi_post_json, but the existingexcept V2ClientErrorclause didn't catch transport-layer errors when no server was configured (defaulted tohttp://localhost:8000). Now also catcheshttpx.HTTPErrorand renders the friendly hintRun \agnes init …` first`.agnes pushnow reads Claude Code session jsonls from~/.claude/projects/<encoded-cwd>/(where Claude Code actually writes them), instead of<workspace>/user/sessions/(which the SessionEnd hook never populated — the previous code uploaded an empty list every time). Encoding logic incli/lib/claude_sessions.pyprobes both Claude Code variants — older/→-and newer all-non-alphanumeric→-— and unions the result, so users who have upgraded Claude Code mid-project see sessions from both encoded dirs. Falls back to<workspace>/user/sessions/for back-compat.
Removed
da analyst setup,da analyst status,da sync,da fetch,da metrics. See Changed for replacements.da metricsnamespace as a top-level group (subcommands moved toagnes catalog --metricsfor read-only views andagnes admin metrics …for write operations).- Legacy workspace directories
data/parquet/,data/duckdb/,data/metadata/,user/artifacts/. Existing analyst workspaces should be reinitialized withagnes init --server-url ... --token ... --force(a fresh empty folder is recommended). _resolve_analyst_lines,_analyst_init_lines,_analyst_finale_lineshelpers inapp/web/setup_instructions.py— the analyst-vs-admin layout split is gone.roleparameter oncompute_default_agent_prompt,resolve_lines, andrender_setup_instructions.?role=query parameter on/setup. Admin tile (<nav class="role-tiles">) andROLEJS const + role-aware PAT-mint ternary ininstall.html.
Internal
cli/lib/__init__.py(empty) makescli/lib/a proper package picked up by Hatchling for wheel inclusion..gitignoreallowlistscli/lib/from the genericlib/rule.tests/fixtures/analyst_bootstrap.py— reusable test fixtures (fastapi_test_server,web_session,test_pat,test_pat_no_grants,zero_grants_workspace,NONEXISTENT_TABLE) for clean-install verification.tests/test_reader_smoke_matrix.py— load-bearing parametrized test: every reader CLI command runs on a freshly-bootstrapped zero-grants workspace without a Python traceback.tests/test_clean_install_integration.py— end-to-end happy-path tests (minimal grants, zero grants, force preserves CLAUDE.local.md, readers in pre-init dir).docs/RELEASE_CHECKLIST.md— manual clean-install protocol mandated for any PR touching the bootstrap path.- Audited and replaced stale
daverbs left over from prior merges in admin UI text, audit-log messages, code comments, operator runbooks, analyst-facing skill docs, and test docstrings (welcome template renderer/API tests now assert exact emitted markers —agnes initfor analyst flow,agnes authfor admin flow — with explicit absence checks on legacy verbs). Vendor-specific/opt/data-analyst/install paths in jira backfill/consistency scripts and operator docs replaced with<install-dir>/and anAGNES_ENV_FILEenv-var override. Intentional stale-marker tuples (_LEGACY_STRINGSinapp/api/claude_md.py,_OUR_COMMAND_MARKERSincli/lib/hooks.py) and tests that seed legacy hook content (tests/test_lib_hooks.py,tests/test_legacy_strings_scan.py) are preserved by design.
[0.33.0] — 2026-05-04
Closes #162. Headline fix: query_mode='materialized' BigQuery rows now
materialize correctly for views and materialized views, with per-table
concurrency control preventing parquet corruption on overlapping scheduler
ticks. Plus a source_query server-generation convenience, a
materialize.lock_ttl_seconds config knob, and a schema v24 migration that
converts existing DuckDB-flavor source_query values to BQ-native SQL.
Fixed
- BigQuery materialize now works for views and materialized views. Pre-fix,
materialize_queryran admin'ssource_queryasCOPY (sql) TO parquetthrough the DuckDB BigQuery extension session, which routed through the BQ Storage Read API forbq."<ds>"."<tbl>"references. Storage Read API rejects non-base entities (Binder Error: Error while creating read session: ... non-table entities cannot be read with the storage API). Fixed by always wrapping admin SQL intobigquery_query('<billing-project>', '<inner-sql>')so COPY uses the BQ jobs API uniformly for tables, views, and materialized views. materialize_queryno longer corrupts its parquet under concurrent invocations for the sametable_id. Pre-fix, two overlapping_run_materialized_passcalls (e.g. a long-running COPY + the next scheduler tick) both hit the unconditionalif tmp_path.exists(): tmp_path.unlink()at function entry and started parallel COPYs against the same path, interleaving bytes and producing a parquet file with no valid footer. Now each call acquires a per-table_idthreading.Lockplus an advisoryfcntl.flockon<id>.parquet.lock; the second caller raisesMaterializeInFlightErrorand the scheduler treats it asskipped, in_flight— never as an error.- Cost guardrail dry-run now engages for materialized rows. Pre-fix, the
BigQuery Python client returned 400 (
Table-valued function not found: bigquery_query) on the wrapped SQL and the dry-run silently fail-opened. The dry-run now operates on the inner BQ-native SQL (admin'ssource_querydirectly), which the client parses cleanly.
Changed
- BREAKING
query_mode='materialized'rows MUST registersource_queryas BigQuery-native SQL (backticks for dashed identifiers, native joins/CTEs). DuckDB-flavor (bq."<ds>"."<tbl>") is no longer accepted on register/PUT. The schema v24 migration converts existing rows automatically; operators with custom-writtensource_queryshould review the migrated form on first deploy. The validator's prior backtick-rejection rule is now scoped toquery_mode IN ('remote', 'local')only. _run_materialized_passsummaryskippedfield changes fromlist[str]tolist[dict]with shape{"table": str, "reason": Literal["due_check", "in_flight"]}. Downstream consumers that asserted the old string form must update.
Added
POST /api/admin/register-tableforquery_mode='materialized'rows withbucket+source_tablebut nosource_querynow server-generatesSELECT * FROM `<project>.<bucket>.<source_table>`from the configured BigQuery project. The same fallback fires onPUT /api/admin/registry/{id}when flipping to materialized. Operators only need to knowbigquery_query()semantics for non-trivial queries.- New top-level
materializeconfig section ininstance.yaml. Single field —materialize.lock_ttl_seconds(default86400, 24 h) — controls how long a stale<id>.parquet.lockfile lives before a sibling materialize attempt reclaims it. Editable via/admin/server-configAPI and UI.
Internal
- Schema v24 migration: rewrites
table_registry.source_queryfor materialized BigQuery rows from DuckDB-flavor (bq."<ds>"."<tbl>") to BQ-native (`<project>.<ds>.<tbl>`) using the configured BQ project. Idempotent on already-converted rows; logs a warning and skips when the project isn't configured (operator can configure + restart for retry). Wrapped inBEGIN TRANSACTION/COMMITto match the project's transactional-finalizer pattern. connectors/bigquery/extractor.pyexportsMaterializeInFlightErrorand the_get_table_lock/_get_lock_ttl_seconds/_wrap_admin_sql_for_jobs_api/_escape_sql_string_literalhelpers as test seams. Underscore-prefixed; not part of the public API.tests/conftest.pyliftsbq_instanceandstub_bq_extractorfixtures fromtests/test_api_admin_materialized.pyso subsequent test modules in this PR can resolve them via pytest's auto-discovery.app/api/sync.py:is_table_duehoisted to module-level import (was deferred inside_run_materialized_pass) so monkeypatchingapp.api.sync.is_table_dueactually intercepts the call — the deferred form made test patches a no-op.
[0.32.0] — 2026-05-04
Closes #160. Headline fix: da query --remote now resolves
query_mode='remote' BigQuery rows whose underlying entity is a VIEW
or MATERIALIZED_VIEW. Plus four reinforcing fixes that surfaced during
the work — server-side cost guardrail, registry-gating of direct bq.*
paths, function-call backdoor closed, structured CLI error rendering —
and one operator-side admin convenience (BQ test-connection endpoint +
billing_project placeholder UI). 14 issues caught + fixed across 6
iterations of Devin Review.
Added
/admin/server-configBQ test connection: admin-onlyPOST /api/admin/bigquery/test-connectionruns a 10s-timeoutSELECT 1against BigQuery via the process-cachedBqAccess(@functools.cacheonget_bq_access) and returns typed structured feedback (200 ok/400 not_configured/502 cross_project_forbidden/504 timeout). Tests the config active in the running process — after adata_source.bigquerysave the response shape includesrestart_required: True; click "Test connection" AFTER restart to validate the freshly-saved values. The /admin/server-config UI gets a "Test BigQuery connection" button next to the data_source Save button; on failure the inline result uses the same structured shape as the CLI renderer so operators see the same hint format admins do.data_source.bigquery.bq_max_scan_bytesserver-config knob (default 5 GiB): caps the BigQuery scan thatda query --remotewill issue againstquery_mode='remote'BQ rows. Exceeded queries are rejected with a structured400 remote_scan_too_largedetail naming the bytes, tables, and ada fetchsuggestion. Quota usage is recorded against the same daily byte cap as/api/v2/scan.data_source.bigquery.billing_projectplaceholder UI: the admin form now shows(defaults to <project>)greyed under an empty billing_project input, surfacing the access.py:339-340 fallback rule directly in the UI.
Fixed
da query --remoteagainstquery_mode='remote'BigQuery rows whose underlying entity is aVIEWorMATERIALIZED_VIEWnow resolves correctly (issue #160). The BQ extractor creates a master view via the catalog path (bq."<dataset>"."<source_table>") forBASE TABLE(Storage Read API; predicate pushdown) and viabigquery_query()forVIEW/MATERIALIZED_VIEW(jobs API). Other BQ entity types (EXTERNAL,SNAPSHOT,CLONE) are logged + skipped at extraction with no_metarow, so the orchestrator doesn't strand a registered name with a non-existent inner view.- Direct
bq."<dataset>"."<source_table>"references in/api/queryare now registry-gated: unregistered paths return 403bq_path_not_registered; registered paths are subject to the same per-name grant check as registered names. Closes a pre-existing RBAC bypass where direct catalog-path syntax skipped the master-view forbidden-table check entirely. Quoted catalog tokens ("bq"."ds"."tbl") are caught by the same regex. bigquery_query()direct calls in user SQL are now blocked by the/api/querykeyword blocklist. Closes a pre-existing function-call bypass that ran arbitrary BQ jobs API calls against any reachable dataset, ignoring the registry. Wrap views internal to the BQ extractor still usebigquery_query()inside theirCREATE VIEWbody — those run via DuckDB's view resolution at query time, never via user-submitted SQL, so the blocklist doesn't break them.- CLI commands (
da query --remote,da query --register-bq,da fetch,da schema, etc.) pretty-print structured BigQuery errors —cross_project_forbidden,bq_forbidden,auth_failed,not_configured,remote_scan_too_large,bq_path_not_registered, etc. — instead of dumping the truncated JSON body. The hint that explains how to fixUSER_PROJECT_DENIED(setdata_source.bigquery.billing_projectin /admin/server-config) is now actually visible to the operator. /api/query/hybridnow returns dictdetailfor typed errors (was flattening tof"BQ '{alias}': {error_type}: {message}"), so the new CLI renderer surfaces the structured shape consistently across both endpoints.
Changed
- BREAKING (config-only):
data_source.bigquery.legacy_wrap_viewsremoved. The flag was opt-in for the wrap-view behavior that is now the default. Keys still present in operator overlays are silently ignored — no action required. Operators who previously setlegacy_wrap_views: false(the prior default) get the new behavior for VIEW / MATERIALIZED_VIEW rows: a master view is created (via the BQ jobs API), andda query --remoteworks against the registered name. The cost concern that motivated the prior default is now addressed by the server-side guardrail (see Added). - Quota tracker relocated:
_build_quota_trackerand_quota_singletonmoved fromapp/api/v2_scan.pytoapp/api/v2_quota.py(their natural home).v2_scan.pyre-exports the function for backwards compat; existing test sites that callv2_scan._build_quota_tracker()keep working.
[0.31.0] — 2026-05-04
Added
- Agent Workspace Prompt — admin-editable Jinja2 markdown template for the analyst's
CLAUDE.md, surfaced in their workspace byda analyst setup. Default = rich briefing with RBAC-filtered tables/metrics/marketplaces context. Edit at/admin/workspace-prompt. Endpoints:GET /api/welcome(analyst-facing, auth required),GET/PUT/DELETE /api/admin/workspace-prompt-template,POST /api/admin/workspace-prompt-template/preview. CLI:da analyst setupwritesCLAUDE.mdby default; new--no-claude-mdflag opts out. Seedocs/agent-workspace-prompt.md. - Agent Setup Prompt — customizable bash setup script shown on
/setupand copied by the dashboard clipboard CTA. Default = the livesetup_instructions.resolve_lines()output (TLS trust bootstrap, CLI install, login, marketplace, skills). Admin override at/admin/agent-prompt— full replacement of the default, not a banner added on top. Override flows to both the/setuppage display and the dashboard clipboard payload. Jinja2 is available for{{ instance.name }}etc.;{server_url}and{token}are JS-substituted at clipboard-copy time and survive Jinja2 rendering unchanged. REST API:GET /api/admin/welcome-templatereturns{content, default, updated_at, updated_by}(contentisnullwhen no override is set;defaultis always the live computed script);PUTto set an override;DELETEto clear;POST /api/admin/welcome-template/previewfor live preview without persisting. Available Jinja2 placeholders:instance.{name,subtitle},server.{url,hostname},user(may benullfor anonymous visitors),now,today. Override content is HTML-sanitized post-render (script/iframe/event-handler strip). Seedocs/agent-setup-prompt.md. - DuckDB schema v21:
welcome_templatesingleton table backing the Agent Setup Prompt override. Auto-migration v20→v21 on first start. - DuckDB schema v22:
setup_bannertable reserved (no consumers; retained for forward compatibility with already-migrated instances). - DuckDB schema v23:
claude_md_templatesingleton table backing the Agent Workspace Prompt override. Auto-migration v22→v23.
Changed
da analyst setupwritesCLAUDE.mdto the analyst workspace from the server-rendered template (fetched viaGET /api/welcome). Use--no-claude-mdto opt out. Analysts who ran setup while CLAUDE.md generation was temporarily absent will have their file written on the nextda analyst setuprun./installpage renamed to/setup("Setup local agent" nav label) with 302 redirect from/install.- Dashboard "What Claude Code will receive" inline preview replaced with a link to
/setupfor the canonical view.
Fixed
da analyst setupsummary now accurately reflects whetherCLAUDE.mdwas written, skipped (--no-claude-md), or skipped due to a server error — previously it always claimed "written from server template" even when the fetch failed (404, 401/403, network), contradicting its own stderr warning.
[0.30.1] — 2026-05-02
Security
- auth: per-IP rate limiting now applied across every credential-bearing
auth endpoint. Defaults:
- 10/minute —
POST /auth/token,POST /auth/password/login,POST /auth/password/login/web(login brute-force throttle). - 10/minute —
POST/GET /auth/email/verify,POST /auth/password/reset/confirm,POST /auth/password/setup/confirm,POST /auth/password/setup(JSON variant — without it, the form/setup/confirmthrottle is bypassable by switching to the JSON path) (token brute-force throttle: the 32-byte URL-safe tokens are high entropy but partial leaks via logs / proxy referer have surfaced before, and there's no reason to allow unbounded guessing). - 5/minute —
POST /auth/email/send-link,POST /auth/password/reset,POST /auth/password/setup/request(email-bombing throttle: same shape on all three — attacker rotates random recipient addresses from a single IP to burn SMTP/SendGrid quota and spam real users; anti-enumeration responses mask which addresses landed). - 3/minute —
POST /auth/bootstrap(one-shot in normal use). Returns429withRetry-After: 60once exceeded. Per-IP key uses the leftmostX-Forwarded-Forhop — same trust model asapp.auth.dependencies._client_ip(Caddy strips client-supplied XFF in front of the app). SetAGNES_AUTH_RATELIMIT_ENABLED=0in env and bounce the container to disable (no image rebuild required; the value is read at process start, matching every other Agnes env knob). New dependency:slowapi>=0.1.9. Closes #45.
- 10/minute —
- admin API:
DELETE /api/admin/users/{id}/memberships/{group_id}andDELETE /api/admin/groups/{group_id}/members/{user_id}now refuse to remove anyone from the seededAdmingroup when they are the only remaining active admin — previously the guard only fired on self-removal, leaving a path where an admin could demote the only other admin and then rely on the partial guard to (correctly) block self-removal, but a scheduler / bootstrap path that bypasses normal admin checks could still reduce active admins to zero. Recovery from zero admins requires direct DB access, so the guard generalizes to mirror the existingcount_admins(active_only=True) <= 1check onDELETE /api/admin/users/{id}andPATCH /api/admin/users/{id}(active=false). Closes #151.
Fixed
- admin API:
POST /api/admin/register-tableandPUT /api/admin/registry/{id}now rejectsource_querycontaining BigQuery-native backtick identifiers (e.g.`prj.ds.t`) with HTTP 422 and a message pointing operators at the DuckDB-flavor equivalent (bq."dataset"."table"). Backtick SQL would silently no-op at the next materialize tick — the BQ extension's COPY runs through DuckDB's parser, which doesn't recognize backticks, so the query either parse-errored or matched zero rows and no parquet ever landed at/data/extracts/<source>/data/<id>.parquet. Fix catches the bad SQL at registration time so the row never lands in the registry. - admin API:
DELETE /api/admin/registry/{id}now removes the canonical materialized parquet (${DATA_DIR}/extracts/<source_type>/data/<name>.parquetplus any stale.parquet.tmp) AND clears the matchingsync_state/sync_historyrows. Pre-fix the registry row was dropped but the parquet- sync_state row stayed, so
GET /api/sync/manifestkept advertising the dropped table toda syncand analysts kept downloading it. Defensive failure handling — file-removal errors are logged but don't fail the DELETE.
- sync_state row stayed, so
Added
- admin API:
GET /api/admin/registryenriches each table row withlast_sync_error(string or null) sourced fromsync_state.error. The scheduler's_run_materialized_passnow writes per-row failures viaSyncStateRepository.set_errorso cap-exceeded / auth-failure / bad-SQL errors surface to the admin UI andda admin statusinstead of vanishing into scheduler stderr. A row that recovers on the next tick clears the error automatically (the success path ofupdate_syncresetsstatus='ok'/error=NULLon the upsert). - admin API:
POST /api/admin/register-tablenow refuses requests whosesource_typeisn't actually configured on the instance — pre-fix, an admin could registersource_type='keboola'on a BQ-only instance and the row would land in the registry but never sync (no Keboola URL/token to ATTACH against). Returns 422 with a message naming the configured primary source and pointing at/admin/server-configfor enabling a secondary source.jira/localare exempt — they don't sit underdata_source.*. Omitted source_type still tolerated for legacy CLI callers. Stays permissive when primary is'local'(bootstrap workflow — instance not yet pointed at a real source). - query API:
POST /api/querynow returns a materialize-aware error when the failed SQL references a table id that's registered withquery_mode='materialized'but doesn't yet exist as a master view in this instance'sanalytics.duckdb(e.g. fresh instance, no scheduler tick yet). The hint names the table, points atda sync/POST /api/sync/trigger, and — when the registry row carries a bucket+source_table — surfaces the equivalent direct-source query (bq."dataset"."table"orkbc."bucket"."table") so the operator has a concrete next step. Falls back to DuckDB's raw error for non-materialized unknowns.
Internal
- tests: refresh
docker-e2ehealth asserts to match the current/api/healthshape (auth-free, returnsstatus+db_schemaonly).versionmoved to/api/versionin 0.10-era refactor; richerservices.duckdb_statelives in/api/health/detailed(auth-gated). Tests had drifted and broke nightly e2e on main.
[0.30.0] — 2026-05-01
Added
- admin UI: each row in
/admin/tableslistings now has a per-row Manage access icon button (between Edit and Delete) that deep-links to/admin/access#table:<table_id>. The grant editor reads the hash on load and pre-fills the resource filter so the operator lands on the picked table once they select a group — shortcut for the common "I just registered table X, who should see it?" workflow without manual navigation through the resource tree. - docs:
config/instance.yaml.exampledocuments every field newly exposed by/admin/server-config—data_source.bigquery.billing_project(with the USER_PROJECT_DENIED hint),data_source.bigquery.legacy_wrap_views,data_source.bigquery.max_bytes_per_materialize,ai.base_url,openmetadata.*,desktop.*, and the fullcorporate_memory.*block. Each cross-references the admin UI so operators discover the editor exists. - diagnostics:
/api/health/detailed(and thereforeda diagnose) now surfaces abq_configservice entry on BigQuery instances. Reportsstatus="warning"whendata_source.bigquery.billing_projectresolves equal todata_source.bigquery.project— the configuration where a service account withroles/bigquery.dataVieweron the data project but noserviceusage.services.use403s every BQ call with USER_PROJECT_DENIED. The warning includes a hint pointing at theinstance.yamlfield and the/admin/server-configUI. - admin UI:
/admin/server-configexposes the full corporate_memory governance schema in the editor —distribution_mode,approval_mode,review_period_months,notify_on_new_items, thesources/extraction/confidence/contradiction_detection/entity_resolutionnested objects, plus thedomain_owners/domainslists. The whole section is optional (omitted = legacy democratic-wiki mode); admins can opt in via the UI without hand-editing YAML. Schema mirrorsconfig/instance.yaml.examplelines 224-317.confidence.modifiers(map<string, map<string, float>>) currently renders as a JSON-textarea fallback with the schema explained inline — full structured editor is a TODO. - admin UI: server-config renderer learned three new shapes —
kind="array"with a scalaritem_kindrenders as a vertical stack of typed inputs with +/- row controls;kind="map"with scalarvalue_kindrenders as key:value rows with +/- controls;value_kind="array"inside a map renders the value column as a comma-separated list (pragmatic compromise over a full nested-array UI inside each map row). Leaf inputs now carrydata-path(JSON-encoded segment array) so map keys with embedded dots — e.g.confidence.base["user_verification.correction"]— survive round-trip without being mistaken for nested-path separators. - admin UI:
/admin/server-configrenders registry-declared nested fields (kind="object"with explicitfields) as a fully-editable structured form — every leaf is its own input with a dotted-pathdata-key, and the collector rebuilds a nested patch on save. Replaces the previous read-only preview that forced operators to edit a parent JSON textarea. YAML-only keys outside the registry survive via an "Other (YAML-only) keys" expander per nested layer. Recursion handles arbitrary depth, ready for the upcoming corporate_memory + admins registry entries. - admin UI:
/admin/server-confignow ships a known-fields registry (_KNOWN_FIELDSinapp/api/admin.py, exposed on the GET response asknown_fields). The renderer shows registry-declared knobs as dashed placeholders alongside populated values, with a one-line hint per field, so operators discover optional config (e.g.data_source.bigquery.billing_project) directly in the UI instead of having to read docs or hit a runtime error first. Subagents 2-4 will populate the bodies; the smoke fixture coversbigquery.billing_project. - admin UI:
/admin/server-confignow exposes three previously YAML-only BigQuery knobs in the editor —data_source.bigquery.billing_project,legacy_wrap_views, andmax_bytes_per_materialize. The GET response always includes them underdata_source.bigquery(with documented defaults when YAML omits them) so the JSON-textarea UI shows them as editable keys. The section help text describes each. Operators no longer need to SSH to the VM, edit YAML, restart to flip these. - admin UI:
/admin/tablesis now a per-connector tab interface (BigQuery / Keboola / Jira). Each tab has its own Register modal + listing scoped to its source_type. Active tab persists inwindow.location.hashso refresh keeps the operator in place. - Keboola materialized SQL:
query_mode='materialized'now works forsource_type='keboola'— admin registers a SELECT againstkbc."bucket"."table"and the scheduler writes the result to/data/extracts/keboola/data/<id>.parquet. Same flow as BigQuery materialized; sameda syncdistribution; same RBAC. Cost guardrail (BQ-style dry-run) intentionally omitted — Keboola extension has no dry-run analog and Storage API cost is download-byte-shaped, not scan-byte-shaped. A future PR can add a configurable byte cap if operators ask for it. - Keboola Sync Schedule: per-table cron input added to the Keboola
tab Register and Edit modals. The scheduler has always honored
per-table
sync_schedulefor every source viais_table_due(), but the Keboola UI had no surface for it — operators had to use the/api/admin/registry/{id}PUT endpoint orda adminCLI. Now they can typeevery 6h/daily 03:00directly. - BigQuery
query_mode='materialized'— admin registers a SQL query viada admin register-table --query-mode materialized --query @file.sql --sync-schedule "every 6h"; the sync trigger pass runs it through the DuckDB BigQuery extension via theBqAccessfacade on each tick that's due (per-tablesync_schedulehonored viais_table_due()) and writes the result to/data/extracts/bigquery/data/<name>.parquet. The manifest endpoint exposes the row toda sync, which distributes the parquet to analysts; analysts query it through their local DuckDB view. The server-side orchestrator does not create a master view for materialized tables — they are intentionally local-only for analyst distribution, mirroring the v2 fetch primitives' "queryable viada fetchnot via remote" contract. Per-user RBAC filtering is unchanged: a materialized table is just another row intable_registrywithresource_grantscontrolling which groups see it. - Schema v20 adds
source_query TEXTcolumn totable_registryto back the materialized mode. NULL for existing rows. Thematerialize_query()function in the BigQuery extractor performs the COPY atomically (<id>.parquet.tmp→os.replace) so a failed query never leaves a half-written parquet. - BigQuery cost guardrail for
query_mode='materialized'tables: before each COPY the scheduler runs a BQ dry-run (reusingapp.api.v2_scan._bq_dry_run_bytesso cost-estimate logic lives in exactly one place) and raisesMaterializeBudgetError(skips the row) when the estimate exceedsdata_source.bigquery.max_bytes_per_materialize. Default 10 GiB; explicit0disables (YAMLnullfalls through to the default — documented inconfig/instance.yaml.example). Fail-open when the dry-run itself errors (library missing, DuckDB three-part syntax the native BQ client can't parse, transient API failure) — logs a warning instead of blocking the COPY. - Admin API:
POST /api/admin/register-tableandPUT /api/admin/registry/{id}acceptsource_queryfield. Validator enforces thatquery_mode='materialized'requiressource_queryandquery_mode in ('local', 'remote')forbids it. PUT also rejectssource_queryset withoutquery_modein the same request body and clears the stalesource_querywhen switching the merged record away from materialized mode. - CLI:
da admin register-table --query <SQL>accepts inline SQL or@path/to.sqlshorthand for reading from disk. Reuses the existing--sync-scheduleflag for the cron string. da sync --quietflag suppresses Rich progress + multi-line summary, intended for use from Claude Code SessionStart/SessionEnd hooks and cron jobs. Errors still surface on stderr; the no-op case is silent. The terse summary line in--quietmode (sync: N tables, M errors) lands on stderr so stdout stays clean for hook callers.da analyst setupnow installsSessionStart(pull) andSessionEnd(upload) hooks into<workspace>/.claude/settings.json, idempotently, preserving any existing user-owned hooks. Workspace-level (not user-home) so the hooks fire only when Claude Code is opened in the analyst workspace, not in unrelated sessions on the same machine. Hooks assumedais onPATH. If the CLI is not installed system-wide (e.g. viapipxorpip install -e .), the hooks no-op silently — expected graceful degradation, never blocks a session.docs/setup/claude_settings.jsonships the same two hooks so operators bootstrapping a fresh Claude Code workspace get auto-sync out of the box.
Changed
- admin UI: Keboola Register and Edit modals adopt the same
two-question radio model as BigQuery — What to sync? (Whole table
/ Custom SQL). Whole-table mode synthesizes a
SELECT *and writes it through the materialized path; Custom mode lets the admin filter / aggregate / project. The legacyquery_mode='local'extractor path remains supported for back-compat but is no longer the default for new Keboola registrations — Whole mode is functionally equivalent and follows the unified materialized pipeline. - admin UI:
Sync Strategydropdown removed from the Keboola form (Register and Edit). Two independent agent reviews (2026-05-01) found the field's hint claimed it controlled extraction but no extractor reads it; onlyprofiler.is_partitioned()consumes it for parquet- layout detection. Field stays in the DB and Pydantic model for back-compat (markedField(deprecated=True)); just hidden from the primary form. - admin UI:
Primary Keyinput moved under<details>Advancedin both Keboola Register and Edit modals, with a clarifying hint that it's catalog metadata only — Agnes always does full-overwrite sync; no upsert / dedup. Auto-fill from Keboola discovery still works. - admin UI: Registry listing column "Strategy" replaced with "Mode"
(showing
query_modeinstead of decorativesync_strategy). The.col-strategy/.strategy-badgeCSS rules removed. - BigQuery
init_extractno longer creates remote views for rows withquery_mode='materialized'; those live as parquets and surface via the orchestrator's standard local-parquet discovery. Skipped rows do not appear in_metaso cross-source view-name collisions remain impossible.
Deprecated
RegisterTableRequest.sync_strategy— catalog/profiler metadata only; no extractor reads it. MarkedField(deprecated=True). External API consumers see the signal in OpenAPI; back-compat preserved.RegisterTableRequest.profile_after_sync— runtime never read this flag (Agent 1 finding 2026-05-01); profiler runs unconditionally on every synced table. MarkedField(deprecated=True)and made inert (the BQ register endpoint no longer force-sets it toFalse). Back-compat preserved — external clients sending the field get no error, no warning, no effect.
Fixed
- admin API:
update_tablePUT preservessync_strategyandprimary_keywhen the Edit modal omits them from the payload (this invariant always held viarequest.model_dump()+if v is not None, but Phase I now has an explicit regression-guard test). docs/setup/claude_settings.jsonno longer references the deletedserver/scripts/collect_session.py— the deadSessionEndhook had silently failed in every Claude Code session since the v1→v2 server purge. Replaced withda sync --upload-only --quiet.
Internal
- README mode-first source table; new "Local sync & auto-update" section
covering
da sync, hooks, and admin RBAC for auto-sync membership. CLAUDE.mdschema chain extended through v20 with thesource_querydescription; four source modes documented in Connector Pattern (added Materialized SQL); new "Local sync & Claude Code hooks" subsection under Development.cli/skills/connectors.md— "BigQuery: pick a mode" decision table with cost / guardrail / registration example.docs/architecture.md— new "BigQuery — Materialized SQL" subsection describing the COPY pipeline, BqAccess integration, and cost guardrail.- BQ cost guardrail dry-run is performed via the native
google-cloud-bigqueryclient (throughBqAccess.client()), which does not parse DuckDB three-part identifiers (bq."ds"."t"). Queries written in DuckDB syntax fall through fail-open and log a warning instead of engaging the cap. Operators who need the cap to be enforceable must register the materialized SQL using native BQ identifiers (\project.ds.t``). - Hardenings landed during devil's-advocate review of PR #145:
materialize_querycomputes the parquet MD5 inline (after COPY, beforeos.replace) instead of re-reading the file in_run_materialized_pass— saves a full sequential read on the request thread for multi-GB parquets.- 0-row materializations log a
WARNINGso an empty result set can't masquerade as "the SQL is fine, today there's nothing". - The ATTACH-tolerated
except duckdb.Error: passis narrowed to the "alias already attached" case; real errors (cross-project permission, malformed project_id) propagate so the per-row aggregator records them correctly instead of surfacing a confusing downstream "bq is not attached".
Known limitations
Operators should be aware of these production-only behaviours; tests cannot exercise them and they will be revisited in follow-up PRs:
- GCE metadata token expiry mid-COPY (catastrophic for very long
scans). The DuckDB BQ extension caches the token in a session
SECRET created at session-open. A
materialize_querycall that takes longer than the token's remaining lifetime (~1h) will see silent 401s downstream and may produce a truncated parquet. No current mitigation; if your materialized SQL scans more than ~30 GiB on a single COPY, run it via the BQ console / Storage Read API offline andda fetchthe result instead until token refresh is wired into the BQ extension's session. - DuckDB
bigquerycommunity extension is unpinned —INSTALL bigquery FROM community; LOAD bigquery;picks up the latest published version on every cold start. A breaking change upstream surfaces as a production failure with no test signal. - Schema drift after a SQL edit silently breaks analyst queries.
Editing
source_queryto drop a column writes a new parquet with the new shape; analysts' queries that referenced the dropped column 500 on the next sync without warning. No diff or version field surfaces this. Workaround: announce changes in the team channel before editing materialized SQL. materialize_queryis not concurrency-locked. Two concurrent/api/sync/triggercalls for the same materialized row race on<id>.parquet.tmp.init_extracthas_INIT_EXTRACT_LOCKfor the remote-attach path, but the materialized path does not yet. In practice: the cron scheduler is single-threaded and manual triggers are rare, so the race window is small.
[0.29.0] — 2026-05-01
Fixed
scripts/ops/agnes-tls-rotate.shself-signed fallback cert now setsbasicConstraints=critical,CA:FALSEon the leaf. OpenSSL's default[v3_ca]config marksCA:TRUEonreq -x509, which causes strict TLS stacks (rustls /webpki, used byuv,cargo, and future versions ofpip) to reject the cert withinvalid peer certificate: CaUsedAsEndEntityper RFC 5280 §4.2.1.9. Browsers, curl, and OpenSSL-based clients tolerated the violation, hiding the bug until auvuser hit it. Affects every VM running on the self-signed fallback while the corp PKI hasn't published the real chain yet — the fix lands on the nextagnes-tls-rotate.timertick (orsystemctl start agnes-tls-rotate.servicefor an immediate refresh). Existing CSR / real-cert paths unaffected; only the bring-up fallback regenerates.
[0.28.0] — 2026-05-01
Fixed
- Analyst CLAUDE.md template now documents BigQuery remote-query capability.
config/claude_md_template.txt(used byda analyst setup) had zero mention ofquery_mode: "remote",da fetch,da query --remote, or--register-bq— the AI analyst running in a freshly-bootstrapped workspace had no idea remote tables existed. Added a## Remote Queries (BigQuery)section covering: discovery viada catalog(now called out as canonical, withdata/metadata/schema.jsonflagged as local-only); the three query patterns (da fetchpreferred,da query --remotefor one-shots,da query --register-bqfor hybrid joins); permission boundary (BQ access via the agnes server's GCE service account, not personal creds — escalate permission errors to admin); cost awareness (every query bills the SA's project for bytes scanned,--select/--where/--estimatediscipline);da fetchestimate-first rules; BigQuery SQL flavor reminder; snapshot freshness ritual (da snapshot drop+ re-fetch when source data updates); concrete hybrid-query example with--register-bqjoining local + ad-hoc BQ; the unknown-table case (ad-hoc--register-bqor ask admin to register); and a cross-reference toda skills show agnes-data-queryingfor deeper guidance. Also clarifies that personal customizations belong in.claude/CLAUDE.local.md, not CLAUDE.md (which is regenerated byda analyst setup --forceand would lose edits). Closes #153.
Removed
- Legacy
docs/setup/claude_md_template.txtdeleted. 359-line stale template that documented the deprecated SSH-heredoc remote-query protocol (ssh data-analyst 'bash ~/server/scripts/remote_query.sh --stdin' < query.json). The active template lives atconfig/claude_md_template.txt; the docs/ copy was confusing references and at risk of being pulled into a workspace by a future refactor. No code references the deleted file (verified).
[0.27.0] — 2026-04-30
Removed
- BREAKING Table access fully migrated to per-group
resource_grants(ResourceType.TABLE). Existingdataset_permissionsrows are dropped on upgrade — admins must re-grant via/admin/access. Wildcard bucket grants (bucket.*) no longer supported and not replaced: every table needs an explicit grant (or admin override). Per-table bulk action in/admin/accesscovers a whole bucket at once. - BREAKING
table_registry.is_publiccolumn dropped. The bypass shortcut had no API/UI/CLI surface to set it (only direct DB UPDATE worked) so the legacy data-RBAC layer was de-facto inactive — every table was implicitly public. Post-upgrade non-admin users see zero tables until admin grants explicit access. Migrate by minting the relevantresource_grants(group, "table", id)rows in/admin/accessbefore deploy or immediately after. - BREAKING Self-service
access_requestsflow removed (table, repository,/api/access-requests/*endpoints, "Request Access" catalog modal). Users contact admin out-of-band; admin grants via/admin/access. - BREAKING Legacy
users.rolecolumn dropped (NULL artifact since v13). API contracts:CreateUserRequest.role,UpdateUserRequest.roleremoved;UserResponse.rolebecomes a derived"admin"|"user"label. CLI:da admin set-roleremoved (hard-fail with a replacement command),--roleflag removed fromda admin add-userandda auth import-token. JWTroleclaim removed from new tokens (existing tokens keep the claim, ignored on read). - BREAKING
/api/admin/permissionsendpoints removed (POST/DELETE/GET). Replaced by/api/admin/grants. Half-shipped/admin/permissionsadmin UI page removed (template, route). AGNES_ENABLE_TABLE_GRANTSenv-gate removed fromapp/resource_types.py—ResourceType.TABLEis now unconditionally enabled (the gate existed only because runtime enforcement still flowed through legacydataset_permissions).tests/test_permissions.py,tests/test_permissions_api.py,tests/test_access_requests_api.pydeleted (covered functionality removed).
Added
- Schema v19: drops
dataset_permissions,access_requeststables andusers.role,table_registry.is_publiccolumns. Implementation insrc/db.py:_v18_to_v19_finalizeuses the table-rebuild idiom (rename → create new → INSERT … SELECT → drop old) to work around DuckDB'sALTER TABLE DROP COLUMNlimitations on tables that have ever held FK constraints. The INSERT picks the intersection of the legacy and v19 column sets so test fixtures with hand-crafted minimal pre-v19 schemas migrate cleanly.
[0.26.0] — 2026-04-30
Changed
- BREAKING All host-side artifacts (compose files,
Caddyfile, host bash scripts) now ship in the docker image, not curled frommainat boot. The Dockerfile bakes them at/opt/agnes-host/and the customer-instance startup template extracts the whole directory viadocker create+docker cpfrom the sameimage_tagthe operator already pinned. Removes 5curls againstraw.githubusercontent.comfrom the customer template (docker-compose.yml,docker-compose.prod.yml,docker-compose.host-mount.yml,docker-compose.tls.yml,Caddyfile) plus theagnes-auto-upgrade.shcurl shipped in 0.25.0. The image also now shipsagnes-tls-rotate.sh+tls-fetch.shat/opt/agnes-host/so consumer-side deploy templates can adopt the same pattern. Replaces the curl-from-main pattern that decoupled host-side artifacts from the pinned image (split-brain — image atstable-2026.04.516, host artifacts floating on whatevermainwas when the VM last booted) and gave no rollback knob other than reverting upstream PRs globally. With everything baked in, host artifacts and app code are released together from one commit;image_tagcontrols all; rollback is one tag bump; egress simplifies to "private registry" only (no public-internet dependency on every boot). Drift prevention is preserved by construction — image and host artifacts CANNOT drift because they ship together. Operator action:image_tagMUST point to a tag from this release or later; older tags lack/opt/agnes-host/and the startupdocker cpwill fail-loud at first boot. Existing VMs are unaffected because the module setslifecycle { ignore_changes = [metadata_startup_script] }— only newly-created VMs run the new script. compose_refvariable on the customer-instance terraform module is deprecated — no longer used (compose files come fromimage_tagnow). Variable retained for one release cycle to avoid breaking existingterraform plans; will be removed in a future major bump. Pinimage_taginstead.
[0.25.0] — 2026-04-30
Fixed
scripts/ops/agnes-auto-upgrade.sh: fail-fast guard before anydocker composeaction — when the VM has a config disk attached (/dev/disk/by-id/google-config-diskexists),/data/stateMUST be backed by it. Three retry attempts with backoff, then exit non-zero. Prevents the silent regression where docker host-mount propagation unmounts the config disk and the app writes user state (DuckDB, marketplaces, session secret) onto/data(sdb) — wiped on the next container recreate. Re-appliesmount --make-rprivate /data /data/stateon every run to defend against propagation regressions.infra/modules/customer-instance/startup-script.sh.tpl: replaced the inline heredoc copy of the auto-upgrade script with acurlfromraw.githubusercontent.com/keboola/agnes-the-ai-analyst/main/scripts/ops/agnes-auto-upgrade.sh— single source of truth eliminates drift (the inline copy had fallen behind on TLS overlay detection, array-form compose files, and the new config-disk guard). VMs re-fetch on every boot, so script-only fixes propagate without an infra recreate. Also:docker-compose.tls.ymlis now fetched unconditionally (not only whentls_mode=caddy), because the canonical auto-upgrade script detects TLS at runtime via cert files on disk — certs can appear after boot viaagnes-tls-rotate.shor manual provisioning, and the cron job would otherwise fail every 5 min until the file was placed. Same reasoning extends toCaddyfile: fetched unconditionally now, plusagnes-auto-upgrade.shskips the tls overlay whenCaddyfileis missing/empty (defensive — without it the caddy service crash-loops while the overlay closes:8000, net effect "app unreachable").
[0.24.0] — 2026-04-30
Changed
-
Effective-access readout no longer short-circuits for admin users on
/admin/users/{id}and/profile. BothGET /api/admin/users/{id}/effective-accessandGET /api/me/effective-accesspreviously returnedis_admin=true, items=[]when the target was in the Admin group, and the UI rendered a flat "Full access via Admin" gold pill — which hid the underlying grant graph. Now both endpoints always run the JOIN, return the explicit per-resource breakdown, and surfaceis_adminonly as informational metadata on the response. The UI drops the special pill on both surfaces and renders the same per-resource table everyone else sees. Authorization at runtime still gives Admin god-mode regardless of this list (seeapp.auth.access.is_user_admin); this is purely an audit/debug surface for admins to see which Admin-group grants exist via which sibling groups. -
/profilegroup memberships use the same color-coded chip vocabulary as the rest of the admin surface. Each membership renders as a colored.group-chip(Admin yellow, Everyone gray, google_sync green, custom purple) with the same name-shortening rule (grp_acme_legal@workspace.example.com→Legal, full email on hover viatitle). The Status row in the Account card was removed — same admin signal already appears as the Admin chip in Group memberships, so the pill was redundant. Server-side: the/profileroute now projectsoriginanddisplay_nameper membership (computed via the shared_derive_originhelper + theAGNES_GOOGLE_GROUP_PREFIXstrip), so the Jinja template stays env-lookup-free. -
/admin/users/{id}polish: headerAdminpill removed, "Add to group" dropdown filters out google-managed groups, whole user-cell on the list page is one anchor. Header pill was redundant — the Group memberships section already shows the Admin chip with the canonical yellow color. The dropdown now skipsis_google_managedrows (bothcreated_by='system:google-sync'and the env-mapped Admin/Everyone) so admins don't see options the API would 409 on anyway. On/admin/usersthe avatar + name + email block became a single<a class="user-cell">linking to/admin/users/{id}so the entire info area lights up on hover, not just one line; the dedicatedDetailaction button stays for explicit affordance. -
/admin/users/{id}Group memberships table renders chips with the same color + name-shortening rules as the user list. The Group cell is now a<a class="group-chip">colored byis-admin(yellow) /is-everyone(gray) /is-google_sync(green) /is-custom(purple) and links through to/admin/groups/{group_id}. Google-sync chip text shortens viaderiveDisplayName(e.g.grp_acme_legal@workspace.example.com→Legal); raw email lives on the chip'stitleattribute. Powered by a neworiginfield onUserMembershipResponse(GET /api/admin/users/{id}/memberships), computed via the same_derive_originhelper the rest of the surface uses. -
/admin/usersmembership chips are color-coded by group origin and shorten Workspace-email names to a friendly form, so a row tells the same story as/admin/groupsat a glance. Colors: Admin → yellow, Everyone → gray, other google-synced groups → green, admin-created custom groups → purple. Name match (Admin/Everyone) takes precedence over origin so an env-mapped Admin/Everyone row (whose API origin isgoogle_sync) keeps its canonical color. The chip text for google_sync groups runs through the samederiveDisplayNamehelper used on/admin/groups:grp_acme_legal@workspace.example.comrenders asLegal(prefix stripped viaAGNES_GOOGLE_GROUP_PREFIX, capitalized), and the raw Workspace email goes into the chip'stitleattribute for hover reveal. Custom / Admin / Everyone chip text stays raw —deriveDisplayNamewould over-capitalize names likedata-team. To support this,GroupBriefonGET /api/usersnow carries the sameoriginfield as/api/admin/groups, computed via the shared_derive_originhelper. Replaces the v12-era 2-color layout (yellow Admin, gray for any other system row, blue for everything else, full email always shown) which gave no signal about whether a chip came from Workspace or a manual admin grant and overflowed the cell on long Workspace emails. -
/admin/accesssidebar + right-pane title now use the same group display rules as/admin/groups. Each sidebar row renders a multi-color origin pill (google_sync/system/custom) instead of the legacy yellow inlinesystemtag, and a monospace subtitle below the name showing the Workspace email when the row is wired to one (mapped_emailfor env-mapped Admin/Everyone, the rawnamefor user-created google-sync groups). The right-pane card head adopts the same treatment when a group is selected. To support this,GET /api/admin/access-overviewnow includesorigin,mapped_email,is_google_managed, andcreated_byper group — single source of truth shared with theGET /api/admin/groupsendpoint via the same helpers (_derive_origin,_mapped_email,_is_google_managed). -
GET /api/admin/groupsandGET /api/admin/access-overviewrename theoriginvalue"admin"→"custom". The label is named after the row's origin (admin-created via UI/CLI), not the creator's role, so the pill doesn't visually clash with the seededAdminsystem group's name. CSS class.origin-admin→.origin-custom; same purple swatch. No external consumers (CLI never reads the field). Pydantic default and JS fallbacks updated in lock-step. The previous workaround — a frontendoriginLabel()helper that mappedadmin → Customat render time — is gone now that the API value already reads correctly. -
/admin/groupsswitches the seeded Admin / Everyone rows to agoogle_syncchip and shows the Workspace email as a subtitle when env-mapped. Previously the mapped Admin row showedAdminas the big title withAdminrepeated as the subtitle (thederiveDisplayNamestrip-and-capitalize chain produced no useful output for a literal canonical name) and a yellowsystemchip — which buried the fact that membership is actually owned by Workspace. Now: whenAGNES_GROUP_ADMIN_EMAIL/AGNES_GROUP_EVERYONE_EMAILis configured,GET /api/admin/groupsreportsorigin='google_sync'for the matching seeded row (the system badge is suppressed; Workspace is the authoritative source of membership) and the newmapped_emailfield carries the configured Workspace email. The list view shows the canonical name as the big title with the Workspace email as a monospace subtitle (Admin / admins@workspace.test) and a greengoogle_syncchip. The/admin/groups/{id}detail header mirrors the same — name as<h1>,mapped_emailas thegd-title-emailsubtitle. Unmapped Admin / Everyone rows stayorigin='system'with no subtitle. Regular google_sync rows (whosenameis already the Workspace email) keep the existingderiveDisplayNamerewrite behavior withmapped_email=null. -
SSO-managed accounts are read-only for password / delete operations, both in UI and at the API layer. Detection is in
app.api.users._is_sso_user: a user counts as SSO-managed if they belong to any group whosecreated_by = 'system:google-sync', OR they belong to the seededAdminsystem group whileAGNES_GROUP_ADMIN_EMAILis set, OR the seededEveryonesystem group whileAGNES_GROUP_EVERYONE_EMAILis set. Users with no groups, or only admin-created custom groups, are unaffected. The flag surfaces asis_sso_user: boolon every/api/usersand/api/users/{id}response. UI: the/admin/usersrow actions and the/admin/users/{id}Account section suppress the Reset / Set pwd / Delete buttons for those rows. Server:POST /api/users/{id}/reset-password,POST /api/users/{id}/set-password, andDELETE /api/users/{id}now return 409 withdetail: "User is managed by an external SSO provider; …"for SSO targets — so a curl-savvy admin who bypasses the UI guard still cannot reset / set / wipe a Google Workspace account locally. Deactivate stays available so admins can gate access locally even when the upstream account is managed elsewhere. Name is provider-neutral so a future provider (Cloudflare Access, Okta, …) plugs into the same flag without churning the API.
Fixed
-
scripts/ops/agnes-tls-rotate.shnow chowns/data/state/certs/to UID 999 (theagnesuser inside the app image) on every run. Previously the script onlymkdir -p'd andchmod 700'd the directory, leaving ownership to whoever happened to create it first — root when systemd fired the timer before docker-compose-up, or UID 999 when the container's volume init touched it first. Race-dependent. When root won, the resultingdrwx------ root:rootdirectory was unreadable by the UID-999 container,_read_agnes_ca_pem()returnedNone, and the/installsetup prompt silently dropped the cross-platform TLS trust block (Step 0 from #137) — operators on those VMs ended up with no client-side cert bootstrap and a brokenclaude plugin marketplace addagainst the self-signed host. The chown is unconditional + idempotent (|| truefor hosts where the numeric GID can't be set), so re-running the timer self-heals existing VMs without manualchownon the operator's part. Files inside the directory keep their existing modes —fullchain.pemis0644(world-readable, so root- or 999-owned both work for the agnes container) andprivkey.pemis0600(only Caddy reads it, and Caddy's container runs as root). -
_is_sso_userno longer treatssystem_seed/adminmemberships in env-mapped Admin/Everyone as SSO (Devin BUG_0002 on PR #142). Without checkinguser_group_members.source, the v13 migration's blanket Everyone backfill (source='system_seed') flipped every existing local user tois_sso_user=Truethe moment an operator setAGNES_GROUP_EVERYONE_EMAIL— locking the admin out of password reset / set / delete on accounts the IdP doesn't actually own (the admin couldn't even un-flag them via "remove from Everyone" because_guard_google_managedblocks manual removal once env-mapped). The system-group branches (Admin / Everyone) now additionally requiresource='google_sync'. The created_by branch (system:google-syncgroups) is unchanged because those groups only exist because of Google sync — every membership in them is IdP-owned regardless ofsource. The v18 migration in this PR also retroactively cleans up the offendingsystem_seedrows in env-mapped Admin/Everyone groups; the source-check fix is the runtime guard that keeps future writes safe. -
POST /api/admin/users/{id}/membershipsnow returns the correctoriginfor the new membership (Devin review round 1 on PR #142). The handler constructedUserMembershipResponsewithout settingorigin, so the model default"custom"was returned regardless of the target group — while the matching GET endpoint computesoriginvia the shared_derive_originhelper. Adding a user to a system group (Admin / Everyone) over POST now reportsorigin="system"(or"google_sync"when env-mapped), matching GET. The UI re-fetches after add so visible impact was zero, but any non-UI API consumer got the wrong value. -
Schema migration v18: drop stranded non-google memberships in google-managed groups (Devin review round 1 on PR #142, partial response). v13's
_v12_to_v13_finalizeunconditionally backfilled every existing user into Everyone withsource='system_seed'under the original "Everyone = all users" semantics. The platform design has since shifted: whenAGNES_GROUP_EVERYONE_EMAIL/AGNES_GROUP_ADMIN_EMAILis configured, those system rows mirror a Workspace group exclusively, and only Google sync should write into them. The leftoversystem_seedrows (a) misrepresent the membership model and (b) cause_is_sso_userto flag local users as SSO-managed, blocking password-reset / set / delete via_reject_if_sso. v18 deletes: (1) non-google memberships in auto-createdcreated_by='system:google-sync'groups (unconditional — those groups only exist because Workspace materialized them), (2)system_seedrows in Everyone only whenAGNES_GROUP_EVERYONE_EMAILis set, (3)system_seedrows in Admin only whenAGNES_GROUP_ADMIN_EMAILis set ANDadded_by NOT IN ('app.main:seed_admin', 'auth.bootstrap')so the bootstrap admin always survives. Env-conditional branches mean a non-Google deployment keeps its local Admin / Everyone semantics intact (system_seed rows there are legitimate, not cruft). Runtime safeguards against future writes from the legacyusers.roleapparatus are tracked in #144.
Removed
GET /api/admin/group-suggestionsendpoint and the "Suggested from your Google account" picker on the/admin/groupscreate modal. The picker fetched the calling admin's Workspace groups (via Cloud Identity), filtered out ones already registered asuser_groupsrows, and offered them as one-click name pre-fills. Replaced by the OAuth callback's automaticgoogle_syncgroup materialization (every Workspace group the user belongs to that matchesAGNES_GOOGLE_GROUP_PREFIXis auto-created on login) — the manual picker became redundant. Cloud Identity calls in the request path are gone with it.
[0.23.0] — 2026-04-30
Added
- Single-item Edit button on every memory item card in
/corporate-memory/admin. Surfaces the per-itemPATCH /api/memory/admin/{id}endpoint added in #126 — until now it was only reachable via the CLI (da admin memory edit <id>) or by selecting one item in the bulk batch bar. The modal pre-fills from the item's current title / content / category / domain (dropdown matchingVALID_DOMAINS+(unset)) / audience / tags (comma-separated). Authorisation: samerequire_admingate as the rest of the memory admin surface. aisection editable in/admin/server-config. Theai:block ininstance.yaml(provider / api_key / model / base_url / structured_output for the corporate-memory extractor) was missing from_EDITABLE_SECTIONSandSECTION_META, so admins had no UI path to view or set the LLM token without editinginstance.yamldirectly.api_keyis auto-masked via the existing_SECRET_KEY_PATTERNS(substring matches "api_key"), so the input renders as a password field and audit-log diffs redact the value.MEMORY_DOMAINRBAC resource type for corporate-memory items. Admins use/admin/accessto grantuser_groupsaccess to specific domains (one offinance/engineering/product/data/operations/infrastructure). Members of granted groups see allknowledge_itemsin that domain regardless of the existingaudiencestring filter. The two filters compose with OR semantics, so the existingaudience='group:X'convention keeps working unchanged for ad-hoc per-item targeting; pre-grant deployments behave identically (when no MEMORY_DOMAIN grants exist, the OR clause collapses to a no-op). Wired inKnowledgeRepository.list_items/search/count_items/count_by_tag/count_by_audienceand in the inline SQL ofGET /api/memory/statsvia a newgranted_domainsparameter resolved fromresource_grantsby_caller_granted_memory_domains. Note: a MEMORY_DOMAIN grant is a parallel visibility path that pierces theaudiencefield — an item withaudience='group:admins-only'anddomain='finance'becomes visible to anyone with aMEMORY_DOMAIN/financegrant. Operators who relied onaudienceas a hard access boundary should be aware (Devin ANALYSIS_0003 on PR #141).
Fixed
- Edit modal NULL→empty-string preservation in
/corporate-memory/admin.submitEditItemwas sendingaudience=""for items whose stored audience was NULL, which silently broke visibility (the audience filter checksaudience IS NULL OR audience = 'all', neither of which matches empty string). Now empty form values foraudience/category/domain/contentare sent as JSONnullso the backend stores NULL. (Devin BUG_0001 on PR #141 5f649a4 review.)
[0.22.0] — 2026-04-30
Fixed
/api/v2/sample/{table_id},/api/v2/schema/{table_id},/api/v2/scan/estimate, and/api/v2/scannow return structured 502/400 instead of bare 500 when BigQuery raisesForbidden/BadRequest. Issue #134. Previously,_fetch_bq_sample,_fetch_bq_schema,_bq_dry_run_bytes, and_run_bq_scanhad notry/except, so a cross-project SA withoutserviceusage.services.useon the data project surfaced as an empty HTTP 500 — operators got no diagnostic. All four call sites now translategoogle.api_core.exceptions.Forbiddento HTTP 502 witherror: "cross_project_forbidden"(when the message mentionsserviceusage) plus adetails.hintpointing atdata_source.bigquery.billing_projectininstance.yaml, orerror: "bq_forbidden"for non-serviceusage ACL denials.BadRequesttranslates to HTTP 400 (bq_bad_request) on/scan/estimateand/scansince their SQL is user-derived (built fromreq.select/where/order_by), and to HTTP 502 (bq_upstream_error) on/sampleand/schemawhere SQL is server-constructed (server-builtSELECT * … LIMIT nandINFORMATION_SCHEMA.COLUMNSqueries respectively). The strict_fetch_bq_schemapath is wrapped; the best-effort_fetch_bq_table_optionspath retains its existingtry/except → return {}so/schemastill returns 200 with empty partition info if BQ metadata is unreachable./api/v2/sampleadditionally falls back todata_source.bigquery.billing_project(withdata_source.bigquery.projectas the default) —/scan/estimateand/scanalready had this fallback.
Changed
- BREAKING for deployments using
BIGQUERY_PROJECTenv var alongsidedata_source.bigquery.projectininstance.yaml. Issue #134. The env var now sets BOTH billing and data project (used as both the FROM-clause project AND the billing/quota target), overridingdata_source.bigquery.projectfor FROM-clause construction inv2_scan/v2_sample/v2_schema. PreviouslyBIGQUERY_PROJECTonly affectedRemoteQueryEnginebilling and was ignored by the v2 endpoints (which readinstance.yamldirectly). Migrate by clearingBIGQUERY_PROJECTand settingdata_source.bigquery.billing_project+data_source.bigquery.projectininstance.yamlinstead — the env var remains as a legacy override only.
Internal
- New shared module
connectors/bigquery/access.py—BqAccessfacade. Issue #134. Unifies BigQuery project resolution (BIGQUERY_PROJECTenv →instance.yaml billing_project→instance.yaml project),bigquery.Clientconstruction, DuckDB-extension session setup (INSTALL/LOAD/SECRETfromget_metadata_token()), and Google-API error translation (translate_bq_error()mappingForbidden/BadRequest/GoogleAPICallErrorto typedBqAccessErrorwithkind→ HTTP-status mapping). Replaces four near-identical inline blocks acrossv2_scan,v2_sample,v2_schema, andRemoteQueryEngine. FastAPI endpoints inject viaDepends(get_bq_access)(process-cached;instance_config.reset_cache()invalidates this cache too, so admin server-config saves hot-reload BQ project IDs without container restart);RemoteQueryEngineinjects viabq_access=BqAccess(...)constructor kwarg with lazy resolution (DuckDB-only paths never trigger BQ config lookup). When BigQuery isn't configured,get_bq_access()returns a sentinelBqAccesswhoseclient()/duckdb_session()raiseBqAccessError(not_configured)only when actually called — non-BQ instances (Keboola-only, CSV-only) get cleanDepends()resolution and 200s on local-source v2 requests. Two known-duplicate sites (connectors/bigquery/extractor.py,scripts/duckdb_manager.register_bq_table) explicitly out of scope; tracked as follow-up. - Internal API change in
RemoteQueryEngine:__init__no longer accepts_bq_client_factory(test-only injection point, prefix_). Tests migrate toRemoteQueryEngine(..., bq_access=BqAccess(projects, client_factory=...)). TheBqAccessErrorraised internally byBqAccessis translated to the existingRemoteQueryError(error_type="bq_error")shape in_get_bq_client, preserving the public contract — CLI (cli/commands/query.py) and/api/query/hybridcallers see no change. Removed the stale docstring atsrc/remote_query.pyreferencingscripts.duckdb_manager._create_bq_clientas the default factory (it never was). - Side-effect behavior change for unusual cross-project setups in
/api/v2/sample. Issue #134. The FROM-clause project for/sampleis nowdata_source.bigquery.project(the data project) rather than the conflatedbilling_projectvalue — the Phase 1 fix passedbilling_project(when set) as both the billing target AND the FROM-clause project. Deployments wherebilling_project ≠ projectAND the queried table physically lives inbilling_project(an unusual setup contradicting the documented config semantics) must move the table to the data project or unsetbilling_project. No effect on the standard cross-project setup (table in data project, jobs billed to billing project). scripts/smoke-test.sh: assertion 8 now hits/api/admin/registry(the current admin tables endpoint). The old/api/admin/tablesURL was renamed long ago and the smoke test was returning 404 on every run — it only surfaced as a deploy failure when the full release pipeline first triggered the rollback path on the post-#137 deploy (run 25151878647). Same stale URL was also fixed inCLAUDE.md,README.md, anddev_docs/server.md— the routes now correctly point atPOST /api/admin/register-table(create) andPUT /api/admin/registry/{id}(update)..github/workflows/release.ymlsmoke-test job: addedLog in to GHCRstep. The auto-rollback'sdocker push :stablewas hittingunauthenticated: User cannot be authenticated with the token providedbecause the smoke-test job had no GHCR login of its own. Result: a failed deploy left:stablepointing at the broken image. The rollback step also got an explicitGH_TOKENenv, and the workflow's top-levelpermissionsblock gainedissues: write, so itsgh issue createcall actually creates the alert issue (was silently swallowed by the|| echofallback because of both the missing env var AND the missing scope).
[0.21.0] — 2026-04-30
Internal
scripts/dev/agnes-client-reset.sh— destructive cleanup of an Agnes client install on a developer workstation, mirror image ofapp/web/setup_instructions.pyso an onboarding-from-scratch test is reproducible. Removes thedaCLI (uv tool uninstall),~/.config/da/~/.agnes/~/.claude/skills/agnes, the Claude Codeagnesmarketplace + its plugins, the Agnes CA from the OS trust store (Windowscertutil -delstore, macOSsecurity delete-certificate -Z, Linuxupdate-ca-certificates/update-ca-trust), theAGNES_CA_PEM_TRUSTblock from the user's shell rc (with.agnes-reset.bakbackup), and/tmp/agnes*.whlmatches. Cross-platform (Git Bash on Windows / macOS / Linux);--yesskips the confirm prompt,--dry-runprints actions without executing.
Changed
-
Trust block heredoc trimmed to 8 lines so reset script's
skip = 8matches exactly (Devin review round 3 on PR #137). The_tls_trust_blockheredoc was emitting 9 lines into the user's shell rc (a leading empty line + theAGNES_CA_PEM_TRUSTmarker + 7 export/comment lines), butscripts/dev/agnes-client-reset.shawk strips exactly 8 lines starting at the marker — leaving the leading empty line behind. On repeated install/reset cycles, those stray empty lines accumulated in~/.zshrc/~/.bashrc. Removed the leading empty string from the heredoc body in_tls_trust_blockso the heredoc now writes exactly 8 lines, matching the awk count. Added two regression tests that pin the invariant — one asserts the heredoc body length, the other parsesskip = Nout of the reset script via regex and cross-checks it against the heredoc body line count, so future drift on either side fails loudly. -
Marketplace block re-detects
$PLATFORMso Linux actually gets the direct-HTTPS attempt (Devin review round 2 on PR #137).$PLATFORMis set in step 0(a) but the prompt itself warns that env vars don't persist across separate Bash invocations (step 0(e) IMPORTANT note). The marketplace step'scase "$PLATFORM" inran in a later Bash call where$PLATFORM="", falling through to the*)catch-all which hard-codesMARKETPLACE_VIA=clone— defeating the Linux-only direct-HTTPS attempt that node-basedclaudewould have honored viaNODE_EXTRA_CA_CERTS. Marketplace block now re-detects$PLATFORMvia the sameuname -sswitch from step 0(a) before its case statement, making the block self-contained. Same fix not applied to step 0(c)'s$PLATFORMuse because step 0 is meant to run as a single Bash block (a→b→c→d→e in sequence) where the variable is still in scope. -
Setup prompt no longer references steps that may not have been emitted (Devin review on PR #137). Three places hard-coded references to optional steps regardless of whether those steps were actually rendered. (1) Confirm step's summary bullets unconditionally listed "Which CA bundle source got picked in step 0(d)" and "Whether the marketplace add went via direct HTTPS or via the git-clone fallback" — both phantom in the default no-CA, no-plugins flow, and an LLM following the prompt would either ask the user about non-existent steps or hallucinate.
_FINALE_LINES_TEMPLATEconstant replaced with_finale_lines(has_ca, has_marketplace)that conditionally appends each bullet. (2) Preamble's "The fallback chain inside step 0(d) is documented and OK to use" pointed at a non-existent step whenca_pemwas unset._preamble_lines(has_ca)now drops that line in the no-trust-block path; the "don't disable TLS verification" guidance stays unconditional (valid generic advice). (3) Trust block step 0(c) said "without this, step 7's marketplace add fails" — stale after the layout reordering moved marketplace to step 5 (and made it optional). Reworded to describe the consequence without naming a step number. -
Marketplace step now uses the git-clone fallback on macOS too — not only Windows — and strips the PAT from the cloned repo's
.git/configafter clone. First fix:claudeon macOS arm64 ships as a Mach-O binary with a__BUNsegment (single-filebun build --compile); reverse engineering itsstringstable shows it recognizesNODE_EXTRA_CA_CERTS/SSL_CERT_FILE/REQUESTS_CA_BUNDLE/CURL_CA_BUNDLE(including a "NODE_EXTRA_CA_CERTS detected" log line) but in practice none of them — nor the macOS login keychain — is honored for the marketplace HTTPS request, leaving the direct-add path failing withunable to verify the first certificateeven after step 0(c) registered the cert. So the marketplacecasenow matcheslinux)for the direct-then-fallback path and*)(= Windows + macOS, both Bun-compiled) for straight-to-clone. Second fix:git clone https://x:<PAT>@host/...writes the URL verbatim into~/.agnes/marketplace/.git/config, so the PAT then sits in plaintext at a path that gets read by cloud-sync agents (iCloud, OneDrive) and antivirus scanners on default home setups; after clone we now rungit remote set-url origin "https://<host>/marketplace.git/"to drop the token, plus a best-effortchmod 700/chmod 600(wrapped in|| trueso it's a no-op on Windows NTFS via MSYS / Git Bash). Marketplace registration uses the local FS path, not the remote URL, so removing the token doesn't break anything — refreshes go via re-running setup with a fresh PAT from the dashboard. Third fix: each shell-out (git clone,claude plugin marketplace add,claude plugin install <name>@agnes) is now wrapped in|| { echo "ERROR..." >&2; exit 1; }so a failure halts the prompt loudly instead of falling through to a confusing downstream error (e.g. failed clone →marketplace 'agnes' not foundfrom the nextplugin install). Fourth fix: the diagnose step now calls out thatdb_schema: unknownis also normal for non-admin roles (e.g.analyst) on populated instances, not just on fresh installs — analyst lacks grants on the system schema, so the field staysunknownforever and was previously misread as a yellow check. -
Setup prompt step ordering reshuffled so all installation work runs before the human-loop skills question. Old order interleaved the human question (skills, step 5) between install (step 1) and marketplace/plugins (step 7), which led the assistant to either block on the user mid-install or "do the rest in parallel" while waiting. New order: install → login → verify → git check → marketplace + plugins → diagnose → skills (last interactive step before Confirm) → Confirm. With marketplace plugins to install, that's steps 1-2-3-4-5-6-7-8; without plugins, 4-5 (git/marketplace) collapse out and diagnose/skills/confirm renumber to 4-5-6. The skills step now explicitly tells the assistant to wait for the user's answer before moving to Confirm — the old "you can continue in parallel" hint is gone because there's no longer anything to do in parallel.
da diagnoserunning late doubles as a final smoke test after plugins are in place. -
Setup prompt's TLS trust block rewritten to be cross-platform and to dodge three TLS pitfalls observed across real workstation setups. The previous block exported
SSL_CERT_FILE/NODE_EXTRA_CA_CERTS/GIT_SSL_CAINFOall pointing at the single Agnes CA; this caused (1) every Python tool in the same shell to lose its system trust store (PyPI immediately broke withUnknownIssuerbecauseSSL_CERT_FILEis a replace, not an append), (2)uv tool install <https-url>against the Agnes wheel endpoint to fail with rustls'CaUsedAsEndEntitybecause the Agnes leaf cert is its own CA —--native-tlsdoesn't help (the rejection happens during chain validation, not trust lookup), and (3)claude plugin marketplace addto fail on Windows becauseclaude.exeignores both the OS trust store andNODE_EXTRA_CA_CERTSfor marketplace HTTPS. The new step 0 (a) detects platform viauname+$SHELLand picks the correct shell rc file (zsh→.zshrc, bash on macOS→.bash_profile, else→.bashrc), (b) writes the cert PEM via single-quoted heredoc, (c) registers the cert in the OS trust store (Windowscertutil -user -addstore Root, macOSsecurity add-trusted-cert, Linuxupdate-ca-certificates/update-ca-trust) — no admin rights needed, idempotent on re-run — so native binaries that bypass our env vars still trust the host, (d) builds a combined CA bundle at~/.agnes/ca-bundle.pem(system roots + Agnes CA) using a fallback chain for the system roots source (systempython3 -c 'import certifi'→ distro/curl bundle paths →uv run --with certifias last resort), (e) persistsSSL_CERT_FILE/REQUESTS_CA_BUNDLE/GIT_SSL_CAINFOpointing at the combined bundle while keepingNODE_EXTRA_CA_CERTSon justca.pem(Node's append semantics). When the trust block is emitted, step 1 also switches to a curl-then-local-install pattern (curl --cacertto download the wheel,uv tool install --native-tls --force <local-file>to install) so rustls never sees the Agnes host. Step 7 (marketplace) goes platform-aware: Windows skips the direct HTTPS attempt and uses a systemgit clonefallback (system git honorsGIT_SSL_CAINFO), macOS/Linux try direct first. Step 4 (diagnose) calls out thatdb_schema: unknownanddata: 0 tablesare normal on fresh installs. Step 5 (skills) makes clear the assistant can continue with steps 6-7 while waiting for the user's skills answer. Step 12 (marketplace) calls out the harmlessgit: 'credential-manager-core' is not a git commandwarning so the operator doesn't chase it. The legacygit config sslVerify=falsedowngrade path stays as a fallback for instances without afullchain.pemon disk (so existingAGNES_DEBUG_AUTHsetups keep working).
Added
- "Set up a new Claude Code" prompt now bootstraps the marketplace and plugins. The clipboard payload generated by the dashboard CTA appends a git pre-flight check (
git --version, withbrew install git/winget install --id Git.Gitinstall commands for macOS / Windows) followed by a marketplace-registration step that runsclaude plugin marketplace add "https://x:<PAT>@<host>/marketplace.git/"and oneclaude plugin install <plugin>@agnes --scope projectper RBAC-allowed plugin (resolved viamarketplace_filter.resolve_allowed_plugins). When the user has no plugin grants, the original 6-step layout is preserved. WhenAGNES_DEBUG_AUTHis enabled on the server (dev/self-signed-cert instances), a host-scopedgit config --global http."<server>/".sslVerify falseline is also included so the marketplace clone works against the self-signed endpoint. Plugins load on the nextclaudestart. - Setup prompt inlines the server's TLS cert as a step-0 trust block on instances with a private CA / self-signed chain.
app.web.router._read_agnes_ca_pemreads/data/state/certs/fullchain.pem(path overridable viaAGNES_TLS_FULLCHAIN_PATH; the file is bind-mounted into the app container bydocker-compose.host-mount.ymlfrom the same locationagnes-tls-rotate.shwrites). Self-signed leaves and CA-signed leaves whose issuer isn't in the server-sidecertifitrust store are inlined into the prompt; publicly-trusted chains (Let's Encrypt etc.) are skipped so users don't unnecessarily narrow their default Python TLS trust. The inlined block writes the PEM to~/.agnes/ca.pemvia single-quoted heredoc (so$/backtick chars in the cert never shell-expand) and exportsSSL_CERT_FILE,NODE_EXTRA_CA_CERTS,GIT_SSL_CAINFOfor the current shell + persists them to~/.bashrc/~/.zshrc(idempotent via a marker grep guard) sodakeeps trusting the host across new terminal sessions. When the trust block is emitted, the legacygit config sslVerify=falsedowngrade is suppressed — full TLS validation re-enabled, just against the inlined cert. Cross-platform (macOS bash/zsh + Windows Git Bash) — same env vars, same heredoc syntax. Replaces thegit config sslVerify=false-only path that brokeclaude plugin marketplace add(Node has its own HTTPS client and ignoresgit config) anduv tool install(rustls, no insecure flag) on self-signed instances.
[0.20.0] — 2026-04-29
Added
- Dev debug toolbar gated by
DEBUG=1. Mountsfastapi-debug-toolbarwith panels for headers, routes, settings, versions, timer, logging, and a custom DuckDB panel that captures everycon.execute()fromsrc/db.py(tagged bysystem/analytics/analytics_ro). Seedocs/development.md. X-Request-IDrequest header / response header on every FastAPI response, plus arequest_idfield in JSON logs for cross-process correlation.- Request-ID surfaced end-to-end on error responses:
Reference: <rid>block on the renderederror.htmlpage (withuser-select: allfor one-click copy) and a"request_id": "<rid>"field in the JSON 5xx body. The same id appears in thex-request-idresponse header, so a support ticket can be traced from a single value the user sees on the page. - Dev log lines now carry the request id via
_RequestIdFilter—RichHandlerformat is[<rid>] [<logger>] <msg>(or[-]outside of a request scope). JSON formatter already includedrequest_id; this closes the gap forDEBUG=1development. - Centralized
app.logging_config.setup_logging()— replaces 23 scatteredlogging.basicConfig(...)calls. Usesrich.logging.RichHandlerin dev (DEBUG=1) and JSON to stderr in prod.
Changed
- All service entrypoints (
services/scheduler/__main__.py,ws_gateway,telegram_bot,corporate_memory,session_collector,verification_detector) and CLI scripts underscripts/andconnectors/jira/scripts/now callsetup_logging(__name__)instead of inlinebasicConfig. Library modules no longer configure root logger at import time. - BREAKING Telegram bot no longer writes to
/data/notifications/bot.log. All bot logs go to stdout, captured by Docker. Usedocker compose logs -f notify-botto read them. Operators tail-ing the file must update their runbooks; seedev_docs/telegram_bot.mdfor the new procedure (includingjournalctlfallback for non-Docker hosts). - Toolbar middleware is mounted INSIDE the GZip middleware (innermost on response) so the toolbar can decode HTML before compression. RequestIdMiddleware remains outermost; production behavior (DEBUG unset) is byte-identical to before.
Fixed
- Removed rogue module-level
logging.basicConfigfromapp/api/sync.pythat was reconfiguring root logger every time the api module was imported. RequestIdMiddlewarerewritten as a pure ASGI middleware (wasBaseHTTPMiddleware). Removes the earlyrequest_id_var.resetinfinallythat fired BEFORE BackgroundTasks ran, causing them to lose the id. Also side-steps the knownBaseHTTPMiddlewareContextVar-cross-task issue.- Incoming
X-Request-IDheaders are now sanitized (alnum +-/_, truncated to 64 chars; falls back to a fresh uuid if nothing legal remains). Closes a CRLF log-forging vector when log handlers don't escape newlines. _wants_htmlno longer treatsAccept: */*(curl default) or empty Accept as "wants HTML". Operators who curl non-API paths get JSON{"detail": "..."}as before — only real browsers (withAccept: text/html,...) get HTML error pages. (Devin ANALYSIS_0003 on PR #136 review.)- Subprocess extractor in
app/api/sync.pyre-installslogging.basicConfigso INFO-level extraction progress fromconnectors.keboola.extractor.run()reaches stderr again (was silently dropped by Python'slastResorthandler after the import-timebasicConfigcleanup). (Devin BUG_0002.) .env.templatecomment forDEBUG=1no longer claims to enableFastAPI debug=True— that flag is intentionally NOT toggled (Starlette'sServerErrorMiddlewarewould intercept unhandled exceptions before the custom error handler runs). (Devin BUG_0001.)- Security: HTML error page (500) no longer leaks
str(exc)in production. The JSON branch already guarded that string behinddebug_onbut the HTML branch did not — browser users could see raw exception messages containing DB paths, SQL fragments, internal hostnames, or credentials embedded in connection strings. The HTML branch now mirrors the JSON branch'sdebug_oncheck; production users see only"Internal server error"plus the request id. (Devin BUG on b1c6ee9 review.)
Internal
pyproject.toml: addedfastapi-debug-toolbar>=0.6.3to dev optional deps.services/telegram_bot/config.py: removed unusedBOT_LOG_FILEconstant.tests/conftest.py: removed stale comment about bot.py FileHandler.
[0.19.0] — 2026-04-29
Added
table_registry.sync_scheduleis now honored at runtime.POST /api/sync/trigger(called by the scheduler sidecar every 15 min by default) drops local tables whose schedule says they are not due. Tables without a schedule continue to sync on every tick (opt-in feature). ManualPOST /api/sync/trigger {"tables":[...]}bypasses the schedule filter — operator override always wins. (#79)script_registry.scheduleis now honored at runtime via the new endpointPOST /api/scripts/run-due(admin-only). The scheduler sidecar fires this every 60 s by default. Each due script is claimed atomically (last_status='running'), executed in a BackgroundTask, and the outcome written tolast_run/last_status. Scripts already inrunningstate are skipped — no concurrent runs of the same script. (#78)- Four new env vars on the scheduler sidecar:
SCHEDULER_DATA_REFRESH_INTERVAL,SCHEDULER_HEALTH_CHECK_INTERVAL,SCHEDULER_SCRIPT_RUN_INTERVAL,SCHEDULER_TICK_SECONDS. All accept positive integers (seconds); tick must be ≤ smallest job interval. Documented indocs/DEPLOYMENT.md→ Scheduler tuning. (#77) RegisterTableRequest.sync_schedule,UpdateTableRequest.sync_schedule, andDeployScriptRequest.schedulenow reject malformed strings with a Pydantic 422 (e.g."hourly","daily 25:00"). The accepted forms are unchanged:every Nm,every Nh,daily HH:MM[,HH:MM,...]. Note: cron expressions ("0 8 * * MON"etc.) were never honoured by the runtime evaluator — they used to round-trip through the API as a silent no-op, and now they get a loud 422 at register/deploy time. Operators using cron strings must convert to one of the supported forms. (#78, #79)- New
verify_sslknob in theopenmetadata:section ofinstance.yaml(defaulttrue). Operators on internal CAs / self-signed catalogs must set it explicitly. (#89)
Changed
- BREAKING
POST /api/scripts/deploynow validates the source against the safety blocklist BEFORE persisting (previously safety checks ran only at execution time). Scripts containing blocked imports / patterns return 400 from/deployinstead of being stored and failing every scheduler tick. Closes the claim-fail-retry loop where the new/api/scripts/run-dueendpoint would re-claim and re-fail an unrunnable deployed script every minute. (#78) - BREAKING
OpenMetadataClientnow defaults toverify=Truefor TLS. The previous version hardcodedverify=Falseand suppressed urllib3's "Unverified HTTPS request" warning at import time (which leaked to every other httpx client in the process). Existing deployments on self-signed certificates without an explicit opt-out will start failing TLS verification — setverify_ssl: falsein theopenmetadata:block ofinstance.yaml, or supply a CA bundle path, before upgrading. Both production call sites (connectors/openmetadata/enricher.py,src/catalog_export.py) read the newverify_sslconfig knob and pass it through. (#89) - BREAKING
GET /marketplace/info(admin-only debug endpoint)namefield now returns the plugin's authoritative name from itsplugin.json(e.g.plug-x) instead of the slug-prefixed form (<slug>-<plug-x>). The slug-prefixed form moved to a newprefixed_namefield next to it;original_nameis unchanged. Side-effect of the/pluginUI fix below — the synth marketplace.json'snamefield had to switch over for Claude Code's catalog lookup to work, and/marketplace/infomirrors that surface for consistency. Any downstream tooling that parsed thenamefield expecting the slug-prefixed format must now readprefixed_name. (#133)
Fixed
/pluginUI in Claude Code rendered "Plugin not found in marketplace" in the Components panel for every plugin Agnes served, even though agents/skills/commands loaded correctly under the plugin's own namespace. Root cause: the synthetic.claude-plugin/marketplace.jsonlisted each plugin under a slug-prefixedname(<slug>-<plugin>) while the plugin's authoritative.claude-plugin/plugin.jsonkept the original name. Claude Code resolves the loaded plugin back to its catalog entry byplugin.jsonname, so the lookup missed every entry. The synth manifest now reads the plugin's authoritative name from<plugin_dir>/.claude-plugin/plugin.json(falling back to the upstream marketplace.json'snamewhen the plugin manifest is absent or unreadable). The directory layout underplugins/<slug>-<plugin>/...keeps the prefix so two upstream marketplaces that ship a same-named plugin still get distinct on-disk paths in the ZIP / git tree — their catalog entries will then collide under the samename, which is the correct surface (admin RBAC decides which upstream wins, same as if a user added both upstream marketplaces directly to Claude Code)./marketplace/infonow exposesprefixed_namealongsidenameso operators can still disambiguate cross-marketplace shadowing. (#133)
Internal
src/scheduler.pynow exportsis_valid_schedule(s)andfilter_due_tables(configs, sync_state_repo)for reuse across the sync filter, the script runner, and Pydantic validators.ScriptRepositorygainsclaim_for_run(script_id)andrecord_run_result(script_id, status)— the atomic primitives for the scheduled-script execution path.claim_for_runusesUPDATE … WHERE last_status IS DISTINCT FROM 'running' RETURNING idfor race-free claim.services/scheduler/__main__.pyJOBS list refactored to abuild_jobs()factory that reads + validates env at startup.
Known limitations
- Stuck
last_status='running': a scheduled script whose BackgroundTask crashes mid-run (process killed, OOM, gateway timeout) stays claimed forever. Recovery:UPDATE script_registry SET last_status = NULL WHERE id = ?from a DuckDB shell. Auto-recovery via max-runtime detection is intentionally out of scope for v0.19.0; revisit if it bites in practice. - Schedule quantization rounds up:
SCHEDULER_*_INTERVALaccepts seconds but the underlying schedule grammar is minute-grained. Non-multiples of 60 round UP to the next minute (90 s →every 2m, neverevery 1m) so a job never fires more often than the operator configured. Sub-minute values clamp toevery 1m. Documented indocs/DEPLOYMENT.md→ Scheduler tuning.
[0.18.0] — 2026-04-29
Added
- Corporate-memory tree view + cross-axis filtering on
/corporate-memoryand/corporate-memory/admin. Operators choose a grouping axis (domain / category / tag / audience) and combine it with chip filters (status, source_type, audience, has-duplicate-hint, search). Tree uses native<details>; localStorage persists open/closed state per axis; no new dependencies. Issue #62. - Corporate-memory duplicate-candidate hints — admin sees a "Duplicate Candidates" tab with likely-duplicate item pairs detected by entity overlap (Jaccard score, ≥2 shared entities, same domain). Resolution actions:
duplicate/different/dismissed. Auto-merge intentionally not included. Issue #62. - Bulk-edit endpoints:
PATCH /api/memory/admin/{id}(now accepts category/domain/tags/audience/title/content, was title+content only);POST /api/memory/admin/bulk-updatefor multi-id mutations with per-item audit rows; new "Move to category / domain / audience" + "Add/Remove tag" actions in the admin batch bar. Issue #62. GET /api/memory/statsnow includesby_tag(DuckDBjson_eachover tags) andby_audienceaggregations to power chip-filter pickers. Issue #62.GET /api/memory/tree?axis=...&...— server-side grouping endpoint that returns{groups: [{key, label, count, items: [...]}]}, RBAC-filtered, with chip filters (status_filter,source_type,audience,q,has_duplicate). Issue #62.da admin memory {tree,edit,bulk-edit,stats,duplicates list,duplicates resolve}— full CLI parity for the new admin endpoints. Issue #62.- Schema v17: new
knowledge_item_relationstable for duplicate-candidate hints. PK(item_a_id, item_b_id, relation_type)with canonical(min, max)pair ordering at the repository layer; auto-migration v16→v17 idempotent. Issue #62. - BigQuery table registration via admin UI + CLI (issue #108 — Milestone 1). Operators on a BigQuery instance can now register a BQ table or view as a remote DuckDB master view from
/admin/tablesorda admin register-table, without hand-editingtable_registryor running the extractor by hand. The register modal branches ondata_source.typeserver-side: BQ instances see Dataset / Source Table / View Name / Description / Folder / Sync Schedule; Keboola instances keep the discovery-driven flow. Submit runs/api/admin/register-table/precheckfirst (round-tripsbigquery.Client.get_tableto confirm the table exists and the SA can see it; surfaces row count + size + column count in the modal), then commits. The server validates BQ-specific shape (dataset / source_table / DuckDB-safe identifiers / GCP project_id grammar), forcesquery_mode=remote+profile_after_sync=false, and synchronously rebuildsextract.duckdb+ master views with a 5s wall-clock budget — on overrun, the rebuild continues in aBackgroundTaskand the API returns 202 with{"status": "accepted", "view_name": ...}instead of 200. View-name collisions (distinct from id collisions) return 409 to stop two callers from registering the same DuckDB view via different display names.sync_scheduleis accepted and stored but not yet evaluated by the scheduler — see issue #79; addressed in Milestone 3 of #108. Seedocs/DATA_SOURCES.md. POST /api/admin/register-table/precheck— validation-only sibling of register-table. Returns{"ok": true, "table": {rows, size_bytes, columns, …}}for BQ rows after round-trippingget_table; surfaces NotFound → 404, Forbidden → 403, anything else → 400 with the GCP error verbatim. Also runs Pydantic validation for non-BQ source types so the CLI / UI gets a single endpoint shape.--dry-runflag onda admin register-table— calls/precheckand pretty-prints rows / bytes / columns; exits 0 onok, 1 on validation or source-side error.- Audit-log entries on every
register_table/update_table/unregister_tablemutation — closes the asymmetry where instance-config saves audited but registry mutations didn't (Decision 4 in #108). Secret-named fields in the request payload are masked as***;descriptionis logged raw. - Google Workspace group prefix filter + system-group mapping. Three new env vars wire the OAuth callback's group sync to a configurable Workspace prefix and route the admin/everyone Workspace groups into the seeded system rows.
AGNES_GOOGLE_GROUP_PREFIX— when set (e.g.grp_acme_), only Workspace groups whose email local part starts with the prefix are mirrored intouser_group_members. Empty = legacy behavior (mirror every fetched group).AGNES_GROUP_ADMIN_EMAIL— Workspace group email that maps onto the seededAdminsystem row instead of creating a freshuser_groupsentry. Members of that Workspace group land inAdmindirectly.AGNES_GROUP_EVERYONE_EMAIL— same mechanism forEveryone.
- Login gate. When
AGNES_GOOGLE_GROUP_PREFIXis set and the user's Workspace fetch returned a non-empty list with zero prefix matches, the callback redirects to/login?error=not_in_allowed_groupwith a friendly inline banner. Empty fetch results (transient Cloud Identity failures) preserve the cached membership and let the login proceed — fail-soft only the soft-fail path; an explicit no-match still blocks. New error codegroup_check_unavailableis wired through the login banner for future use. - Admin UI subtitle for synced groups. The
/admin/groupstable and the/admin/groups/{id}detail page render a derived display name (prefix stripped,@domainremoved, capitalized) above a small monospace subtitle showing the full Workspace email. Edit / Delete affordances are hidden on Google-managed rows, and a "managed by Google Workspace — read-only here" banner appears on the detail page.
Changed
- BREAKING Auto-
Everyonemembership for new users was removed.UserRepository.createno longer writes auser_group_membersrow, andapp.auth.access._user_group_idsno longer adds a virtualEveryoneid to the result. Every membership now traces to a real source row (admin,google_sync, or an explicitsystem_seed). If you relied on the implicit-Everyone behavior for plugin visibility, grant the plugin to a real group (e.g. aneveryone@example.comWorkspace group mapped viaAGNES_GROUP_EVERYONE_EMAIL). - Admin UI / API are read-only on Google-managed groups.
created_by='system:google-sync'rows, plus the seededAdmin/Everyonerows when the matching email-mapping env var is set, return409with body{"detail": {"code": "google_managed_readonly", ...}}fromPATCH /api/admin/groups/{id},DELETE /api/admin/groups/{id},POST /api/admin/groups/{id}/members,DELETE /api/admin/groups/{id}/members/{user_id},POST /api/admin/users/{id}/memberships,DELETE /api/admin/users/{id}/memberships/{group_id}. Edit through admin.google.com, then sign in again to refresh. - Audit action names for corporate-memory operations renamed from
km_<action>tocorporate_memory.<action>to match the 0.15.0 CHANGELOG documentation. The audit-tab filter accepts both prefixes for back-compat with rows already in the audit log (no historical-row rewrite). Issue #62. onDomainChange()UX bug fixed on/corporate-memory: domain and category filters now compose instead of resetting each other when either changes. Issue #62.POST /api/memory/admin/editcontinues to accept title/content as before; the newPATCH /api/memory/admin/{id}is the recommended path for everything else (including title/content). The legacy endpoint is kept one release for back-compat.
Internal
- New env vars surfaced into
ConfigProxyso templates can derive the friendly display name client-side. - New
is_google_managed: boolfield onGroupResponse(the API surface for the admin UI's group list/detail). - New
UserGroupMembersRepository.has_any_google_sync_membershiphelper (currently diagnostic; kept for a future tightening of the gate). - New tests in
tests/test_google_group_prefix_sync.py;tests/test_repositories.py::TestUserRepositoryEveryoneAutoMemberrenamed toTestUserRepositoryNoAutoMembershipwith inverted assertion; twotests/test_marketplace_filter.pytests adapted to the no-implicit-Everyone semantics. Seedocs/auth-groups.mdfor the full reference.
Fixed
PATCH /api/memory/admin/{id}now switches frommodel_dump(exclude_none=True)toexclude_unset=True, so an explicitnullin the request body clears the field (e.g.{"audience": null}resets a previously-set audience to NULL). Pre-fix nulls were silently dropped, leaving no path to clearaudienceand only the empty-string short-circuit fordomain. The endpoint now distinguishes "field absent from body" (untouched) from "field explicitly set to null" (cleared). BothPATCH /api/memory/admin/{id}andPOST /api/memory/admin/bulk-updatenow reject an explicitnullfortitle(NOT NULL in the schema) at the boundary with HTTP 400 instead of bubbling up as a 500 (PATCH) or per-item Constraint Error (bulk). Issue #62 / PR #126 review.- Empty-string
domainis now consistently allowed acrossPOST /api/memory,PATCH /api/memory/admin/{id}, andPOST /api/memory/admin/bulk-update— previously create allowed it (short-circuit on falsy) but PATCH/bulk-update rejected it with 400, which made it impossible to clear a domain through the admin endpoints. Issue #62 / PR #126 review. - Bulk-edit modal
(unset)option for the domain field is now actually submittable. Pre-fix the JS rejected empty values with "Please enter a value" before the request fired, so the operator-visible "(unset)" option couldn't ever clear a domain even though the backend supports it. Issue #62 / PR #126 review. POST /api/memory/admin/bulk-updatenow enforces an API-layer allowlist of mutable fields (category,domain,tags,tags_add,tags_remove,audience,title,content). Pre-fix the endpoint forwarded any key in the repo's_UPDATABLE_FIELDSset, which includedstatus,sensitivity,is_personal, andconfidence— an admin couldPOST {"updates": {"status": "mandatory"}}and silently bypass/admin/mandate's dedicated audit trail (the bulk audit row also stampedupdated_fields: []for those mutations, leaving no trace of what changed). Disallowed keys now return HTTP 400 with the offending list; the repo layer is unchanged so the per-itemrepo.updatepath keeps its broader access. Issue #62 / PR #126 review.- Tree endpoint
audience=allchip now includes NULL-audience items, matching the SQL audience filter (audience IS NULL OR audience = 'all'),count_by_audience(COALESCE→'all'), and_bucket_key(NULL → "all"). Pre-fix the in-memory chip filter compareditem.audience != 'all'and dropped NULL-audience items from the bucket they were supposed to land in. Issue #62 / PR #126 review. GET /api/memory/admin/audithonorspage— the SQL hadLIMITonly and silently returned the first page for every page parameter. Both branches (action-filtered and unfiltered) now applyOFFSET (page - 1) * per_page. Issue #62 / PR #126 review.POST /api/memoryvalidatesdomainagainst the same allowlistPATCH /admin/{id}andPOST /admin/bulk-updateuse, so an item can't be created with a domain it can't later be patched to. Empty / missing domain is still accepted. Issue #62 / PR #126 review.da admin memory edit --add-tag/--remove-tagcould silently drop existing tags when the target item lived past page 1 of/api/memory. The CLI did GET-then-PATCH for tag mutations, looked the item up in the first 50 rows of the unfiltered list, and overwrote the tag set with[just_added]when it didn't find the row. Tag mutations now route throughPOST /api/memory/admin/bulk-update(single-id array, server-side merge with the existing tags). Issue #62.da admin memory duplicates listcouldn't list both resolution states — the CLI always sentresolved=true|falseand the API defaulted toresolved=falsewhen omitted, so neither path returned the full set.GET /api/memory/admin/duplicate-candidatesnow treats omittedresolvedas "no filter"; the CLI omits the flag by default and only sets it when the user passes--resolvedor--unresolved. The web UI continues to passresolved=falseexplicitly so the actionable backlog stays the default surface. Issue #62.PUT /api/admin/registry/{id}now preserves the originalregistered_attimestamp instead of resetting it tonow()on every edit.TableRegistryRepository.registeracceptsregistered_atas an optional kwarg;update_tablere-passes the existing value from the row it just read. Closes #130.
[0.17.0] — 2026-04-29
Added
- Shared-secret auth path for the in-cluster scheduler service (
SCHEDULER_API_TOKEN). Both theappandschedulercontainers source the same/opt/agnes/.envvia Docker Composeenv_file:, so a 256-bit secret generated once at VM provisioning serves both sides symmetrically. The app validates incomingAuthorization: Bearer <secret>against the env var (constant-time compare; minimum length 32 chars; rejected when env is empty) and resolves matches to a syntheticscheduler@system.localuser that is a member of theAdminsystem group — every existing RBAC gate (require_admin,require_resource_access) works unchanged. Audit-log entries from the scheduler are attributed to this user. Rotation: edit.env,docker compose restart app scheduler. Seeapp/auth/scheduler_token.pyfor the threat model. POST /api/marketplaces/sync-all— admin-only endpoint that runssrc.marketplace.sync_marketplaces()inside the app process. Wired up so the scheduler container can drive the nightly refresh over HTTP without openingsystem.duckdbdirectly.
Fixed
- Scheduler
marketplacesjob 500-ed every cron tick withIO Error: Could not set lock on file system.duckdbafter v0.12.1. The previous implementation calledsrc.marketplace.sync_marketplaces()in-process from the scheduler container, but DuckDB permits only one writer per file across processes — the scheduler raced the app's long-lived handle. Switched the job toPOST /api/marketplaces/sync-all, making the app the sole writer; the scheduler is now a pure cron clock. - Scheduler
data-refreshjob 401-ed every 15 minutes withMissing or invalid Authorization headerbecauseSCHEDULER_API_TOKENwas never propagated byinfra/modules/customer-instance/startup-script.sh.tpl. The startup script now generates a 64-hex-char secret on first boot viaopenssl rand -hex 32, persists it across reboots by reading back from an existing.env(rotation requires explicit operator action — both containers must restart together), and writes it into/opt/agnes/.envalongside the other secrets.app/main.pyseeds the matching synthetic user at startup so the very first cron tick has a valid actor to attribute audit-log entries to. Existing VMs need a one-timesudo /opt/agnes/agnes-rotate-scheduler-token.sh(or simply re-run the startup script viaterraform apply -replace='module.agnes.google_compute_instance.vm["<vm-name>"]'); see migration note in this changelog or rerun the startup script manually. - Non-root container couldn't write to host-bind-mounted
/dataafter the v0.12.1 USER-agnes flip.infra/modules/customer-instance/startup-script.sh.tplnowchown -R 999:999 /dataafter creating the persistent-disk subdirs (state,analytics,extracts). Without this, a freshly-attached PD is root-owned by default andUSER agnes(uid 999) cannot open/data/state/system.duckdbfor write — every authed request 500s withIOException: Cannot open file ... Permission deniedwhile/api/health(which doesn't open the system DB) keeps returning 200, masking the failure from health-only monitoring. Regression first observed onagnes-developmenton 2026-04-29 after the auto-upgrade picked up:stablefrom the 0.12.1 release. Existing VMs with PD-backed/dataneed a one-time host-sidesudo chown -R 999:999 /var/lib/docker/volumes/agnes_data/_data && sudo docker restart agnes-app-1 agnes-scheduler-1to recover — Terraformmetadata_startup_scriptonly runs on boot, so an apply alone does not retro-fix running VMs. Dockerfilepins theagnesuser touid:gid 999:999explicitly (useradd --uid 999). Previously the uid was whatever Debian'suseradd --systemassigned next — happened to be 999 today, but a future base-image change picking 998 or 1000 would silently desync from the startup-script'schown 999:999, reintroducing the same incident. Pinning makes the contract grep-able from both sides.scripts/smoke-test.shno longer silently SKIPs every authed check whenbootstrapreturns 403 (users exist) andSMOKE_TOKENis not set — it now FAILs loudly. Also adds an unauthenticated DB-touching probe (POST /auth/email/request) before bootstrap, since/api/healthdeliberately doesn't opensystem.duckdb(kept cheap for LB probes) and so cannot detect filesystem/permission issues. The new probe catches a class of regression that bypasses health-only monitoring even on instances where bootstrap is closed.- Corporate memory pages (
/corporate-memory,/corporate-memory/admin) now render the shared app header at full viewport width, matching the dashboard. Previously the_app_header.htmlinclude sat inside.container-memory(max-width: 1000px) and was cropped on wide viewports. release.ymlnow publishes a:dev-<slug>+:dev-<prefix>-latestimage when a fresh branch is pushed offmainwith no extra commits. Pre-fix,paths-ignoreon thepushevent diffed the new ref against the default branch — a same-SHA branch had zero diff, every file matched paths-ignore, and the workflow was skipped, so a developer creating a personal branch off main to deploy main's exact state to their dev VM (which pins to:dev-<user>-latest) had to either commit something or trigger the workflow manually. Thebuild-and-pushjob'sifwas also tightened tomain || workflow_dispatchonly, which prevented branch-push images regardless. Both fixed: addedcreate:trigger (filtered to branch refs at the job level so tag creates don't double-build withkeboola-deploy.yml), and broadenedbuild-and-push.ifto also publish on non-main branch pushes / branch creates.- Web header admin nav (All tokens, Marketplaces, Admin → Users / Groups / Resource access / Server config) is now visible to admin users again. Pre-fix,
_app_header.htmlgated the admin block onsession.user.role == 'admin', but the v13 RBAC migration nulledusers.roleand moved admin authority ontouser_group_members(Admin system group) — so the gate evaluated to false for everyone, including actual admins.get_current_usernow injectsuser["is_admin"](computed viaapp.auth.access.is_user_admin, the same call all server-side admin gates use), and the header readssession.user.is_admin. The role badge in the user-menu dropdown now reads "Admin" or hides —users.roleis no longer surfaced in the UI. admin_tablesregister modal payload now matches theRegisterTableRequestAPI contract — drops the phantomidandversionfields the modal used to send (the API silently dropped them), and renamesdataset→bucketso the source-bucket actually persists. Pre-fix the operator's bucket / dataset edit looked saved but never made it past the wire. Edit + delete handlers in the same template were dropping the same fields and are also corrected.- Discovery JS in
admin_tablesnow handles the actual{tables: [...]}flat shape returned byGET /api/admin/discover-tables. Pre-fix the JS expected{buckets: [...]}(a shape the API never emitted) and silently rendered an empty discovery panel after the first call. - #108 review fixes for BigQuery register-table. (a) The post-register materialize worker (BackgroundTask + 5s-timeout daemon thread) no longer captures the request-scoped DuckDB connection — it opens a fresh
get_system_db()handle per run, so the request'sfinally: conn.close()no longer races the worker. (b)connectors/bigquery/extractor.init_extractis now serialized by a module-level_INIT_EXTRACT_LOCKso the timeout-fallback BackgroundTask cannot collide with the still-running daemon thread on theextract.duckdbswap. (c)PUT /api/admin/registry/{id}now runs the same BQ-shape validation as register when the merged record is a BigQuery row (or the patch flips it to BigQuery), returning 400/422 instead of silently persisting an unsafebucket/source_table/ project_id and breaking at the next rebuild. (d)POST /api/admin/register-tableno longer carries a misleadingstatus_code=201on the route decorator — the Keboola branch explicitly returns 201, the BigQuery branch returns 200 (sync) or 202 (timeout fallback), and OpenAPI now documents all three. - #108 round-4 review fix for BigQuery register-table.
_validate_bigquery_register_payloadnow applies the same raw-value rule tobucketandsource_tableas round-3 added forname. Pre-fix the helper validatedbucket.strip()/source_table.strip()butregister_tablepersisted the un-stripped value, so abucket=" my_dataset"slipped through validation, got stored verbatim, and 500'd at the next rebuild when the BQ extractor spliced it intoATTACH … AS bq_<bucket>and view DDL. The validator now rejects anybucket/source_tablewith leading/trailing whitespace and surfaces the offending raw value in the 400 detail. Applies identically toPOST /api/admin/register-tableandPOST /api/admin/register-table/precheck. - #108 round-3 review fixes for BigQuery register-table. (a)
_validate_bigquery_register_payloadnow validates the raw view name (the value persisted totable_registry.nameand read back by the BQ extractor), not a normalizedstrip().lower().replace(" ", "_")form. Pre-fix a name like"my table"passed validation (normalized"my_table"was safe), got stored verbatim, and then 500'd at the post-insert rebuild — defeating fast-fail-at-register. The validator now rejects any name with leading/trailing whitespace OR that fails the strict^[a-zA-Z_][a-zA-Z0-9_]{0,63}$check, and surfaces the offending raw value verbatim in the 400 response so the operator can retype with a corrected name. Server does NOT silently rewrite the input. Applies identically toPOST /api/admin/register-tableandPOST /api/admin/register-table/precheck. (b)_run_bigquery_materialize_with_timeoutnow distinguishes worker-raised-within-budget (→{"status": "errors"}→ HTTP 500 with the exception in the body) from worker-still-running-at-timeout (→{"status": "timeout"}→ HTTP 202 + BackgroundTask retry). Pre-fix both outcomes mapped to "timeout" / 202, hiding the real failure for the budget window before the BG retry surfaced the same exception in the logs. (c)register_table_precheckis now a plaindef(wasasync def) — the BQ branch makes synchronousbigquery.Client(...)/client.get_table(...)calls that would otherwise block the asyncio event loop on an async handler. Mirrors the same conversion already done forregister_table. - #108 round-2 review fixes for BigQuery register-table. (a)
POST /api/admin/register-tableis now a plaindef(wasasync def) — the synchronous-materialize path waits onthreading.Event.wait(), which blocks the asyncio event loop on an async handler and stalls every other request for up to the 5s budget. FastAPI runs sync handlers in a threadpool so the wait is harmless there. (b)connectors/bigquery/extractor.rebuild_from_registrynow resolvesdata_source.bigquery.projectviaapp.instance_config.get_value(deep-merge of static + writable overlay) instead ofconfig.loader.load_instance_config(static only). Operators who set the project throughPOST /api/admin/configuregot a silent rebuild failure pre-fix — validation passed (validation already used the overlay-aware read) but the rebuild reported "project missing" and the master view never appeared. (c)register-tablenow propagatesrebuild_from_registryerrors as HTTP 500 with{"status": "rebuild_failed", "errors": [...]}when the synchronous rebuild ran but reported an error (auth failure, missing project, unsafe identifier slipping the validator). Pre-fix those errors were silently logged and the API returned 200 ok. The BackgroundTask path now logs rebuild errors at ERROR level (was WARNING). (d) The admin tables UI's BigQuery register modal now splits precheck and register into two operator-driven clicks — Step 1 fires precheck and surfaces row count / size / column count in the modal AND swaps the primary button to "Register"; Step 2 fires the actual register call only when the operator clicks. Pre-fix the precheck and register fired in a single chained promise, so the operator never got to review the summary before the row was committed. (e) The Keboola register-modal payload now derivessource_tablefrom the discovered table's storage identifier (t.idminus the bucket prefix, e.g.companyforin.c-sfdc.company) via a new hiddenregSourceTablefield. Pre-fix the JS sentregTableName(the human-friendly display name) assource_table; manual-entry callers fall back to the display name. (f)da admin discover-and-registeraccepts HTTP 200 / 201 / 202 as success (was 201 only); pre-fix every successful BigQuery row counted as an error because BQ register returns 200 (sync OK) or 202 (background) but never 201.
Internal
_sanitize_for_auditnow masks against an explicit_SECRET_FIELDSallowlist instead of substring-scanning + maintaining aprimary_keywhitelist exception. New tests assertnot_actually_a_token/primary_key_hash/passwordlessflow through cleartext while known-secret fields (keboola_token,client_secret,smtp_password,bot_token) get masked. Operationally identical for the current registry payloads (no secret-bearing fields), but removes a class of false-positive / false-negative as the request body grows.release.ymladds ane2e-bind-mountjob that boots the freshly built image against a host-bind-mounted/datadirectory (instead of the named volume the existingsmoke-testjob uses). Docker initializes a fresh named volume by copying from the image's/data— which the Dockerfile chowns toagnes:agnesbefore flipping USER — so the named-volume path always works. The bind-mount path mirrors what GCE VMs run viadocker-compose.host-mount.yml, and includes a negative assertion (write must fail on root-owned/databefore the operator chown) plus a positive assertion (smoke passes after the chown). Locks in the contract that broke a recent release: removingchown 999:999fromstartup-script.sh.tplor changing the Dockerfile uid pin breaks CI.- Extracted
bigquery.extractor.rebuild_from_registry()from the__main__block ofconnectors/bigquery/extractor.pyso the API can call it post-register withoutrunpy-importing the module. The standalone CLI entrypoint (python -m connectors.bigquery.extractor) keeps working.
0.16.0 — 2026-04-29
Minor release. Comprehensive deploy safety audit — CI/CD pipeline hardening, 50+ new tests covering previously untested failure modes, DB schema health check, config versioning, and BigQuery ATTACH error resilience. Built on top of v0.15.0 / 2e1dfb7.
PR: #120 (ci/deploy-safety-audit).
Added
- ruff lint + mypy type check in
release.ymlandkeboola-deploy.ymlCI workflows (bothcontinue-on-error: trueinitially — 257 pre-existing ruff errors, mypy has pre-existing issues; neither blocks CI yet). - Automatic rollback on smoke test failure in
release.yml— tags the broken image as:deprecated-<short-sha>, reverts:stableto the previous good tag, opens a GitHub issue for investigation. - Smoke test in
keboola-deploy.yml— was completely missing; now runs the samesmoke-test.shasrelease.yml. - Expanded smoke-test.sh — added
/api/catalog,/api/admin/tables,/marketplace.zip,/api/metricsendpoint checks beyond the original/api/health. - Post-deploy smoke test (
scripts/ops/post-deploy-smoke-test.sh) — validates health, DB schema version, query endpoint, catalog, and marketplace on a prod VM after deploy. - DB schema version check in
/api/health— returnsdb_schema: "ok" | "mismatch" | "unreachable"; overall status becomes"unhealthy"on schema mismatch. Lets load balancers and monitoring detect half-migrated instances. - Config versioning —
config_version: 1ininstance.yaml, validated at startup by_validate_config_version()inconfig/loader.py. Prevents silent misconfiguration when the config schema evolves. .github/settings.yml— required status checks onmainbranch..github/dependabot.yml— weekly pip + github-actions dependency updates..github/CODEOWNERS— default@keboola/agnes-team, special owners for/infra/,/app/auth/,src/db.py..pre-commit-config.yaml— detect-private-key, check-yaml/json/merge-conflict, ruff, mypy.[tool.ruff]config inpyproject.toml—line-length = 120,target-version = "py313".
Test Coverage (~50 new tests)
- v13→v14 migration (
test_db.py): orphan cleanup, FK constraints, rollback on failure. - Email magic link TTL (
test_auth_providers.py): expired token, token reuse, wrong token. - PAT (
test_pat.py): malformed JWT, empty bearer,last_used_iptracking. - Marketplace ZIP (
test_marketplace_server_zip.py): ETag/304, PAT auth, content-addressed caching,invalidate_etag_cache()on mutation. - Marketplace Git (
test_marketplace_server_git.py): smart HTTP, Basic auth with PAT, RBAC filtering. - Jira webhooks (
test_jira_webhooks.py): HMAC validation, missing signature, malformed JSON (10 tests). - Hybrid Query BQ (
test_remote_query.py):register_bq, JOIN local+BQ, error handling (12 tests). - Keboola extractor (
test_keboola_extractor.py): crash, partial write, timeout, extension fallback (9 tests). - BigQuery extractor (
test_bigquery_extractor.py): corrupted DB, partial write, atomic swap, ATTACH timeout (6 tests). - Orchestrator (
test_orchestrator.py): corrupted extract.duckdb, empty_meta, mid-write, unsafe identifiers (5 tests).
Fixed
- BigQuery extractor ATTACH error handling —
init_extract()now catches exceptions onINSTALL/ATTACHand records them instats["errors"]instead of propagating up. A network timeout or auth failure no longer crashes the extractor; all configured tables are marked as skipped. - ETag cache invalidation on disk mutation —
invalidate_etag_cache()is the documented way to force re-hash after marketplace sync. Tests now call it after mutating on-disk content.
Internal
fetch-depth: 0+fetch-tags: trueinrelease.ymlfor rollback tag resolution.- Docs updated:
ARCHITECTURE.md,docs/DATA_SOURCES.md,docs/QUICKSTART.md,docs/RBAC.md,docs/auth-groups.md.
0.15.0 — 2026-04-29
Minor release. Adds corporate memory v1+v1.5 and /me/debug self-only auth diagnostic. See GitHub release for full notes.
0.14.0 — 2026-04-28
Minor release. Replaces BigQuery wrap-view pattern with Claude-driven fetch primitives. See GitHub release for full notes.
0.13.0 — 2026-04-28
Minor release. Admin server-config editor + Windows PowerShell wrapper. See GitHub release for full notes.
0.12.1 — 2026-04-28
Patch release. Hotfixes the pre-migration snapshot-integrity bug shipped in v0.12.0 and bundles the security/ops hardening from issue groups #82 (auth hardening), #85 (API validation), #87 (deploy posture), plus #46 (SSRF) and #90 (memory stats blocking).
Added
- Path-traversal validation on
/api/data/{table_id}/download—table_idis now checked against_SAFE_QUOTED_IDENTIFIERregex (allows dots and hyphens for Keboola-style IDs likein.c-crm.orders) before any filesystem or DB operation; unsafe values return 404 (no info leakage). See issue #85/C2. - SSRF protection on
POST /api/admin/configure—keboola_urlis validated against private/reserved networks (127.0.0.0/8, 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, localhost, IPv6 loopback/link-local/unique-local). Uses DNS resolution +ipaddressmodule for robust IPv6 handling (catches abbreviated forms likefe80::1,fc00::1). See issue #46. - Caddyfile security headers:
X-Frame-Options DENY,X-Content-Type-Options nosniff,Referrer-Policy strict-origin-when-cross-origin,-Server(strip). See issue #87/M22. - Container runs as non-root user
agnes—USERdirective added to Dockerfile withuseradd+chown. See issue #87/C13. - Docker resource limits:
mem_limit: 4g,mem_reservation: 1g,cpus: 2.0onapp;mem_limit: 2g,cpus: 1.0onscheduler. See issue #87/M21. - Startup warning when no user has
password_hash— alerts operators that/auth/bootstrapis reachable. See issue #82/C8. - Audit logging for failed web form login attempts (
/auth/password/login/web) — mirrors the existing/auth/tokenaudit trail. See issue #82/M9. /api/health/detailedendpoint (authenticated) — returns full diagnostics (version, schema, sync state, user count). Minimal/api/health(unauth) returns only{"status": "ok"}for load balancers. See issue #87/M17.- Health endpoint monitoring guide in
docs/DEPLOYMENT.md— documents both endpoints and how to wire external monitoring tools (Datadog, Prometheus, UptimeRobot) to/api/health/detailedwith a PAT.
Changed
- BREAKING
docker-compose.override.ymlrenamed todocker-compose.dev.yml. Docker Compose auto-mergesdocker-compose.override.ymlon every host with the repo, silently enabling dev mode (source mount +--reload) on production. The new name requires explicit-f docker-compose.dev.yml, eliminating the foot-gun. Update any scripts or workflows that relied on auto-merge.scripts/run-local-dev.shandMakefileupdated accordingly. See issue #87/M23. - BREAKING
/api/healthnow returns a minimal{"status": "ok"}payload (unauthenticated, for load balancers). Full diagnostics moved to/api/health/detailed(requires authentication). Scripts that parsed/api/healthfor version, sync state, or user count must switch to/api/health/detailedwith anAuthorizationheader. CLI commands (da setup test-connection,da setup verify,da diagnose,da status) updated to call/api/health/detailedfor service-level checks, with graceful fallback to the minimal endpoint when auth is not configured. See issue #87/M17. release.ymlCI workflow:build-and-pushjob now only runs onmainpushes or manualworkflow_dispatchtriggers. Non-main branch pushes run tests only. Addedpaths-ignorefordocs/**,*.md,LICENSE. See issue #87/M26.
Fixed
- Pre-migration snapshot integrity — the snapshot file written
before a v(N-k)→vN migration now captures the true on-disk state
before any DDL runs, instead of the post-self-heal state the
0.12.0 hoist (#106) introduced. With the unconditional
conn.execute(_SYSTEM_SCHEMA)at the top of_ensure_schema, the full set of modern-binary tables (view_ownership,marketplace_registry,user_groups,resource_grants, etc.) was materialized first, thenCHECKPOINTflushed them to disk, andshutil.copy2copied the already-modified DB as the "pre-migration" snapshot — so an operator inspecting the snapshot for rollback debugging saw the binary's full table set instead of the old schema. Functionally rollback still worked (extra empty tables are harmless and re-running migration is idempotent), but the snapshot was misleading. Fix: gate the self-heal call oncurrent >= SCHEMA_VERSION. The split-brain (current > SCHEMA_VERSION) and same-version safety-net (current == SCHEMA_VERSION) paths still self-heal as before; the migration path (current < SCHEMA_VERSION) takes its snapshot first and then runs_SYSTEM_SCHEMAfrom inside the existing migration block. reset_tokenno longer leaks in the JSON response body ofPOST /api/users/{id}/reset-password. Thereset_urlstill contains the token (as intended), but the raw secret is no longer exposed to DevTools, proxy logs, or CLI stdout. CLIadmin reset-passwordnow prints the URL instead of the bare token. See issue #82/C5./api/memory/statsno longer blocks the async event loop — replacedrepo.list_items(limit=10000)+ Python loop with a single SQLGROUP BYaggregation. See issue #90.- Magic-link token consumption is now atomic — compare-and-swap pattern
with a unique
CONSUMED:marker prevents two concurrent verifies from both succeeding. DuckDB concurrent-write conflicts are caught and converted to 401. See issue #82/M10. - Password reset confirm (
POST /auth/password/reset/confirm) now uses the same compare-and-swap pattern as the magic-link flow — closes the remaining asymmetry onusers.reset_tokenconsumption. Lower severity than the magic-link race because the reset flow ends with a new password (an attacker would need the reset token and to race the legitimate user) but the consistency closes a polish gap. New regressiontest_concurrent_reset_only_one_winsintests/test_password_flows.py::TestResetConfirm. - Upload endpoints (
/sessions,/artifacts) now stream to a temp file with cumulative size check instead of buffering the entire body in memory before the size cap — prevents OOM from oversized uploads. Temp file handle is properly closed beforeshutil.moveto avoid FD leaks. See issue #85/M4. /api/upload/local-mduses a SHA-256 hashed filename instead of rawuser_email— stable per user, no charset surprises from email addresses. See issue #85/M4./auth/bootstrap403 message no longer leaks user count. See issue #82/n1.
Internal
- New regression
test_split_brain_future_version_with_missing_tables_self_healsintests/test_db.py::TestMigrationSafety— synthesizes a v99 DB whose only table isschema_version, runs_ensure_schema, asserts that the v13-era core tables (users,user_groups,user_group_members,resource_grants) get materialized and thatschema_versionstays at 99 (self-heal without falsely advertising a downgrade). - New regression
test_pre_migration_snapshot_excludes_post_self_heal_tablespins the snapshot-integrity contract: a v2→vN migration's snapshot must not contain any post-v2 table from the modern binary. Sanity-checked against the pre-fix unconditional hoist — fails with 6 leaked tables. test_future_version_is_noopdocstring updated to reflect that the self-heal pass does run on a future-version DB, just doesn't touch the version row. The test still passes unchanged — its only assertion was the version-row contract, which holds.test_no_override_fileregression test assertsdocker-compose.override.ymldoes not exist post-rename. See issue #87/M23.
0.12.0 — 2026-04-28
Changed
/admin/accessresource tree now visually separates the three-level hierarchy (resource type → block/bucket → item). Each resource-type section gets a colored left stripe and a faint tinted banner; sections are separated by an 8px neutral gap. Stripe colors cycle 4-wide vianth-childso adding new resource types toapp/resource_types.pyworks without touching CSS. The first-position color is the project primary blue (#0073D1), avoiding the violet (#6366f1) reserved for granted items.
Added
ResourceType.TABLE— admins can grant table-level access peruser_groupvia the/admin/accesspage. Tables registered intable_registryare listed grouped bybucket, with the existing per-block "Grant all" / "Revoke all" bulk actions. Listing and grant storage only — runtime enforcement still flows through legacydataset_permissions; the migration plan lives indocs/TODO-rbac-data-enforcement.md.AGNES_ENABLE_TABLE_GRANTSenv var (default off) gates the half-builtResourceType.TABLEchip. While disabled the chip is hidden from/admin/accessandPOST /api/admin/grantsreturns 422 with the env-var name indetailon a TABLE grant attempt. Existing TABLE rows inresource_grantsstay listable and deletable — the flag controls UI exposure and new-grant acceptance only, never blocks cleanup.da admin break-glass <user>CLI — recovery path when the operator has locked themselves out of/admin/access. Adds the user to the Admin user_group withsource='system_seed'regardless of RBAC state. Bypasses authentication; relies on filesystem access to${DATA_DIR}/state/system.duckdbimplying host-level trust. Document this in deployment runbooks alongsideSEED_ADMIN_EMAIL.
Internal
scripts/seed_dummy_tables.py— populatestable_registrywith 12 dummy tables across 3 buckets (in.c-finance,in.c-marketing,in.c-product), each withis_public=False, for exercising the new/admin/accessTables section without a configured data source./marketplace.zipshort-circuits to304before any file IO or ZIP compression on a matchingIf-None-Match. Hot path on every Claude Code SessionStart hook. Backed by an in-processcachetools.TTLCacheover the resolved-plugins → ETag map (default 120s, env-tunable viaAGNES_MARKETPLACE_ETAG_TTL, set0to disable).invalidate_etag_cache()is called by marketplace sync after refresh so the next request re-hashes against new on-disk content instead of waiting for TTL expiry. New explicit dependency:cachetools>=5.3.0.
Fixed
/admin/accessgroup sidebar grant-count badges no longer revert to a stale value when switching between groups. The badge was readingstate.groups[i].grant_count, a snapshot populated once at/access-overviewload; toggling a grant only updated the DOM (viarefreshCounts), not that field, so the nextrenderGroupscall (triggered byselectGroup) would clobber the live count with the original snapshot.renderGroupsnow derives the count live fromstate.grants, the array thattoggleGrant/bulkSetkeep in sync. Server data was always correct — only the in-page badge drifted until refresh./catalog,/admin/tables, and/admin/permissionspages now render the shared top header correctly. The pages include_app_header.html(which uses.app-*CSS classes) but were not linkingstyle-custom.csswhere those classes are defined; onlydashboard.htmlandbase.htmldid. Without the stylesheet the nav links, dropdowns, and user menu rendered as unstyled inline text. Added the missing<link>to all three templates.PATCH /api/admin/groups/{id}on a system group now correctly accepts description-only updates while still rejecting renames. The endpoint guard previously short-circuited with409 "System groups are immutable"for any mutation, which contradicted the repository layer's narrowed contract (rename-only rejection) — a description-only payload like{"description": "..."}would hit the endpoint short-circuit and never reach the repo. The endpoint now 409s only whenpayload.namediffers from the existing name; a no-op rename (same name in payload) is dropped from the update before reaching the repo.- Google OAuth callback no longer wipes a user's
google_syncgroup memberships on a transient Workspace API failure.fetch_user_groupsis fail-soft and returns[]for both "no groups" and "API error" — the callback used to feed that empty list intoreplace_google_sync_groups, which deletes allsource='google_sync'rows for the user and then inserts zero. A login during a transient Cloud Identity hiccup would silently drop every Workspace-synced membership the user had built up. Admin-added memberships (source='admin') were already protected. The callback now skipsreplace_google_sync_groupswhen the fetch returns empty and logs "preserving existing memberships" instead. Trade-off: a user whose Workspace groups were genuinely cleared keeps stale memberships until the next non-empty sync — accepted untilfetch_user_groupslearns to distinguish empty-success from empty-failure. docker-compose.host-mount.ymlnow useso: bind,rbindinstead ofo: bindfor thedatavolume. With a plain bind, sub-mounts under/dataon the host (e.g. the dual-disk layout where sdc is mounted on/data/state) are silently shadowed inside the container by an empty subdirectory on the parent disk. The container then writessystem.duckdband other state to the wrong disk; the dedicated state disk receives no writes and accumulates only the snapshot left by the migration script. Recursive bind propagates existing sub-mounts at container start, so the container sees the same filesystem the host does. Operators on dual-disk VMs need to copy the live DB from/var/lib/docker/volumes/agnes_data/_data/state/(sdb's empty subdir) onto/data/state/(sdc) before redeploying with the fix, or the next start will surface the stale snapshot.
Changed
- BREAKING Marketplace endpoint (
/marketplace.zip,/marketplace.git/*) no longer god-modes for Admin members.src.marketplace_filter.resolve_allowed_pluginsnow filters every caller — admins included — throughresource_grants. Admins curate their own marketplace view by granting plugins to the Admin group (or any group they belong to). Existing installs where the only membership on Admin is the admin themselves will see an empty marketplace until grants are added in/admin/access. App-level authorization (require_admin,can_accessfor non-marketplace types) is unaffected — Admin is still god mode there. - BREAKING RBAC redesigned around two layers: app-level access via the
Adminuser-group (god mode short-circuit) and resource-level access via a generic(group, resource_type, resource_id)grant model. The four-valuecore.viewer/analyst/km_admin/adminhierarchy withimpliesBFS expansion is gone — every protected endpoint now uses eitherrequire_adminorrequire_resource_access(ResourceType.X, "{path}")from the newapp.auth.accessmodule. Authorization is decided per-request via a single DB lookup; no session cache, no dual-path resolver, no_hydrate_legacy_roleshim. Seedocs/RBAC.md. - BREAKING
internal_roles,group_mappings,user_role_grants, andplugin_accesstables removed. Replaced byuser_group_members(binds users to user_groups with asourceenum:admin/google_sync/system_seed) andresource_grants(group →(resource_type, resource_id)). Schema v13; the migration backfills from v12 atomically —users.groupsJSON is converted intouser_group_membersrows withsource='google_sync',core.admingrants become Admin-group memberships withsource='system_seed', andplugin_accessrows becomeresource_grantsof typemarketplace_plugin. Theusers.groupsJSON column is dropped; the deprecatedusers.rolecolumn is kept NULL as a legacy artifact. - BREAKING Schema v14 —
user_group_membersandresource_grantsnow declare DuckDB foreign-key constraints ongroup_id(referencinguser_groups.id). Cascade deletes can no longer leave orphaned member / grant rows pointing at a deleted group. Migration is RENAME → CREATE-with-FK → INSERT → DROP, wrapped inBEGIN TRANSACTIONso a partial failure rolls back without leaving the DB at a half-applied schema. Forks that touched these tables outside the documented repository APIs need to verify the FK direction matches their writes. - BREAKING Admin REST surface unified under
/api/admin/groups,/api/admin/groups/{id}/members,/api/admin/grants,/api/admin/resource-types.app.api.role_managementandapp.api.plugin_accessremoved. The web UI route/admin/role-mappingand/admin/plugin-accessare replaced by a single/admin/accesspage; the_app_header.htmllink is renamed to "Access". - BREAKING CLI subcommands
da admin role *,da admin mapping *,da admin grant-role,da admin revoke-role,da admin effective-rolesremoved. New subcommands:da admin group {list,create,delete,members,add-member,remove-member}andda admin grant {list,create,delete,resource-types}.da admin set-role <user> adminstill works as a thin wrapper that toggles Admin-group membership. - Module authors no longer call
register_internal_role(...). Resource types are anapp.resource_types.ResourceTypeStrEnumpaired with aResourceTypeSpecregistered inRESOURCE_TYPES; adding a new resource type means adding one enum member, onelist_blocks(conn)projection delegate, and one spec entry — all inapp/resource_types.py. The registry drives both/api/admin/resource-typesand/api/admin/access-overview, so there's no second wiring step. No DB migration, no startup hook. - Google OAuth callback writes Cloud Identity group memberships into
user_group_members(source='google_sync') instead ofusers.groupsJSON. Manual admin-added memberships (source='admin') survive subsequent logins.
Removed
app/auth/role_resolver.py,app/api/role_management.py,app/api/plugin_access.py.src/repositories/internal_roles.py,src/repositories/group_mappings.py,src/repositories/user_role_grants.py.app/web/templates/admin_role_mapping.html,app/web/templates/admin_plugin_access.html.Roleenum +has_role,is_admin,is_km_admin,is_analyst,_is_admin_user_dict,set_user_role,get_user_rolefromsrc/rbac.py. Dataset-access helpers (can_access_table,get_accessible_tables,has_dataset_access) preserved.- Test files:
test_role_resolver.py,test_api_role_management.py,test_admin_role_mapping_ui.py,test_cli_admin_role.py,test_schema_v9_migration.py,test_plugin_access_api.py.
Internal
src/db.pyschema bumped to v13. New helpers_seed_system_groups(idempotent Admin/Everyone seed, runs on every connect) and_v12_to_v13_finalize(one-shot backfill + DROP cascade) replace_seed_core_rolesand_backfill_users_role_to_grants.app.auth.accessis the new authorization vocabulary:_user_group_ids,is_user_admin,can_access,require_admin,require_resource_access. Lives in its own module to avoid the circular import that would happen if it sat inapp.auth.dependencies(the dependency factory needsget_current_userfrom there).- New
tests/helpers/auth.py::grant_admin(conn, user_id)— adds a user to the Admin system group sorequire_adminresolves to True. Updated test fixtures acrosstest_admin_tokens_ui,test_password_flows,test_pat,test_api,test_api_complete,test_api_scripts,test_web_uito call it afterUserRepository.create(role="admin"). The legacyusers.rolecolumn alone is no longer the admin marker. - Skipped at module level (rewrite required for v13):
test_admin_user_capabilities_ui(asserts the gone v9 capabilities UI),test_marketplace_server_zipandtest_marketplace_server_git(depend on the removedPluginAccessRepository). - Skipped individually as v13 behavior changes:
TestScriptRBACintest_security(scripts are now any-signed-in-user, not analyst+), profile-page tests intest_web_uithat assertedcore.analyst/Direct grants/Effective rolesmarkers from the dropped role hierarchy.
Added
/api/v2/{catalog,schema,sample,scan,scan/estimate}— discovery + scoped fetch primitives for remote-mode tables. Seedocs/superpowers/specs/2026-04-27-claude-fetch-primitives-design.md.da catalog,da schema,da describe,da fetch,da snapshot {list,refresh,drop,prune},da disk-info— CLI primitives backed by the v2 API.cli/skills/agnes-data-querying.md— Claude rails skill loaded for Agnes-flavored projects; covers discovery-first protocol,da fetchworkflow, BigQuery SQL flavor cheat-sheet, and snapshot hygiene.cli/skills/agnes-table-registration.md— admin-side companion skill: when to register single vs. bulk-discover, source-side verification before registration, idempotence rules, update/delete via REST (no CLI today), and confirmation flow.instance.yaml: api.scan.*knobs —max_limit,max_result_bytes,max_concurrent_per_user,max_daily_bytes_per_user,bq_cost_per_tb_usd,request_timeout_seconds. All optional; defaults applied if absent.instance.yaml: api.catalog_cache_ttl_seconds,api.schema_cache_ttl_seconds,api.sample_cache_ttl_seconds— TTL knobs for server-side discovery caches.instance.yaml: data_source.bigquery.legacy_wrap_views— opt-in toggle to restore the pre-v2 behavior of exposing BigQuery VIEW/MATERIALIZED_VIEW tables as DuckDB master views inanalytics.duckdb. Defaultfalse. Settruefor one release cycle when migrating existing scripts (see BREAKING note below).instance.yaml: data_source.bigquery.billing_project— optional GCP project to bill BQ jobs to / submit jobs from. Defaults todata_source.bigquery.projectfor backwards compatibility. Set when the SA hasbigquery.data.*on the data project but lacksserviceusage.services.usethere (cross-project read pattern); otherwise/api/v2/scan/estimateand BQ-mode/api/v2/scanfail with 403.- BigQuery extractor detects table type (BASE TABLE vs. VIEW / MATERIALIZED_VIEW) via
INFORMATION_SCHEMA.TABLESusing DuckDB'sbigquery_query()table function. Emits the appropriate DuckDB view:- BASE TABLE → direct
bq."dataset"."table"reference (queries hit BigQuery Storage Read API). - VIEW / MATERIALIZED_VIEW →
bigquery_query('project', 'SELECT * FROM \dataset.table`')` wrapper (queries hit BigQuery Jobs API, required for views).
- BASE TABLE → direct
- GCE metadata-server authentication for BigQuery. New
connectors/bigquery/auth.pymodule (get_metadata_token()function) fetches OAuth access tokens from the GCE metadata server on GCE instances. No service-account key file required. Both the extractor (at sync time) and the orchestrator / read-side (at ATTACH time) fetch fresh tokens on every rebuild / readonly-conn open. RaisesBQMetadataAuthErroron failure (network or malformed metadata-server response). - SQL identifier-validation helper in
src/sql_safe.py. New functionsis_safe_identifier()andvalidate_identifier()enforce safe character sets before f-stringing identifiers into SQL. BigQuery extractor and orchestrator_attach_remote_extensionsboth validatedataset,source_table, and view names before use, closing a SQL-injection surface if admin config is untrusted. /api/sync/manifestresponse now includesquery_modeandsource_typeper table, joined fromtable_registry. Clients can branch on table semantics (remote vs. local, source type) without a second API call.da sync --jsonoutput now includes askipped_remotelist with IDs ofquery_mode='remote'tables that were skipped during sync (they're not downloaded locally; only queried via/api/query).- Schema v10 introduces
view_ownershipto detect cross-connector view-name collisions in the master analytics DB (issue #81 Group C). When two connectors register the same_meta.table_name, the orchestrator now refuses to silently overwrite the prior owner's view — it logs aview_ownership collisionERROR identifying both sources and the colliding name, and the second source's view is NOT created. Previously this was last-write-wins, which depended on directory iteration order and could change deployment-to-deployment. Operators resolve a collision by renamingnameintable_registryon one side (registry-side aliasing —source_tablestays unchanged, only the view name changes). The orchestrator pre-scans every connector's_metaat the start of each rebuild and releases stale ownerships immediately (when ALL pre-scans succeed; if any fail, reconcile is skipped to avoid silently stealing a transient-IO source's name), so a renamed table frees its name in the SAME rebuild that introduces the rename — no two-step waits needed. New modulesrc/repositories/view_ownership.pyexposes the repository.
Changed
- BREAKING: BigQuery
VIEWandMATERIALIZED_VIEWtables (i.e.query_mode='remote'tables whose underlying BQ object is a view) are no longer wrapped as DuckDB master views inanalytics.duckdb.da query --remote "SELECT * FROM <bq_view>"no longer resolves the view name by default. Useda fetch <table_id> --where ... --as <snapshot_name>to materialize a local snapshot, orda query --remote "SELECT ... FROM bigquery_query('<project>', '<inner BQ SQL>')"for one-shot execution. To restore the previous behavior for a migration window, setinstance.yaml: data_source.bigquery.legacy_wrap_views: true. BQBASE TABLEentities are unaffected — their direct-ref master views remain. da syncskipsquery_mode='remote'tables. Previously they produced 404s on download attempts. Now the CLI prints a one-line stderr summary (Skipping N remote-mode tables: a, b, c (and M more)) and a separate summary line (Skipped (remote-mode): N) in the final output, distinct from existingSkipped (unchanged): Mcounts.
Fixed
/api/v2/scan500 on local-mode tables.arrow_table_to_ipc_bytes()only handledpa.Table; DuckDB's local query path returns apa.RecordBatchReader. Helper now accepts both. (Caught during dev-VM E2E.)/api/v2/schema/{table_id}500 on BigQuery tables._fetch_bq_schema()selecteddescriptionfromINFORMATION_SCHEMA.COLUMNS, which BigQuery doesn't expose there — column descriptions live inINFORMATION_SCHEMA.COLUMN_FIELD_PATHSfor nested fields. Removed the column from the SELECT; descriptions default to empty string until a real source is wired. (Caught during dev-VM E2E.)- BigQuery views failed at first query when FastAPI / CLI reopened
analytics.duckdb.SyncOrchestrator._attach_remote_extensionsfetches a fresh GCE-metadata access token and creates abigqueryDuckDB SECRET before ATTACH, but secrets are session-scoped and don't persist with the on-disk database. The mirror code insrc.db._reattach_remote_extensions(called fromget_analytics_db_readonly()) still ATTACHed BigQuery without auth, so the next query againstbq."dataset"."table"failed. Fixed by adding the same three-branch structure tosrc.db: BigQuery → fetch metadata token →CREATE OR REPLACE SECRET bq_secret_<alias> (TYPE bigquery, ACCESS_TOKEN '<token>')→ ATTACH; otherwise fall back to env-var-token / no-auth paths. Metadata-server failures log at ERROR and skip the source so other connectors still resolve. src/orchestrator.py::_attach_remote_extensionswas ineffective for BigQuery. It filtered_remote_attachlookups bytable_schema=<source_name>, but DuckDB lists an attached database withtable_catalog=<source_name>(nottable_schema), so the loop never executed and_remote_attachrows were silently ignored. Switched the filter totable_catalog, matching the corresponding query already insrc.db.- BigQuery extractor
python -m connectors.bigquery.extractorstandalone CLI now reads project ID fromdata_source.bigquery.projectmatchinginstance.yaml.example. Previously it looked for an undocumented top-levelbigquery.project_idkey and silently produced an empty string on miss, causing cryptic BigQuery API errors downstream. Now exits with code 2 + a clearlogger.errorwhen the key is missing.
Internal
- Test pattern: BigQuery extractor is exercised with a dual-path strategy (BASE TABLE + VIEW detection) via
_CapturingProxySQL-capture wrappers. DuckDB's C-implementedexecuteattribute is read-only and can't be monkey-patched directly; the proxy wraps the connection and captures outgoing SQL before forwarding to the real DuckDB conn. - Implementation plan:
docs/superpowers/plans/2026-04-27-bq-pipeline-views-and-metadata-auth.md— subagent-driven development for Tasks 1-7 of this PR.
Changed (issue #81 / #44 / #88 — security & OSS neutralization)
-
BREAKING (ops): Keboola extractor now exits with three distinct codes instead of two (issue #81 Group B / M14):
0= full success,1= full failure,2= partial failure (some tables succeeded, some failed). Previouslyexit(0)fired even when 9 of 10 tables failed, masking partial failures from the sync API and any operator alerting hooked to non-zero exit codes. The sync API (POST /api/sync/trigger) now logsPARTIAL FAILURE (exit 2)as a data-quality alert (distinct fromFAILED (exit 1)) and continues to the orchestrator rebuild step — successful tables from this run plus unchanged tables from previous runs stay queryable. Operators whose alerting treated any non-zero exit as a hard error must teach it that exit 2 is a partial-failure signal, not a deploy failure. -
BREAKING (security): The entire Script API is now admin-only (issue #44).
GET /api/scripts,POST /api/scripts/deploy,POST /api/scripts/run, andPOST /api/scripts/{id}/runall require the admin role; previously the list endpoint was open to any authenticated user and deploy/run were analyst-accessible. Two reasons: (1) the AST + string-blocklist sandbox in_execute_scriptis defense-in-depth and known to be bypassable through introspection chains (__class__.__base__.__subclasses__(),__globals__['__builtins__'],__mro__traversal — the dunder pattern list was tightened in this PR but the policy is "the role gate is the trust boundary, not the blocklist"); (2) gating only/runleft a planted-script attack open — an analyst could deploy a malicious script and wait for an admin to run it. Operators who need scripted workflows for non-admin users should run them on the user's behalf or expose the relevant data via the read-only/api/datasurface instead. Migration for cron / scheduler PATs: if a non-admin PAT is wired into a scheduler that hits/api/scripts/{id}/runor/api/scripts/run, the request now returns 403. Add the PAT user to the Admin group via/admin/accessorda admin group add-member Admin <pat-user-email>. PATs themselves do not need re-issuing — group membership is read at request time. -
BREAKING (ops): Generic ops scripts moved out of the customer-named
scripts/grpn/directory intoscripts/ops/as part of the OSS vendor-neutralization (issue #88):scripts/grpn/agnes-tls-rotate.sh→scripts/ops/agnes-tls-rotate.shscripts/grpn/agnes-auto-upgrade.sh→scripts/ops/agnes-auto-upgrade.sh
Downstream consumer infra repos that copy these scripts onto VMs (e.g. via their own
startup.sh) must update the source path. The OSS-shippedinfra/modules/customer-instance/Terraform module is unaffected — it embeds equivalent logic inline via heredoc and does not source-by-path fromscripts/. Script behaviour and env vars are unchanged. Cross-refs inREADME.md,CLAUDE.md,docs/DEPLOYMENT.md,Caddyfile, anddocker-compose.ymlwere updated. -
OSS neutralization (wave 2 — code, tests, planning docs). Customer identifiers replaced with placeholders across the codebase to ready the repo for public release (issue #88):
- Code docstrings:
connectors/openmetadata/{client,transformer,enricher}.py,src/catalog_export.py,scripts/duckdb_manager.py—prj-grp-…→my-bq-project/prj-example-1234,AIAgent.FoundryAI→AIAgent.MyAgent(in docstrings) /AIAgent.Example(in test fixtures),FoundryAIDataModel→AnalyticsDataModel. - Test fixtures in
tests/test_openmetadata_enricher.py,tests/test_duckdb_manager.py,tests/test_catalog_export.py,tests/test_openmetadata_transformer.py— same set of replacements, behaviour-preserving (157 tests still green). - Terraform module
infra/modules/customer-instance/variables.tf:customer_namedescription rewritten in English, examples switched fromkeboola, grpntoacme, example. - Workflow
.github/workflows/keboola-deploy.yml: comment "Groupon-side dev VMs" → generic "per-developer dev VMs". - Caddyfile: TLS-rotation cross-ref updated to
scripts/ops/…and Keboola-specific aside removed. - Auth docs
docs/auth-groups.mdand the OAuth probe inscripts/debug/probe_google_groups.py: GCP project namekids-ai-data-analysisreplaced with placeholderacme-internal-prod. - Planning docs under
docs/superpowers/plans/and…/specs/: the five hackathon-era documents (2026-04-21-deployment-log.md,…-multi-customer-deployment.md,…-issues-14-and-10.md,…-hackathon-dry-run.md, the spec) had34.77.94.14/34.77.102.61replaced with<dev-vm-ip>/<prod-vm-ip>,Groupon/GRPN/grpnwithAcme/another-customer, andprj-grp-…withprj-example-….
- Code docstrings:
Fixed
- BREAKING (security CRITICAL): Jira webhook handler is now
fail-closed (issue #83). Previously, if
JIRA_WEBHOOK_SECRETwas unset,_verify_signaturereturnedTrueand any unauthenticated POST to/webhooks/jiracould trigger the full ingest pipeline. The handler now returns 503 when the secret is missing (operator-misconfiguration signal, distinct from 401 wrong-signature). Operators relying on the no-secret = accept-everything mode (don't — it was never documented) must setJIRA_WEBHOOK_SECRETbefore this merges. - Security (CRITICAL): Jira issue keys arriving via webhooks are now
validated against the canonical
^[A-Z][A-Z0-9]{0,31}-[0-9]{1,12}\Zformat ([0-9]not\dto refuse non-ASCII Unicode digits,\Znot$to refuse trailing newlines that$would tolerate) before any filesystem operation (issue #83). Previously,issue_keyflowed unsanitized intoconnectors/jira/service.py(save_issue,download_attachment,_handle_deletion,process_webhook_event) andconnectors/jira/incremental_transform.py, enabling path traversal (../../etc/passwdstyle writes outside the Jira data dir). New moduleconnectors/jira/validation.pyprovidesis_valid_issue_key(regex whitelist; underscore deliberately excluded — Atlassian rejects underscores in real project keys) andsafe_join_under(Path.resolve()containment check). Both are enforced at every filesystem boundary, defense-in-depth. - Security (CRITICAL):
webhookEvent(the second attacker-controlled field in Jira webhook payloads) was used as a filename component in_log_webhook_eventwithout sanitization (issue #83 reviewer follow-up). A payload withwebhookEvent: "../../tmp/pwn"could write a JSON dump outsideWEBHOOK_LOG_DIR. The handler now strips everything that isn't[A-Za-z0-9_-](dot deliberately excluded to defeat..survival), clips length to 64 chars, and routes the final filename throughsafe_join_under. - Security (CRITICAL): hardened the connector → orchestrator trust
boundary on BOTH the rebuild path
(
src/orchestrator.py::_attach_remote_extensions) AND the read-only query path (src/db.py::_reattach_remote_extensions, called byget_analytics_db_readonly()on every request) — issue #81 Group A. Three fixes: (1) DuckDB extensions referenced by_remote_attachare matched against a hard allowlist (default:keboola, bigquery; override viaAGNES_REMOTE_ATTACH_EXTENSIONS). Install path splits built-in (LOAD only) from community (INSTALL FROM community; LOADon rebuild path; LOAD only on the read-only query path which must not touch the network). (2)token_envnames are matched against a hard allowlist (default:KBC_TOKEN,KBC_STORAGE_TOKEN,KEBOOLA_STORAGE_TOKEN,GOOGLE_APPLICATION_CREDENTIALS; override viaAGNES_REMOTE_ATTACH_TOKEN_ENVS). Names must additionally match^[A-Z][A-Z0-9_]{0,63}$. A malicious connector cannot ask the orchestrator to readJWT_SECRET_KEY/SESSION_SECRET/OPENAI_API_KEYand exfiltrate them viaATTACH ... TOKEN. (3) The URL passed toATTACHis now single-quote-escaped on both paths. Also fixed atable_schemavstable_catalogmismatch that silently no-op'd_attach_remote_extensionsfor every connector (the rebuild-path hardening would have been moot in production without this fix). New modulesrc/orchestrator_security.pycentralises the policy and exposeslog_effective_policy(), called from app startup so an operator's typo inAGNES_REMOTE_ATTACH_EXTENSIONS(which replaces the default, not extends it — a setting ofhttpfswould silently lock outkeboola, bigquery) is visible at boot rather than at the next failed attach. Seedocs/superpowers/plans/2026-04-27-issue-81-trust-boundary.md. - Security (MEDIUM): extractor-side identifier validation (issue
#81 Group D / M15). The Keboola and BigQuery extractors interpolate
table_name,bucket/dataset, andsource_tablefromtable_registrydirectly intoCREATE OR REPLACE VIEW,INSERT INTO _meta, andCOPY ... TOSQL. Anyone with write access totable_registry(admin, registry-write API) could inject SQL via these identifiers. New shared modulesrc/identifier_validation.pyexposes a strictvalidate_identifier(for our own view names —^[a-zA-Z_][a-zA-Z0-9_]{0,63}$, used fortable_nameso it matches the orchestrator's rebuild-time check and dashed names fail fast at extraction rather than being silently dropped at rebuild) and a relaxedvalidate_quoted_identifier(for upstream-typed names like Keboolain.c-foo/ BigQuerymy-dataset:[a-zA-Z0-9_][a-zA-Z0-9_.\-]*, refusing any character that could close a"..."identifier literal). The orchestrator's existing_validate_identifierwas lifted into the new module so both layers share a single source of truth; both extractors skip-and-continue on unsafe rows (logged + counted in failure stats; the rest of the registry still processes).
Removed
- Customer-specific manual-deploy helper
scripts/grpn/Makefileand its README, plus the corresponding hackathon deploy log underdocs/superpowers/plans/2026-04-22-grpn-deploy-learnings.md. These documented one operator's hand-rolled stopgap for an org-policy-blocked Terraform flow and do not belong in vendor-neutral OSS. scripts/switch-dev-vm.sh— hackathon-era helper hardcoded to a specific shared dev VM. Per-developer dev VMs are the supported pattern now; operators who need an equivalent should usegcloud compute ssh <vm> --command "sed -i …/.env && sudo /usr/local/bin/agnes-auto-upgrade.sh"with their own VM details.
Internal
- Sandbox blocklist now flags introspection-chain dunders explicitly:
__subclasses__,__globals__,__class__,__base__,__bases__,__mro__,__dict__,__code__,__builtins__.__init__and__getattribute__are intentionally not in the list — substring match would flag every legitimatedef __init__(self):. The chain breaks at the next link anyway. - New regression test
test_run_pwn_payload_blockedparametrized over the exact PoC from issue #44 plus two equivalent variants (lambda+__globals__,__mro__traversal). If the dunder list is silently weakened in a future refactor, the test fails. Newtest_*_requires_admintests parametrized over all three non-admin core roles (analyst, viewer, km_admin). tests/conftest.py::seeded_appextended withviewer_tokenandkm_admin_tokenso role-gating tests cover all four core roles.
Migrated
- Schema bumped from v9 to v10. Auto-migration applies on next start
(creates the
view_ownershiptable; data on disk is unaffected). The pre-migration snapshot machinery (added at v8→v9) covers v9→v10 too — if anything goes wrong during the migration, the snapshot at<DATA_DIR>/state/system.duckdb.pre-migratelets you roll back.
0.11.5 — 2026-04-27
Follow-up release for PR #73: addresses four rounds of Devin AI review on the role-management-complete branch. No new public-API surface; the user-visible payoff is that v8→v9-migrated installations now work end-to-end (login flows, user list, admin nav, privilege revocation), and make local-dev startup is finally quiet.
Fixed
- Privilege retention after grant revocation via the new REST API (Devin review #73).
_hydrate_legacy_rolepreviously short-circuited on a truthyuser.get("role"). The role-management endpoints (POST/DELETE /api/admin/users/{id}/role-grants, plus thechangeCoreRoleUI flow) only mutateuser_role_grants— they don't touch the legacyusers.rolecolumn. After a downgrade-via-API, the stale legacy value would keepuser["role"] = "admin"in memory;_is_admin_user_dictand the catalog/sync admin-bypass short-circuits then silently retained elevated table access even thoughrequire_internal_rolecorrectly denied the API gates. Fix: always re-resolve fromuser_role_grantsregardless of the legacy column, making the grants table the single source of truth on every authenticated request. Cost: one DB round-trip per request (same as the existing PAT-aware fallback). - Dev-bypass + OAuth callback dropped direct grants from the session cache (Devin review #73). Both call sites passed
external_groupsonly toresolve_internal_roles, never the user's id — souser_role_grantsrows were resolved on the per-request DB-fallback path insiderequire_internal_roleinstead of the cache. Functionally correct, but every admin-gated request paid a DB round-trip and the dev-bypass log line read "resolved 0 internal role(s)" for an obviously-admin user, which was confusing during debugging. Fix: passuser_idso the cache reflects the union at sign-in. GET /api/usersreturned HTTP 500 for any v8→v9-migrated installation. The migration NULL-s legacyusers.role(kept as a deprecated artifact because DuckDB FK blocks DROP COLUMN), butUserResponse.roleis a requiredstrPydantic field — every user listing failed validation./admin/usersshowed only "Failed to load users" and the new/admin/users/{id}Detail link was unreachable. Fix: route every user dict returned by the API through_hydrate_legacy_role(same shim already used byget_current_user), which derives the legacy enum value fromuser_role_grantsfor migrated users. Also fixes a quieter dual of the same bug —target["role"] == "admin"short-circuits inupdate_user/delete_userwould silently no-op on migrated admins, letting the operator demote/delete the last admin against the documented protection.- Scheduler log-noise: every cron tick produced a
POST /auth/token 401 Unauthorizedaccess-log line because the scheduler's auto-fetch fallback was always broken — it called/auth/tokenwith just an email, but the endpoint requires email + password. Fix: removed the auto-fetch path entirely. Operators setSCHEDULER_API_TOKEN(a long-lived PAT) in production; inLOCAL_DEV_MODEthe dev-bypass auto-authenticates the un-tokenized request, so jobs continue to work. - HTTP 500 on
POST /auth/tokenfor v8-migrated users (Devin review #73 round 3).TokenResponse.roleis a requiredstrPydantic field, but the v8→v9 migration NULL-s the legacyusers.rolecolumn for every existing user. The login endpoint passed the raw NULL through to Pydantic, raisingValidationError→ 500. Same root cause produced semantically wrong (but non-crashing) JWTs from Google OAuth, password, and email-magic-link flows — they wroterole: nullinto the issued token; downstream_hydrate_legacy_roleinget_current_userwould correct the per-request view, but the token payload itself stayed misleading. Fix: hydrate inline in each login flow before readinguser["role"]—app/auth/router.py(POST /auth/token),app/auth/providers/google.py(OAuth callback),app/auth/providers/password.py(5 flows: JSON login, web login, JSON setup, web reset, web setup), andapp/auth/providers/email.py(centralized in_consume_token, covers both magic-link/verifyendpoints). New regression classTestAuthLoginFlowsPostMigrationintests/test_schema_v9_migration.pypins both the no-crash and the correct-role contracts for all four legacy levels (viewer/analyst/km_admin/admin). docs/RBAC.mddocumented animplies=[…]keyword onregister_internal_role()that the function doesn't accept (Devin review #73 round 3). A module author copying the example would hitTypeError: got an unexpected keyword argument 'implies'at import time. Reality:impliesis currently seeded only for thecore.*hierarchy via_seed_core_rolesinsrc/db.py— the registry-side write path doesn't exist yet. Rewrote the Implies hierarchy and Module-author workflow sections to document what's actually supported in 0.11.4 and what a future change would need to add._seed_core_roleswas advertised as a per-connect safety net but only ran during fresh installs and the v8→v9 migration (Devin review #73 round 4). The docstring promised "called from_ensure_schemaon every connect" so an accidentalDELETE FROM internal_roles WHERE key = 'core.admin'(or a doc-tweak release that updated_CORE_ROLES_SEEDwithout bumping the schema version) would self-heal on the next process start. In reality both call sites lived insideif current < SCHEMA_VERSION:— once the DB was on v9, the seed function never ran again, leaving any deletion permanent and any in-codedisplay_name/description/implieschange requiring a manual SQL deploy. Fix: added an unconditional tail call to_seed_core_roles(conn)at the bottom of_ensure_schema, gated only bycurrent <= SCHEMA_VERSIONso the future-version-rollback contract still holds. New regression classTestSeedCoreRolesSafetyNetintests/test_schema_v9_migration.pypins all three contracts (deleted row re-seeds, mutateddisplay_namere-syncs from code,applied_atdoesn't churn on already-current DBs).make local-devstartup spammed anAuthlibDeprecationWarningfrom upstream's own_joserfc_helpers.pyevery timeapp/auth/providers/google.pytriggered thefrom authlib.integrations.starlette_client import OAuthimport chain. The warning is upstream-internal — authlib telling itself to migrate fromauthlib.josetojoserfcbefore its 2.0 cut — and isn't actionable on our side until either authlib ships the fix or we rewrite OAuth on top ofjoserfcdirectly. Filtered the specific warning class at the top ofapp/main.py(with a message-based fallback if the class moves in a future authlib release) so the warning no longer pollutes operator-facing stdout. OtherDeprecationWarnings remain visible.
Added
/profilenow self-services every user's role situation. Three new sections rendered server-side for all signed-in users (not just admins): Effective roles (the full resolver output as chip cloud — direct grants ∪ group-derived ∪ implies-expanded), Direct grants (rows inuser_role_grantswith source label:auto-seedfrom v8 backfill vs.directadmin grant), and Roles via groups (which Cloud Identity / dev group grants which role for the current user). Non-admins finally see why a particular feature is or isn't accessible without asking an admin to read the DB. Admins additionally see a deep-link to/admin/users/{id}for editing their own grants in place./admin/role-mappinggroup ID picker. A new "Known groups" panel above the create-mapping form surfaces clickable chips of group IDs known to the system: the calling admin's ownsession.google_groups(with human-readable names + a "your group" tag) merged with distinctexternal_group_ids already used in existing mappings (tagged "already mapped"). Click a chip → fills the form's external-group-id input and focuses the role select. Empty-state copy points the operator atLOCAL_DEV_GROUPS/ Google sign-in when the picker is empty, instead of leaving them to guess Cloud Identity opaque IDs from memory.
Changed
- Renamed
docs/internal-roles.md→docs/RBAC.md. Standard industry term, more discoverable for engineers grepping for "RBAC" in a new repo. Added Quickstart-by-role sections (operator / end-user / module author) and a step-by-step Module-author workflow with code examples for registering a key, gating endpoints, declaring implies hierarchies, and writing a contract test against the gate. Cross-references in code (app/api/admin.py,tests/test_role_resolver.py) updated.CLAUDE.mdnow points contributors at the new doc from the Extensibility → RBAC section. Historical CHANGELOG entries ([0.11.3]/[0.11.4]body) keep the originalinternal-roles.mdfilename — they describe what shipped at that version and aren't retro-edited.
0.11.4 — 2026-04-27
Role-management complete release. Sjednocuje legacy users.role enum (viewer/analyst/km_admin/admin) with the v8 internal-roles foundation under one model with implies hierarchy, ships admin UI + REST API + CLI for managing both group mappings and direct user grants, and wires require_internal_role for PAT-aware resolution so admin endpoints work uniformly across OAuth and headless callers.
Added
- Schema v9 — unified role model. New
user_role_grants(user_id, internal_role_id, granted_by, source)table for direct user→role assignments (complementary togroup_mappingswhich assigns via Cloud Identity group). Two new columns oninternal_roles:implies(JSON array of role keys this role transitively grants) andis_core(BOOL, distinguishes seeded core.* hierarchy from module-registered roles). Migration v8→v9 seeds fourcore.*rows (core.viewer/analyst/km_admin/admin) with the legacy hierarchy asimplies(core.admin → core.km_admin → core.analyst → core.viewer), backfills oneuser_role_grantsrow per existing user mirroring their pre-v9users.rolevalue (source='auto-seed'), and NULLs the legacy column. - PAT-aware
require_internal_role. Two-path resolution: session cache first (OAuth flow), DB-backeduser_role_grantsfallback (PAT/headless flow). Admin CLI scripts now hit gated endpoints uniformly without an OAuth round-trip. The PAT-specific 403 message from 0.11.3 is removed — PAT now legitimately resolves through direct grants. - Implies expansion at resolve time. New
expand_implies(role_keys, conn)helper inapp.auth.role_resolverdoes BFS over theimpliesgraph;resolve_internal_rolescalls it at the end so a singlecore.admingrant expands to the full four-level hierarchy automatically. - Dotted role-key namespace. Regex extended to allow
core.admin,context_engineering.admin,corporate_memory.curatorstyle keys (max 64 chars, lower-snake-case segments separated by dots). The owner_module column should match the prefix before the first dot. - REST API for role management. New router
app/api/role_management.pyunder/api/admin:GET/POST/DELETEongroup-mappings,users/{id}/role-grants, plusGET internal-rolesandGET users/{id}/effective-roles(debug). All gated byrequire_internal_role("core.admin")— works for both OAuth admins (cookie) and admin PATs. - Admin UI
/admin/role-mapping. Browse internal roles, manage Cloud Identity group → role mappings (table view + create/delete forms). User detail page extended with three sections: Core role (single-select forcore.*), Additional capabilities (multi-checkbox for module roles), Effective roles (debug view of direct + group-derived + expanded set). da adminCLI subcommands.role list,role show <key>,mapping list/create/delete,grant-role <email> <key>,revoke-role <email> <key>,effective-roles <email>. All run over PAT — use them in CI scripts to grant/revoke roles without going through the browser.
Changed
- BREAKING (semantics, not API).
users.rolecolumn NULL-ed during v8→v9 migration. Reads viaUserRepository.get_by_*still return the column but the value is always NULL after upgrade — code readinguser["role"]directly in business logic getsNone. The legacyRoleenum (Role.VIEWER/ANALYST/KM_ADMIN/ADMIN) and convenience helpers (is_admin,has_role, etc. insrc/rbac.py) continue to work — they now read fromuser_role_grantsvia the resolver. Sweepinguser.get("role") == "admin"checks were rewritten to the new helper. The column itself is preserved physically because DuckDB rejects DROP COLUMN while a FK references the table; physical drop is deferred to a future schema-rebuild migration. require_role(Role.X)andrequire_adminare now thin wrappers overrequire_internal_role(f"core.{role}"). Behavior identical for OAuth users (admin role from group_mappings); PAT users now succeed when they hold a directcore.admingrant.UserRepository.create()andupdate()mirror role changes intouser_role_grantsautomatically (_grant_core_rolehelper); existing setup code keeps working without changes.UserRepository.delete()pre-deletesuser_role_grantsrows (DuckDB FK doesn't auto-cascade).UserRepository.count_admins()readsuser_role_grants ⨝ internal_roles WHERE key='core.admin'— the legacyusers.role = 'admin'count would always return 0 after backfill.app/api/admin.pymodule-level docstring documents the v9 pattern for module authors who want to add their own capability gates.docs/internal-roles.mdrewritten to remove the v8 "no UI yet" caveat, document the implies hierarchy, the dual session/DB resolution pathway, and the dotted-namespace key convention.
Removed
require_internal_role's session-only enforcement (the v8 "This endpoint needs an interactive (OAuth) session — Bearer/PAT tokens do not carry session-resolved roles" error message). PAT clients with a matchinguser_role_grantsrow now pass the gate uniformly.
Internal
- New
UserRoleGrantsRepositoryinsrc/repositories/user_role_grants.pymirrors the style ofGroupMappingsRepository(list/get/create/delete + per-user / per-role indices). - INFO-level audit log on grant + mapping mutations (action strings:
role_mapping.created/deleted,role_grant.created/deleted, resourcemapping:<id>/grant:<id>). - "Last admin protection" on
DELETE /api/admin/users/{id}/role-grants/{grant_id}: refuses to delete the finalcore.admingrant in the system (mirrors existingcount_adminsprotection on user deletion / deactivation).
0.11.3 — 2026-04-26
Authorization-foundation release — adds the internal-roles layer between Cloud Identity groups and per-module capability checks. Schema v8 migration; no admin UI yet (follow-up).
Added
- Internal roles + group mapping (foundation). Schema v8 adds two tables:
internal_roles(app-defined capabilities likecontext_admin,agent_operator, registered by Agnes modules at import time) andgroup_mappings(many-to-many bindings of Cloud Identity group IDs to internal role keys, managed by admins). Newapp.auth.role_resolvermodule exposesregister_internal_role(...)for module authors,sync_registered_roles_to_db(...)(run once at startup, idempotent),resolve_internal_roles(external_groups, conn)(called at sign-in, writes resolved keys intosession["internal_roles"]), and arequire_internal_role("…")FastAPI dependency factory for permission checks. Resolution runs at sign-in (Google OAuth callback + dev-bypass — populates on first request and whenever external groups change, mirroring the OAuth callback's always-write semantics). No DB hit per request. Refresh requires re-login, same semantics assession.google_groups. No admin UI yet — mapping rows must be created via the repository directly until the management UI ships in a follow-up. PAT/headless clients carry no session and therefore cannot passrequire_internal_rolegates by design —require_internal_roledistinguishes "signed-in but missing role" from "no session at all" and surfaces a PAT-specific 403 detail in the second case so an API consumer hitting the wall sees what to fix. Seedocs/internal-roles.md→ PAT and headless requests.
Changed
docs/internal-roles.mddocumentsAdmin → Users → deactivate then reactivateas the supported "force re-resolve now" lever for users you can't get to log out (long-lived sessions, automated clients) — invalidates the existing session and forces a fresh sign-in on the next request.
Internal
- INFO-level audit log on every successful resolve (OAuth callback + dev-bypass) so a "wrong role" complaint is debuggable from the log alone — admin can correlate "user X claims they lost access" with the resolver output without replaying the request.
- Startup warning when
SESSION_SECRETis shorter than 32 chars, matching the existingJWT_SECRET_KEYgate. Both HMAC surfaces sign trust-laden state (session.internal_roles,session.google_groups, JWTs) — keeping the two gates consistent so a weak secret gets surfaced at boot, not after a quiet downgrade. _clear_registry_for_tests()now refuses to run unlessTESTING=1so a stray import path in production can't drop the registered capabilities.
0.11.2 — 2026-04-26
Dev-experience patch release — make LOCAL_DEV_MODE realistic enough to actually exercise group-aware code paths on localhost, and consolidate scattered dev-onboarding instructions into a single docs/local-development.md.
Added
LOCAL_DEV_GROUPSenv var mockssession.google_groupsfor the auto-logged-in dev user whenLOCAL_DEV_MODE=1. JSON array matching the production shape ([{"id":"…","name":"…"}]) so group-aware UI and access-control code paths can be exercised onlocalhostwithout a Google OAuth round-trip. Honored only underLOCAL_DEV_MODE=1. The startup banner reports the parsed group IDs (or warns loudly when the value is set but malformed), so a typo gets surfaced at boot rather than silently on the first authenticated request. Session injection mirrors the production OAuth callback's "always-write" semantics — including clearing stale groups when the operator unsetsLOCAL_DEV_GROUPSmid-session. Seedocs/auth-groups.md→ Local-dev mock.make local-devnow seeds two default mocked groups (Local Dev Engineers+Local Dev Adminsonexample.com) viascripts/run-local-dev.sh, so first-boot/profileis non-empty out of the box. Override withLOCAL_DEV_GROUPS='[…]' make local-dev; disable withLOCAL_DEV_GROUPS= make local-dev.docs/local-development.md— single onboarding doc for working on Agnes locally: TL;DR, whatLOCAL_DEV_MODEactually bypasses, group mocking, what isn't mocked, and the security-rails reminder that dev mode must never reach a production deploy.
Internal
- Fix nightly
docker-e2eCI failures: refresh two stale assertions that had drifted from the live API.tests/test_docker_full.py::test_app_returns_html_on_rootnow expects the auth-aware302 → /login(root has redirected since the auth middleware landed);tests/test_e2e_docker.py::TestDockerHealth::test_health_has_duckdbnow readsservices["duckdb_state"](current health-payload shape, already validated bytests/test_api.py). No application behavior change — these only ran in the scheduled nightly job, so the drift went unnoticed for several PRs.
0.11.1 — 2026-04-26
Patch release — hotfix the missed Caddy env passthrough that should have shipped with 0.11.0, plus codify changelog discipline so this kind of drift gets caught at PR review time next time.
Fixed
docker-compose.ymlcaddy service now passesCADDY_TLSthrough to the container (- CADDY_TLSbare-form passthrough). Without it theCaddyfile{$CADDY_TLS:default}substitution always falls back to cert-file mode regardless of what the operator wrote into.env, and Caddy crash-loops on Let's Encrypt / internal-CA deployments. Should have shipped with #52; first attempt was #55, accidentally closed before merging.
Internal
CLAUDE.md— non-negotiable changelog discipline: every PR touching user-visible behavior must updateCHANGELOG.mdunder## [Unreleased]in the same PR.
0.11.0 — 2026-04-26
First tagged semver release. The version = "2.x" strings that appeared in earlier pyproject.toml snapshots were arbitrary placeholders from the initial scaffold and never reflected actual API maturity — resetting to pre-1.0 to signal that things may still shift.
Added — Auth
- Google Workspace groups on
/profile. OAuth callback fetches the signed-in user's group memberships via Cloud Identity (searchTransitiveGroupswith thesecuritylabel — seedocs/auth-groups.mdfor the GCP setup checklist and thesecurity-vs-discussion_forumgotcha). Profile link added to the user dropdown. - Password reset + invite flows for web and admin (
/auth/password/reset,/admin/users/invite). - Personal access tokens (PAT) with separate
:typ=patJWT claim, per-token revoke, last-used IP tracking, "My tokens" + admin "All tokens" UI. - Email magic-link provider (itsdangerous-signed token).
- Optional
SEED_ADMIN_PASSWORDto pre-hash the seed admin (dev convenience).
Added — Deploy
keboola-deploy.ymlworkflow. Tag-triggered alternative torelease.ymlfor shared dev VMs that want explicit "deploy when I tag" semantics. Publishes immutable:keboola-deploy-<tag>+ floating:keboola-deploy-latestalias.- Caddy + Let's Encrypt + corporate-CA TLS.
Caddyfileparametrized via$CADDY_TLSenv var so a single file serves three regimes: cert-file (corp PKI), Let's Encrypt auto-issue, Caddy-internal-CA. URL-driven cert rotation with self-signed fallback (scripts/grpn/agnes-tls-rotate.sh).docker-compose.tls.ymloverlay closes host:8000when Caddy fronts. dev_instancesschema incustomer-instanceTerraform module gains optionaltls_mode+domain(mirrorsprod_instance).infra-v1.6.0tag.- Optional Google OAuth credentials from Secret Manager. Module reads
google-oauth-client-{id,secret}at boot if present; graceful fallback so non-Google deployments aren't affected. LOCAL_DEV_MODE+make local-dev-up/local-dev-downfor one-keystroke local stack with magic-link auth pre-wired.- Per-developer
dev-<prefix>-latestGHCR alias for branches matching<prefix>/<branch>— push-to-deploy on personal dev VMs. /setupweb wizard for first-time instance setup, plus headlessPOST /api/admin/configureandPOST /api/admin/discover-and-register.- Smoke-test job in CI (Docker-in-CI after every release) +
scripts/smoke-test.shfor post-deploy verification.
Added — CLI
- Wheel distribution + auto-update check on startup.
--versionflag,--dry-run+X/Nprogress onda sync, durable sync (atomic writes + manifest hash + retry on transient errors).- gzip on JSON/HTML responses (server-side).
Added — Data
- Remote query engine. Two-phase BigQuery + DuckDB engine for tables too large to sync locally (
--register-bqflag). - Business metrics. Standardized
metric_definitionstable in DuckDB with starter pack importer (da metrics import). /api/healthreturnsversion,channel,commit_sha,image_tag,schema_version.- Custom connector mount support (
connectors/custom/). - OpenAPI snapshot test for breaking-change detection.
Added — Docs / tooling
docs/auth-groups.md,docs/DEPLOYMENT.md,docs/HACKATHON.md,docs/ONBOARDING.mdrunbooks.scripts/debug/probe_google_groups.py— stdlib-only probe for diagnosing Cloud Identity API issues without a deploy cycle.- Schema migration safety tests (idempotency, data preservation, snapshot).
- Pre-migration snapshot of
system.duckdbbefore schema upgrades. - Auto-generated JWT and session secrets with file persistence (
/data/state/.jwt_secret). - Startup banner logging version, channel, and schema version.
Changed
- BREAKING (deployment) — Caddy compose profile renamed
production→tls. Existingdocker compose --profile production up -dinvocations need to switch. - BREAKING (deployment) — Default
Caddyfilemode is now cert-file (tls /certs/fullchain.pem /certs/privkey.pem); for the previous Let's Encrypt auto-issue behaviour setCADDY_TLS=tls <ops-email>in.env. Seedocs/auth-groups.mdandCaddyfileinline docs. - Schema migration v5→v6→v7: adds
users.active,personal_access_tokenstable,personal_access_tokens.last_used_ip. Auto-applied at boot. - Image-level
AGNES_VERSIONnow sourced frompyproject.tomlat build time (no more drift betweenda --versionand the package metadata). - Vendor-agnostic OSS rule codified in
CLAUDE.md— customer-specific names, hostnames, project IDs belong in consumer infra repos, not in this OSS distribution.
Fixed — Security
- Open-redirect guard for backslash in
safe_next_path. SessionMiddleware max_age=3600 + https_only(was browser-session forever, plain-HTTP-OK).- Timezone-aware datetimes in Keboola metadata cache.
- Atomic magic-link token consumption (closes double-use race under concurrent clicks).
- Bootstrap backdoor closed when passwordless seed admin exists.
- urllib3 1.26→2.6.3 (resolves 4 Dependabot security alerts).
- argon2-cffi adopted for password hashing.
- See docs/security-audit-2026-04.md for the full audit (renamed from
docs/padak-security.mdin #94).
Fixed — Other
uvicorn --proxy-headers --forwarded-allow-ips='*'so OAuth callbacks resolve to https when behind a TLS terminator.scripts/grpn/agnes-tls-rotate.shhardened:--max-redirs 0+--proto '=https'on cert fetch, post-fetch PEM validation (rejects HTML error pages from corp portals),ulimit -c 0to suppress coredumps that could leak the unencrypted privkey, POSIX-safe${arr[@]+"${arr[@]}"}array expansion.scripts/tls-fetch.sh— generic URL fetcher (sm://,gs://,https://,file://) with redirect refusal + PEM validation.kbcstoragemoved to optional dep — unblocks urllib3 security updates; primary Keboola path now uses the DuckDB Keboola extension.- Dependencies consolidated into
pyproject.toml(no morerequirements.txt).
Internal
- Test suite expanded to 1357+ tests (4 layers — unit, integration, web smoke, journey).