* feat(telemetry): marketplace item rollup refactor (schema v46)
Replace the v42 attribution layer with prefix-split + live lookup against
marketplace_plugins / store_entities. The v42 design had a latent bug —
AttributionLookup keyed on bare skill names while Claude Code writes
`<plugin>:<local>` in JSONL, so lookups never matched and
usage_plugin_daily stayed empty in every deployment.
Schema (v46 migration):
- Drop usage_attribution_skills / _agents / _commands (mapping tables,
derivable from marketplace_plugins + plugin tree).
- Drop usage_plugin_daily (always empty in production due to the bug above).
- Create usage_marketplace_item_daily — per-day fact (count, distinct_users,
error_count), composite PK on (day, source, type, parent_plugin, name).
- Create usage_marketplace_item_window — sliding-window snapshot with
true cross-window distinct user counts; period_label='last_7d' refreshes
every tick, 'last_30d' refreshes hourly (tracked via session_processor_state).
- Mark usage_tool_daily as candidate for removal (no product-UI consumer).
Attribution flow:
- MarketplaceItemLookup replaces AttributionLookup. Preloads
marketplace_plugins.name + store_entities.name into memory once per
UsageProcessor tick, then per-event splits identifier on ':',
matches prefix, writes resolved source / parent_plugin into
usage_events. agnes-store-bundle prefix routes to flea entities.
Slash commands with `plugin:` prefix count as type='skill' in rollup.
API:
- BREAKING: MarketplaceItem.unique_users_30d renamed to distinct_users_30d
(now a true distinct count from the window snapshot, not sum-of-daily).
- InnerDetailResponse gains a telemetry field — invocations_30d +
distinct_users_30d surfaced on curated inner skill / agent detail pages.
- Card chip hidden pending UX finalisation; data stays in the response.
Backfill: scripts/backfill_marketplace_rollup.py — one-shot rebuild over
historic usage_events after deploy, idempotent.
USAGE_PROCESSOR_VERSION bumped 4 → 5 so the reprocess loop re-attributes
existing events to the new source/ref_id semantics on the next tick.
Tests rewritten: test_session_processor_usage, test_usage_rollups,
test_marketplace_telemetry, test_api_admin_usage_reprocess,
test_db_schema_version, test_home_stats, test_schema_v42_migration.
New: test_backfill_marketplace_rollup.
* fix(marketplace): refresh Most Popular on search + category changes
`loadMostPopular()` early-exits when `state.q` or `state.category` is
set, but the search + category handlers only called `loadItems()` —
so once the section was visible, typing a query or filtering by
category didn't re-run the hide check and the cards stayed on screen
out of scope. Tab + sort handlers already chained the call.
Add the call to runSearch + category pill click handlers (All +
per-category) so the visibility contract holds for every state
mutation that can flip the early-exit condition.
* feat(marketplace): All-plugins section + 7-day Most Popular
Listing layout:
- Always-visible "All plugins" / "All items" / "Your stack" section
header (label swaps per tab) wrapped in `#mp-all-section` so its
margin-collapse mirrors the sibling `#mp-popular-section` and the
spacing from the filter row stays consistent in both layouts.
- Sort dropdown moved from the filter row into the All-* header,
pinned right via `margin-left: auto`. Anchored to its section so
the relationship between sort + grid is obvious.
- `.mp-section-header` gets `min-height: 32px` + `align-items: center`
so the bare-text Most Popular row matches the dropdown-bearing
All-* row.
- `.mp-section-header` margin tightened 24px → 20px on top.
Most Popular:
- Capacity reduced 8 → 4 cards.
- Now reflects a 7-day window (was 30-day). Backend surfaces
`invocations_7d` + `distinct_users_7d` on `MarketplaceItem`
alongside the existing 30d fields; the loader pulls a wider page
(server still sorts by 30d) and re-sorts + filters client-side
on `invocations_7d > 0` so the strip stays "hot right now".
- Section label updated to "Last 7 days".
- Section now renders on both `curated` and `flea` tabs (was
curated-only). Hidden on `my` and whenever search / category
filter is active. Refresh hooks wired into search + category
click handlers so visibility flips immediately on state change.
Backend (`_load_invocation_stats`):
- Single SELECT pulls both `last_30d` and `last_7d` rows from
`usage_marketplace_item_window`; the result dict carries
invocations + distinct_users for both windows.
- Trend (recent_7 vs prior_7) kept on the daily fact table so it
stays independent of the window snapshot's freshness.
* feat(marketplace): Most adopted sort + hide Trending when no trend data
Add a fourth sort option to the All-items dropdown — "Most adopted
(30d)", keyed on `MarketplaceItem.distinct_users_30d` (true 30d
distinct user count from `usage_marketplace_item_window`). Protects
the listing from power-user skew that `most_used` is susceptible to:
one user × 100 invokes can't beat 10 different users × 1 invoke
under adoption sort.
Hide Trending option when the response has no trend data. User
reported `sort=trending` returning an empty grid because every
plugin's `trend_pct` was None (prior-week threshold of >= 3
invocations didn't clear anywhere). Empty grids on a user-selected
sort are worse UX than just not offering the sort — surface what
works, hide what doesn't.
Backend (`app/api/marketplace.py`):
- `_apply_sort` gains a `most_adopted` branch (DESC distinct_users_30d,
ties by name ASC).
- `sort` Literal extended.
- `ItemListResponse.available_sorts` lists the sort keys the UI
should expose for this response. recent/most_used/most_adopted
always; trending only when at least one item in the tab's stats
carries a non-null trend_pct.
- `_available_sorts(stats_dicts)` helper centralises the rule —
curated and flea branches pass one stats dict, my-tab passes both
(option is available when either source has trend data).
Frontend (`app/web/templates/marketplace.html`):
- New `<option value="most_adopted">Most adopted (30d)</option>`
between Most used and Trending.
- URL state allowlist extended so `?sort=most_adopted` round-trips.
- `applyAvailableSorts(available)` runs after each list fetch:
hides options not in the response's available_sorts; if the user
is on a now-unavailable sort, resets to 'recent' and re-fetches.
Search-mode fan-out unions availability across the curated + flea
responses so a hit on either side keeps the option visible.
* feat(marketplace): funnel chip on cards + deterministic Most Popular sort
Card chip — funnel telemetry between description and footer:
[stack-icon] N installed · [user-icon] N active · [bolt-icon] N calls · ↑/↓ N%
- stack_count (new MarketplaceItem field): for curated it's COUNT(*)
on user_plugin_optouts (post-v28 row PRESENCE = subscribed; system
plugins are fanned out to every user via fanout_system_for_user so
the count includes them naturally). For flea it reuses the existing
store_entities.install_count (bumped on install/uninstall).
- distinct_users_30d (existing) — active users in the 30d window.
- invocations_30d (existing) — call volume.
- trend_pct (existing) — week-over-week, both directions: green ↑ /
red ↓, magnitude only (sign in the arrow). Hidden when null.
Backend additions in app/api/marketplace.py:
- MarketplaceItem.stack_count field.
- _load_curated_stack_counts() — one SELECT per render, GROUP BY
(marketplace_id, plugin_name). Wired into the curated + my-tab
branches; flea reads install_count off the entity row directly.
Frontend (app/web/templates/marketplace.html):
- Heroicons solid 24×24 inlined (one helper per icon, all
fill="currentColor" so per-segment colour tokens apply): rectangle-
stack (mirrors the My Stack tab icon), user, bolt, arrow-trending-
up/down.
- Per-segment colour: installed=amber #F59F0A (My Stack accent),
active=green #0e9b6a, calls=orange #f97316. Text stays neutral so
the chip still reads as metadata, the leading glyph carries the
visual cue. Trend pill keeps the full-segment green/red colour.
- Zero state: chip hidden when stack_count == 0 AND invocations_30d
== 0 — brand-new cards aren't visually penalised by a "0·0·0" row.
- Tooltips on every segment via title="…" so hover explains the
number's meaning to anyone uncertain about the icon.
Most Popular section — deterministic ordering:
Previously sorted by invocations_7d DESC with no tie-breakers, so
several cards with identical 7d call counts would swap places on
refresh (JS stable sort fell back on backend order, and the backend's
own tie-breaker for `most_used` was just name ASC — six `grpn`
plugins from six test marketplaces collapse to the same name and
became indeterminate via list_with_filters' created_at order).
New cascading hierarchy (chosen primary now matches what "most
popular" really means — wide adoption, not power-user volume):
1. distinct_users_7d DESC ← adoption / social proof
2. invocations_7d DESC ← volume at equal adoption
3. distinct_users_30d DESC ← broader adoption fallback
4. invocations_30d DESC ← broader volume fallback
5. name ASC ← deterministic textual order
6. marketplace_slug ASC ← splits duplicate plugin names across
marketplaces
Six levels guarantee any two items end at a different sort key, so
the strip is stable across refreshes.
* fix(marketplace): unify Most Popular on 30d + right-align installed chip
Most Popular section was sorting on the 7d window while its cards
rendered 30d numbers — header label promised one thing, cards showed
another. Unified everything on 30d so a card means the same data
everywhere on the page.
- Dropped the "Last 7 days" meta from the Most Popular header.
- Sort cascade now starts on distinct_users_30d, then invocations_30d,
with 7d adoption/volume as recency-aware fallbacks before the name +
marketplace_slug deterministic tail. Six levels guarantee identical
sort keys never produce indeterminate order across refreshes.
- Filter switched from invocations_7d > 0 to invocations_30d > 0 to
match the new horizon.
- Most Popular now only renders on page 1 of the listing. Past initial
discovery, a top-of-list popularity strip on page 2+ would shadow the
results the user paged into. Pager click handler refreshes the
section so navigating back to page 1 re-mounts it.
Chip layout — split engagement vs adoption visually:
[user] N active · [bolt] N calls · [↑/↓] N% [stack] N installed
└────────── LEFT (time-bounded engagement) ────┘ └── RIGHT (all-time) ──┘
- Installed (stack_count) is all-time, decremented on uninstall. Alone
it says little ("12 people installed it") without the engagement
context next to it ("…but did anyone actually use it?"). Visually
separating the two groups makes that distinction obvious — left
group answers "is it used", right answers "does anyone have it".
- Implemented via flex with margin-left:auto on .seg-installed so
installed drifts to the trailing edge.
- Installed tooltip now reads "Currently installed by N users" — the
count is a real-time net (uninstall drops it), and saying "currently"
makes that explicit. Helps when a card shows 0: signals "nobody has
this in their stack right now", not "data missing".
* feat(plugin-detail): telemetry chip in hero, derived rows in sidebar
Surface the same telemetry funnel the listing card carries on the
curated plugin detail page, so clicking through from /marketplace
keeps a single mental model — figures match, semantics match. The
detail sidebar drops the two raw numbers that used to live there
(Invocations 30d / Users 30d — duplicated by the chip now) and
replaces them with two *derived* signals only the daily series can
provide: Active days + Last used.
Backend (app/api/marketplace.py):
- PluginDetailResponse.stack_count — curated reads via
_load_curated_stack_counts(), flea reuses install_count. Frontend
treats both sources uniformly.
- _build_telemetry() always returns a dict (never None). Frontend
decides chip visibility from stack_count + invocations_30d the
same way the listing card does. daily_series is always 30 entries
(zero-padded) so "Active days" and "Last used" derivations on the
sidebar are trivial array filters.
Frontend (app/web/templates/marketplace_plugin_detail.html):
- New .hero-telemetry slot at the bottom of the hero meta column,
between the pills row and the action buttons. Renders the four
funnel segments — active · calls · trend · installed — joined by
` · `. No left/right split: the hero has space, so a single
coherent metadata strip reads cleaner than the card's split layout.
- Heroicons solid inlined (user / bolt / arrow-trending-up,-down /
rectangle-stack) recoloured against the dark hero — icons in
lighter tokens (mint #6ee7b7, peach #fdba74, cream #fde68a), trend
pill keeps the saturated green/red because direction-coding earns
its own colour.
- Tooltip on installed reads "Currently installed by N users" — the
count is a real-time net (drops on uninstall), and "currently"
makes that explicit when a card shows 0.
- fmtNum helper added so 1.2k / 14M renderings match the card's
format exactly.
- Sidebar swap: Invocations + Users rows removed, replaced by
Active days → "N of 30"
Last used → fmtRelative of the latest non-zero day
Both derived from telemetry.daily_series — engagement consistency
+ recency, neither of which the hero chip exposes on its own.
* feat(item-detail): telemetry chip in hero for curated skill/agent
Bring the funnel chip the plugin detail page got in 4cf38d40 to the
curated inner skill/agent detail page — clicking through from the
listing card now keeps the same metadata strip from grid to plugin
page to inner item page.
Backend (app/api/marketplace.py):
- _load_inner_item_stats() rewritten:
* always returns a dict (never None) so the frontend can decide
chip visibility client-side, same contract as _build_telemetry
* adds trend_pct, computed the same way as plugin level
(recent_7 vs prior_7 from usage_marketplace_item_daily, ≥3
prior-week threshold)
* adds daily_series (30 entries, zero-padded) so the sidebar can
derive Active days + Last used
- InnerDetailResponse.parent_stack_count — new field. Skills/agents
don't have a per-item subscription model, so the hero shows the
*parent plugin's* stack count under a "Plugin:" prefix. The
funnel: "12 installed plugin → 2 actually use this skill".
- curated_skill_detail + curated_agent_detail handlers load
_load_curated_stack_counts() once and pass the parent's value.
Frontend (app/web/templates/marketplace_item_detail.html):
- New .item-detail .hero .hero-telemetry slot beneath the badges
row. CSS mirrors plugin-detail's colour tokens (mint/peach/cream
Heroicons solid + saturated trend pill) so the two surfaces read
as one visual family.
- Installed segment uses a "Plugin:" label rendered with reduced
opacity to signal the metric describes the parent, not the item
itself. Tooltip: "Parent plugin (<plugin_name>) currently
installed by N users".
- Sidebar Invocations + Users rows removed (chip carries them).
Active days + Last used derived from telemetry.daily_series replace
them; only rendered when activeDays > 0 so a brand-new skill
doesn't show "0 of 30" / "Last used —".
- "Type" row dropped from the sidebar — duplicates the hero badge.
- fmtNum helper added (matches listing card + plugin detail).
Plugin detail (app/web/templates/marketplace_plugin_detail.html):
- Hero "Curator: …" line removed. The Details sidebar already
carries that info; duplicating it under the h1 was visual noise.
- Sidebar "Owner" row renamed to "Curator" — for curated plugins
it's a person who curates inclusion in this Agnes instance, not
the upstream code owner. "Owner" was a hold-over label.
* feat(item-detail): unify hero with plugin detail — pills + breadcrumb + cleaner sidebar
- Inner skill/agent hero now uses the same `.pills` / `.pill.cat / .curated /
.flea / .muted` class names + CSS as the plugin detail page; the only
item-only addition is `.pill.type` (Skill / Agent uppercase, plugin detail
has no kind axis).
- Hero `Updated` moved out of the meta-row into a muted pill (mirrors the
plugin detail hero), removed from the Details sidebar to avoid duplication.
- Details sidebar slimmed: dropped Marketplace, Path, Updated rows; Parent
plugin now shows the curator-friendly display name
(`parent_display_name || manifest_name || slug`) instead of the slug.
- Breadcrumb extended to full path: Marketplace > <marketplace_name> >
<plugin display name> > <self>, mirroring the plugin detail breadcrumb.
- Backend: new `InnerDetailResponse.parent_display_name` field, populated via
`_curated_plugin_enrichment` from marketplace-metadata.json — same source
plugin detail hero already uses.
* feat(marketplace): flea inner skill/agent detail + breadcrumb polish
- Flea inner skill/agent detail page parity with curated:
* GET /api/marketplace/flea/{id}/skill/{name} + /agent/{name}
returning InnerDetailResponse (mirror of curated_skill_detail).
* /marketplace/flea/{id}/skill|agent/{name} web routes that render
marketplace_item_detail.html with source='flea' + innerName context.
* Frontend apiURL grows a third branch for flea-inner; breadcrumb
grows to 4 segments (Marketplace > Flea Market > <plugin display
name> > <self>) when innerName is set.
* Telemetry attribution: MarketplaceItemLookup resolves
<flea_plugin>:<inner> prefixes to (source='flea',
parent_plugin=<plugin name>) so nested invocations land in the
same rollups curated nested skills use. USAGE_PROCESSOR_VERSION
bumped 5 -> 6 so the reprocess loop re-attributes historic events.
- Breadcrumb 2nd segment is now a generic clickable "Curated
Marketplace" / "Flea Market" link to /marketplace?tab=... instead
of the opaque per-instance marketplace_name. Applied on both plugin
detail and inner item detail.
- Inner item hero telemetry chip works for both sources: installedCount
branches on parent_stack_count (curated) vs install_count (flea),
installed segment drops the "Plugin:" prefix for flea standalone /
inner items.
- Updated row dropped from Details sidebar on item detail — the hero
pill already carries the value, sidebar row was duplicate.
* feat(item-detail): block stack-install on flea inner items (mirror curated)
Inner skills/agents nested inside a flea plugin can no longer be added
to a user's stack on their own — adoption only happens at the plugin
level, same rule curated nested items have followed since launch.
- Hero action: when innerName is set (curated nested OR flea nested),
render "Open parent plugin →" link + helper text instead of the
install/remove buttons. Flea standalone entities (no innerName) keep
the normal install UX.
- Meta-row: same branch now serves curated + flea inner — "part of
<parent plugin display name> · by <author>" with the parent link
pointing at the right detail page per source.
No API gate change needed: POST /api/store/entities/{id}/install only
accepts existing entity ids (plugin-level), inner items have no entity
id of their own so the endpoint cannot target them directly.
* feat(marketplace): telemetry chip on inner cards + fix flea hero chip visibility
Inner skill/agent cards on the plugin detail page now carry the same
four-segment funnel chip the marketplace listing cards show (N active
. N calls . trend . N installed), for both curated nested skills and
flea nested skills. Plus two fixes that were keeping the hero chip
hidden on flea plugin / flea inner detail pages.
- Backend `_load_inner_items_stats_by_parent(conn, source, parent_plugin)`
bulk loader: one query per plugin against usage_marketplace_item_window
+ one against _daily, returning {(name, type): stats}. Avoids N+1
per-card lookups.
- `InnerItemSummary` gains invocations_30d / distinct_users_30d /
trend_pct / parent_stack_count fields. `curated_detail` and
`flea_detail` (in the entity.type=='plugin' branch) enrich the
skills / agents lists after the existing cover-photo enrichment loop.
- `marketplace_plugin_detail.html`: new `.plugin-detail .inner-card
.inv-chip*` CSS lifted from marketplace.html with the listing-card
rules, new buildInnerCardChip() helper, buildCardSection appends
the chip to each card body. Same gate as the listing card (hidden
on parent_stack==0 && calls==0).
- fix(flea): flea_detail forgot to populate PluginDetailResponse.stack_count
from entity.install_count (listing card does this on line 851; detail
endpoint didn't). Hero chip gate `stackCount===0 && calls===0` then
always hid the chip even when the entity had installs. Now mirrors
listing card semantics: stack_count == install_count for flea.
- fix(flea inner): renderInnerHeroTelemetry was reading `d.install_count`
for any non-curated source. InnerDetailResponse has no install_count
field — it has parent_stack_count (populated server-side from the
parent flea plugin's install_count). Gate + label now read
parent_stack_count for both curated nested AND flea nested scenarios;
install_count remains the flea standalone path.
* fix(marketplace): Owner label on flea + parent-centric sidebar for flea inner
- Plugin detail Details sidebar — authorship row label now tracks the
source: curated bundles get `Curator` (existing behaviour), flea
bundles get `Owner`. The `owner_todo` reminder placeholder stays on
the curated branch only; flea falls through silently.
- Inner item detail Details sidebar — flea-inner (skill/agent nested
inside a flea plugin) now shares the curated nested layout: Parent
plugin / Bundle size / Active days / Last used / Owner. Drops the
flea-standalone shape's `Category`, `Version`, `Installs`, `Released`
rows that didn't apply to a nested item. Active days + Last used were
already wired (telemetryRows) — they just weren't on the flea-inner
branch.
* fix(tests): bump SCHEMA_VERSION assertions 47 -> 48 post-rebase
The marketplace telemetry migration was renamed _v46_to_v47 -> _v47_to_v48
during the rebase onto main (collision with #326 FTS BM25 migration that
took the v47 slot). Two test files still asserted the pre-rebase value:
- tests/test_home_stats.py::test_schema_version_constant_is_46 (CI red)
- tests/test_schema_v46_migration.py::test_schema_version_is_46
Renames the helper fn name + bumps the assertion. The other two test
files (test_db_schema_version.py, test_schema_v42_migration.py) were
already updated in the rebase resolution.
* fix(telemetry): _build_telemetry returns None when invocations_30d == 0
The follow-up commit that introduced the always-return-dict shape broke
the test contract from the original v46 PR (commit b603e998):
tests/test_marketplace_telemetry.py::TestDetailTelemetry::
test_detail_endpoint_telemetry_absent_when_no_data
AssertionError: assert {'daily_series': [...], ...} is None
Both `PluginDetailResponse.telemetry` and `InnerDetailResponse.telemetry`
are declared `Optional[Dict] = None`, the frontend renders are None-safe
(`d.telemetry || {}` guard + `if (!d.telemetry || ...)` on daily_series),
so dropping the dict on zero activity is the cleaner default.
* release: 0.54.21 — marketplace telemetry refactor (schema v48) + flea inner detail parity + listing UX polish
---------
Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
|
||
|---|---|---|
| .claude | ||
| .github | ||
| app | ||
| cli | ||
| config | ||
| connectors | ||
| dev_docs | ||
| docs | ||
| infra | ||
| scripts | ||
| services | ||
| src | ||
| tests | ||
| .dockerignore | ||
| .gitignore | ||
| .pre-commit-config.yaml | ||
| .test_durations | ||
| AGENTS.md | ||
| ARCHITECTURE.md | ||
| Caddyfile | ||
| CHANGELOG.md | ||
| CLAUDE.md | ||
| docker-compose.ci.yml | ||
| docker-compose.dev.yml | ||
| docker-compose.flat-mount.yml | ||
| docker-compose.host-mount.yml | ||
| docker-compose.local-dev.yml | ||
| docker-compose.prod.yml | ||
| docker-compose.test.yml | ||
| docker-compose.tls.yml | ||
| docker-compose.yml | ||
| Dockerfile | ||
| LICENSE | ||
| Makefile | ||
| pyproject.toml | ||
| pytest.ini | ||
| README.md | ||
| uv.lock | ||
Agnes — AI Data Analyst
Agnes is an open-source data distribution platform for AI analytical systems. It extracts data from configured sources into DuckDB, serves it via a FastAPI backend, and distributes Parquet files to analysts who query them locally using Claude Code and DuckDB.
Each data source produces a self-describing extract.duckdb file. The SyncOrchestrator attaches all extract databases into a master analytics.duckdb, making every table available through a unified view layer without copying data unnecessarily.
Architecture: extract.duckdb Contract
Every connector produces the same output structure:
/data/extracts/{source_name}/
├── extract.duckdb ← _meta table + views
└── data/ ← parquet files (local sources only)
The orchestrator scans /data/extracts/*/extract.duckdb, attaches each into analytics.duckdb, and creates master views.
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Keboola │ │ BigQuery │ │ Jira │
│ extractor │ │ extractor │ │ webhooks │
│ (DuckDB ext) │ │ (remote BQ) │ │ (incremental)│
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
▼ ▼ ▼
extract.duckdb extract.duckdb extract.duckdb
+ data/*.parquet (views → BQ) + data/*.parquet
│ │ │
└─────────────────┼─────────────────┘
▼
SyncOrchestrator.rebuild()
ATTACH → master views in analytics.duckdb
│
┌──────────┼──────────┐
▼ ▼ ▼
FastAPI CLI
(serve) (agnes pull)
Supported Data Sources
| Mode | Distribution | Sources | Use when |
|---|---|---|---|
Batch pull (local) |
Parquet on disk, scheduled | Keboola | Source has a native bulk-export and the table fits on disk |
Materialized SQL (materialized) |
Parquet on disk, scheduled query | BigQuery, Keboola | Source table is too large to mirror as-is; you want a curated subset / aggregate on disk |
Remote attach (remote) |
View only, no download | BigQuery | Table is too large to materialize; latency cost of remote query is acceptable |
| Real-time push | Incremental parquet | Jira | Source is event-driven and you need sub-minute freshness |
The first three modes are what agnes pull distributes to analysts. The fourth is server-side only — analysts query Jira data through the same agnes pull-distributed parquets.
Admins manage per-source registrations through the /admin/tables UI (per-connector tabs for BigQuery / Keboola / Jira) or the agnes admin register-table CLI; per-row "Manage access" deep-links to /admin/access for granting tables to user groups via resource_grants(group, ResourceType.TABLE, table_id).
Analysts get a closed loop with Claude Code: agnes init writes <workspace>/.claude/settings.json with SessionStart (agnes pull --quiet) and SessionEnd (agnes push --quiet) hooks so every Claude Code session starts with fresh RBAC-filtered parquets and ends with the session log uploaded back.
Adding a new source means creating connectors/<name>/extractor.py that produces extract.duckdb with a _meta table (table_name, description, rows, size_bytes, extracted_at, query_mode). The orchestrator attaches it automatically.
Quick Start with Docker
# Clone the repository
git clone https://github.com/keboola/agnes-the-ai-analyst.git
cd agnes-the-ai-analyst
# Copy and edit configuration
cp config/instance.yaml.example config/instance.yaml
cp config/.env.template .env
# Edit both files for your environment
# Start the app and scheduler
docker compose up
# Start with all optional services (Telegram bot, etc.)
docker compose --profile full up
# Start with TLS (Caddy on :443 with corporate-CA certs from /data/state/certs)
docker compose -f docker-compose.yml -f docker-compose.prod.yml -f docker-compose.tls.yml \
--profile tls up -d
Once running, the FastAPI app is available at http://localhost:8000 (or https://$DOMAIN in TLS mode). See docs/DEPLOYMENT.md for cert provisioning + auto-rotation via scripts/ops/agnes-tls-rotate.sh. Trigger a manual sync:
curl -X POST http://localhost:8000/api/sync/trigger
Local sync & auto-update
Analysts run Claude Code against a local DuckDB built from RBAC-filtered parquets pulled from the server. agnes pull is the distribution path:
agnes pull # delta-pull: manifest → MD5 compare → download changed → rebuild views
agnes pull --quiet # same, no progress output (for hooks/cron)
agnes push # push session jsonl + CLAUDE.local.md back to the server
agnes init writes Claude Code lifecycle hooks into <workspace>/.claude/settings.json:
SessionStart→agnes pull --quiet— fresh data on every sessionSessionEnd→agnes push --quiet— uploads notes and session log
Hooks live at workspace level so they only fire in this analyst workspace, not in unrelated Claude Code sessions on the same machine.
Admin: which tables auto-sync to whom
The auto-sync set per analyst is the intersection of:
- Tables with
query_mode IN ('local', 'materialized')— these have parquets on disk and end up in the manifest - Tables granted to one of the analyst's groups via
resource_grants(group, ResourceType.TABLE, table_id)(seedocs/RBAC.md)
To enroll a new table for auto-sync, register it (or update its query_mode) and grant it to the relevant groups in /admin/access. New analysts get the same set on their next agnes pull.
For BigQuery, register a query_mode='materialized' table with a SQL body:
agnes admin register-table orders_90d \
--source-type bigquery \
--query-mode materialized \
--query @docs/queries/orders_90d.sql \
--schedule "every 6h"
The scheduler runs the query through the DuckDB BigQuery extension on each tick that's due, writes the result as a parquet, and the analyst picks it up on the next agnes pull. Cost guardrail: data_source.bigquery.max_bytes_per_materialize (default 10 GiB) — operations exceeding the BQ dry-run estimate are skipped.
Development Setup
# Create and activate virtual environment
python3 -m venv .venv && source .venv/bin/activate
# Install dependencies
uv pip install ".[dev]"
# Run FastAPI locally with hot reload
uvicorn app.main:app --reload
# Run the test suite
pytest tests/ -v
Project Structure
├── src/ # Core engine
│ ├── db.py # DuckDB schema (system.duckdb, analytics.duckdb)
│ ├── orchestrator.py # SyncOrchestrator — ATTACHes extract.duckdb files
│ ├── repositories/ # DuckDB-backed CRUD (sync_state, table_registry, users, etc.)
│ ├── profiler.py # Data profiling
│ └── catalog_export.py # OpenMetadata catalog export
├── app/ # FastAPI application
│ ├── main.py # App setup, router registration
│ ├── api/ # REST API (sync, data, catalog, admin, auth)
│ ├── auth/ # Auth providers (Google OAuth, email magic link, desktop JWT)
│ └── web/ # HTML dashboard routes
├── connectors/ # Data source connectors (extract.duckdb contract)
│ ├── keboola/ # Keboola: extractor.py (DuckDB extension) + client.py (fallback)
│ ├── bigquery/ # BigQuery: extractor.py (remote-only via DuckDB BQ extension)
│ └── jira/ # Jira: webhook + incremental parquet → extract.duckdb
├── cli/ # CLI tool (`agnes pull`, `agnes query`, `agnes admin`)
├── services/ # Standalone services (scheduler, telegram_bot, ws_gateway, etc.)
├── scripts/ # Utility + migration scripts
├── config/ # Configuration templates (instance.yaml.example)
├── docs/ # Documentation + metric YAML definitions
└── tests/ # Test suite (633 tests)
Configuration
| File | Purpose |
|---|---|
config/instance.yaml |
Instance-specific settings: branding, data source type, auth provider, Google domain |
.env |
Secrets and environment variables — never committed |
system.duckdb table_registry table |
Table definitions managed via POST /api/admin/register-table (or PUT /api/admin/registry/{id} to update) or the web UI |
Copy the example to get started:
cp config/instance.yaml.example config/instance.yaml
See config/instance.yaml.example for all available options.
Documentation
Full index: docs/README.md — every doc, organized by audience (analyst / operator / developer).
Key entry points:
- Quickstart — local development setup
- Onboarding Guide — end-to-end Terraform deployment into a GCP project (recommended for production)
- Deployment Guide — chooses between Terraform and Docker Compose; covers OSS self-host
- Configuration Reference —
instance.yaml, env vars, per-instance options - Architecture — orchestrator, extractors, DB layout
Contributing
- Fork the repository and create a feature branch.
- Run
pytest tests/ -vto verify all tests pass before opening a pull request. - Keep commits focused and messages concise.
- Open a pull request against
mainwith a clear description of the change.
For bugs and feature requests, open a GitHub issue.
License
This project is licensed under the MIT License.