agnes-the-ai-analyst/config/claude_md_template.txt
minasarustamyan efc607f3ee
feat(cli): agnes marketplace search/detail/add/remove + retire stale subcommands (#280)
* feat(cli): agnes marketplace search/detail/add/remove + retire stale subcommands

Unified CLI surface for the v28+ marketplace: search across Curated and
Flea Market (RBAC-filtered server-side), drill into a single item's
detail, add/remove from your stack. Replaces opt-out era commands that
no longer reflect how users compose their stack.

CLI changes:
- Added: agnes marketplace {search,detail,add,remove}
- Removed: agnes my-stack toggle (opt-out semantics, curated-only)
- Removed: agnes store {list,show,install,uninstall} (consumer-side ops
  moved under marketplace; store now covers only creator-side upload,
  update, delete, mine)

ID format unifies curated and flea: marketplace_id/plugin_name (slash)
routes to /api/marketplace/curated/..., bare UUID routes to
/api/store/entities/... (flea bundles skills/agents into a synthetic
plugin server-side, so the analyst sees a single add/remove surface).

Templates:
- claude_md_template.txt: rewritten marketplace section as operational
  guidance for Claude Code (discovery, stack management, behaviour
  notes). Dropped the static {% if marketplaces %} listing — the CLI is
  the source of truth for what's in the stack at any moment, so a
  snapshot rendered at init time would lie the moment the user runs
  agnes marketplace add/remove. Same discipline already applied to
  tables and metrics.
- agnes_workspace_template.txt: cheat sheet adds 5 marketplace
  one-liners; keeps the file's reference-doc tone (the original
  commit's intent: 'what is this thing, how does it work, how do I
  uninstall it').

Docs: HOWTO/05-customizing-skills.md rewritten around the new CLI flow;
the opt-out section is replaced by 'Removing items from your stack'.

Tests: new test_cli_marketplace.py covers all four subcommands incl.
RBAC/409 paths (system plugin guard, not-approved flea entity);
test_cli_store.py trimmed to the retained creator-side commands.

* release: 0.54.1 — agnes marketplace CLI redesign + retire stale subcommands

Last commit on the PR per CLAUDE.md hard rule. Patch bump (0.54.0 →
0.54.1) bundling the BREAKING removals of `agnes my-stack toggle` and
`agnes store {list,show,install,uninstall}` plus the new unified
`agnes marketplace {search,detail,add,remove}` surface.

No DB migration; no operator-facing config change. Operators on
floating tags (`:stable`) auto-upgrade transparently. Analyst CLI
upgrade prompt fires on next `agnes pull`; users invoking the
retired commands get "No such command" with the new `agnes
marketplace` substitution called out in the BREAKING bullets.

---------

Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
2026-05-13 05:20:56 +00:00

323 lines
18 KiB
Text
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{# Default analyst-onboarding workspace prompt for "agnes init".
Rendered server-side by src/claude_md.py. Edit this file to change
the OSS default; admins override per-instance via /admin/workspace-prompt.
Available context (see docs/agent-workspace-prompt.md for the full reference):
instance.name, instance.subtitle
server.url, server.hostname
sync_interval — string from instance.yaml
data_source.type — keboola | bigquery | local
tables — list of {name, description, query_mode}
metrics.count, metrics.categories
marketplaces — list of {slug, name, plugins:[{name}]}
user.id, user.email, user.name, user.is_admin, user.groups
now, today — datetime / date string
#}
# {{ instance.name }} — AI Data Analyst
This workspace is connected to {{ server.url }}.
{% if instance.subtitle %}Operated by **{{ instance.subtitle }}**.{% endif %}
> Looking for human-readable workspace docs? Open `AGNES_WORKSPACE.md` in this directory — that file documents what `agnes init` installed, where files live, and how to uninstall.
## Rules
- Before computing any business metric: run `agnes catalog --metrics --show <category>/<name>`
- **For canonical table list with query modes: `agnes catalog`.** Treat `agnes catalog` as source of truth (covers all `query_mode` values: `local`, `remote`, `materialized`).
- Do not use DESCRIBE/SHOW COLUMNS — use `agnes schema <table>` instead
- Sync data regularly with `agnes pull`
- **Personal customizations go in `.claude/CLAUDE.local.md`, NOT here.** This file is regenerated by `agnes init --force`; edits here will be lost. CLAUDE.local.md is preserved across regeneration and uploaded on `agnes push`.
## Metrics Workflow
1. `agnes catalog --metrics` — list registered metrics + categories
2. `agnes catalog --metrics --show <category>/<name>` — read the canonical SQL + business rules
3. Adapt the canonical SQL; never invent metric calculations
## Data Sync
- `agnes pull` — download current data from server
- `agnes push` — upload sessions and local notes to server
- Data on the server refreshes every {{ sync_interval }}
## Private sessions
If a Claude Code session covers something you do **not** want uploaded
to the Agnes server (sensitive PII walk-through, exploratory work that
shouldn't shape org-wide context, etc.), mark it private:
- Run `/agnes-private` from inside the session — the deterministic
`!`-prefix bash adds the current `CLAUDE_CODE_SESSION_ID` to
`<workspace>/.claude/agnes-sessions-private.txt`. Subsequent
`agnes push` runs skip its transcript and audit-log the skip to
`<workspace>/.claude/agnes-sessions-private-skipped.txt`.
- The Claude Code statusbar shows `🔒 agnes-private` while you are
in a private session, so the marking is visible at a glance.
- Marking before `capture-session` fires (the SessionStart hook that
queues the transcript) means the session never enters the upload
queue at all; marking after it fires is also safe — the push-side
re-check on the private list is the authoritative filter.
## Discovering tables — never enumerate from memory
Tables, columns, sizes, descriptions, and `query_mode` change as admins
register / migrate / drop entries. Always re-discover from the live server,
never from this file or your training data:
```
agnes catalog --json # all tables: id, query_mode, sql_flavor,
# where_examples, fetch_via, rough_size_hint, description
agnes catalog --json | jq '.tables[] | select(.id=="<id>")' # single table — read its description in full BEFORE writing any SQL
agnes schema <table> # columns + types in the right SQL dialect
agnes describe <table> -n 5 # sample rows (local + materialized only)
```
The `description` field on each catalog row is the **authoritative
business-rules text** for that table — it carries grain, partition
column, join contracts, and column-level gotchas. Re-read it from the
live `agnes catalog` for every cross-table decision; do **not** copy
it into this workspace `CLAUDE.md` (it's a snapshot that goes stale,
and `agnes init` will overwrite local edits — put personal notes into
`.claude/CLAUDE.local.md` instead). The CLI is the source of truth.
`rough_size_hint` is server-populated for `local` and `materialized` tables
(`small` ≤100 MiB, `medium` ≤1 GiB, `large` ≤10 GiB, `very_large` >10 GiB) and
`null` for `remote` rows. When `null`, treat the table as potentially large
and use `agnes snapshot create --estimate` to size-check before fetching.
## Agnes Marketplace — discovering and managing skills, agents, plugins
The marketplace has two sections:
- **Curated** — vetted by your org; admins control RBAC visibility
- **Flea Market** — open community uploads anyone can publish (admin-approved)
When the user asks "is there a skill/agent for X?" or "what's available for Y?",
**always check the marketplace** before assuming nothing exists — new items are
published continuously and your training data does not cover them. Never
enumerate from memory; the CLI is the source of truth for what exists and
what's in your stack.
### Discovery
```
# Search across Curated + Flea Market (RBAC-filtered server-side)
agnes marketplace search -q "pdf"
agnes marketplace search --type skill # restrict by type
agnes marketplace search --source curated # restrict by source
agnes marketplace search --json # machine-readable for scripts
# Full details — use cases, contents, install command, examples
agnes marketplace detail <id>
agnes marketplace detail --json <id> # raw payload
```
ID format:
- Curated → `marketplace_id/plugin_name` (contains `/`)
- Flea Market → UUID (no `/`)
### Managing your stack
```
# What's currently in my stack? (authoritative — read live, do not cache)
agnes my-stack show
# Add or remove (works for both Curated and Flea Market by ID)
agnes marketplace add <id>
agnes marketplace remove <id>
```
After `add` / `remove`, the user must run `/update-agnes-plugins` inside
Claude Code to apply the change to the running session — no restart needed.
### Notes on behaviour
- System plugins (admin-pinned) cannot be removed — the API returns 409.
- Flea Market items must be admin-approved before they're installable.
- Skills and agents from the Flea Market are bundled into a synthetic plugin
when served to Claude Code; you invoke them the same way as plugin-shipped
skills.
- Plugins served to Claude Code are addressed as `<plugin>@agnes`
(e.g. `claude plugin install <plugin>@agnes`). The SessionStart hook runs
`agnes refresh-marketplace --check` to detect upstream changes.
## Remote Queries (BigQuery) — when data isn't on the laptop
Not every table is synced. Tables registered with `query_mode: "remote"` live in
BigQuery, accessed server-side via DuckDB's BQ extension — no parquet on disk.
Tables you don't see in `server/parquet/` may still be queryable.
### Discovery first — read `agnes catalog --json` BEFORE every cross-table decision
`agnes catalog --json` returns one row per table with these fields. Use them; don't guess:
| Field | What it tells you | How to use it |
|---|---|---|
| `query_mode` | `local` (parquet on laptop) / `remote` (BQ on demand) / `materialized` (synced parquet of a BQ result) | Picks the tool — see decision tree below |
| `source_type` | `keboola` / `bigquery` / `jira` | Determines SQL dialect |
| `sql_flavor` | `duckdb` for local sources, `bigquery` for `--remote` queries on BQ rows | What syntax `--where` expects |
| `where_examples` | 13 example WHERE predicates that are valid for this table's dialect | Copy as starting point for `--where` |
| `fetch_via` | Pre-formatted `agnes snapshot create …` template for this table | The canonical "how do I get a slice of this table" command |
| `rough_size_hint` | Coarse size hint (`small` / `medium` / `large` or null when unknown) | Bigger than `medium` → never `agnes query --remote` without a tight `--where`; use `agnes snapshot create` |
```
agnes catalog --json # full structured view (use this in scripts)
agnes catalog # human-readable summary
agnes schema <table> # columns + types (BIGQUERY/DUCKDB dialect printed in header)
agnes describe <table> -n 5 # sample rows (works on local & materialized only)
```
### Decision tree — pick the right tool BEFORE writing SQL
```
┌─ local → agnes query "SELECT ..."
agnes catalog → ─────┤
query_mode of <table> ├─ materialized → agnes query (parquet was synced by agnes pull)
│ (if missing locally, run `agnes pull` first)
└─ remote → choose by table size + query shape:
- one cheap probe (COUNT, schema-confirm, single agg ≤200s)
→ agnes query --remote "..."
- repeated questions on same slice / large scan
→ agnes snapshot create <table> --select ... --where ... --as <name>
then agnes query "SELECT ... FROM <name>"
- join with a local table
→ agnes snapshot create <remote-table> --select ... --where ... --as <name>,
then agnes query "SELECT ... FROM <local_table> JOIN <name> ..."
```
### Two patterns for `query_mode: "remote"` tables
| Pattern | Tool | Use when |
|---------|------|----------|
| **`agnes snapshot create`** (preferred) | materializes a filtered subset locally → query the snapshot | repeated questions on same slice, OR joining a remote table with a local one (snapshot the remote side, join locally) |
| **`agnes query --remote`** | one-shot, server-side execution against BigQuery (works for BASE TABLE rows directly + VIEW/MATERIALIZED_VIEW rows via the BQ jobs API; cost-guarded by a 5 GiB scan cap configurable in /admin/server-config) | single aggregate / cheap probe |
### Common mistakes — avoid on first try
- **`--estimate` is on `agnes snapshot create` ONLY.** Do NOT pass it to `agnes query` — fails with `No such option: --estimate`. The estimate flow is a snapshot-creation cost gate, not a query primitive.
- **Old `agnes fetch` / `da fetch` / `da query` references in stale docs** — the CLI is `agnes`; `agnes fetch` was renamed to `agnes snapshot create`. If you see those names, translate before running.
- **Don't attempt personal GCP auth** if a BQ query fails with permission errors. BQ access uses the **server's service account**, not your Google identity — escalate to admin instead.
- **Don't `agnes query --remote "SELECT * FROM <large_table>"`** without a `--where`. Even if the scan-byte gate refuses, you've wasted the round-trip; gate yourself first by reading `rough_size_hint` and `where_examples` from `agnes catalog --json`.
### Failure-mode dictionary — what each error means + the right response
| Error wording (substring) | Cause | Response |
|---|---|---|
| `Binder Error: Query execution exceeded the timeout. Job ID: ...` | BQ-side query took >~200 s wall-clock; the DuckDB BQ extension's `bq_query_timeout_ms` (default 90 s, server may bump to 600 s) elapsed | Narrow `--where` (especially partition column), drop unused columns from `--select`, or switch to `agnes snapshot create` to materialise once + query locally |
| `HTTP 400: remote_scan_too_large` | Server's `bq_max_scan_bytes` cost gate refused the query (default 5 GiB) | Tighten `--where`; consider `agnes snapshot create` so the cost is paid once, then local queries are free |
| `HTTP 401: ... unauthorized` | PAT expired or wrong | `agnes init --server-url ... --token <new-PAT>`; re-mint via the dashboard's "Personal Access Tokens" page |
| `HTTP 403: cross_project_forbidden` (with `serviceusage` mention) | Server SA lacks `serviceusage.services.use` on the BQ data project | Escalate to admin to set `data_source.bigquery.billing_project`; do NOT try personal auth |
| `ReadTimeout` (client-side) on `agnes query --remote` | CLI is older than 0.35.1 (had 30 s default) | `agnes --version`; if <0.35.1, upgrade with `uv tool install --force <wheel-from-server>` (the URL is in the `[update]` banner that prints on every command). Then retry. |
| `unknown columns: [...]` from `agnes snapshot create` | `--select` lists columns that don't exist | Run `agnes schema <table>` and copy column names verbatim |
### Cost discipline — every BQ query bills bytes scanned
A naive `SELECT * FROM <large_table>` can cost real money. ALWAYS:
- filter via `--where` on the partition column (typically a date) — read `where_examples` in `agnes catalog --json`
- list specific columns in `--select` — column-store BQ skips the rest
- run `--estimate` first (only valid on `agnes snapshot create`) when the table is partitioned/clustered or when `rough_size_hint` is unknown
### `agnes snapshot create` discipline
```
# 1. ESTIMATE first — refuses to fetch without knowing the cost
agnes snapshot create <table> --select col1,col2 --where "date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)" --estimate
# 2. If reasonable, fetch as a named snapshot
agnes snapshot create <table> --select col1,col2 --where "..." --as my_recent
# 3. Query the local snapshot
agnes query "SELECT col1, COUNT(*) FROM my_recent GROUP BY 1"
# 4. List + drop snapshots when done
agnes snapshot list
agnes snapshot drop my_recent
```
Rules of thumb:
- ALWAYS list specific columns in `--select`. Avoid implicit SELECT *.
- ALWAYS include a `--where` for remote tables; otherwise add `--limit`.
- ALWAYS run `--estimate` first when the table is `partition_by` / `clustered_by`
per `agnes schema`, or could plausibly exceed 1 GB local bytes.
- Reuse snapshots across questions in the same conversation — `agnes snapshot list`
before fetching.
### Snapshot freshness — when to refresh
Snapshots are point-in-time copies. They go stale as the source data updates (most BQ tables refresh daily; check `sync_schedule` per `agnes catalog`). For each new conversation:
```
agnes snapshot list # see existing snapshots + their ages
agnes snapshot drop my_recent # drop stale ones
agnes snapshot create <table> --select ... --where ... --as my_recent # re-fetch
```
If the question is time-sensitive (e.g. "today's orders"), assume any snapshot older than the table's `sync_schedule` is stale and refresh.
### Hybrid query example — local + remote in one query
To join a local table with remote BigQuery data: `agnes snapshot create` a
filtered slice of the remote table, then query the local join. Snapshot the
remote side aggressively (WHERE + only the columns you need) so the cached
parquet stays under ~100 MB; the join itself runs locally over the snapshot
+ the existing local table.
```
# 1. snapshot the remote side, narrow + aggregated
agnes snapshot create web_sessions \
--select date,country,views \
--where "date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)" \
--as traffic
# 2. local join (orders is a local table, traffic is the snapshot from step 1)
agnes query "SELECT o.date, o.country, o.revenue, t.views,
o.revenue / NULLIF(t.views, 0) AS rev_per_view
FROM orders o
JOIN traffic t ON o.date = t.date AND o.country = t.country
ORDER BY 1 DESC"
```
If the remote side is too big to snapshot even after filtering, push the
whole join server-side: `agnes query --remote "<SQL>"` runs on the server
where local tables and remote tables are already in the same DuckDB
session.
### BigQuery SQL flavor for `--where`
Source-typed `bigquery` tables use BigQuery dialect, not DuckDB:
- Date literal: `DATE '2026-01-01'`
- Timestamp literal: `TIMESTAMP '2026-01-01 00:00:00 UTC'`
- Now: `CURRENT_DATE()`, `CURRENT_TIMESTAMP()`
- Date arithmetic: `DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)`
- Regex: `REGEXP_CONTAINS(col, r'pattern')` (raw string!)
- Cast: `CAST(x AS INT64)` (NOT `INT`)
### When the table you want isn't in `agnes catalog`
The table may exist in BigQuery but not be registered with Agnes yet. Tell the user the table isn't registered and ask an admin to register it with `query_mode: "remote"` so it shows up in `agnes catalog` and supports `agnes snapshot create` / `agnes query --remote`. Direct `bq."<dataset>"."<table>"` paths are registry-gated server-side (403 `bq_path_not_registered`) — there's no analyst-side workaround for an unregistered table; registration is the right path.
### Deeper guidance
For the full protocol, including snapshot hygiene and when NOT to use
`agnes snapshot create`, run:
```
agnes skills show agnes-data-querying
```
## Corporate Memory
Rules injected by `agnes pull` from the server's corporate knowledge base live in `.claude/rules/km_*.md`. They are automatically loaded by Claude Code on every session start.
- `km_<id>.md` — mandatory rules (always enforced)
- `km_approved.md` — approved guidance (confidence × recency ranked)
Run `agnes pull` to refresh. Rules are pruned automatically when items are revoked.
## Directory Structure
- `server/parquet/*.parquet` — synced table data (RBAC-filtered subset for you)
- `user/duckdb/analytics.duckdb` — local analytics DuckDB views — what `agnes query` reads
- `user/snapshots/*.parquet` — ad-hoc materialized snapshots from `agnes snapshot create`
- `user/sessions/*.jsonl` — Claude Code session logs (uploaded on `agnes push`)
- `.claude/CLAUDE.local.md` — your personal notes + workspace customizations. **Never overwritten by `agnes init --force`.** Uploaded to the server on `agnes push`. Put any local-only Claude instructions, project-specific reminders, or temporary notes here — NOT in CLAUDE.md (this file is regenerated from a template).
_Hello {{ user.name or user.email }} — generated {{ today }}._