Customise the CLAUDE.md generated for analysts on da analyst setup.
+
Customise the banner shown above the setup commands on /setup.
{% if is_override %}
@@ -220,33 +219,32 @@
- Edit the template below to customise onboarding instructions for analysts on this instance.
- Leave empty or click Reset to default to revert to the OSS-shipped template.
- The override is rendered server-side — placeholders like
- {{ "{{ user.name }}" }} are substituted at delivery time.
+ This banner is shown above the setup commands on /setup. Empty by default.
+ Use it for organisation-specific notes: VPN requirements, support channel, data classification
+ policy, platform onboarding steps, etc.
+ The override is rendered server-side as HTML — Jinja2 placeholders like
+ {{ "{{ user.name }}" }} are substituted at render time.
+ Output is sanitised post-render: inline <script> tags and
+ on*= event handlers are stripped as a safety net.
Available placeholders
- {{ "{{ instance.name }}" }} — instance display name
-{{ "{{ instance.subtitle }}" }} — operator name
-{{ "{{ server.url }}" }} — full server URL
-{{ "{{ server.hostname }}" }} — host part
-{{ "{{ sync_interval }}" }} — refresh cadence (instance.yaml)
-{{ "{{ data_source.type }}" }} — keboola | bigquery | local
-{{ "{{ tables }}" }} — list of {name, description, query_mode}
-{{ "{{ metrics.count }}" }}, {{ "{{ metrics.categories }}" }}
-{{ "{{ marketplaces }}" }} — RBAC-filtered list of {slug, name, plugins[]}
+ {{ "{{ instance.name }}" }} — instance display name
+{{ "{{ instance.subtitle }}" }} — operator / org name
+{{ "{{ server.url }}" }} — full server URL
+{{ "{{ server.hostname }}" }} — host part only
{{ "{{ user.email }}" }}, {{ "{{ user.name }}" }}, {{ "{{ user.is_admin }}" }}, {{ "{{ user.groups }}" }}
-{{ "{{ now }}" }}, {{ "{{ today }}" }}
+ (user may be null for anonymous visitors — guard with {{ "{% if user %}" }})
+{{ "{{ now }}" }}, {{ "{{ today }}" }} — server time (UTC) / date string
-
+
Live preview
@@ -266,7 +264,7 @@
Reset to default?
-
Your override will be permanently removed. The OSS-shipped template will be used instead. This cannot be undone.
+
Your override will be permanently removed. No banner will be shown on /setup. This cannot be undone.
@@ -338,7 +336,7 @@ async function renderPreview() {
});
if (r.ok) {
const j = await r.json();
- previewBox.textContent = j.content;
+ previewBox.innerHTML = j.content;
previewErr.hidden = true;
} else {
let detail = r.statusText;
@@ -398,7 +396,7 @@ async function refreshStatus() {
if (!r.ok) return;
const data = await r.json();
setStatusChip(data);
- editor.setValue(data.content !== null ? data.content : data.default);
+ editor.setValue(data.content !== null ? data.content : "");
renderPreview();
}
diff --git a/config/claude_md_template.txt b/config/claude_md_template.txt
deleted file mode 100644
index 1e59d2f..0000000
--- a/config/claude_md_template.txt
+++ /dev/null
@@ -1,195 +0,0 @@
-{# Default analyst-onboarding welcome prompt for "da analyst setup".
- Rendered server-side by src/welcome_template.py. Edit this file to change
- the OSS default; admins override per-instance via /admin/welcome.
-
- Available context (see docs/welcome-template.md for the full reference):
- instance.name, instance.subtitle
- server.url, server.hostname
- sync_interval — string from instance.yaml
- data_source.type — keboola | bigquery | local
- tables — list of {name, description, query_mode}
- metrics.count, metrics.categories
- marketplaces — list of {slug, name, plugins:[name]}
- user.email, user.name, user.is_admin, user.groups
- now, today — datetime / date string
-#}
-# {{ instance.name }} — AI Data Analyst
-
-This workspace is connected to {{ server.url }}.
-{% if instance.subtitle %}Operated by **{{ instance.subtitle }}**.{% endif %}
-
-## Rules
-- Before computing any business metric: run `da metrics show /`
-- **For canonical table list with query modes: `da catalog`.** `data/metadata/schema.json` covers `query_mode: "local"` tables only — for remote/hybrid tables it's incomplete. Treat `da catalog` as source of truth.
-- Do not use DESCRIBE/SHOW COLUMNS — use `da schema
` instead
-- Save work output to `user/artifacts/`
-- Sync data regularly with `da sync`
-- **Personal customizations go in `.claude/CLAUDE.local.md`, NOT here.** This file is regenerated by `da analyst setup --force`; edits here will be lost. CLAUDE.local.md is preserved across regeneration and uploaded on `da sync --upload-only`.
-
-## Metrics Workflow
-1. `da metrics list` — find the relevant metric ({{ metrics.count }} available, categories: {{ metrics.categories | join(", ") or "none yet" }})
-2. `da metrics show /` — read SQL and business rules
-3. Use the canonical SQL from the metric definition, adapt to the question
-4. Never invent metric calculations — always check existing definitions first
-
-## Data Sync
-- `da sync` — download current data from server
-- `da sync --docs-only` — just metadata and metrics (fast refresh)
-- `da sync --upload-only` — upload sessions and local notes to server
-- Data on the server refreshes every {{ sync_interval }}
-
-## Available Datasets
-{% for t in tables -%}
-- `{{ t.name }}`{% if t.description %} — {{ t.description }}{% endif %}{% if t.query_mode == "remote" %} *(remote, queried on demand)*{% endif %}
-{% else -%}
-- _No tables registered yet — ask an admin to register tables in the dashboard._
-{% endfor %}
-
-{% if marketplaces -%}
-## Plugins available to you
-{% for mp in marketplaces -%}
-- **{{ mp.name }}** ({{ mp.slug }}): {{ mp.plugins | map(attribute="name") | join(", ") }}
-{% endfor %}
-{% endif -%}
-
-## Remote Queries (BigQuery) — when data isn't on the laptop
-
-Not every table is synced. Tables registered with `query_mode: "remote"` live in
-BigQuery, accessed server-side via DuckDB's BQ extension — no parquet on disk.
-Tables you don't see in `data/parquet/` may still be queryable.
-
-### Discovery first
-
-```
-da catalog --json | jq '.[] | {name, source_type, query_mode}' # see all tables + their modes
-da schema
# columns + types
-da describe
-n 5 # sample rows
-```
-
-For local-mode tables, query directly with `da query "SELECT … FROM
"`.
-
-### Three patterns for `query_mode: "remote"` tables
-
-| Pattern | Tool | Use when |
-|---------|------|----------|
-| **`da fetch`** (preferred) | materializes a filtered subset locally → query the snapshot | repeated questions on same slice |
-| **`da query --remote`** | one-shot, server-side execution against BigQuery | single aggregate / cheap probe |
-| **`da query --register-bq`** | hybrid joins between local snapshots and ad-hoc BQ subqueries | crossing local + remote |
-
-### Permission model + cost — important
-
-- BQ access goes through the **agnes server's GCE service account**, not your personal Google credentials. If a query fails with a permission error, the table is in a project the server SA cannot read — escalate to admin, do NOT try to authenticate yourself.
-- Every BQ query bills the SA's GCP project for **bytes scanned**. A naive `SELECT * FROM ` can cost real money. ALWAYS:
- - filter via `--where` on the partition column (typically a date)
- - list specific columns in `--select` — column-store BQ skips the rest, cheaper
- - run `--estimate` first when unsure of the table size or partitioning
-
-### `da fetch` discipline
-
-```
-# 1. ESTIMATE first — refuses to fetch without knowing the cost
-da fetch
--select col1,col2 --where "date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)" --estimate
-
-# 2. If reasonable, fetch as a named snapshot
-da fetch
--select col1,col2 --where "..." --as my_recent
-
-# 3. Query the local snapshot
-da query "SELECT col1, COUNT(*) FROM my_recent GROUP BY 1"
-
-# 4. List + drop snapshots when done
-da snapshot list
-da snapshot drop my_recent
-```
-
-Rules of thumb:
-- ALWAYS list specific columns in `--select`. Avoid implicit SELECT *.
-- ALWAYS include a `--where` for remote tables; otherwise add `--limit`.
-- ALWAYS run `--estimate` first when the table is `partition_by` / `clustered_by`
- per `da schema`, or could plausibly exceed 1 GB local bytes.
-- Reuse snapshots across questions in the same conversation — `da snapshot list`
- before fetching.
-
-### Snapshot freshness — when to refresh
-
-Snapshots are point-in-time copies. They go stale as the source data updates (most BQ tables refresh daily; check `sync_schedule` per `da catalog`). For each new conversation:
-
-```
-da snapshot list # see existing snapshots + their ages
-da snapshot drop my_recent # drop stale ones
-da fetch
--select ... --where ... --as my_recent # re-fetch
-```
-
-If the question is time-sensitive (e.g. "today's orders"), assume any snapshot older than the table's `sync_schedule` is stale and refresh.
-
-### Hybrid query example — local + remote in one query
-
-`da query --register-bq` lets a single SQL statement join a local table with an ad-hoc BQ subquery. The BQ subquery runs first (server-side), result registered as a DuckDB view, then the joined query runs locally.
-
-```
-da query \
- --register-bq "traffic=SELECT date, country, SUM(views) AS views \
- FROM \`prj.web_analytics.sessions\` \
- WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) \
- GROUP BY 1, 2" \
- --sql "SELECT o.date, o.country, o.revenue, t.views, o.revenue / NULLIF(t.views,0) AS rev_per_view \
- FROM orders o \
- JOIN traffic t ON o.date = t.date AND o.country = t.country \
- ORDER BY 1 DESC"
-```
-
-The BQ subquery MUST contain `WHERE` and/or `GROUP BY` to keep the registered result manageable (target: under 500K rows, well under 100 MB). Multiple `--register-bq` flags can compose multiple BQ sources. For complex SQL, use `--stdin` mode (`echo '{"register_bq":{...},"sql":"..."}' | da query --stdin`).
-
-### BigQuery SQL flavor for `--where`
-
-Source-typed `bigquery` tables use BigQuery dialect, not DuckDB:
-
-- Date literal: `DATE '2026-01-01'`
-- Timestamp literal: `TIMESTAMP '2026-01-01 00:00:00 UTC'`
-- Now: `CURRENT_DATE()`, `CURRENT_TIMESTAMP()`
-- Date arithmetic: `DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)`
-- Regex: `REGEXP_CONTAINS(col, r'pattern')` (raw string!)
-- Cast: `CAST(x AS INT64)` (NOT `INT`)
-
-### When the table you want isn't in `da catalog`
-
-The table may exist in BigQuery but not be registered with Agnes yet. Two options:
-
-1. **Ad-hoc one-shot** — register a BQ subquery as a view inline, no admin needed
- if the agnes server SA has BQ access:
- ```
- da query --register-bq "live=SELECT * FROM \`project.dataset.table\` WHERE date >= '...' LIMIT 1000" \
- --sql "SELECT * FROM live"
- ```
-2. **Ask admin to register** the table with `query_mode: "remote"` so it shows up
- in `da catalog` and supports `da fetch` / `da query --remote`. This is the
- right path for any table you'll query repeatedly.
-
-### Deeper guidance
-
-For the full protocol, including hybrid-query examples, snapshot hygiene, and
-when NOT to use `da fetch`, run:
-
-```
-da skills show agnes-data-querying
-```
-
-## Corporate Memory
-
-Rules injected by `da sync` from the server's corporate knowledge base live in `.claude/rules/km_*.md`. They are automatically loaded by Claude Code on every session start.
-
-- `km_.md` — mandatory rules (always enforced)
-- `km_approved.md` — approved guidance (confidence × recency ranked)
-
-Run `da sync` to refresh. Rules are pruned automatically when items are revoked.
-
-## Directory Structure
-- `data/` — read-only data downloaded from server
- - `data/parquet/` — table data in Parquet format
- - `data/duckdb/` — local analytics DuckDB database
- - `data/metadata/` — profiles, schema, metrics cache
-- `user/` — your workspace (persistent across syncs)
- - `user/artifacts/` — analysis outputs, reports, charts
- - `user/sessions/` — Claude Code session logs
-- `.claude/CLAUDE.local.md` — your personal notes + workspace customizations. **Never overwritten by `da analyst setup --force`.** Uploaded to the server on `da sync --upload-only`. Put any local-only Claude instructions, project-specific reminders, or temporary notes here — NOT in CLAUDE.md (this file is regenerated from a template).
-
-_Hello {{ user.name or user.email }} — generated {{ today }}._
diff --git a/docs/agent-setup-prompt.md b/docs/agent-setup-prompt.md
index 72c1575..96e7f17 100644
--- a/docs/agent-setup-prompt.md
+++ b/docs/agent-setup-prompt.md
@@ -1,94 +1,91 @@
# Agent Setup Prompt
-The agent setup prompt is the `CLAUDE.md` file generated in an analyst's local
-workspace by `da analyst setup`. It instructs Claude Code on how to behave in
-that workspace — which commands to use, where to read schema metadata, what
-metrics exist, what plugins are available.
+The agent setup prompt is an HTML banner shown **above the bash setup commands**
+on the `/setup` page. It is intended for organisation-specific operational notes
+that every new analyst should read before running the bootstrap script —
+for example: VPN requirements, support channel, data classification reminder,
+or platform-specific prerequisites.
-## Defaults
+## Default behaviour
-The OSS distribution ships a generic welcome prompt at
-`config/claude_md_template.txt`. Every Agnes instance starts with this default;
-no admin action is required.
+No banner is shown by default. The `/setup` page renders only the standard
+install steps until an admin configures an override.
-## Customizing per instance
+## Customising per instance
-Admins can override the template via:
+Admins configure the banner via:
-- **Admin UI:** `/admin/agent-prompt` — textarea editor with placeholder cheatsheet
- and live preview button. Save sends a `PUT` to `/api/admin/welcome-template`.
+- **Admin UI:** `/admin/agent-prompt` — Jinja2 HTML editor with a placeholder
+ cheatsheet, live preview, and save/reset actions.
- **REST API:**
- - `GET /api/admin/welcome-template` — returns `{content, default, updated_at, updated_by}`. `content` is `null` when no override is set.
- - `PUT /api/admin/welcome-template` with body `{"content": "..."}` — validates Jinja2 syntax, stores the override.
- - `DELETE /api/admin/welcome-template` — clears the override; renderer falls back to the shipped default.
- - `POST /api/admin/welcome-template/preview` with body `{"content": "..."}` — renders arbitrary content against the calling admin's live context without persisting. Used by the editor's Preview button.
+ - `GET /api/admin/welcome-template` — returns `{content, updated_at, updated_by}`.
+ `content` is `null` when no override is set (default = no banner).
+ - `PUT /api/admin/welcome-template` with body `{"content": "..."}` — validates
+ Jinja2 syntax and renders against a stub context before persisting.
+ Returns `400` on syntax errors or unknown placeholders.
+ - `DELETE /api/admin/welcome-template` — clears the override; no banner shown.
+ - `POST /api/admin/welcome-template/preview` with body `{"content": "..."}` —
+ renders arbitrary content against the calling admin's live context without
+ persisting. Used by the editor's Preview button.
The override lives in `system.duckdb` (table `welcome_template`, singleton
-row id=1). Resetting via the UI or `DELETE` simply NULL-s `content` — the
-audit trail (`updated_at`, `updated_by`) is preserved.
+row id=1). The `DELETE` endpoint NULLs `content`; the audit trail
+(`updated_at`, `updated_by`) is preserved.
## Template language
-[Jinja2](https://jinja.palletsprojects.com/) with `StrictUndefined`. Any
-typo in a placeholder name raises an error at render time rather than
-silently emitting an empty string. Server returns HTTP 500 with a hint
-pointing at `/admin/agent-prompt`; the admin UI rejects syntax errors AND
-undefined-placeholder errors with HTTP 400 on save (validated by rendering
-the template against a stub context before persisting).
+[Jinja2](https://jinja.palletsprojects.com/) with `autoescape=True` and
+`StrictUndefined`. Autoescape is on because the output is rendered into HTML.
+Any typo in a placeholder name raises an error at PUT validation time rather
+than silently emitting an empty string — the editor reports the error
+immediately so the admin can fix it before saving.
## Available placeholders
-| Placeholder | Type | Source |
+| Placeholder | Type | Notes |
|---|---|---|
| `instance.name` | string | `instance.name` in `instance.yaml` |
| `instance.subtitle` | string | `instance.subtitle` in `instance.yaml` |
-| `server.url` | string | passed by the CLI (`?server_url=` query) |
-| `server.hostname` | string | parsed from `server.url` |
-| `sync_interval` | string | `instance.sync_interval` in `instance.yaml` (default `"1 hour"`) |
-| `data_source.type` | string | `keboola` \| `bigquery` \| `local` |
-| `tables` | list | rows from `table_registry`, each `{name, description, query_mode}` |
-| `metrics.count` | int | total rows in `metric_definitions` |
-| `metrics.categories` | list[str] | distinct categories from `metric_definitions` |
-| `marketplaces` | list | RBAC-filtered for the calling user, each `{slug, name, plugins:[{name}]}` |
-| `user.email` | string | calling user |
-| `user.name` | string | calling user |
-| `user.is_admin` | bool | calling user |
-| `user.groups` | list[str] | calling user's group names |
-| `now` | datetime (UTC, tz-aware) | server time at render |
-| `today` | string (`YYYY-MM-DD`) | server date |
+| `server.url` | string | Full server URL at render time |
+| `server.hostname` | string | Host part only |
+| `user` | object or `null` | `null` for anonymous `/setup` visitors |
+| `user.id` | string | Authenticated user ID |
+| `user.email` | string | Authenticated user email |
+| `user.name` | string | Authenticated user display name |
+| `user.is_admin` | bool | Whether the user is in the Admin group |
+| `user.groups` | list[str] | User's group names |
+| `now` | datetime (UTC, tz-aware) | Server time at render |
+| `today` | string (`YYYY-MM-DD`) | Server date |
-> **Timezone caveat:** `now` is tz-aware UTC, while DB-sourced timestamps elsewhere in the codebase are naive (DuckDB stores `TIMESTAMP`, not `TIMESTAMPTZ`). Don't subtract or compare `now` with naive timestamps inside templates without normalising first.
+**Anonymous visitors:** `user` is `null` on `/setup` when the visitor is not
+signed in. Guard any user-specific content with `{% if user %}…{% endif %}`.
-## RBAC
+## Security
-`marketplaces` is filtered through `src.marketplace_filter.resolve_allowed_plugins`
-— the same logic that gates `/marketplace.zip`. Two analysts with different
-group memberships will see different plugin lists in their `CLAUDE.md`.
+Output is HTML-sanitized after Jinja2 render as a defense-in-depth measure:
-> **Admin self-view caveat:** `Admin` group is treated like any other group for marketplace filtering — there is no god-mode shortcut. An admin viewing the editor's Preview will see an empty `marketplaces` list unless the admin's groups have plugin grants. To populate the list, grant plugins to the `Admin` group (or any group the admin is a member of).
+- `` blocks are stripped.
+- `` elements are stripped.
+- `on*=` event handler attributes (e.g. `onclick=`, `onload=`) are stripped.
+- `javascript:` and `data:` URI schemes in `href`/`src`/`action` attributes
+ are replaced with `#`.
-## Example: minimal override
+Admins are trusted, but this prevents accidental XSS from copy-pasted snippets
+reaching the public `/setup` page.
-```jinja2
-# {{ instance.name }}
+## Example: VPN and support banner
-This workspace is connected to {{ server.url }}.
-You have access to {{ tables | length }} dataset(s):
-{% for t in tables %}
-- `{{ t.name }}`{% if t.description %}: {{ t.description }}{% endif %}
-{%- endfor %}
+```html
+Before you start: This server is on the corporate VPN.
+Connect to vpn.example.com before running the install command.
+{% if user %}
+ Signed in as {{ user.email }} —
+ open a ticket if you need help.
+{% endif %}
```
-## Falling back to the default
+## Resetting to no banner
-Click **Reset to default** in the admin UI or `DELETE
-/api/admin/welcome-template`. The shipped default is always available as
-`response.default` in the GET endpoint, so admins can copy-paste it into
-the editor as a starting point for a new override.
-
-## Older-server compatibility
-
-The CLI (`da analyst setup`) tolerates older servers that don't yet
-implement `/api/welcome` — on a 404, it writes a minimal embedded fallback
-`CLAUDE.md` and prints a stderr warning on any other failure mode (5xx,
-network, auth). Upgrade the server to get the full feature.
+Click **Reset to default** in the admin UI, or call
+`DELETE /api/admin/welcome-template`. The `/setup` page will show only the
+standard install steps with no banner above them.