feat(admin-prompt): update editor UX + docs for banner context

- admin_welcome.html: update subtitle, description, placeholder cheatsheet
  (drop tables/metrics/marketplaces/sync_interval; add user-null note and
  security note). Textarea initial value is now empty (no default template
  to show). Preview pane uses innerHTML (HTML output). refreshStatus sets
  editor to empty when no override. Preview pane styled as light surface.
  Reset modal copy updated (no banner shown, not "OSS-shipped template").
- config/claude_md_template.txt: deleted (markdown template is gone;
  default is now no banner).
- docs/agent-setup-prompt.md: rewritten for variant C — describes the
  /setup banner, smaller placeholder table, security/sanitization notes,
  anonymous-user guard, example HTML snippet.
This commit is contained in:
ZdenekSrotyr 2026-05-02 22:18:21 +02:00
parent 8db4c1645b
commit c4d23cf235
3 changed files with 85 additions and 285 deletions

View file

@ -116,17 +116,16 @@
.welcome-preview-col {
border: 1px solid var(--border, #e5e7eb);
border-radius: 8px;
background: #1e1e2e;
color: #cdd6f4;
background: var(--surface, #fff);
color: var(--text-primary, #111827);
padding: 16px;
font-family: var(--font-mono, ui-monospace, "SF Mono", Menlo, monospace);
font-size: 13px;
white-space: pre-wrap;
font-family: var(--font-primary, system-ui, sans-serif);
font-size: 14px;
overflow: auto;
max-height: 600px;
}
.welcome-preview-col h4 {
color: #cdd6f4; margin: 0 0 8px; font-size: 12px; opacity: 0.7;
color: var(--text-secondary, #6b7280); margin: 0 0 8px; font-size: 12px;
text-transform: uppercase; letter-spacing: 0.5px;
}
.welcome-preview-error {
@ -203,7 +202,7 @@
<div class="welcome-toolbar">
<div>
<h2 class="welcome-title">Agent Setup Prompt</h2>
<p class="welcome-sub">Customise the <code>CLAUDE.md</code> generated for analysts on <code>da analyst setup</code>.</p>
<p class="welcome-sub">Customise the banner shown above the setup commands on <code>/setup</code>.</p>
</div>
<div id="status-chip">
{% if is_override %}
@ -220,33 +219,32 @@
<div class="welcome-card">
<div class="welcome-card-body">
<p class="welcome-desc">
Edit the template below to customise onboarding instructions for analysts on this instance.
Leave empty or click <strong>Reset to default</strong> to revert to the OSS-shipped template.
The override is rendered server-side — placeholders like
<code>{{ "{{ user.name }}" }}</code> are substituted at delivery time.
This banner is shown above the setup commands on <code>/setup</code>. Empty by default.
Use it for organisation-specific notes: VPN requirements, support channel, data classification
policy, platform onboarding steps, etc.
The override is rendered server-side as HTML — Jinja2 placeholders like
<code>{{ "{{ user.name }}" }}</code> are substituted at render time.
Output is sanitised post-render: inline <code>&lt;script&gt;</code> tags and
<code>on*=</code> event handlers are stripped as a safety net.
</p>
<details class="welcome-cheatsheet">
<summary>Available placeholders</summary>
<div class="code-block">
<span id="placeholder-text" class="code-body">{{ "{{ instance.name }}" }} — instance display name
{{ "{{ instance.subtitle }}" }} — operator name
{{ "{{ server.url }}" }} — full server URL
{{ "{{ server.hostname }}" }} — host part
{{ "{{ sync_interval }}" }} — refresh cadence (instance.yaml)
{{ "{{ data_source.type }}" }} — keboola | bigquery | local
{{ "{{ tables }}" }} — list of {name, description, query_mode}
{{ "{{ metrics.count }}" }}, {{ "{{ metrics.categories }}" }}
{{ "{{ marketplaces }}" }} — RBAC-filtered list of {slug, name, plugins[]}
<span id="placeholder-text" class="code-body">{{ "{{ instance.name }}" }} — instance display name
{{ "{{ instance.subtitle }}" }} — operator / org name
{{ "{{ server.url }}" }} — full server URL
{{ "{{ server.hostname }}" }} — host part only
{{ "{{ user.email }}" }}, {{ "{{ user.name }}" }}, {{ "{{ user.is_admin }}" }}, {{ "{{ user.groups }}" }}
{{ "{{ now }}" }}, {{ "{{ today }}" }}</span>
(user may be null for anonymous visitors — guard with {{ "{% if user %}" }})
{{ "{{ now }}" }}, {{ "{{ today }}" }} — server time (UTC) / date string</span>
<button class="btn-copy" data-copy-target="placeholder-text">Copy</button>
</div>
</details>
<div class="welcome-editor-row">
<div class="welcome-editor-col">
<textarea id="content" name="content">{{ current or default_template }}</textarea>
<textarea id="content" name="content">{{ current }}</textarea>
</div>
<div class="welcome-preview-col">
<h4>Live preview</h4>
@ -266,7 +264,7 @@
<div class="modal-backdrop" id="reset-modal" role="dialog" aria-modal="true" aria-labelledby="reset-modal-title">
<div class="modal-card">
<h3 id="reset-modal-title">Reset to default?</h3>
<p class="sub">Your override will be permanently removed. The OSS-shipped template will be used instead. This cannot be undone.</p>
<p class="sub">Your override will be permanently removed. No banner will be shown on <code>/setup</code>. This cannot be undone.</p>
<div class="modal-actions">
<button class="modal-btn" data-close-modal="reset-modal">Cancel</button>
<button class="modal-btn danger" id="reset-confirm-btn">Reset</button>
@ -338,7 +336,7 @@ async function renderPreview() {
});
if (r.ok) {
const j = await r.json();
previewBox.textContent = j.content;
previewBox.innerHTML = j.content;
previewErr.hidden = true;
} else {
let detail = r.statusText;
@ -398,7 +396,7 @@ async function refreshStatus() {
if (!r.ok) return;
const data = await r.json();
setStatusChip(data);
editor.setValue(data.content !== null ? data.content : data.default);
editor.setValue(data.content !== null ? data.content : "");
renderPreview();
}

View file

@ -1,195 +0,0 @@
{# Default analyst-onboarding welcome prompt for "da analyst setup".
Rendered server-side by src/welcome_template.py. Edit this file to change
the OSS default; admins override per-instance via /admin/welcome.
Available context (see docs/welcome-template.md for the full reference):
instance.name, instance.subtitle
server.url, server.hostname
sync_interval — string from instance.yaml
data_source.type — keboola | bigquery | local
tables — list of {name, description, query_mode}
metrics.count, metrics.categories
marketplaces — list of {slug, name, plugins:[name]}
user.email, user.name, user.is_admin, user.groups
now, today — datetime / date string
#}
# {{ instance.name }} — AI Data Analyst
This workspace is connected to {{ server.url }}.
{% if instance.subtitle %}Operated by **{{ instance.subtitle }}**.{% endif %}
## Rules
- Before computing any business metric: run `da metrics show <category>/<name>`
- **For canonical table list with query modes: `da catalog`.** `data/metadata/schema.json` covers `query_mode: "local"` tables only — for remote/hybrid tables it's incomplete. Treat `da catalog` as source of truth.
- Do not use DESCRIBE/SHOW COLUMNS — use `da schema <table>` instead
- Save work output to `user/artifacts/`
- Sync data regularly with `da sync`
- **Personal customizations go in `.claude/CLAUDE.local.md`, NOT here.** This file is regenerated by `da analyst setup --force`; edits here will be lost. CLAUDE.local.md is preserved across regeneration and uploaded on `da sync --upload-only`.
## Metrics Workflow
1. `da metrics list` — find the relevant metric ({{ metrics.count }} available, categories: {{ metrics.categories | join(", ") or "none yet" }})
2. `da metrics show <category>/<name>` — read SQL and business rules
3. Use the canonical SQL from the metric definition, adapt to the question
4. Never invent metric calculations — always check existing definitions first
## Data Sync
- `da sync` — download current data from server
- `da sync --docs-only` — just metadata and metrics (fast refresh)
- `da sync --upload-only` — upload sessions and local notes to server
- Data on the server refreshes every {{ sync_interval }}
## Available Datasets
{% for t in tables -%}
- `{{ t.name }}`{% if t.description %} — {{ t.description }}{% endif %}{% if t.query_mode == "remote" %} *(remote, queried on demand)*{% endif %}
{% else -%}
- _No tables registered yet — ask an admin to register tables in the dashboard._
{% endfor %}
{% if marketplaces -%}
## Plugins available to you
{% for mp in marketplaces -%}
- **{{ mp.name }}** ({{ mp.slug }}): {{ mp.plugins | map(attribute="name") | join(", ") }}
{% endfor %}
{% endif -%}
## Remote Queries (BigQuery) — when data isn't on the laptop
Not every table is synced. Tables registered with `query_mode: "remote"` live in
BigQuery, accessed server-side via DuckDB's BQ extension — no parquet on disk.
Tables you don't see in `data/parquet/` may still be queryable.
### Discovery first
```
da catalog --json | jq '.[] | {name, source_type, query_mode}' # see all tables + their modes
da schema <table> # columns + types
da describe <table> -n 5 # sample rows
```
For local-mode tables, query directly with `da query "SELECT … FROM <table>"`.
### Three patterns for `query_mode: "remote"` tables
| Pattern | Tool | Use when |
|---------|------|----------|
| **`da fetch`** (preferred) | materializes a filtered subset locally → query the snapshot | repeated questions on same slice |
| **`da query --remote`** | one-shot, server-side execution against BigQuery | single aggregate / cheap probe |
| **`da query --register-bq`** | hybrid joins between local snapshots and ad-hoc BQ subqueries | crossing local + remote |
### Permission model + cost — important
- BQ access goes through the **agnes server's GCE service account**, not your personal Google credentials. If a query fails with a permission error, the table is in a project the server SA cannot read — escalate to admin, do NOT try to authenticate yourself.
- Every BQ query bills the SA's GCP project for **bytes scanned**. A naive `SELECT * FROM <large_table>` can cost real money. ALWAYS:
- filter via `--where` on the partition column (typically a date)
- list specific columns in `--select` — column-store BQ skips the rest, cheaper
- run `--estimate` first when unsure of the table size or partitioning
### `da fetch` discipline
```
# 1. ESTIMATE first — refuses to fetch without knowing the cost
da fetch <table> --select col1,col2 --where "date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)" --estimate
# 2. If reasonable, fetch as a named snapshot
da fetch <table> --select col1,col2 --where "..." --as my_recent
# 3. Query the local snapshot
da query "SELECT col1, COUNT(*) FROM my_recent GROUP BY 1"
# 4. List + drop snapshots when done
da snapshot list
da snapshot drop my_recent
```
Rules of thumb:
- ALWAYS list specific columns in `--select`. Avoid implicit SELECT *.
- ALWAYS include a `--where` for remote tables; otherwise add `--limit`.
- ALWAYS run `--estimate` first when the table is `partition_by` / `clustered_by`
per `da schema`, or could plausibly exceed 1 GB local bytes.
- Reuse snapshots across questions in the same conversation — `da snapshot list`
before fetching.
### Snapshot freshness — when to refresh
Snapshots are point-in-time copies. They go stale as the source data updates (most BQ tables refresh daily; check `sync_schedule` per `da catalog`). For each new conversation:
```
da snapshot list # see existing snapshots + their ages
da snapshot drop my_recent # drop stale ones
da fetch <table> --select ... --where ... --as my_recent # re-fetch
```
If the question is time-sensitive (e.g. "today's orders"), assume any snapshot older than the table's `sync_schedule` is stale and refresh.
### Hybrid query example — local + remote in one query
`da query --register-bq` lets a single SQL statement join a local table with an ad-hoc BQ subquery. The BQ subquery runs first (server-side), result registered as a DuckDB view, then the joined query runs locally.
```
da query \
--register-bq "traffic=SELECT date, country, SUM(views) AS views \
FROM \`prj.web_analytics.sessions\` \
WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) \
GROUP BY 1, 2" \
--sql "SELECT o.date, o.country, o.revenue, t.views, o.revenue / NULLIF(t.views,0) AS rev_per_view \
FROM orders o \
JOIN traffic t ON o.date = t.date AND o.country = t.country \
ORDER BY 1 DESC"
```
The BQ subquery MUST contain `WHERE` and/or `GROUP BY` to keep the registered result manageable (target: under 500K rows, well under 100 MB). Multiple `--register-bq` flags can compose multiple BQ sources. For complex SQL, use `--stdin` mode (`echo '{"register_bq":{...},"sql":"..."}' | da query --stdin`).
### BigQuery SQL flavor for `--where`
Source-typed `bigquery` tables use BigQuery dialect, not DuckDB:
- Date literal: `DATE '2026-01-01'`
- Timestamp literal: `TIMESTAMP '2026-01-01 00:00:00 UTC'`
- Now: `CURRENT_DATE()`, `CURRENT_TIMESTAMP()`
- Date arithmetic: `DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)`
- Regex: `REGEXP_CONTAINS(col, r'pattern')` (raw string!)
- Cast: `CAST(x AS INT64)` (NOT `INT`)
### When the table you want isn't in `da catalog`
The table may exist in BigQuery but not be registered with Agnes yet. Two options:
1. **Ad-hoc one-shot** — register a BQ subquery as a view inline, no admin needed
if the agnes server SA has BQ access:
```
da query --register-bq "live=SELECT * FROM \`project.dataset.table\` WHERE date >= '...' LIMIT 1000" \
--sql "SELECT * FROM live"
```
2. **Ask admin to register** the table with `query_mode: "remote"` so it shows up
in `da catalog` and supports `da fetch` / `da query --remote`. This is the
right path for any table you'll query repeatedly.
### Deeper guidance
For the full protocol, including hybrid-query examples, snapshot hygiene, and
when NOT to use `da fetch`, run:
```
da skills show agnes-data-querying
```
## Corporate Memory
Rules injected by `da sync` from the server's corporate knowledge base live in `.claude/rules/km_*.md`. They are automatically loaded by Claude Code on every session start.
- `km_<id>.md` — mandatory rules (always enforced)
- `km_approved.md` — approved guidance (confidence × recency ranked)
Run `da sync` to refresh. Rules are pruned automatically when items are revoked.
## Directory Structure
- `data/` — read-only data downloaded from server
- `data/parquet/` — table data in Parquet format
- `data/duckdb/` — local analytics DuckDB database
- `data/metadata/` — profiles, schema, metrics cache
- `user/` — your workspace (persistent across syncs)
- `user/artifacts/` — analysis outputs, reports, charts
- `user/sessions/` — Claude Code session logs
- `.claude/CLAUDE.local.md` — your personal notes + workspace customizations. **Never overwritten by `da analyst setup --force`.** Uploaded to the server on `da sync --upload-only`. Put any local-only Claude instructions, project-specific reminders, or temporary notes here — NOT in CLAUDE.md (this file is regenerated from a template).
_Hello {{ user.name or user.email }} — generated {{ today }}._

View file

@ -1,94 +1,91 @@
# Agent Setup Prompt
The agent setup prompt is the `CLAUDE.md` file generated in an analyst's local
workspace by `da analyst setup`. It instructs Claude Code on how to behave in
that workspace — which commands to use, where to read schema metadata, what
metrics exist, what plugins are available.
The agent setup prompt is an HTML banner shown **above the bash setup commands**
on the `/setup` page. It is intended for organisation-specific operational notes
that every new analyst should read before running the bootstrap script —
for example: VPN requirements, support channel, data classification reminder,
or platform-specific prerequisites.
## Defaults
## Default behaviour
The OSS distribution ships a generic welcome prompt at
`config/claude_md_template.txt`. Every Agnes instance starts with this default;
no admin action is required.
No banner is shown by default. The `/setup` page renders only the standard
install steps until an admin configures an override.
## Customizing per instance
## Customising per instance
Admins can override the template via:
Admins configure the banner via:
- **Admin UI:** `/admin/agent-prompt`textarea editor with placeholder cheatsheet
and live preview button. Save sends a `PUT` to `/api/admin/welcome-template`.
- **Admin UI:** `/admin/agent-prompt`Jinja2 HTML editor with a placeholder
cheatsheet, live preview, and save/reset actions.
- **REST API:**
- `GET /api/admin/welcome-template` — returns `{content, default, updated_at, updated_by}`. `content` is `null` when no override is set.
- `PUT /api/admin/welcome-template` with body `{"content": "..."}` — validates Jinja2 syntax, stores the override.
- `DELETE /api/admin/welcome-template` — clears the override; renderer falls back to the shipped default.
- `POST /api/admin/welcome-template/preview` with body `{"content": "..."}` — renders arbitrary content against the calling admin's live context without persisting. Used by the editor's Preview button.
- `GET /api/admin/welcome-template` — returns `{content, updated_at, updated_by}`.
`content` is `null` when no override is set (default = no banner).
- `PUT /api/admin/welcome-template` with body `{"content": "..."}` — validates
Jinja2 syntax and renders against a stub context before persisting.
Returns `400` on syntax errors or unknown placeholders.
- `DELETE /api/admin/welcome-template` — clears the override; no banner shown.
- `POST /api/admin/welcome-template/preview` with body `{"content": "..."}`
renders arbitrary content against the calling admin's live context without
persisting. Used by the editor's Preview button.
The override lives in `system.duckdb` (table `welcome_template`, singleton
row id=1). Resetting via the UI or `DELETE` simply NULL-s `content` — the
audit trail (`updated_at`, `updated_by`) is preserved.
row id=1). The `DELETE` endpoint NULLs `content`; the audit trail
(`updated_at`, `updated_by`) is preserved.
## Template language
[Jinja2](https://jinja.palletsprojects.com/) with `StrictUndefined`. Any
typo in a placeholder name raises an error at render time rather than
silently emitting an empty string. Server returns HTTP 500 with a hint
pointing at `/admin/agent-prompt`; the admin UI rejects syntax errors AND
undefined-placeholder errors with HTTP 400 on save (validated by rendering
the template against a stub context before persisting).
[Jinja2](https://jinja.palletsprojects.com/) with `autoescape=True` and
`StrictUndefined`. Autoescape is on because the output is rendered into HTML.
Any typo in a placeholder name raises an error at PUT validation time rather
than silently emitting an empty string — the editor reports the error
immediately so the admin can fix it before saving.
## Available placeholders
| Placeholder | Type | Source |
| Placeholder | Type | Notes |
|---|---|---|
| `instance.name` | string | `instance.name` in `instance.yaml` |
| `instance.subtitle` | string | `instance.subtitle` in `instance.yaml` |
| `server.url` | string | passed by the CLI (`?server_url=` query) |
| `server.hostname` | string | parsed from `server.url` |
| `sync_interval` | string | `instance.sync_interval` in `instance.yaml` (default `"1 hour"`) |
| `data_source.type` | string | `keboola` \| `bigquery` \| `local` |
| `tables` | list | rows from `table_registry`, each `{name, description, query_mode}` |
| `metrics.count` | int | total rows in `metric_definitions` |
| `metrics.categories` | list[str] | distinct categories from `metric_definitions` |
| `marketplaces` | list | RBAC-filtered for the calling user, each `{slug, name, plugins:[{name}]}` |
| `user.email` | string | calling user |
| `user.name` | string | calling user |
| `user.is_admin` | bool | calling user |
| `user.groups` | list[str] | calling user's group names |
| `now` | datetime (UTC, tz-aware) | server time at render |
| `today` | string (`YYYY-MM-DD`) | server date |
| `server.url` | string | Full server URL at render time |
| `server.hostname` | string | Host part only |
| `user` | object or `null` | `null` for anonymous `/setup` visitors |
| `user.id` | string | Authenticated user ID |
| `user.email` | string | Authenticated user email |
| `user.name` | string | Authenticated user display name |
| `user.is_admin` | bool | Whether the user is in the Admin group |
| `user.groups` | list[str] | User's group names |
| `now` | datetime (UTC, tz-aware) | Server time at render |
| `today` | string (`YYYY-MM-DD`) | Server date |
> **Timezone caveat:** `now` is tz-aware UTC, while DB-sourced timestamps elsewhere in the codebase are naive (DuckDB stores `TIMESTAMP`, not `TIMESTAMPTZ`). Don't subtract or compare `now` with naive timestamps inside templates without normalising first.
**Anonymous visitors:** `user` is `null` on `/setup` when the visitor is not
signed in. Guard any user-specific content with `{% if user %}…{% endif %}`.
## RBAC
## Security
`marketplaces` is filtered through `src.marketplace_filter.resolve_allowed_plugins`
— the same logic that gates `/marketplace.zip`. Two analysts with different
group memberships will see different plugin lists in their `CLAUDE.md`.
Output is HTML-sanitized after Jinja2 render as a defense-in-depth measure:
> **Admin self-view caveat:** `Admin` group is treated like any other group for marketplace filtering — there is no god-mode shortcut. An admin viewing the editor's Preview will see an empty `marketplaces` list unless the admin's groups have plugin grants. To populate the list, grant plugins to the `Admin` group (or any group the admin is a member of).
- `<script>…</script>` blocks are stripped.
- `<iframe>…</iframe>` elements are stripped.
- `on*=` event handler attributes (e.g. `onclick=`, `onload=`) are stripped.
- `javascript:` and `data:` URI schemes in `href`/`src`/`action` attributes
are replaced with `#`.
## Example: minimal override
Admins are trusted, but this prevents accidental XSS from copy-pasted snippets
reaching the public `/setup` page.
```jinja2
# {{ instance.name }}
## Example: VPN and support banner
This workspace is connected to {{ server.url }}.
You have access to {{ tables | length }} dataset(s):
{% for t in tables %}
- `{{ t.name }}`{% if t.description %}: {{ t.description }}{% endif %}
{%- endfor %}
```html
<strong>Before you start:</strong> This server is on the corporate VPN.
Connect to <code>vpn.example.com</code> before running the install command.
{% if user %}
<br>Signed in as <strong>{{ user.email }}</strong>
<a href="https://support.example.com">open a ticket</a> if you need help.
{% endif %}
```
## Falling back to the default
## Resetting to no banner
Click **Reset to default** in the admin UI or `DELETE
/api/admin/welcome-template`. The shipped default is always available as
`response.default` in the GET endpoint, so admins can copy-paste it into
the editor as a starting point for a new override.
## Older-server compatibility
The CLI (`da analyst setup`) tolerates older servers that don't yet
implement `/api/welcome` — on a 404, it writes a minimal embedded fallback
`CLAUDE.md` and prints a stderr warning on any other failure mode (5xx,
network, auth). Upgrade the server to get the full feature.
Click **Reset to default** in the admin UI, or call
`DELETE /api/admin/welcome-template`. The `/setup` page will show only the
standard install steps with no banner above them.