diff --git a/CHANGELOG.md b/CHANGELOG.md index dba8ba3..f410f70 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,18 +10,19 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C ## [Unreleased] -## [0.31.0] — 2026-05-03 +## [0.31.0] — 2026-05-04 ### Added +- **Agent Workspace Prompt** — admin-editable Jinja2 markdown template for the analyst's `CLAUDE.md`, surfaced in their workspace by `da analyst setup`. Default = rich briefing with RBAC-filtered tables/metrics/marketplaces context. Edit at `/admin/workspace-prompt`. Endpoints: `GET /api/welcome` (analyst-facing, auth required), `GET/PUT/DELETE /api/admin/workspace-prompt-template`, `POST /api/admin/workspace-prompt-template/preview`. CLI: `da analyst setup` writes `CLAUDE.md` by default; new `--no-claude-md` flag opts out. See `docs/agent-workspace-prompt.md`. - **Agent Setup Prompt** — customizable bash setup script shown on `/setup` and copied by the dashboard clipboard CTA. Default = the live `setup_instructions.resolve_lines()` output (TLS trust bootstrap, CLI install, login, marketplace, skills). Admin override at `/admin/agent-prompt` — full replacement of the default, not a banner added on top. Override flows to both the `/setup` page display and the dashboard clipboard payload. Jinja2 is available for `{{ instance.name }}` etc.; `{server_url}` and `{token}` are JS-substituted at clipboard-copy time and survive Jinja2 rendering unchanged. REST API: `GET /api/admin/welcome-template` returns `{content, default, updated_at, updated_by}` (`content` is `null` when no override is set; `default` is always the live computed script); `PUT` to set an override; `DELETE` to clear; `POST /api/admin/welcome-template/preview` for live preview without persisting. Available Jinja2 placeholders: `instance.{name,subtitle}`, `server.{url,hostname}`, `user` (may be `null` for anonymous visitors), `now`, `today`. Override content is HTML-sanitized post-render (script/iframe/event-handler strip). See `docs/agent-setup-prompt.md`. -- DuckDB schema v21: `welcome_template` singleton table backing the banner override. Auto-migration v20→v21 on first start. +- DuckDB schema v21: `welcome_template` singleton table backing the Agent Setup Prompt override. Auto-migration v20→v21 on first start. - DuckDB schema v22: `setup_banner` table reserved (no consumers; retained for forward compatibility with already-migrated instances). +- DuckDB schema v23: `claude_md_template` singleton table backing the Agent Workspace Prompt override. Auto-migration v22→v23. ### Changed -- **BREAKING (CLI):** `da analyst setup` no longer generates a `CLAUDE.md` file in the analyst workspace. Workspace-context customisation is handled via the `/setup` page banner instead. Existing analysts with a server-generated `CLAUDE.md` may delete it manually if desired. -- **BREAKING (API):** `GET /api/welcome` removed. The endpoint was internal-only (consumed only by the CLI's now-removed `CLAUDE.md` generation step). +- `da analyst setup` writes `CLAUDE.md` to the analyst workspace from the server-rendered template (fetched via `GET /api/welcome`). Use `--no-claude-md` to opt out. Analysts who ran setup while CLAUDE.md generation was temporarily absent will have their file written on the next `da analyst setup` run. - `/install` page renamed to `/setup` ("Setup local agent" nav label) with 302 redirect from `/install`. - Dashboard "What Claude Code will receive" inline preview replaced with a link to `/setup` for the canonical view. diff --git a/CLAUDE.md b/CLAUDE.md index 8fe2f1a..f16c162 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -430,7 +430,7 @@ Module sets `lifecycle { ignore_changes = [metadata_startup_script] }` on `googl ## Key Implementation Details ### DuckDB Schema (src/db.py) -- Schema v22 with auto-migration v1→…→v22 (v5 adds `users.active`, v6 adds `personal_access_tokens`, v7 adds `personal_access_tokens.last_used_ip`, v8/v9 added the legacy internal_roles/role-grants tables, v10 added `view_ownership` for cross-connector view-name collision detection (issue #81 Group C), v11 added marketplace_registry + marketplace_plugins + user_groups + plugin_access, v12 added users.groups JSON + user_groups.is_system, **v13 replaces internal_roles/group_mappings/user_role_grants/plugin_access with user_group_members + resource_grants and drops users.groups JSON**, v14 adds FK constraints on user_group_members + resource_grants after orphan cleanup, v15 adds knowledge_items context-engineering columns + contradictions + session_extraction_state, v16 adds verification_evidence, v17 adds knowledge_item_relations, v18 drops stranded non-google memberships from google-managed groups, **v19 drops legacy `dataset_permissions`, `access_requests` tables and `users.role`, `table_registry.is_public` columns — table access is now exclusively per-group via `resource_grants(resource_type='table')`**, **v20 adds `source_query` TEXT to `table_registry` to back `query_mode='materialized'` (BigQuery scheduled-query parquet path)**, **v21 adds `welcome_template` singleton table backing the Agent Setup Prompt admin override (`/admin/agent-prompt`)**, **v22 reserves the `setup_banner` table — feature dropped mid-development; table retained for forward compatibility with already-migrated instances** — see CHANGELOG and docs/RBAC.md) +- Schema v23 with auto-migration v1→…→v23 (v5 adds `users.active`, v6 adds `personal_access_tokens`, v7 adds `personal_access_tokens.last_used_ip`, v8/v9 added the legacy internal_roles/role-grants tables, v10 added `view_ownership` for cross-connector view-name collision detection (issue #81 Group C), v11 added marketplace_registry + marketplace_plugins + user_groups + plugin_access, v12 added users.groups JSON + user_groups.is_system, **v13 replaces internal_roles/group_mappings/user_role_grants/plugin_access with user_group_members + resource_grants and drops users.groups JSON**, v14 adds FK constraints on user_group_members + resource_grants after orphan cleanup, v15 adds knowledge_items context-engineering columns + contradictions + session_extraction_state, v16 adds verification_evidence, v17 adds knowledge_item_relations, v18 drops stranded non-google memberships from google-managed groups, **v19 drops legacy `dataset_permissions`, `access_requests` tables and `users.role`, `table_registry.is_public` columns — table access is now exclusively per-group via `resource_grants(resource_type='table')`**, **v20 adds `source_query` TEXT to `table_registry` to back `query_mode='materialized'` (BigQuery scheduled-query parquet path)**, **v21 adds `welcome_template` singleton table backing the Agent Setup Prompt admin override (`/admin/agent-prompt`)**, **v22 reserves the `setup_banner` table — feature dropped mid-development; table retained for forward compatibility with already-migrated instances**, **v23 adds `claude_md_template` singleton table backing the Agent Workspace Prompt admin override (`/admin/workspace-prompt`)** — see CHANGELOG and docs/RBAC.md) - `table_registry`: id, name, source_type, bucket, source_table, query_mode, sync_schedule, etc. - `sync_state`, `sync_history`: track extraction progress - `users`, `audit_log`: account state + audit trail. RBAC lives in `user_groups` + `user_group_members` + `resource_grants`. diff --git a/app/api/claude_md.py b/app/api/claude_md.py new file mode 100644 index 0000000..dcb9d82 --- /dev/null +++ b/app/api/claude_md.py @@ -0,0 +1,206 @@ +"""REST endpoints for the agent-workspace-prompt (analyst CLAUDE.md). + +- GET /api/welcome : analyst-facing rendered CLAUDE.md (auth required) +- GET /api/admin/workspace-prompt-template : raw template override + live default (admin) +- PUT /api/admin/workspace-prompt-template : set override (admin) +- DELETE /api/admin/workspace-prompt-template : reset to default (admin) +- POST /api/admin/workspace-prompt-template/preview : live preview without persisting (admin) +""" + +import datetime +import logging +from typing import Optional +from urllib.parse import unquote + +import duckdb +from fastapi import APIRouter, Depends, HTTPException, Query, Request, Response +from jinja2 import Environment, StrictUndefined, TemplateError +from pydantic import BaseModel, Field + +from app.auth.access import require_admin +from app.auth.dependencies import _get_db, get_current_user +from src.repositories.claude_md_template import ClaudeMdTemplateRepository +from src.claude_md import build_claude_md_context, compute_default_claude_md, render_claude_md + +logger = logging.getLogger(__name__) + + +router = APIRouter(tags=["claude_md"]) + +# Stub context used to validate that a saved template renders end-to-end, +# not just that it parses. Mirrors the shape of build_claude_md_context() output. +# user is an authenticated user so templates that reference user.* are validated. +_VALIDATION_STUB_CONTEXT = { + "instance": {"name": "Example", "subtitle": "Example Org"}, + "server": {"url": "https://example.com", "hostname": "example.com"}, + "sync_interval": "1h", + "data_source": {"type": "keboola"}, + "tables": [{"name": "orders", "description": "Sample orders", "query_mode": "local"}], + "metrics": {"count": 3, "categories": ["revenue", "growth"]}, + "marketplaces": [{"slug": "example", "name": "Example Marketplace", "plugins": [{"name": "plugin-a"}]}], + "user": { + "id": "u", + "email": "user@example.com", + "name": "User", + "is_admin": False, + "groups": ["Everyone"], + }, + "now": datetime.datetime(2026, 1, 1, tzinfo=datetime.timezone.utc), + "today": "2026-01-01", +} + +# Same stub with an anonymous-style user context to validate templates against +# the case where a user dict is present but minimal (analyst). The CLAUDE.md +# endpoint always requires auth, so user is never None — but templates may +# accidentally reference fields that aren't in the context. +_VALIDATION_STUB_CONTEXT_ANON = { + **{k: v for k, v in _VALIDATION_STUB_CONTEXT.items() if k != "user"}, + "user": { + "id": "u2", + "email": "anon@example.com", + "name": "", + "is_admin": False, + "groups": ["Everyone"], + }, +} + + +class ClaudeMdResponse(BaseModel): + content: str + + +class TemplateGetResponse(BaseModel): + content: Optional[str] + default: str # live default rendered with calling admin's context + updated_at: Optional[str] = None + updated_by: Optional[str] = None + + +class TemplatePutRequest(BaseModel): + content: str = Field(..., min_length=1, max_length=200_000) + + +class TemplatePreviewRequest(BaseModel): + content: str = Field(..., min_length=1, max_length=200_000) + + +# --------------------------------------------------------------------------- +# Analyst-facing endpoint — returns rendered CLAUDE.md +# --------------------------------------------------------------------------- + +@router.get("/api/welcome", response_model=ClaudeMdResponse) +async def get_welcome( + request: Request, + server_url: Optional[str] = Query(None, description="Server URL used in rendered CLAUDE.md"), + user: dict = Depends(get_current_user), + conn: duckdb.DuckDBPyConnection = Depends(_get_db), +): + """Return the rendered CLAUDE.md for the authenticated analyst. + + The CLI calls this endpoint during ``da analyst setup`` to write + ``/CLAUDE.md``. The content is RBAC-filtered per the + calling user. + + ``server_url`` query param lets the CLI pass the origin it knows so + the rendered content references the correct server URL rather than the + request host (which may differ behind a proxy). + """ + effective_url = server_url or str(request.base_url).rstrip("/") + try: + content = render_claude_md(conn, user=user, server_url=effective_url) + except TemplateError as exc: + logger.warning("render_claude_md failed (template error): %s", exc) + raise HTTPException(status_code=500, detail=f"Template render error: {exc}") + except Exception: + logger.exception("render_claude_md failed (unexpected)") + raise HTTPException(status_code=500, detail="Internal error rendering CLAUDE.md") + return ClaudeMdResponse(content=content) + + +# --------------------------------------------------------------------------- +# Admin endpoints — CRUD for the workspace-prompt template override +# --------------------------------------------------------------------------- + +@router.get("/api/admin/workspace-prompt-template", response_model=TemplateGetResponse) +async def admin_get_workspace_template( + request: Request, + user: dict = Depends(require_admin), + conn: duckdb.DuckDBPyConnection = Depends(_get_db), +): + row = ClaudeMdTemplateRepository(conn).get() + server_url = str(request.base_url).rstrip("/") + live_default = compute_default_claude_md(conn, user=user, server_url=server_url) + return TemplateGetResponse( + content=row["content"], + default=live_default, + updated_at=row["updated_at"].isoformat() if row["updated_at"] else None, + updated_by=row["updated_by"], + ) + + +@router.put("/api/admin/workspace-prompt-template") +async def admin_put_workspace_template( + payload: TemplatePutRequest, + user: dict = Depends(require_admin), + conn: duckdb.DuckDBPyConnection = Depends(_get_db), +): + """Save an admin override for the analyst CLAUDE.md template. + + Two-pass Jinja2 validation (autoescape=False, StrictUndefined): + - Pass 1: render with an authenticated user stub — catches undefined + placeholders and syntax errors. + - Pass 2: render with a minimal anon-style user stub — catches templates + that hard-depend on admin-only context fields. + """ + env = Environment(undefined=StrictUndefined, autoescape=False) + try: + template = env.from_string(payload.content) + template.render(**_VALIDATION_STUB_CONTEXT) + except TemplateError as e: + raise HTTPException(status_code=400, detail=f"Template invalid: {e}") + + try: + template.render(**_VALIDATION_STUB_CONTEXT_ANON) + except TemplateError as e: + raise HTTPException( + status_code=400, + detail=( + f"Template fails for non-admin analyst users: {e}. " + "Wrap user-dependent expressions in {{% if user.is_admin %}}...{{% endif %}} " + "or ensure the template renders correctly for all users." + ), + ) + + ClaudeMdTemplateRepository(conn).set(payload.content, updated_by=user["email"]) + return {"status": "ok"} + + +@router.delete("/api/admin/workspace-prompt-template", status_code=204) +async def admin_reset_workspace_template( + user: dict = Depends(require_admin), + conn: duckdb.DuckDBPyConnection = Depends(_get_db), +): + ClaudeMdTemplateRepository(conn).reset(updated_by=user["email"]) + return Response(status_code=204) + + +@router.post("/api/admin/workspace-prompt-template/preview", response_model=ClaudeMdResponse) +async def admin_preview_workspace_template( + payload: TemplatePreviewRequest, + request: Request, + user: dict = Depends(require_admin), + conn: duckdb.DuckDBPyConnection = Depends(_get_db), +): + """Render arbitrary template content against the live RBAC context for the + calling admin, without persisting. Used by the /admin/workspace-prompt editor's + Preview button so admins can see their edits before saving.""" + env = Environment(undefined=StrictUndefined, autoescape=False) + try: + template = env.from_string(payload.content) + ctx = build_claude_md_context( + conn, user=user, server_url=str(request.base_url).rstrip("/") + ) + rendered = template.render(**ctx) + except TemplateError as e: + raise HTTPException(status_code=400, detail=f"Template invalid: {e}") + return ClaudeMdResponse(content=rendered) diff --git a/app/main.py b/app/main.py index 296135b..b858749 100644 --- a/app/main.py +++ b/app/main.py @@ -122,6 +122,7 @@ from app.api.v2_sample import router as v2_sample_router from app.api.v2_scan import router as v2_scan_router from app.api.marketplaces import router as marketplaces_router from app.api.welcome import router as welcome_router +from app.api.claude_md import router as claude_md_router from app.marketplace_server.router import router as marketplace_server_router from app.marketplace_server.git_router import make_git_wsgi_app from app.web.router import router as web_router @@ -529,6 +530,7 @@ def create_app() -> FastAPI: app.include_router(v2_scan_router) app.include_router(marketplaces_router) app.include_router(welcome_router) + app.include_router(claude_md_router) app.include_router(marketplace_server_router) # Git smart-HTTP endpoint for Claude Code: /marketplace.git/* diff --git a/app/web/router.py b/app/web/router.py index 603bebb..3a5a3c5 100644 --- a/app/web/router.py +++ b/app/web/router.py @@ -950,6 +950,30 @@ async def admin_agent_prompt_page( return templates.TemplateResponse(request, "admin_welcome.html", ctx) +@router.get("/admin/workspace-prompt", response_class=HTMLResponse) +async def admin_workspace_prompt_page( + request: Request, + user: dict = Depends(require_admin), + conn: duckdb.DuckDBPyConnection = Depends(_get_db), +): + from src.repositories.claude_md_template import ClaudeMdTemplateRepository + from src.claude_md import compute_default_claude_md + + row = ClaudeMdTemplateRepository(conn).get() + server_url = str(request.base_url).rstrip("/") + default_template = compute_default_claude_md(conn, user=user, server_url=server_url) + ctx = _build_context( + request, + user=user, + current=row["content"] or "", + default_template=default_template, + updated_at=row["updated_at"], + updated_by=row["updated_by"], + is_override=row["content"] is not None, + ) + return templates.TemplateResponse(request, "admin_workspace_prompt.html", ctx) + + @router.get("/tokens", response_class=HTMLResponse) async def my_tokens_page( diff --git a/app/web/templates/_app_header.html b/app/web/templates/_app_header.html index 49a4790..8d360fb 100644 --- a/app/web/templates/_app_header.html +++ b/app/web/templates/_app_header.html @@ -14,7 +14,7 @@ Setup local agent {% if session.user.is_admin %} Marketplaces - {% set _admin_active = _path.startswith('/admin/tables') or _path.startswith('/admin/tokens') or _path.startswith('/admin/users') or _path.startswith('/admin/groups') or _path.startswith('/admin/access') or _path.startswith('/admin/server-config') or _path.startswith('/admin/agent-prompt') %} + {% set _admin_active = _path.startswith('/admin/tables') or _path.startswith('/admin/tokens') or _path.startswith('/admin/users') or _path.startswith('/admin/groups') or _path.startswith('/admin/access') or _path.startswith('/admin/server-config') or _path.startswith('/admin/agent-prompt') or _path.startswith('/admin/workspace-prompt') %}
{% endif %} diff --git a/app/web/templates/admin_workspace_prompt.html b/app/web/templates/admin_workspace_prompt.html new file mode 100644 index 0000000..e8c1d2b --- /dev/null +++ b/app/web/templates/admin_workspace_prompt.html @@ -0,0 +1,529 @@ +{% extends "base.html" %} +{% block title %}Agent Workspace Prompt — {{ config.INSTANCE_NAME }}{% endblock %} + +{% block content %} + + + + + + + + + +
+
+
+

Agent Workspace Prompt

+

Customize the CLAUDE.md Claude Code reads when it opens the analyst workspace.

+
+
+ {% if is_override %} + + Override active + + {% else %} + Using default + {% endif %} +
+
+ +
+
+

+ Default: a rich markdown briefing about Agnes commands, registered tables + (RBAC-filtered for the calling analyst), available metrics, and marketplace plugins. + Written to CLAUDE.md in the analyst workspace at da analyst setup time. + Use --no-claude-md to skip writing it. +

+

+ Template engine: Jinja2 with StrictUndefined — unknown placeholders + raise an error at save time. Use {{ "{% if user.is_admin %}" }}…{{ "{% endif %}" }} + to guard admin-only context. +

+ +
+ Available Jinja2 placeholders +
+ {{ "{{ instance.name }}" }} — instance display name +{{ "{{ instance.subtitle }}" }} — operator / org name +{{ "{{ server.url }}" }} — full server URL +{{ "{{ server.hostname }}" }} — host part only +{{ "{{ sync_interval }}" }} — e.g. "1h" +{{ "{{ data_source.type }}" }} — keboola | bigquery | local + +{{ "{{ tables }}" }} — list of {name, description, query_mode} +{{ "{% for t in tables %}" }} {{ "{{ t.name }}" }}, {{ "{{ t.description }}" }}, {{ "{{ t.query_mode }}" }} {{ "{% endfor %}" }} + +{{ "{{ metrics.count }}" }} — total number of metrics +{{ "{{ metrics.categories }}" }} — list of category names + +{{ "{{ marketplaces }}" }} — list of {slug, name, plugins:[{name}]} +{{ "{% for mp in marketplaces %}" }} {{ "{{ mp.name }}" }}, {{ "{{ mp.slug }}" }}, {{ "{{ mp.plugins }}" }} {{ "{% endfor %}" }} + +{{ "{{ user.id }}" }}, {{ "{{ user.email }}" }}, {{ "{{ user.name }}" }} +{{ "{{ user.is_admin }}" }}, {{ "{{ user.groups }}" }} + +{{ "{{ now }}" }} — tz-aware UTC datetime +{{ "{{ today }}" }} — ISO date string e.g. "2026-01-01" + +
+
+ +
+
+

Editor

+
+ +
+
+
+

Live preview

+
+
(rendering…)
+ +
+
+
+
+
+ + +
+
+
+ + + + +
+ + +{% endblock %} diff --git a/cli/commands/analyst.py b/cli/commands/analyst.py index c1a6374..411a768 100644 --- a/cli/commands/analyst.py +++ b/cli/commands/analyst.py @@ -297,11 +297,15 @@ def _install_claude_hooks(settings_path: Path) -> None: # Helper: initialise Claude workspace (.claude/ directory) # --------------------------------------------------------------------------- -def _init_claude_workspace(workspace: Path) -> None: +def _init_claude_workspace( + workspace: Path, + server_url: str = "", + token: str = "", +) -> None: """Initialise the .claude/ directory with placeholder files and hooks. - Does NOT write CLAUDE.md — workspace-context customisation is handled - server-side via the banner on /setup, not as a file in the workspace. + Writes CLAUDE.md from the server (GET /api/welcome) unless ``server_url`` + or ``token`` are empty, or the request fails (graceful degradation). """ local_md = workspace / ".claude" / "CLAUDE.local.md" if not local_md.exists(): @@ -320,6 +324,57 @@ def _init_claude_workspace(workspace: Path) -> None: _install_claude_hooks(settings_path) + # Write CLAUDE.md from the server + if server_url and token: + _write_claude_md(workspace, server_url, token) + + +def _write_claude_md(workspace: Path, server_url: str, token: str) -> None: + """Fetch the rendered CLAUDE.md from the server and write it to the workspace. + + Gracefully handles: + - 404: older server without the endpoint — skip with warning. + - Other HTTP errors / network errors — skip with warning. + """ + from urllib.parse import urlencode + import httpx + + server_url = server_url.rstrip("/") + params = urlencode({"server_url": server_url}) + url = f"{server_url}/api/welcome?{params}" + try: + resp = httpx.get( + url, + headers={"Authorization": f"Bearer {token}"}, + timeout=30.0, + ) + if resp.status_code == 404: + typer.echo( + "Warning: server does not support CLAUDE.md generation (older version). Skipping.", + err=True, + ) + return + if resp.status_code == 401 or resp.status_code == 403: + typer.echo( + f"Warning: CLAUDE.md fetch failed ({resp.status_code} {resp.reason_phrase}). Skipping.", + err=True, + ) + return + resp.raise_for_status() + data = resp.json() + content = data.get("content", "") + if content: + (workspace / "CLAUDE.md").write_text(content, encoding="utf-8") + else: + typer.echo("Warning: server returned empty CLAUDE.md content. Skipping.", err=True) + except httpx.HTTPStatusError as e: + typer.echo( + f"Warning: CLAUDE.md fetch failed (HTTP {e.response.status_code}). Skipping.", + err=True, + ) + except Exception as e: + typer.echo(f"Warning: CLAUDE.md fetch failed: {e}. Skipping.", err=True) + # --------------------------------------------------------------------------- # Helper: data freshness check (for returning-session detection) @@ -352,6 +407,7 @@ def setup( server_url: str = typer.Option(..., "--server-url", help="URL of the AI Data Analyst server"), force: bool = typer.Option(False, "--force", help="Re-initialise even if workspace already exists"), workspace_dir: Optional[str] = typer.Option(None, "--workspace", help="Workspace directory (default: current dir)"), + no_claude_md: bool = typer.Option(False, "--no-claude-md", help="Skip writing CLAUDE.md to workspace"), ): """Bootstrap a new analyst workspace from a remote server.""" workspace = Path(workspace_dir).resolve() if workspace_dir else Path.cwd() @@ -385,9 +441,13 @@ def setup( typer.echo("Initialising DuckDB views...") total_rows = _initialize_duckdb(workspace) - # 7. Initialise Claude workspace (.claude/ hooks + placeholder) + # 7. Initialise Claude workspace (.claude/ hooks + placeholder + CLAUDE.md) typer.echo("Initializing Claude workspace...") - _init_claude_workspace(workspace) + _init_claude_workspace( + workspace, + server_url=server_url if not no_claude_md else "", + token=token if not no_claude_md else "", + ) # 8. Summary typer.echo("") @@ -396,6 +456,8 @@ def setup( typer.echo(f" Tables : {n_downloaded} downloaded, {total_rows} total rows") typer.echo(f" Workspace: {workspace}") typer.echo(f" Hooks : SessionStart/End installed in {workspace}/.claude/settings.json") + if not no_claude_md: + typer.echo(f" CLAUDE.md: written from server template") typer.echo("") typer.echo("Next steps:") typer.echo(" da sync — refresh data") diff --git a/config/claude_md_template.txt b/config/claude_md_template.txt new file mode 100644 index 0000000..688f02e --- /dev/null +++ b/config/claude_md_template.txt @@ -0,0 +1,195 @@ +{# Default analyst-onboarding workspace prompt for "da analyst setup". + Rendered server-side by src/claude_md.py. Edit this file to change + the OSS default; admins override per-instance via /admin/workspace-prompt. + + Available context (see docs/agent-workspace-prompt.md for the full reference): + instance.name, instance.subtitle + server.url, server.hostname + sync_interval — string from instance.yaml + data_source.type — keboola | bigquery | local + tables — list of {name, description, query_mode} + metrics.count, metrics.categories + marketplaces — list of {slug, name, plugins:[{name}]} + user.id, user.email, user.name, user.is_admin, user.groups + now, today — datetime / date string +#} +# {{ instance.name }} — AI Data Analyst + +This workspace is connected to {{ server.url }}. +{% if instance.subtitle %}Operated by **{{ instance.subtitle }}**.{% endif %} + +## Rules +- Before computing any business metric: run `da metrics show /` +- **For canonical table list with query modes: `da catalog`.** `data/metadata/schema.json` covers `query_mode: "local"` tables only — for remote/hybrid tables it's incomplete. Treat `da catalog` as source of truth. +- Do not use DESCRIBE/SHOW COLUMNS — use `da schema ` instead +- Save work output to `user/artifacts/` +- Sync data regularly with `da sync` +- **Personal customizations go in `.claude/CLAUDE.local.md`, NOT here.** This file is regenerated by `da analyst setup --force`; edits here will be lost. CLAUDE.local.md is preserved across regeneration and uploaded on `da sync --upload-only`. + +## Metrics Workflow +1. `da metrics list` — find the relevant metric ({{ metrics.count }} available, categories: {{ metrics.categories | join(", ") or "none yet" }}) +2. `da metrics show /` — read SQL and business rules +3. Use the canonical SQL from the metric definition, adapt to the question +4. Never invent metric calculations — always check existing definitions first + +## Data Sync +- `da sync` — download current data from server +- `da sync --docs-only` — just metadata and metrics (fast refresh) +- `da sync --upload-only` — upload sessions and local notes to server +- Data on the server refreshes every {{ sync_interval }} + +## Available Datasets +{% for t in tables -%} +- `{{ t.name }}`{% if t.description %} — {{ t.description }}{% endif %}{% if t.query_mode == "remote" %} *(remote, queried on demand)*{% endif %} +{% else -%} +- _No tables registered yet — ask an admin to register tables in the dashboard._ +{% endfor %} + +{% if marketplaces -%} +## Plugins available to you +{% for mp in marketplaces -%} +- **{{ mp.name }}** ({{ mp.slug }}): {{ mp.plugins | map(attribute="name") | join(", ") }} +{% endfor %} +{% endif -%} + +## Remote Queries (BigQuery) — when data isn't on the laptop + +Not every table is synced. Tables registered with `query_mode: "remote"` live in +BigQuery, accessed server-side via DuckDB's BQ extension — no parquet on disk. +Tables you don't see in `data/parquet/` may still be queryable. + +### Discovery first + +``` +da catalog --json | jq '.[] | {name, source_type, query_mode}' # see all tables + their modes +da schema
# columns + types +da describe
-n 5 # sample rows +``` + +For local-mode tables, query directly with `da query "SELECT … FROM
"`. + +### Three patterns for `query_mode: "remote"` tables + +| Pattern | Tool | Use when | +|---------|------|----------| +| **`da fetch`** (preferred) | materializes a filtered subset locally → query the snapshot | repeated questions on same slice | +| **`da query --remote`** | one-shot, server-side execution against BigQuery (works for BASE TABLE rows directly + VIEW/MATERIALIZED_VIEW rows via the BQ jobs API; cost-guarded by a 5 GiB scan cap configurable in /admin/server-config) | single aggregate / cheap probe | +| **`da query --register-bq`** | hybrid joins between local snapshots and ad-hoc BQ subqueries | crossing local + remote | + +### Permission model + cost — important + +- BQ access goes through the **agnes server's GCE service account**, not your personal Google credentials. If a query fails with a permission error, the table is in a project the server SA cannot read — escalate to admin, do NOT try to authenticate yourself. +- Every BQ query bills the SA's GCP project for **bytes scanned**. A naive `SELECT * FROM ` can cost real money. ALWAYS: + - filter via `--where` on the partition column (typically a date) + - list specific columns in `--select` — column-store BQ skips the rest, cheaper + - run `--estimate` first when unsure of the table size or partitioning + +### `da fetch` discipline + +``` +# 1. ESTIMATE first — refuses to fetch without knowing the cost +da fetch
--select col1,col2 --where "date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)" --estimate + +# 2. If reasonable, fetch as a named snapshot +da fetch
--select col1,col2 --where "..." --as my_recent + +# 3. Query the local snapshot +da query "SELECT col1, COUNT(*) FROM my_recent GROUP BY 1" + +# 4. List + drop snapshots when done +da snapshot list +da snapshot drop my_recent +``` + +Rules of thumb: +- ALWAYS list specific columns in `--select`. Avoid implicit SELECT *. +- ALWAYS include a `--where` for remote tables; otherwise add `--limit`. +- ALWAYS run `--estimate` first when the table is `partition_by` / `clustered_by` + per `da schema`, or could plausibly exceed 1 GB local bytes. +- Reuse snapshots across questions in the same conversation — `da snapshot list` + before fetching. + +### Snapshot freshness — when to refresh + +Snapshots are point-in-time copies. They go stale as the source data updates (most BQ tables refresh daily; check `sync_schedule` per `da catalog`). For each new conversation: + +``` +da snapshot list # see existing snapshots + their ages +da snapshot drop my_recent # drop stale ones +da fetch
--select ... --where ... --as my_recent # re-fetch +``` + +If the question is time-sensitive (e.g. "today's orders"), assume any snapshot older than the table's `sync_schedule` is stale and refresh. + +### Hybrid query example — local + remote in one query + +`da query --register-bq` lets a single SQL statement join a local table with an ad-hoc BQ subquery. The BQ subquery runs first (server-side), result registered as a DuckDB view, then the joined query runs locally. + +``` +da query \ + --register-bq "traffic=SELECT date, country, SUM(views) AS views \ + FROM \`prj.web_analytics.sessions\` \ + WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) \ + GROUP BY 1, 2" \ + --sql "SELECT o.date, o.country, o.revenue, t.views, o.revenue / NULLIF(t.views,0) AS rev_per_view \ + FROM orders o \ + JOIN traffic t ON o.date = t.date AND o.country = t.country \ + ORDER BY 1 DESC" +``` + +The BQ subquery MUST contain `WHERE` and/or `GROUP BY` to keep the registered result manageable (target: under 500K rows, well under 100 MB). Multiple `--register-bq` flags can compose multiple BQ sources. For complex SQL, use `--stdin` mode (`echo '{"register_bq":{...},"sql":"..."}' | da query --stdin`). + +### BigQuery SQL flavor for `--where` + +Source-typed `bigquery` tables use BigQuery dialect, not DuckDB: + +- Date literal: `DATE '2026-01-01'` +- Timestamp literal: `TIMESTAMP '2026-01-01 00:00:00 UTC'` +- Now: `CURRENT_DATE()`, `CURRENT_TIMESTAMP()` +- Date arithmetic: `DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)` +- Regex: `REGEXP_CONTAINS(col, r'pattern')` (raw string!) +- Cast: `CAST(x AS INT64)` (NOT `INT`) + +### When the table you want isn't in `da catalog` + +The table may exist in BigQuery but not be registered with Agnes yet. Two options: + +1. **Ad-hoc one-shot** — register a BQ subquery as a view inline, no admin needed + if the agnes server SA has BQ access: + ``` + da query --register-bq "live=SELECT * FROM \`project.dataset.table\` WHERE date >= '...' LIMIT 1000" \ + --sql "SELECT * FROM live" + ``` +2. **Ask admin to register** the table with `query_mode: "remote"` so it shows up + in `da catalog` and supports `da fetch` / `da query --remote`. This is the + right path for any table you'll query repeatedly. + +### Deeper guidance + +For the full protocol, including hybrid-query examples, snapshot hygiene, and +when NOT to use `da fetch`, run: + +``` +da skills show agnes-data-querying +``` + +## Corporate Memory + +Rules injected by `da sync` from the server's corporate knowledge base live in `.claude/rules/km_*.md`. They are automatically loaded by Claude Code on every session start. + +- `km_.md` — mandatory rules (always enforced) +- `km_approved.md` — approved guidance (confidence × recency ranked) + +Run `da sync` to refresh. Rules are pruned automatically when items are revoked. + +## Directory Structure +- `data/` — read-only data downloaded from server + - `data/parquet/` — table data in Parquet format + - `data/duckdb/` — local analytics DuckDB database + - `data/metadata/` — profiles, schema, metrics cache +- `user/` — your workspace (persistent across syncs) + - `user/artifacts/` — analysis outputs, reports, charts + - `user/sessions/` — Claude Code session logs +- `.claude/CLAUDE.local.md` — your personal notes + workspace customizations. **Never overwritten by `da analyst setup --force`.** Uploaded to the server on `da sync --upload-only`. Put any local-only Claude instructions, project-specific reminders, or temporary notes here — NOT in CLAUDE.md (this file is regenerated from a template). + +_Hello {{ user.name or user.email }} — generated {{ today }}._ diff --git a/docs/agent-workspace-prompt.md b/docs/agent-workspace-prompt.md new file mode 100644 index 0000000..2488295 --- /dev/null +++ b/docs/agent-workspace-prompt.md @@ -0,0 +1,114 @@ +# Agent Workspace Prompt + +The agent workspace prompt is the `CLAUDE.md` file written to each analyst's +workspace by `da analyst setup`. It gives Claude Code context about the +connected instance: available tables (RBAC-filtered), business metrics, installed +plugins, and operational rules for the analyst. + +## When is CLAUDE.md written? + +`da analyst setup` fetches `GET /api/welcome` and writes the rendered markdown +to `/CLAUDE.md` on every run (including `--force` re-initialisation). + +To skip writing CLAUDE.md: + +```bash +da analyst setup --server-url https://agnes.example.com --no-claude-md +``` + +**Analysts who ran setup while CLAUDE.md generation was temporarily absent** will +have their file written on the next `da analyst setup` run. Any existing +`CLAUDE.md` is overwritten with the current server template. + +The companion `CLAUDE.local.md` (at `.claude/CLAUDE.local.md`) is **never** +overwritten — it is the analyst's personal customisation space. + +## Editing the template + +Admins configure the template via: + +- **Admin UI:** `/admin/workspace-prompt` — Jinja2 markdown editor with a + placeholder cheatsheet, live preview (rendered against the calling admin's + RBAC context), and save/reset actions. +- **REST API:** + - `GET /api/admin/workspace-prompt-template` — returns + `{content, default, updated_at, updated_by}`. `content` is `null` when no + override is set; `default` is always the live rendered default. + - `PUT /api/admin/workspace-prompt-template` with body `{"content": "..."}` — + validates Jinja2 syntax against two stubs (authenticated user, minimal user) + before persisting. Returns `400` on syntax errors or unknown placeholders. + - `DELETE /api/admin/workspace-prompt-template` — clears the override; reverts + to the rich default template from `config/claude_md_template.txt`. + - `POST /api/admin/workspace-prompt-template/preview` with + body `{"content": "..."}` — renders arbitrary content against the calling + admin's live RBAC context without persisting. Used by the editor's Preview + button. + +The override lives in `system.duckdb` (table `claude_md_template`, singleton +row id=1). `DELETE` NULLs `content`; audit trail (`updated_at`, `updated_by`) +is preserved. + +## Default template + +The default template is `config/claude_md_template.txt` (Jinja2 markdown). +When no admin override is set, this file is rendered for every `GET /api/welcome` +request. Operators can customise it per-instance via the UI — or ship a modified +default by editing the file before deployment. + +## Template language + +[Jinja2](https://jinja.palletsprojects.com/) with `autoescape=False` and +`StrictUndefined`. Autoescape is off because the rendered output is markdown, not +HTML. `StrictUndefined` means any typo in a placeholder name raises an error at +PUT validation time, so the admin is notified immediately. + +## Available placeholders + +| Placeholder | Type | Notes | +|---|---|---| +| `instance.name` | string | `instance.name` from `instance.yaml` | +| `instance.subtitle` | string | `instance.subtitle` from `instance.yaml` | +| `server.url` | string | Full server URL at render time | +| `server.hostname` | string | Host part only | +| `sync_interval` | string | e.g. `"1h"` from `instance.yaml` | +| `data_source.type` | string | `keboola`, `bigquery`, or `local` | +| `tables` | list[dict] | RBAC-filtered list of `{name, description, query_mode}` | +| `metrics.count` | int | Total metric definitions in DB | +| `metrics.categories` | list[str] | Sorted unique category names | +| `marketplaces` | list[dict] | RBAC-filtered `{slug, name, plugins:[{name}]}` | +| `user.id` | string | Analyst user ID | +| `user.email` | string | Analyst email | +| `user.name` | string | Analyst display name | +| `user.is_admin` | bool | Whether the user is in the Admin group | +| `user.groups` | list[str] | User's group names | +| `now` | datetime (UTC, tz-aware) | Server time at render | +| `today` | string (`YYYY-MM-DD`) | Server date | + +## Example: iterating tables + +```jinja2 +## Available Datasets +{% for t in tables -%} +- `{{ t.name }}`{% if t.description %} — {{ t.description }}{% endif %} +{% else -%} +- _No tables registered yet._ +{% endfor %} +``` + +## Example: conditional marketplace section + +```jinja2 +{% if marketplaces %} +## Plugins +{% for mp in marketplaces %} +- **{{ mp.name }}**: {{ mp.plugins | map(attribute="name") | join(", ") }} +{% endfor %} +{% endif %} +``` + +## Resetting to the built-in default + +Click **Reset to default** in the admin UI, or call +`DELETE /api/admin/workspace-prompt-template`. The next analyst who runs +`da analyst setup` will receive the rich default template from +`config/claude_md_template.txt`. diff --git a/src/claude_md.py b/src/claude_md.py new file mode 100644 index 0000000..0ab5b0b --- /dev/null +++ b/src/claude_md.py @@ -0,0 +1,233 @@ +"""Render the analyst-workspace CLAUDE.md prompt. + +The template source is admin-editable at /admin/workspace-prompt. When no +override is set, the default content is the Jinja2 markdown template shipped +at config/claude_md_template.txt. When an override is saved, it replaces the +default for every call to render_claude_md(). + +Override content is a Jinja2 template (autoescape=False, StrictUndefined). +Available placeholders: instance.{name,subtitle}, server.{url,hostname}, +sync_interval, data_source.type, tables (list), metrics.{count,categories}, +marketplaces (RBAC-filtered list), user.{id,email,name,is_admin,groups}, +now, today. + +See also: surfaced as the "Agent Workspace Prompt" admin editor at +/admin/workspace-prompt. +""" + +from __future__ import annotations + +import logging +from datetime import datetime, timezone +from pathlib import Path +from typing import Any +from urllib.parse import urlparse + +import duckdb +from jinja2 import Environment, StrictUndefined, TemplateError + +from app.instance_config import ( + get_data_source_type, + get_instance_name, + get_instance_subtitle, + get_sync_interval, +) +from src.repositories.claude_md_template import ClaudeMdTemplateRepository + +logger = logging.getLogger(__name__) + +def _load_default_template() -> str: + """Load the shipped CLAUDE.md default template. + + Resolution order (first hit wins): + 1. importlib.resources lookup in the installed `config` package — works + in both editable installs and wheel-installed deployments. This is + the canonical path on container deployments where `/app/config/` + may be bind-mounted to overlay instance-specific config (instance.yaml) + and shadow the image-baked template file. + 2. Filesystem path relative to this module — for dev runs from a checkout. + 3. Last-resort embedded fallback so the renderer never fails outright. + """ + # 1. Package-resource path (preferred — works under wheel installs) + try: + from importlib import resources + + ref = resources.files("config").joinpath("claude_md_template.txt") + if ref.is_file(): + return ref.read_text(encoding="utf-8") + except (ModuleNotFoundError, FileNotFoundError, OSError): + pass + + # 2. Filesystem path relative to this module (dev checkout) + fs_path = Path(__file__).resolve().parent.parent / "config" / "claude_md_template.txt" + if fs_path.exists(): + return fs_path.read_text(encoding="utf-8") + + # 3. Embedded fallback (image stripped down, partial Docker COPY, etc.) + return ( + "# {{ instance.name }} — AI Data Analyst\n\n" + "This workspace is connected to {{ server.url }}.\n" + "Data refreshes every {{ sync_interval }}.\n" + ) + + +def _list_tables(conn: duckdb.DuckDBPyConnection, *, user: dict) -> list[dict[str, Any]]: + """Return registered tables filtered by the calling user's RBAC grants. + + For admins, returns all tables. For non-admins, returns only tables the + user has explicit ``resource_grants(resource_type='table')`` access to. + """ + from src.rbac import get_accessible_tables + try: + allowed_ids = get_accessible_tables(user, conn) # None=admin, list=non-admin + if allowed_ids is None: + rows = conn.execute( + "SELECT name, description, query_mode FROM table_registry ORDER BY name" + ).fetchall() + elif not allowed_ids: + return [] + else: + placeholders = ",".join(["?"] * len(allowed_ids)) + rows = conn.execute( + f"SELECT name, description, query_mode FROM table_registry " + f"WHERE id IN ({placeholders}) ORDER BY name", + allowed_ids, + ).fetchall() + except duckdb.CatalogException: + return [] + return [ + {"name": r[0], "description": r[1] or "", "query_mode": r[2] or "local"} + for r in rows + ] + + +def _metrics_summary(conn: duckdb.DuckDBPyConnection) -> dict[str, Any]: + try: + rows = conn.execute( + "SELECT category, COUNT(*) FROM metric_definitions GROUP BY category" + ).fetchall() + except duckdb.CatalogException: + return {"count": 0, "categories": []} + return { + "count": sum(r[1] for r in rows), + "categories": sorted({r[0] for r in rows if r[0]}), + } + + +def _marketplaces_for_user( + conn: duckdb.DuckDBPyConnection, user: dict[str, Any] +) -> list[dict[str, Any]]: + """Return marketplaces with the plugins the user is allowed to see. + + Delegates RBAC filtering entirely to resolve_allowed_plugins, which + returns List[dict] with marketplace_slug, original_name, etc. + Results are grouped by marketplace slug; display names are fetched + from marketplace_registry in a single query. + """ + try: + from src.marketplace_filter import resolve_allowed_plugins + allowed = resolve_allowed_plugins(conn, user) + except Exception: + logger.exception("_marketplaces_for_user: marketplace plugin resolution failed") + return [] + if not allowed: + return [] + + # Build slug → display name lookup from registry + slugs = list({p["marketplace_slug"] for p in allowed}) + placeholders = ",".join(["?"] * len(slugs)) + try: + name_rows = conn.execute( + f"SELECT id, name FROM marketplace_registry WHERE id IN ({placeholders})", + slugs, + ).fetchall() + except duckdb.CatalogException: + name_rows = [] + slug_to_name: dict[str, str] = {r[0]: r[1] for r in name_rows} + + grouped: dict[str, dict[str, Any]] = {} + for plugin in allowed: + slug = plugin["marketplace_slug"] + bucket = grouped.setdefault( + slug, + { + "slug": slug, + "name": slug_to_name.get(slug, slug), + "plugins": [], + }, + ) + bucket["plugins"].append({"name": plugin["original_name"]}) + + return list(grouped.values()) + + +def build_claude_md_context( + conn: duckdb.DuckDBPyConnection, + *, + user: dict[str, Any], + server_url: str, +) -> dict[str, Any]: + """Compose the Jinja2 render context for the CLAUDE.md template. Pure, no side effects.""" + now = datetime.now(timezone.utc) + parsed = urlparse(server_url) + return { + "instance": { + "name": get_instance_name(), + "subtitle": get_instance_subtitle(), + }, + "server": { + "url": server_url, + "hostname": parsed.hostname or "", + }, + "sync_interval": get_sync_interval(), + "data_source": {"type": get_data_source_type()}, + "tables": _list_tables(conn, user=user), + "metrics": _metrics_summary(conn), + "marketplaces": _marketplaces_for_user(conn, user), + "user": { + "id": user.get("id", ""), + "email": user.get("email", ""), + "name": user.get("name") or "", + "is_admin": bool(user.get("is_admin")), + "groups": user.get("groups") or [], + }, + "now": now, + "today": now.date().isoformat(), + } + + +def compute_default_claude_md( + conn: duckdb.DuckDBPyConnection, + *, + user: dict[str, Any], + server_url: str, +) -> str: + """Return the rendered default CLAUDE.md from config/claude_md_template.txt. + + Renders the shipped Jinja2 template with the given user's RBAC context. + On TemplateError, raises — callers that want graceful fallback should catch. + """ + source = _load_default_template() + env = Environment(undefined=StrictUndefined, autoescape=False) + template = env.from_string(source) + return template.render(**build_claude_md_context(conn, user=user, server_url=server_url)) + + +def render_claude_md( + conn: duckdb.DuckDBPyConnection, + *, + user: dict[str, Any], + server_url: str, +) -> str: + """Resolve the active template (override or default) and render it for the given user. + + When an admin override is set, renders it via Jinja2 (StrictUndefined, autoescape=False). + When no override is set, renders the shipped default template. + + On TemplateError, raises — the API layer catches this and returns 400/500. + """ + row = ClaudeMdTemplateRepository(conn).get() + source = row["content"] if row.get("content") else _load_default_template() + env = Environment(undefined=StrictUndefined, autoescape=False) + template = env.from_string(source) + return template.render(**build_claude_md_context(conn, user=user, server_url=server_url)) diff --git a/src/db.py b/src/db.py index d0b13fa..5c3d8a3 100644 --- a/src/db.py +++ b/src/db.py @@ -39,7 +39,7 @@ def _maybe_instrument(con, db_tag: str): _SAFE_IDENTIFIER = re.compile(r"^[a-zA-Z_][a-zA-Z0-9_]{0,63}$") -SCHEMA_VERSION = 22 +SCHEMA_VERSION = 23 _SYSTEM_SCHEMA = """ CREATE TABLE IF NOT EXISTS schema_version ( @@ -427,6 +427,18 @@ CREATE TABLE IF NOT EXISTS setup_banner ( updated_by VARCHAR, CONSTRAINT singleton CHECK (id = 1) ); + +-- v23: customizable analyst-workspace CLAUDE.md template. +-- Singleton row (id=1). NULL content means "use the default template +-- shipped at config/claude_md_template.txt" (Jinja2 markdown). Admin override +-- stores the raw Jinja2 source string. +CREATE TABLE IF NOT EXISTS claude_md_template ( + id INTEGER PRIMARY KEY DEFAULT 1, + content TEXT, + updated_at TIMESTAMP, + updated_by VARCHAR, + CONSTRAINT singleton CHECK (id = 1) +); """ @@ -1658,6 +1670,17 @@ _V21_TO_V22_MIGRATIONS = [ "INSERT INTO setup_banner (id, content) VALUES (1, NULL) ON CONFLICT (id) DO NOTHING", ] +_V22_TO_V23_MIGRATIONS = [ + """CREATE TABLE IF NOT EXISTS claude_md_template ( + id INTEGER PRIMARY KEY DEFAULT 1, + content TEXT, + updated_at TIMESTAMP, + updated_by VARCHAR, + CONSTRAINT singleton CHECK (id = 1) + )""", + "INSERT INTO claude_md_template (id, content) VALUES (1, NULL) ON CONFLICT (id) DO NOTHING", +] + def _ensure_schema(conn: duckdb.DuckDBPyConnection) -> None: """Create tables if they don't exist. Apply migrations if schema version changed. @@ -1724,6 +1747,10 @@ def _ensure_schema(conn: duckdb.DuckDBPyConnection) -> None: "INSERT INTO setup_banner (id, content) VALUES (1, NULL) " "ON CONFLICT (id) DO NOTHING" ) + conn.execute( + "INSERT INTO claude_md_template (id, content) VALUES (1, NULL) " + "ON CONFLICT (id) DO NOTHING" + ) # Fresh-install seed is handled by the unconditional # _seed_core_roles call at the bottom of _ensure_schema — # left as a no-op branch here so the migration ladder still @@ -1807,6 +1834,9 @@ def _ensure_schema(conn: duckdb.DuckDBPyConnection) -> None: if current < 22: for sql in _V21_TO_V22_MIGRATIONS: conn.execute(sql) + if current < 23: + for sql in _V22_TO_V23_MIGRATIONS: + conn.execute(sql) conn.execute( "UPDATE schema_version SET version = ?, applied_at = current_timestamp", [SCHEMA_VERSION], diff --git a/src/repositories/claude_md_template.py b/src/repositories/claude_md_template.py new file mode 100644 index 0000000..e47393f --- /dev/null +++ b/src/repositories/claude_md_template.py @@ -0,0 +1,53 @@ +"""Repository for the per-instance CLAUDE.md template override (singleton row).""" + +from datetime import datetime, timezone +from typing import Any + +import duckdb + + +class ClaudeMdTemplateRepository: + def __init__(self, conn: duckdb.DuckDBPyConnection): + self.conn = conn + + def get(self) -> dict[str, Any]: + """Return the singleton row. Always exists post-migration; content + is None when no override is set (= use shipped default template).""" + row = self.conn.execute( + "SELECT id, content, updated_at, updated_by FROM claude_md_template WHERE id = 1" + ).fetchone() + if row is None: + # Defensive: re-seed if a previous admin manually deleted it. + self.conn.execute( + "INSERT INTO claude_md_template (id, content) VALUES (1, NULL) " + "ON CONFLICT (id) DO NOTHING" + ) + return {"id": 1, "content": None, "updated_at": None, "updated_by": None} + return { + "id": row[0], + "content": row[1], + "updated_at": row[2], + "updated_by": row[3], + } + + def set(self, content: str, *, updated_by: str) -> None: + now = datetime.now(timezone.utc) + self.conn.execute( + """INSERT INTO claude_md_template (id, content, updated_at, updated_by) + VALUES (1, ?, ?, ?) + ON CONFLICT (id) DO UPDATE SET + content = excluded.content, + updated_at = excluded.updated_at, + updated_by = excluded.updated_by""", + [content, now, updated_by], + ) + + def reset(self, *, updated_by: str) -> None: + """Clear override; renderer falls back to shipped default template.""" + now = datetime.now(timezone.utc) + self.conn.execute( + """UPDATE claude_md_template + SET content = NULL, updated_at = ?, updated_by = ? + WHERE id = 1""", + [now, updated_by], + ) diff --git a/tests/snapshots/openapi.json b/tests/snapshots/openapi.json index de8cd5e..2a02dc5 100644 --- a/tests/snapshots/openapi.json +++ b/tests/snapshots/openapi.json @@ -397,6 +397,19 @@ "title": "BulkUpdateRequest", "type": "object" }, + "ClaudeMdResponse": { + "properties": { + "content": { + "title": "Content", + "type": "string" + } + }, + "required": [ + "content" + ], + "title": "ClaudeMdResponse", + "type": "object" + }, "ColumnMetadataItem": { "properties": { "basetype": { @@ -3407,6 +3420,55 @@ ] } }, + "/admin/workspace-prompt": { + "get": { + "operationId": "admin_workspace_prompt_page_admin_workspace_prompt_get", + "parameters": [ + { + "in": "header", + "name": "authorization", + "required": false, + "schema": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "title": "Authorization" + } + } + ], + "responses": { + "200": { + "content": { + "text/html": { + "schema": { + "type": "string" + } + } + }, + "description": "Successful Response" + }, + "422": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + }, + "description": "Validation Error" + } + }, + "summary": "Admin Workspace Prompt Page", + "tags": [ + "web" + ] + } + }, "/api/admin/access-overview": { "get": { "description": "One-shot snapshot for the /admin/access page.\n\nReturns:\n - ``groups``: every user_group with member + grant counts\n - ``grants``: every (group_id, resource_type, resource_id) row\n - ``resources``: per-resource-type hierarchical layout, where each\n type has a list of *blocks* (parent entities, e.g. a marketplace)\n and each block has *items* (concrete grantable resources).\n\nUI stitches the three pieces into the two-column layout: groups on\nthe left, resources tree on the right with per-item checkboxes whose\nstate derives from ``grants``.", @@ -5552,6 +5614,211 @@ ] } }, + "/api/admin/workspace-prompt-template": { + "delete": { + "operationId": "admin_reset_workspace_template_api_admin_workspace_prompt_template_delete", + "parameters": [ + { + "in": "header", + "name": "authorization", + "required": false, + "schema": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "title": "Authorization" + } + } + ], + "responses": { + "204": { + "description": "Successful Response" + }, + "422": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + }, + "description": "Validation Error" + } + }, + "summary": "Admin Reset Workspace Template", + "tags": [ + "claude_md" + ] + }, + "get": { + "operationId": "admin_get_workspace_template_api_admin_workspace_prompt_template_get", + "parameters": [ + { + "in": "header", + "name": "authorization", + "required": false, + "schema": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "title": "Authorization" + } + } + ], + "responses": { + "200": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/TemplateGetResponse" + } + } + }, + "description": "Successful Response" + }, + "422": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + }, + "description": "Validation Error" + } + }, + "summary": "Admin Get Workspace Template", + "tags": [ + "claude_md" + ] + }, + "put": { + "description": "Save an admin override for the analyst CLAUDE.md template.\n\nTwo-pass Jinja2 validation (autoescape=False, StrictUndefined):\n- Pass 1: render with an authenticated user stub \u2014 catches undefined\n placeholders and syntax errors.\n- Pass 2: render with a minimal anon-style user stub \u2014 catches templates\n that hard-depend on admin-only context fields.", + "operationId": "admin_put_workspace_template_api_admin_workspace_prompt_template_put", + "parameters": [ + { + "in": "header", + "name": "authorization", + "required": false, + "schema": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "title": "Authorization" + } + } + ], + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/TemplatePutRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "content": { + "application/json": { + "schema": {} + } + }, + "description": "Successful Response" + }, + "422": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + }, + "description": "Validation Error" + } + }, + "summary": "Admin Put Workspace Template", + "tags": [ + "claude_md" + ] + } + }, + "/api/admin/workspace-prompt-template/preview": { + "post": { + "description": "Render arbitrary template content against the live RBAC context for the\ncalling admin, without persisting. Used by the /admin/workspace-prompt editor's\nPreview button so admins can see their edits before saving.", + "operationId": "admin_preview_workspace_template_api_admin_workspace_prompt_template_preview_post", + "parameters": [ + { + "in": "header", + "name": "authorization", + "required": false, + "schema": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "title": "Authorization" + } + } + ], + "requestBody": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/TemplatePreviewRequest" + } + } + }, + "required": true + }, + "responses": { + "200": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ClaudeMdResponse" + } + } + }, + "description": "Successful Response" + }, + "422": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + }, + "description": "Validation Error" + } + }, + "summary": "Admin Preview Workspace Template", + "tags": [ + "claude_md" + ] + } + }, "/api/catalog/metrics/{metric_path}": { "get": { "deprecated": true, @@ -10289,6 +10556,74 @@ ] } }, + "/api/welcome": { + "get": { + "description": "Return the rendered CLAUDE.md for the authenticated analyst.\n\nThe CLI calls this endpoint during ``da analyst setup`` to write\n``/CLAUDE.md``. The content is RBAC-filtered per the\ncalling user.\n\n``server_url`` query param lets the CLI pass the origin it knows so\nthe rendered content references the correct server URL rather than the\nrequest host (which may differ behind a proxy).", + "operationId": "get_welcome_api_welcome_get", + "parameters": [ + { + "description": "Server URL used in rendered CLAUDE.md", + "in": "query", + "name": "server_url", + "required": false, + "schema": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "description": "Server URL used in rendered CLAUDE.md", + "title": "Server Url" + } + }, + { + "in": "header", + "name": "authorization", + "required": false, + "schema": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "title": "Authorization" + } + } + ], + "responses": { + "200": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ClaudeMdResponse" + } + } + }, + "description": "Successful Response" + }, + "422": { + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + }, + "description": "Validation Error" + } + }, + "summary": "Get Welcome", + "tags": [ + "claude_md" + ] + } + }, "/auth/admin/tokens": { "get": { "operationId": "admin_list_tokens_auth_admin_tokens_get", diff --git a/tests/test_analyst_bootstrap.py b/tests/test_analyst_bootstrap.py index 1f88e16..a960b6a 100644 --- a/tests/test_analyst_bootstrap.py +++ b/tests/test_analyst_bootstrap.py @@ -139,20 +139,65 @@ class TestCreateWorkspace: # --------------------------------------------------------------------------- class TestInitClaudeWorkspace: - """Tests for _init_claude_workspace: no CLAUDE.md written, but - .claude/CLAUDE.local.md placeholder and settings.json hooks are created. - """ + """Tests for _init_claude_workspace.""" - def test_does_not_write_claude_md(self, tmp_workspace): + def test_does_not_write_claude_md_when_no_server_url(self, tmp_workspace): + """Without server_url, CLAUDE.md must not be written.""" from cli.commands.analyst import _create_workspace, _init_claude_workspace _create_workspace(tmp_workspace) _init_claude_workspace(tmp_workspace) assert not (tmp_workspace / "CLAUDE.md").exists(), ( - "CLAUDE.md must NOT be written by _init_claude_workspace" + "CLAUDE.md must NOT be written when no server_url is provided" ) + def test_writes_claude_md_when_server_returns_200(self, tmp_workspace): + """When /api/welcome returns 200, CLAUDE.md is written.""" + from cli.commands.analyst import _create_workspace, _init_claude_workspace + from unittest.mock import MagicMock, patch + + _create_workspace(tmp_workspace) + + mock_resp = MagicMock() + mock_resp.status_code = 200 + mock_resp.json.return_value = {"content": "# My CLAUDE.md\nHello analyst."} + mock_resp.raise_for_status = MagicMock() + + with patch("cli.commands.analyst.httpx.get", return_value=mock_resp): + _init_claude_workspace(tmp_workspace, server_url="https://example.com", token="tok") + + claude_md = tmp_workspace / "CLAUDE.md" + assert claude_md.exists() + assert "My CLAUDE.md" in claude_md.read_text(encoding="utf-8") + + def test_does_not_write_claude_md_when_no_claude_md_flag(self, tmp_workspace): + """When server_url/token are empty (--no-claude-md path), CLAUDE.md is not written.""" + from cli.commands.analyst import _create_workspace, _init_claude_workspace + + _create_workspace(tmp_workspace) + _init_claude_workspace(tmp_workspace, server_url="", token="") + + assert not (tmp_workspace / "CLAUDE.md").exists() + + def test_does_not_write_claude_md_on_404(self, tmp_workspace): + """When /api/welcome returns 404 (older server), CLAUDE.md is skipped gracefully.""" + from cli.commands.analyst import _create_workspace, _init_claude_workspace + from unittest.mock import MagicMock, patch + import httpx + + _create_workspace(tmp_workspace) + + mock_resp = MagicMock() + mock_resp.status_code = 404 + mock_resp.raise_for_status = MagicMock() + + with patch("cli.commands.analyst.httpx.get", return_value=mock_resp): + # Must not raise + _init_claude_workspace(tmp_workspace, server_url="https://example.com", token="tok") + + assert not (tmp_workspace / "CLAUDE.md").exists() + def test_creates_claude_local_md_when_absent(self, tmp_workspace): from cli.commands.analyst import _create_workspace, _init_claude_workspace diff --git a/tests/test_claude_md_api.py b/tests/test_claude_md_api.py new file mode 100644 index 0000000..8ed6876 --- /dev/null +++ b/tests/test_claude_md_api.py @@ -0,0 +1,257 @@ +"""End-to-end tests for the agent-workspace-prompt API endpoints. + +GET /api/welcome — analyst-facing rendered CLAUDE.md +GET /api/admin/workspace-prompt-template — admin: get template + default +PUT /api/admin/workspace-prompt-template — admin: set override +DELETE /api/admin/workspace-prompt-template — admin: reset to default +POST /api/admin/workspace-prompt-template/preview — admin: live preview +""" + + +def _auth(token: str) -> dict[str, str]: + return {"Authorization": f"Bearer {token}"} + + +# --------------------------------------------------------------------------- +# GET /api/welcome — analyst-facing rendered CLAUDE.md +# --------------------------------------------------------------------------- + +def test_get_welcome_requires_auth(seeded_app): + """Unauthenticated GET /api/welcome must return 401 or 422.""" + c = seeded_app["client"] + resp = c.get("/api/welcome", params={"server_url": "https://example.com"}) + assert resp.status_code in (401, 422) + + +def test_get_welcome_returns_rendered_markdown(seeded_app): + c = seeded_app["client"] + analyst = _auth(seeded_app["analyst_token"]) + + resp = c.get( + "/api/welcome", + params={"server_url": "https://example.com"}, + headers=analyst, + ) + assert resp.status_code == 200 + body = resp.json() + assert "content" in body + assert isinstance(body["content"], str) + assert body["content"].strip() != "" + + +def test_get_welcome_uses_override_when_set(seeded_app): + c = seeded_app["client"] + admin = _auth(seeded_app["admin_token"]) + analyst = _auth(seeded_app["analyst_token"]) + + # Set an override + r = c.put( + "/api/admin/workspace-prompt-template", + json={"content": "# Custom CLAUDE.md for {{ user.email }}"}, + headers=admin, + ) + assert r.status_code == 200 + + # Analyst fetch should include the override + resp = c.get( + "/api/welcome", + params={"server_url": "https://example.com"}, + headers=analyst, + ) + assert resp.status_code == 200 + assert "Custom CLAUDE.md" in resp.json()["content"] + assert "analyst@test.com" in resp.json()["content"] + + # Reset + c.delete("/api/admin/workspace-prompt-template", headers=admin) + + +# --------------------------------------------------------------------------- +# GET /api/admin/workspace-prompt-template — admin get +# --------------------------------------------------------------------------- + +def test_admin_get_template_initially_null(seeded_app): + c = seeded_app["client"] + admin = _auth(seeded_app["admin_token"]) + + r = c.get("/api/admin/workspace-prompt-template", headers=admin) + assert r.status_code == 200 + body = r.json() + assert body["content"] is None + assert "default" in body + assert body["default"] # non-empty default + + +def test_admin_get_template_default_contains_instance_name(seeded_app): + c = seeded_app["client"] + admin = _auth(seeded_app["admin_token"]) + + r = c.get("/api/admin/workspace-prompt-template", headers=admin) + assert r.status_code == 200 + body = r.json() + # Default template renders the instance name + assert body["default"] != "" + + +def test_non_admin_cannot_get_template(seeded_app): + c = seeded_app["client"] + analyst = _auth(seeded_app["analyst_token"]) + r = c.get("/api/admin/workspace-prompt-template", headers=analyst) + assert r.status_code == 403 + + +# --------------------------------------------------------------------------- +# PUT /api/admin/workspace-prompt-template — save override +# --------------------------------------------------------------------------- + +def test_admin_can_set_and_reset_template(seeded_app): + c = seeded_app["client"] + admin = _auth(seeded_app["admin_token"]) + + # PUT override + r = c.put( + "/api/admin/workspace-prompt-template", + json={"content": "# Hello {{ user.email }}"}, + headers=admin, + ) + assert r.status_code == 200 + + # GET reflects override + r = c.get("/api/admin/workspace-prompt-template", headers=admin) + assert r.status_code == 200 + assert r.json()["content"] == "# Hello {{ user.email }}" + + # DELETE = reset + r = c.delete("/api/admin/workspace-prompt-template", headers=admin) + assert r.status_code == 204 + r = c.get("/api/admin/workspace-prompt-template", headers=admin) + assert r.json()["content"] is None + + +def test_non_admin_cannot_put_template(seeded_app): + c = seeded_app["client"] + analyst = _auth(seeded_app["analyst_token"]) + r = c.put( + "/api/admin/workspace-prompt-template", + json={"content": "# evil override"}, + headers=analyst, + ) + assert r.status_code == 403 + + +def test_invalid_jinja2_returns_400(seeded_app): + c = seeded_app["client"] + admin = _auth(seeded_app["admin_token"]) + r = c.put( + "/api/admin/workspace-prompt-template", + json={"content": "{% for x in y %}"}, # unclosed loop + headers=admin, + ) + assert r.status_code == 400 + assert "invalid" in r.json()["detail"].lower() + + +def test_put_rejects_undefined_placeholder(seeded_app): + c = seeded_app["client"] + admin = _auth(seeded_app["admin_token"]) + r = c.put( + "/api/admin/workspace-prompt-template", + json={"content": "{{ no_such_variable }}"}, + headers=admin, + ) + assert r.status_code == 400 + + +# --------------------------------------------------------------------------- +# DELETE /api/admin/workspace-prompt-template +# --------------------------------------------------------------------------- + +def test_non_admin_cannot_delete_template(seeded_app): + c = seeded_app["client"] + analyst = _auth(seeded_app["analyst_token"]) + r = c.delete("/api/admin/workspace-prompt-template", headers=analyst) + assert r.status_code == 403 + + +# --------------------------------------------------------------------------- +# POST /api/admin/workspace-prompt-template/preview +# --------------------------------------------------------------------------- + +def test_admin_preview_renders_content(seeded_app): + c = seeded_app["client"] + admin = _auth(seeded_app["admin_token"]) + r = c.post( + "/api/admin/workspace-prompt-template/preview", + json={"content": "# Preview for {{ user.email }}"}, + headers=admin, + ) + assert r.status_code == 200 + assert r.json()["content"].startswith("# Preview for admin@test.com") + + +def test_preview_rejects_invalid_template(seeded_app): + c = seeded_app["client"] + admin = _auth(seeded_app["admin_token"]) + r = c.post( + "/api/admin/workspace-prompt-template/preview", + json={"content": "{% for x in y %}"}, + headers=admin, + ) + assert r.status_code == 400 + + +def test_preview_requires_admin(seeded_app): + c = seeded_app["client"] + analyst = _auth(seeded_app["analyst_token"]) + r = c.post( + "/api/admin/workspace-prompt-template/preview", + json={"content": "# Preview"}, + headers=analyst, + ) + assert r.status_code == 403 + + +def test_preview_uses_live_context(seeded_app): + """Preview should include live table data from context.""" + c = seeded_app["client"] + admin = _auth(seeded_app["admin_token"]) + r = c.post( + "/api/admin/workspace-prompt-template/preview", + json={"content": "tables: {{ tables | length }}, metrics: {{ metrics.count }}"}, + headers=admin, + ) + assert r.status_code == 200 + # Content must be a rendered string (not raise), numbers may be 0 on fresh DB + assert "tables:" in r.json()["content"] + + +# --------------------------------------------------------------------------- +# Validation stub vs. build_claude_md_context shape alignment +# --------------------------------------------------------------------------- + +def test_validation_stub_matches_build_context_shape(seeded_app, tmp_path, monkeypatch): + """_VALIDATION_STUB_CONTEXT top-level keys must match build_claude_md_context() output.""" + from app.api.claude_md import _VALIDATION_STUB_CONTEXT + from src.db import _ensure_schema, get_system_db + import duckdb + + db_path = tmp_path / "system.duckdb" + c = duckdb.connect(str(db_path)) + _ensure_schema(c) + + user = { + "id": "u1", + "email": "admin@test.com", + "name": "Admin", + "is_admin": True, + "groups": ["Admin"], + } + from src.claude_md import build_claude_md_context + real_ctx = build_claude_md_context(c, user=user, server_url="https://example.com") + + assert set(_VALIDATION_STUB_CONTEXT.keys()) == set(real_ctx.keys()), ( + f"_VALIDATION_STUB_CONTEXT top-level keys differ from build_claude_md_context output. " + f"Stub: {set(_VALIDATION_STUB_CONTEXT.keys())}, " + f"real: {set(real_ctx.keys())}" + ) + c.close() diff --git a/tests/test_claude_md_renderer.py b/tests/test_claude_md_renderer.py new file mode 100644 index 0000000..133a2f3 --- /dev/null +++ b/tests/test_claude_md_renderer.py @@ -0,0 +1,274 @@ +"""Unit tests for the analyst-workspace CLAUDE.md renderer (src/claude_md.py).""" + +import duckdb +import pytest +from jinja2 import TemplateError + +from src.db import _ensure_schema +from src.repositories.claude_md_template import ClaudeMdTemplateRepository +from src.claude_md import ( + build_claude_md_context, + compute_default_claude_md, + render_claude_md, +) + + +@pytest.fixture +def conn(tmp_path, monkeypatch): + monkeypatch.setenv("DATA_DIR", str(tmp_path)) + db_path = tmp_path / "system.duckdb" + c = duckdb.connect(str(db_path)) + _ensure_schema(c) + yield c + c.close() + + +def _user(email="alice@example.com", is_admin=False): + return { + "id": "u1", + "email": email, + "name": "Alice", + "is_admin": is_admin, + "groups": ["Everyone"], + } + + +# --------------------------------------------------------------------------- +# Default (no override) — renders a non-empty markdown string +# --------------------------------------------------------------------------- + +def test_compute_default_returns_non_empty(conn): + out = compute_default_claude_md(conn, user=_user(), server_url="https://example.com") + assert out.strip() != "" + + +def test_default_contains_server_url(conn): + out = compute_default_claude_md(conn, user=_user(), server_url="https://myagnes.example.com") + assert "https://myagnes.example.com" in out + + +def test_default_contains_user_reference(conn): + # The footer uses `user.name or user.email` — a user with no name falls back to email. + user_no_name = {"id": "u1", "email": "bob@example.com", "name": "", "is_admin": False, "groups": []} + out = compute_default_claude_md(conn, user=user_no_name, server_url="https://example.com") + assert "bob@example.com" in out + + +def test_render_uses_default_when_no_override(conn): + out = render_claude_md(conn, user=_user(), server_url="https://example.com") + assert out.strip() != "" + + +# --------------------------------------------------------------------------- +# Override renders correctly +# --------------------------------------------------------------------------- + +def test_render_uses_override_when_set(conn): + ClaudeMdTemplateRepository(conn).set( + "# {{ instance.name }} Workspace\n\nHello {{ user.email }}.", + updated_by="admin@example.com", + ) + out = render_claude_md(conn, user=_user("charlie@example.com"), server_url="https://example.com") + assert "charlie@example.com" in out + + +def test_render_override_tables_list(conn): + # Seed a table registry entry and ensure the test user is an admin so + # RBAC filtering does not hide the table. + conn.execute( + "INSERT INTO table_registry (id, name, description, query_mode, source_type) " + "VALUES ('t1', 'orders', 'All orders', 'local', 'keboola')" + ) + from src.repositories.users import UserRepository + from src.repositories.user_group_members import UserGroupMembersRepository + UserRepository(conn).create(id="u1", email="alice@example.com", name="Alice") + admin_gid = conn.execute("SELECT id FROM user_groups WHERE name='Admin'").fetchone()[0] + UserGroupMembersRepository(conn).add_member("u1", admin_gid, source="admin") + ClaudeMdTemplateRepository(conn).set( + "{% for t in tables %}- {{ t.name }}: {{ t.description }}{% endfor %}", + updated_by="admin@example.com", + ) + out = render_claude_md(conn, user=_user(), server_url="https://example.com") + assert "orders" in out + assert "All orders" in out + + +def test_render_override_metrics_summary(conn): + # Seed a metric definition — must include NOT NULL columns: display_name, sql + conn.execute( + "INSERT INTO metric_definitions (id, name, display_name, category, sql) " + "VALUES ('m1', 'mrr', 'MRR', 'revenue', 'SELECT SUM(amount)')" + ) + ClaudeMdTemplateRepository(conn).set( + "Metrics: {{ metrics.count }}, cats: {{ metrics.categories | join(', ') }}", + updated_by="admin@example.com", + ) + out = render_claude_md(conn, user=_user(), server_url="https://example.com") + assert "1" in out # 1 metric + assert "revenue" in out + + +# --------------------------------------------------------------------------- +# RBAC-filtered marketplaces — two users with different grants render differently +# --------------------------------------------------------------------------- + +def test_marketplaces_empty_for_user_with_no_grants(conn): + # No grants seeded — _marketplaces_for_user returns [] + ClaudeMdTemplateRepository(conn).set( + "{% if marketplaces %}HAS_PLUGINS{% else %}NO_PLUGINS{% endif %}", + updated_by="admin@example.com", + ) + out = render_claude_md(conn, user=_user(), server_url="https://example.com") + assert "NO_PLUGINS" in out + + +# --------------------------------------------------------------------------- +# Anonymous / minimal user context +# --------------------------------------------------------------------------- + +def test_render_with_minimal_user_context(conn): + """Templates referencing user fields must work with minimal user dict.""" + ClaudeMdTemplateRepository(conn).set( + "User: {{ user.email }}, admin: {{ user.is_admin }}", + updated_by="admin@example.com", + ) + out = render_claude_md(conn, user=_user(), server_url="https://example.com") + assert "alice@example.com" in out + assert "False" in out + + +# --------------------------------------------------------------------------- +# Build context shape +# --------------------------------------------------------------------------- + +def test_context_exposes_all_documented_keys(conn): + ctx = build_claude_md_context(conn, user=_user(), server_url="https://example.com") + for key in ("instance", "server", "sync_interval", "data_source", "tables", "metrics", "marketplaces", "user", "now", "today"): + assert key in ctx, f"missing context key: {key}" + + +def test_context_tables_is_list(conn): + ctx = build_claude_md_context(conn, user=_user(), server_url="https://example.com") + assert isinstance(ctx["tables"], list) + + +def test_context_metrics_shape(conn): + ctx = build_claude_md_context(conn, user=_user(), server_url="https://example.com") + assert "count" in ctx["metrics"] + assert "categories" in ctx["metrics"] + + +def test_context_marketplaces_is_list(conn): + ctx = build_claude_md_context(conn, user=_user(), server_url="https://example.com") + assert isinstance(ctx["marketplaces"], list) + + +# --------------------------------------------------------------------------- +# Render failure raises (caller handles) +# --------------------------------------------------------------------------- + +def test_render_raises_on_template_error(conn): + ClaudeMdTemplateRepository(conn).set( + "{{ does_not_exist }}", updated_by="admin@example.com" + ) + with pytest.raises(TemplateError): + render_claude_md(conn, user=_user(), server_url="https://example.com") + + +# --------------------------------------------------------------------------- +# RBAC-filtered tables — two users with different grants see different tables +# --------------------------------------------------------------------------- + +def _make_user(conn, *, user_id: str, email: str) -> None: + from src.repositories.users import UserRepository + UserRepository(conn).create(id=user_id, email=email, name=email.split("@")[0]) + + +def _make_group(conn, *, name: str) -> str: + from src.repositories.user_groups import UserGroupsRepository + return UserGroupsRepository(conn).create(name=name)["id"] + + +def _add_member(conn, *, user_id: str, group_id: str) -> None: + from src.repositories.user_group_members import UserGroupMembersRepository + UserGroupMembersRepository(conn).add_member(user_id, group_id, source="admin") + + +def _grant_table(conn, *, group_id: str, table_id: str) -> None: + from src.repositories.resource_grants import ResourceGrantsRepository + ResourceGrantsRepository(conn).create( + group_id=group_id, resource_type="table", resource_id=table_id + ) + + +def test_render_tables_filtered_by_rbac(conn): + """Non-admin users see only tables granted to their groups.""" + # Seed two tables + conn.execute( + "INSERT INTO table_registry (id, name, description, query_mode, source_type) " + "VALUES ('t-a', 'orders', 'Order data', 'local', 'keboola')" + ) + conn.execute( + "INSERT INTO table_registry (id, name, description, query_mode, source_type) " + "VALUES ('t-b', 'revenue', 'Revenue data', 'local', 'keboola')" + ) + + # Two users, two groups + _make_user(conn, user_id="ua", email="alice@example.com") + _make_user(conn, user_id="ub", email="bob@example.com") + gid_a = _make_group(conn, name="group-a") + gid_b = _make_group(conn, name="group-b") + _add_member(conn, user_id="ua", group_id=gid_a) + _add_member(conn, user_id="ub", group_id=gid_b) + + # Grant: group-a → t-a, group-b → t-b + _grant_table(conn, group_id=gid_a, table_id="t-a") + _grant_table(conn, group_id=gid_b, table_id="t-b") + + user_a = {"id": "ua", "email": "alice@example.com", "name": "Alice", "is_admin": False, "groups": []} + user_b = {"id": "ub", "email": "bob@example.com", "name": "Bob", "is_admin": False, "groups": []} + + ctx_a = build_claude_md_context(conn, user=user_a, server_url="https://example.com") + table_names_a = {t["name"] for t in ctx_a["tables"]} + assert "orders" in table_names_a + assert "revenue" not in table_names_a + + ctx_b = build_claude_md_context(conn, user=user_b, server_url="https://example.com") + table_names_b = {t["name"] for t in ctx_b["tables"]} + assert "revenue" in table_names_b + assert "orders" not in table_names_b + + +def test_render_tables_admin_sees_all(conn): + """Admin users see all tables regardless of grants.""" + conn.execute( + "INSERT INTO table_registry (id, name, description, query_mode, source_type) " + "VALUES ('t-x', 'alpha', 'Alpha table', 'local', 'keboola')" + ) + conn.execute( + "INSERT INTO table_registry (id, name, description, query_mode, source_type) " + "VALUES ('t-y', 'beta', 'Beta table', 'local', 'keboola')" + ) + + # Admin user: member of the Admin system group + _make_user(conn, user_id="u-admin", email="admin@example.com") + admin_gid = conn.execute("SELECT id FROM user_groups WHERE name='Admin'").fetchone()[0] + _add_member(conn, user_id="u-admin", group_id=admin_gid) + + user_admin = {"id": "u-admin", "email": "admin@example.com", "name": "Admin", "is_admin": True, "groups": []} + ctx = build_claude_md_context(conn, user=user_admin, server_url="https://example.com") + table_names = {t["name"] for t in ctx["tables"]} + assert "alpha" in table_names + assert "beta" in table_names + + +def test_render_tables_empty_for_user_with_no_grants(conn): + """Non-admin with no grants sees no tables.""" + conn.execute( + "INSERT INTO table_registry (id, name, description, query_mode, source_type) " + "VALUES ('t-z', 'secret', 'Secret table', 'local', 'keboola')" + ) + _make_user(conn, user_id="u-none", email="none@example.com") + user_none = {"id": "u-none", "email": "none@example.com", "name": "None", "is_admin": False, "groups": []} + ctx = build_claude_md_context(conn, user=user_none, server_url="https://example.com") + assert ctx["tables"] == [] diff --git a/tests/test_claude_md_template_repo.py b/tests/test_claude_md_template_repo.py new file mode 100644 index 0000000..309047e --- /dev/null +++ b/tests/test_claude_md_template_repo.py @@ -0,0 +1,40 @@ +"""Unit tests for ClaudeMdTemplateRepository.""" + +import duckdb +import pytest + +from src.db import _ensure_schema +from src.repositories.claude_md_template import ClaudeMdTemplateRepository + + +@pytest.fixture +def conn(tmp_path): + db_path = tmp_path / "system.duckdb" + c = duckdb.connect(str(db_path)) + _ensure_schema(c) + yield c + c.close() + + +def test_get_returns_none_on_fresh_install(conn): + repo = ClaudeMdTemplateRepository(conn) + row = repo.get() + assert row is not None + assert row["content"] is None # default sentinel + + +def test_set_stores_content(conn): + repo = ClaudeMdTemplateRepository(conn) + repo.set("# {{ instance.name }}", updated_by="admin@example.com") + row = repo.get() + assert row["content"] == "# {{ instance.name }}" + assert row["updated_by"] == "admin@example.com" + assert row["updated_at"] is not None + + +def test_reset_clears_content(conn): + repo = ClaudeMdTemplateRepository(conn) + repo.set("custom template", updated_by="admin@example.com") + repo.reset(updated_by="admin@example.com") + row = repo.get() + assert row["content"] is None diff --git a/tests/test_db_schema_version.py b/tests/test_db_schema_version.py index 01551f0..b2bfc1a 100644 --- a/tests/test_db_schema_version.py +++ b/tests/test_db_schema_version.py @@ -13,8 +13,8 @@ import duckdb from src.db import SCHEMA_VERSION, _ensure_schema, get_schema_version -def test_schema_version_is_22(): - assert SCHEMA_VERSION == 22 +def test_schema_version_is_23(): + assert SCHEMA_VERSION == 23 def test_v20_adds_source_query(tmp_path): @@ -29,7 +29,29 @@ def test_v20_adds_source_query(tmp_path): ).fetchall() } assert "source_query" in cols, f"source_query missing from {cols}" - assert get_schema_version(conn) == 22 + assert get_schema_version(conn) == 23 + conn.close() + + +def test_v23_adds_claude_md_template(tmp_path): + """v23 must create the claude_md_template singleton table.""" + db_path = tmp_path / "system.duckdb" + conn = duckdb.connect(str(db_path)) + _ensure_schema(conn) + + tables = { + r[0] for r in conn.execute( + "SELECT table_name FROM information_schema.tables " + "WHERE table_schema = 'main'" + ).fetchall() + } + assert "claude_md_template" in tables, f"claude_md_template missing from {tables}" + + # Singleton row seeded + row = conn.execute("SELECT id, content FROM claude_md_template WHERE id = 1").fetchone() + assert row is not None + assert row[0] == 1 + assert row[1] is None # default = no override conn.close() @@ -61,7 +83,7 @@ def test_v19_db_migrates_to_v20(tmp_path): _ensure_schema(conn) - assert get_schema_version(conn) == 22 + assert get_schema_version(conn) == 23 cols = { r[0] for r in conn.execute( "SELECT column_name FROM information_schema.columns " diff --git a/tests/test_welcome_template_api.py b/tests/test_welcome_template_api.py index c3f9fda..7320ee3 100644 --- a/tests/test_welcome_template_api.py +++ b/tests/test_welcome_template_api.py @@ -1,7 +1,7 @@ """End-to-end tests for /api/admin/welcome-template (banner editor endpoints). -GET /api/welcome has been removed — the analyst-facing endpoint is gone. -These tests cover only the admin CRUD + preview endpoints. +These tests cover the admin CRUD + preview endpoints for the Agent Setup Prompt. +GET /api/welcome is handled by test_claude_md_api.py (Agent Workspace Prompt). """ import duckdb @@ -14,8 +14,8 @@ def _auth(token: str) -> dict[str, str]: return {"Authorization": f"Bearer {token}"} -def test_get_welcome_endpoint_removed(seeded_app): - """GET /api/welcome must return 404 — the endpoint was deleted.""" +def test_get_welcome_endpoint_exists(seeded_app): + """GET /api/welcome must return 200 for authenticated analysts (endpoint restored).""" c = seeded_app["client"] token = seeded_app["analyst_token"] resp = c.get( @@ -23,7 +23,8 @@ def test_get_welcome_endpoint_removed(seeded_app): params={"server_url": "https://example.com"}, headers=_auth(token), ) - assert resp.status_code == 404 + assert resp.status_code == 200 + assert "content" in resp.json() def test_admin_get_template_initially_null(seeded_app):