From fb8f55c335bb654268f82a54a23cd2d12d0dbdf0 Mon Sep 17 00:00:00 2001 From: ZdenekSrotyr Date: Mon, 4 May 2026 15:22:10 +0200 Subject: [PATCH] docs(plan): clean-analyst-bootstrap implementation plan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 25 tasks across 6 phases: - Phase 1: server-side foundation (PAT scope/TTL, legacy-strings scan, /setup?role= branching, claude_md_template) - Phase 2: cli/lib/ shared-library tree (hooks.py, pull.py) - Phase 3: new commands (init, pull, push, status, diagnose system, snapshot create, catalog --metrics, admin metrics) - Phase 4: wiring + delete obsolete (sync, fetch, analyst, metrics) - Phase 5: test fixtures + reader smoke matrix + clean-install integration - Phase 6: CHANGELOG + final verification TDD discipline throughout: test → fail → implement → pass → commit per task. --- .../2026-05-04-clean-analyst-bootstrap.md | 3371 +++++++++++++++++ 1 file changed, 3371 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-04-clean-analyst-bootstrap.md diff --git a/docs/superpowers/plans/2026-05-04-clean-analyst-bootstrap.md b/docs/superpowers/plans/2026-05-04-clean-analyst-bootstrap.md new file mode 100644 index 0000000..d1777d9 --- /dev/null +++ b/docs/superpowers/plans/2026-05-04-clean-analyst-bootstrap.md @@ -0,0 +1,3371 @@ +# Clean Analyst Bootstrap Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Replace the interactive `da analyst setup` flow with a single web→paste→done bootstrap. New analyst pastes a clipboard prompt from `/setup?role=analyst` into Claude Code in an empty folder, and ends up with `CLAUDE.md`, `AGNES_WORKSPACE.md`, hooks, fresh data, and DuckDB views — fully ready to query. Drop dead workspace dirs (`data/parquet/`, `data/duckdb/`, `data/metadata/`, `user/artifacts/`). Establish a lazy-mkdir contract so nothing creates empty directories. + +**Architecture:** PAT-only auth. `da init` is a thin orchestrator that auths, fetches `CLAUDE.md` from `/api/welcome`, installs hooks, and calls `cli/lib/pull.py:run_pull` for first data refresh. CLI verbs renamed: `init/pull/push/status/snapshot create` (greenfield, no aliases). Server-side install prompt branches on `role` query param. `cli/lib/` shared library tree separates data primitives from Typer wrappers so `da init` can call them without subprocess. Reader contract: every reader handles missing dirs gracefully (exit 0 empty or exit 1 with friendly hint, never traceback). + +**Tech Stack:** Python 3.11+, FastAPI, Typer, Pydantic, DuckDB, httpx, pytest, Hatchling. + +**Spec:** `docs/superpowers/specs/2026-05-04-clean-analyst-bootstrap-design.md` (revision 4, cleared for implementation). + +--- + +## File Structure + +### New files + +| Path | Responsibility | +|---|---| +| `cli/lib/__init__.py` | Empty — makes `cli/lib/` a package so Hatchling includes it in the wheel. | +| `cli/lib/pull.py` | `run_pull(server_url, token, workspace, *, dry_run) -> PullResult` — pure-function data refresh primitive (manifest, parquet download, DuckDB rebuild, memory bundle write). Lazy mkdir. | +| `cli/lib/hooks.py` | `install_claude_hooks(workspace)` — idempotent SessionStart/End hook installer for `/.claude/settings.json`. | +| `cli/commands/init.py` | `da init` Typer command — auth check, save config, write CLAUDE.md, install hooks, call `run_pull`, write `AGNES_WORKSPACE.md`. | +| `cli/commands/pull.py` | `da pull` Typer wrapper around `cli/lib/pull.py:run_pull`. Flags `--quiet`, `--json`, `--dry-run`. | +| `cli/commands/push.py` | `da push` Typer command — uploads `user/sessions/*.jsonl` and `.claude/CLAUDE.local.md`. | +| `cli/commands/admin_metrics.py` | `da admin metrics {import,export,validate}` sub-Typer (lifted from `cli/commands/metrics.py`). | +| `config/agnes_workspace_template.txt` | Static client-side template for `AGNES_WORKSPACE.md`. Three placeholders: `{created_at}`, `{server_url}`, `{workspace_path}`. | +| `tests/fixtures/analyst_bootstrap.py` | Test fixtures: `fastapi_test_server`, `test_pat`, `test_pat_no_grants`, `zero_grants_workspace`, `web_session`. | +| `tests/test_lib_hooks.py` | Tests for `install_claude_hooks` (idempotent, preserves third-party hooks, replaces old `da pull`/`da sync` entries). | +| `tests/test_lib_pull.py` | Tests for `run_pull` (lazy mkdir, partial failure handling, manifest empty case). | +| `tests/test_setup_instructions_analyst.py` | Tests `render_setup_instructions(role="analyst")` produces correct steps. | +| `tests/test_tokens_bootstrap_scope.py` | Tests `scope=bootstrap-analyst` PATs are TTL-clamped to ≤ 1 h; `ttl_seconds` upper bound; `ttl_seconds` wins over `expires_in_days`. | +| `tests/test_legacy_strings_scan.py` | Tests `_scan_legacy_strings` and `legacy_strings_detected` field on `GET /api/admin/workspace-prompt-template`. | +| `tests/test_clean_install_integration.py` | End-to-end clean-install integration tests (minimal grants, zero grants, force preserves, AGNES_WORKSPACE.md content). | +| `tests/test_reader_smoke_matrix.py` | Reader smoke matrix — every CLI command on a freshly-bootstrapped zero-grants workspace, asserts no traceback. | + +### Modified files + +| Path | Change | +|---|---| +| `app/api/tokens.py` | `CreateTokenRequest`: add `scope: str = "general"` and `ttl_seconds: Optional[int] = None`. Validate `ttl_seconds <= 315_360_000`. Resolution: `ttl_seconds` wins; fall back to `expires_in_days`. For `scope == "bootstrap-analyst"`, force-clamp resolved TTL ≤ 3600 s. Audit-log includes scope. | +| `app/api/claude_md.py` | Add module-level `_LEGACY_STRINGS = ("data/parquet", "da sync", "da fetch", "da analyst setup", "da metrics list", "da metrics show")`. Add helper `_scan_legacy_strings(text) -> list[str]`. Add field `legacy_strings_detected: list[str] = []` to `TemplateGetResponse`. Populate in `admin_get_workspace_template`. | +| `app/web/setup_instructions.py` | Add `role: Literal["analyst","admin"] = "admin"` to `resolve_lines()` and `render_setup_instructions()`. Analyst layout: TLS trust (when `ca_pem`) → install `da` → `da init --server-url X --token Y --workspace .` → `da catalog` smoke verify → confirm. Drop for analyst: marketplace, plugins, skills, diagnose, login, whoami. | +| `app/web/router.py` | `setup_page`: read `role` query param (default `"admin"`), pass to `render_setup_instructions(role=...)`. | +| `app/web/templates/setup.html` (or wherever `setup_page` renders) | Two role tiles (Analyst / Admin), POST `/auth/tokens` with matching `scope`. | +| `app/web/templates/admin_workspace_prompt.html` | Yellow banner above editor when `legacy_strings_detected` non-empty. | +| `config/claude_md_template.txt` | Update verb names: `da sync` → `da pull`, `da fetch` → `da snapshot create`, `da metrics list/show` → `da catalog --metrics`, `da analyst setup` → `da init`. Path strings: `data/parquet/` → `server/parquet/`, `data/duckdb/...` → `user/duckdb/analytics.duckdb`. | +| `cli/commands/snapshot.py` | Add `create` subcommand — moves logic from `cli/commands/fetch.py` verbatim. Add `if not db_path.exists(): exit 1` guard before `duckdb.connect()`. | +| `cli/commands/catalog.py` | Add `--metrics` flag (replaces `da metrics list`); `--metrics --show ` (replaces `da metrics show`). | +| `cli/commands/admin.py` | Register the new `admin_metrics_app` sub-Typer. | +| `cli/commands/query.py` | Update hint text "Run: da sync" → "Run: da pull" in two places. | +| `cli/commands/explore.py` | Update hint text "Run: da sync" → "Run: da pull". | +| `cli/main.py` | Drop registrations for `sync`, `analyst`, `metrics`, `fetch`, `status` (existing). Add `init`, `pull`, `push`. Re-register `status` to point at new workspace-status command. | +| `CLAUDE.md` (repo root) | Verb + path rewrites throughout. The "Local sync & Claude Code hooks" subsection rewrites verbatim with new commands. The "Querying Agnes data — agent rails" subsection keeps the 0.32.0 `query_mode='materialized'` and `query_mode='remote'` cost-guardrail prose verbatim, just verb-renaming `da fetch` → `da snapshot create`. | +| `CHANGELOG.md` | Entry under `[Unreleased]` per spec preview (Changed/Added/Fixed/Removed/Kept). | +| `pyproject.toml` | No change; `cli/lib/__init__.py` triggers Hatchling auto-discovery. | + +### Deleted files + +| Path | Reason | +|---|---| +| `cli/commands/sync.py` | Replaced by `cli/commands/pull.py` + `cli/commands/push.py` + `cli/lib/pull.py`. | +| `cli/commands/fetch.py` | Folded into `cli/commands/snapshot.py:create`. | +| `cli/commands/analyst.py` | Replaced by `cli/commands/init.py` + new `cli/commands/status.py` (workspace status). `_install_claude_hooks` lifted to `cli/lib/hooks.py`. | +| `cli/commands/metrics.py` | Read paths fold into `da catalog --metrics`; write paths move to `cli/commands/admin_metrics.py`. | + +### Existing `cli/commands/status.py` rename + +| Action | Detail | +|---|---| +| Existing `da status` ("System status") | Renamed to `da diagnose system` (subcommand under `diagnose_app`) — its content is a subset of what `da diagnose` already does. | +| New `da status` | Workspace status — fresh implementation, replaces `da analyst status`. Lives in `cli/commands/status.py` (overwrite). | + +--- + +## Phase 1 — Server-side foundation (PAT scope, legacy-strings scan, install-prompt branching) + +### Task 1: Add `scope` + `ttl_seconds` fields to `CreateTokenRequest` + +**Files:** +- Modify: `app/api/tokens.py:23-25` (`CreateTokenRequest`), `app/api/tokens.py:85-101` (`create_token` route) +- Test: `tests/test_tokens_bootstrap_scope.py` (new) + +- [ ] **Step 1: Write failing tests** + +```python +# tests/test_tokens_bootstrap_scope.py +"""Tests for PAT scope + ttl_seconds fields (clean-analyst-bootstrap spec).""" + +from __future__ import annotations + +import jwt +import pytest + + +@pytest.fixture +def web_session(client, db_with_admin_user): + """Authenticated test client with session cookie for admin user.""" + # Form-login endpoint — see fixtures/analyst_bootstrap.py + resp = client.post("/auth/password/login/web", + data={"email": "admin@example.com", "password": "test-password"}) + assert resp.status_code in (200, 302), f"login failed: {resp.text}" + return client + + +def _decode(pat: str) -> dict: + return jwt.decode(pat, options={"verify_signature": False}) + + +def test_bootstrap_pat_ttl_clamped_to_one_hour(web_session): + resp = web_session.post("/auth/tokens", json={ + "name": "init", + "scope": "bootstrap-analyst", + "ttl_seconds": 86400, # 1 day — must be ignored, clamped to 3600 + }) + assert resp.status_code == 201, resp.text + payload = _decode(resp.json()["token"]) + assert payload.get("scope") == "bootstrap-analyst" + assert payload["exp"] - payload["iat"] <= 3600 + 5 + + +def test_general_pat_uses_ttl_seconds_when_set(web_session): + resp = web_session.post("/auth/tokens", json={ + "name": "test", + "ttl_seconds": 7200, # 2 hours + }) + assert resp.status_code == 201 + payload = _decode(resp.json()["token"]) + assert payload["exp"] - payload["iat"] <= 7200 + 5 + + +def test_general_pat_falls_back_to_expires_in_days(web_session): + resp = web_session.post("/auth/tokens", json={ + "name": "test", "expires_in_days": 30, + }) + assert resp.status_code == 201 + payload = _decode(resp.json()["token"]) + assert payload["exp"] - payload["iat"] <= 30 * 86400 + 5 + + +def test_ttl_seconds_upper_bound(web_session): + # 3650 days * 86400 = 315_360_000 seconds. One past this must reject. + resp = web_session.post("/auth/tokens", json={ + "name": "test", "ttl_seconds": 315_360_001, + }) + assert resp.status_code == 400 + + +def test_ttl_seconds_must_be_positive(web_session): + resp = web_session.post("/auth/tokens", json={ + "name": "test", "ttl_seconds": 0, + }) + assert resp.status_code == 400 + + +def test_scope_default_is_general(web_session): + resp = web_session.post("/auth/tokens", json={"name": "test"}) + assert resp.status_code == 201 + payload = _decode(resp.json()["token"]) + # scope=general is informational; check audit_log carries it + # (skipped here — tested in test_audit_log_includes_scope below) + assert payload.get("scope", "general") == "general" +``` + +The `db_with_admin_user` fixture is part of the existing test suite or will be added in `tests/fixtures/analyst_bootstrap.py` (Task 22). For now, this test depends on it; if it doesn't exist, mark these as `pytest.skip` until Task 22. + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +cd "$(git rev-parse --show-toplevel)" +pytest tests/test_tokens_bootstrap_scope.py -v +``` + +Expected: tests FAIL with either fixture-missing error or `extra fields not permitted` from Pydantic (if the fixture exists). + +- [ ] **Step 3: Update `CreateTokenRequest` model** + +Replace `app/api/tokens.py:23-25`: + +```python +class CreateTokenRequest(BaseModel): + name: str + expires_in_days: Optional[int] = 90 # null = no expiry + scope: str = "general" # informational; "bootstrap-analyst" force-clamps TTL ≤ 1 h + ttl_seconds: Optional[int] = None # if set, wins over expires_in_days +``` + +- [ ] **Step 4: Update `create_token` route** + +Replace `app/api/tokens.py:85-118` (the `create_token` function body up through the `jwt_token = create_access_token(...)` call): + +```python +@router.post("", response_model=CreateTokenResponse, status_code=201) +async def create_token( + payload: CreateTokenRequest, + user: dict = Depends(require_session_token), + conn: duckdb.DuckDBPyConnection = Depends(_get_db), +): + if not payload.name.strip(): + raise HTTPException(status_code=400, detail="name is required") + if payload.expires_in_days is not None and payload.expires_in_days <= 0: + raise HTTPException(status_code=400, detail="expires_in_days must be a positive integer") + if payload.expires_in_days is not None and payload.expires_in_days > 3650: + raise HTTPException(status_code=400, detail="expires_in_days must not exceed 3650 (10 years)") + if payload.ttl_seconds is not None and payload.ttl_seconds <= 0: + raise HTTPException(status_code=400, detail="ttl_seconds must be a positive integer") + # Mirror the 3650-day cap on ttl_seconds so a hostile client can't + # bypass via field rename. 3650 days * 86400 = 315_360_000. + if payload.ttl_seconds is not None and payload.ttl_seconds > 315_360_000: + raise HTTPException(status_code=400, detail="ttl_seconds must not exceed 315360000 (10 years)") + + # Resolve TTL: ttl_seconds wins; fall back to expires_in_days. + expires_delta: Optional[timedelta] = None + omit_exp = False + if payload.ttl_seconds is not None: + expires_delta = timedelta(seconds=payload.ttl_seconds) + elif payload.expires_in_days is not None: + expires_delta = timedelta(days=payload.expires_in_days) + else: + omit_exp = True # "no expiry" + + # Force-clamp bootstrap-analyst PATs to ≤ 1 h regardless of request. + if payload.scope == "bootstrap-analyst": + ONE_HOUR = timedelta(hours=1) + if expires_delta is None or expires_delta > ONE_HOUR: + expires_delta = ONE_HOUR + omit_exp = False + + expires_at: Optional[datetime] = None + if expires_delta is not None: + expires_at = datetime.now(timezone.utc) + expires_delta + + repo = AccessTokenRepository(conn) + token_id = str(uuid.uuid4()) + + jwt_token = create_access_token( + user_id=user["id"], email=user["email"], + token_id=token_id, typ="pat", + expires_delta=expires_delta, omit_exp=omit_exp, + extra_claims={"scope": payload.scope}, + ) +``` + +- [ ] **Step 5: Update `create_access_token` to accept `extra_claims`** + +Find `app/auth/jwt.py:create_access_token` (read it first to get the current signature). Add an `extra_claims: dict | None = None` parameter that gets merged into the JWT payload before encoding. Show your edit: + +```bash +grep -n "def create_access_token" app/auth/jwt.py +# Read the function and update. +``` + +If the function already supports extra claims, this is a no-op. Otherwise add: + +```python +def create_access_token( + user_id: str, email: str, token_id: Optional[str] = None, + typ: str = "session", expires_delta: Optional[timedelta] = None, + omit_exp: bool = False, extra_claims: Optional[dict] = None, +) -> str: + payload = {"sub": user_id, "email": email, "typ": typ} + if token_id: + payload["jti"] = token_id + if not omit_exp: + payload["iat"] = int(datetime.now(timezone.utc).timestamp()) + if expires_delta: + payload["exp"] = int((datetime.now(timezone.utc) + expires_delta).timestamp()) + if extra_claims: + payload.update(extra_claims) + return jwt.encode(payload, _SECRET, algorithm="HS256") +``` + +(Adapt to the actual function shape after reading it.) + +- [ ] **Step 6: Update audit-log entry to include scope** + +Search `app/api/tokens.py` for the `_audit(...)` call inside `create_token` and add `scope` to the `params` dict: + +```python +_audit(conn, actor=user["id"], action="token.create", + target=token_id, + params={"name": payload.name, + "expires_at": str(expires_at) if expires_at else None, + "scope": payload.scope}) +``` + +- [ ] **Step 7: Run tests to verify they pass** + +```bash +pytest tests/test_tokens_bootstrap_scope.py -v +``` + +Expected: all PASS (or skip if `db_with_admin_user` fixture doesn't yet exist; in that case the failure mode is a clear fixture-not-found error, not a logic error). + +- [ ] **Step 8: Run the full token test suite to verify no regression** + +```bash +pytest tests/ -k token -v +``` + +Expected: all token-related tests PASS. + +- [ ] **Step 9: Commit** + +```bash +git add app/api/tokens.py app/auth/jwt.py tests/test_tokens_bootstrap_scope.py +git commit -m "feat(tokens): add scope + ttl_seconds fields with bootstrap-analyst clamp" +``` + +--- + +### Task 2: Add `_LEGACY_STRINGS` scan to admin workspace-prompt endpoint + +**Files:** +- Modify: `app/api/claude_md.py` (add `_LEGACY_STRINGS`, `_scan_legacy_strings`, augment `TemplateGetResponse`, populate in `admin_get_workspace_template`) +- Test: `tests/test_legacy_strings_scan.py` (new) + +- [ ] **Step 1: Write failing tests** + +```python +# tests/test_legacy_strings_scan.py +"""Tests for legacy-string scan in admin CLAUDE.md template endpoint.""" + +from app.api.claude_md import _scan_legacy_strings, _LEGACY_STRINGS + + +def test_scan_finds_all_known_legacy_strings(): + text = """ + Run `da sync` then `da fetch web_sessions --where ...`. + Old workspace at data/parquet/ — see `da analyst setup`. + Use `da metrics list` and `da metrics show `. + """ + hits = _scan_legacy_strings(text) + assert "da sync" in hits + assert "da fetch" in hits + assert "data/parquet" in hits + assert "da analyst setup" in hits + assert "da metrics list" in hits + assert "da metrics show" in hits + + +def test_scan_returns_empty_for_clean_text(): + text = "Use `da pull` to refresh, `da snapshot create` for ad-hoc, `server/parquet/`." + assert _scan_legacy_strings(text) == [] + + +def test_scan_returns_unique_sorted_hits(): + text = "da sync da sync data/parquet/ data/parquet/foo" + hits = _scan_legacy_strings(text) + assert hits == sorted(set(hits)) + + +def test_legacy_strings_constant_shape(): + assert isinstance(_LEGACY_STRINGS, tuple) + assert all(isinstance(s, str) for s in _LEGACY_STRINGS) + assert "da sync" in _LEGACY_STRINGS + assert "data/parquet" in _LEGACY_STRINGS +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +pytest tests/test_legacy_strings_scan.py -v +``` + +Expected: FAIL with `ImportError: cannot import name '_scan_legacy_strings' from 'app.api.claude_md'`. + +- [ ] **Step 3: Add `_LEGACY_STRINGS` and `_scan_legacy_strings` to `app/api/claude_md.py`** + +Insert near the other module-level constants (after the imports, before the class definitions — find a stable location, e.g., right before `class ClaudeMdResponse`): + +```python +# Substrings that, when found in an admin-saved CLAUDE.md override, signal +# the override is stale relative to the post-clean-bootstrap CLI surface. +# Surfaced via TemplateGetResponse.legacy_strings_detected so the admin UI +# can render a yellow banner prompting re-authoring. +_LEGACY_STRINGS = ( + "data/parquet", + "da sync", + "da fetch", + "da analyst setup", + "da metrics list", + "da metrics show", +) + + +def _scan_legacy_strings(text: str) -> list[str]: + """Return sorted unique substrings from _LEGACY_STRINGS present in text.""" + return sorted({s for s in _LEGACY_STRINGS if s in text}) +``` + +- [ ] **Step 4: Run tests to verify they pass** + +```bash +pytest tests/test_legacy_strings_scan.py -v +``` + +Expected: all PASS. + +- [ ] **Step 5: Augment `TemplateGetResponse`** + +Find `class TemplateGetResponse` (around `app/api/claude_md.py:72-76`) and add the field: + +```python +class TemplateGetResponse(BaseModel): + content: Optional[str] + default: str + updated_at: Optional[str] = None + updated_by: Optional[str] = None + legacy_strings_detected: list[str] = [] # populated when override contains stale verbs/paths +``` + +- [ ] **Step 6: Populate the field in `admin_get_workspace_template`** + +Find the route (search for `admin_get_workspace_template` in `app/api/claude_md.py`). Inside the function body, before constructing the response, add: + +```python +# Scan the saved override (not the live default) for legacy strings. +# A non-empty list triggers the yellow banner in the admin UI. +override_text = override.content if override else "" +legacy_hits = _scan_legacy_strings(override_text) +``` + +Then include `legacy_strings_detected=legacy_hits` in the `TemplateGetResponse(...)` construction. + +- [ ] **Step 7: Add an HTTP test for the populated field** + +Append to `tests/test_legacy_strings_scan.py`: + +```python +def test_admin_get_template_returns_legacy_strings_when_override_dirty(web_session): + """Setting an override containing legacy strings populates the field.""" + web_session.put("/api/admin/workspace-prompt-template", + json={"content": "Run `da sync` and check data/parquet/."}) + resp = web_session.get("/api/admin/workspace-prompt-template") + assert resp.status_code == 200 + body = resp.json() + assert "da sync" in body["legacy_strings_detected"] + assert "data/parquet" in body["legacy_strings_detected"] + + +def test_admin_get_template_returns_empty_when_clean(web_session): + web_session.put("/api/admin/workspace-prompt-template", + json={"content": "Use `da pull` and check `server/parquet/`."}) + resp = web_session.get("/api/admin/workspace-prompt-template") + assert resp.json()["legacy_strings_detected"] == [] +``` + +These depend on `web_session` fixture from Task 22; mark `pytest.skip` if not yet present. + +- [ ] **Step 8: Run all `claude_md` tests** + +```bash +pytest tests/test_legacy_strings_scan.py tests/ -k claude_md -v +``` + +Expected: PASS (skip the HTTP tests if fixture missing — that's OK). + +- [ ] **Step 9: Commit** + +```bash +git add app/api/claude_md.py tests/test_legacy_strings_scan.py +git commit -m "feat(admin): scan CLAUDE.md override for legacy strings" +``` + +--- + +### Task 3: Add `role` parameter to `setup_instructions.py` (analyst branch) + +**Files:** +- Modify: `app/web/setup_instructions.py` (add `role` parameter to `resolve_lines` and `render_setup_instructions`; add analyst-branch helper) +- Test: `tests/test_setup_instructions_analyst.py` (new) + +- [ ] **Step 1: Write failing tests** + +```python +# tests/test_setup_instructions_analyst.py +"""Tests for analyst-branch rendering of /setup paste prompt.""" + +from app.web.setup_instructions import render_setup_instructions + + +def test_render_analyst_role_basic(): + text = render_setup_instructions( + server_url="https://agnes.example.com", + token="agnes_pat_TEST", + wheel_filename="agnes-0.32.0-py3-none-any.whl", + role="analyst", + ) + # Required content for analyst role: + assert "uv tool install" in text + assert "da init" in text + assert "--token" in text and "agnes_pat_TEST" in text + assert "--server-url" in text and "https://agnes.example.com" in text + assert "da catalog" in text # smoke verify step + # Forbidden content (admin-only): + assert "marketplace" not in text + assert "claude plugin install" not in text + assert "da skills install" not in text # analyst doesn't bulk-install skills + assert "da diagnose" not in text # analyst smoke verify is `da catalog`, not diagnose + + +def test_render_admin_role_unchanged(): + """Default role=admin keeps the existing 6/8-step layout.""" + text = render_setup_instructions( + server_url="https://agnes.example.com", + token="agnes_pat_TEST", + wheel_filename="agnes-0.32.0-py3-none-any.whl", + # role omitted — defaults to "admin" + ) + assert "da auth import-token" in text # admin uses import-token, not da init + assert "da diagnose" in text # admin keeps diagnose + + +def test_render_analyst_with_ca_pem(): + """Analyst role + private CA → TLS trust block reused from admin path.""" + text = render_setup_instructions( + server_url="https://agnes.example.com", + token="agnes_pat_TEST", + wheel_filename="agnes-0.32.0-py3-none-any.whl", + role="analyst", + ca_pem="-----BEGIN CERTIFICATE-----\nMIIBxxx\n-----END CERTIFICATE-----", + ) + assert "AGNES_CA_PEM" in text # heredoc marker from trust block + assert "ca-bundle.pem" in text + assert "da init" in text # analyst-specific step still present +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +pytest tests/test_setup_instructions_analyst.py -v +``` + +Expected: FAIL — `render_setup_instructions()` doesn't accept `role` parameter. + +- [ ] **Step 3: Add analyst-branch helper functions** + +Insert after `_install_cli_lines` (around line 311 in `setup_instructions.py`): + +```python +def _analyst_init_lines(server_url_placeholder: str = "{server_url}") -> list[str]: + """Steps 2-3 — `da init` (auth + workspace bootstrap) + smoke verify. + + Replaces the admin-flow login + verify steps (today: `da auth import-token` + + `da auth whoami`). `da init` is non-interactive: `--token` carries the PAT, + `--server-url` carries the origin. The bootstrap PAT has a 1 h TTL — if the + user takes longer than that to paste this prompt, the init call returns 401 + and the user re-clicks "Generate prompt" on the install page. + """ + return [ + "", + "2) Bootstrap your analyst workspace in this directory:", + f" da init --server-url \"{server_url_placeholder}\" --token \"{{token}}\" --workspace .", + "", + " This authenticates with the PAT, fetches your CLAUDE.md (RBAC-filtered),", + " installs Claude Code SessionStart/End hooks (auto-refresh), and runs an", + " initial `da pull` so your DuckDB views are ready.", + "", + "3) Verify the data is queryable:", + " da catalog", + "", + " This should list the tables your account has grants for. Empty list", + " means your admin hasn't granted you access yet — contact them.", + ] + + +def _analyst_finale_lines(confirm_step_num: str, has_ca: bool) -> list[str]: + """Final Confirm step for analyst role. Shorter than admin: no marketplace, + no plugins, no skills.""" + bullets = [ + " - `da --version` output", + " - First few lines of `da catalog` (tables you can see)", + " - Confirmation that `./CLAUDE.md` and `./AGNES_WORKSPACE.md` exist", + " - Confirmation that `./.claude/settings.json` contains SessionStart/End hooks", + ] + if has_ca: + bullets.append( + " - Which CA bundle source got picked in step 0(d)" + ) + return [ + "", + f"{confirm_step_num}) Confirm:", + " Tell me \"Agnes analyst workspace is ready\" and summarize:", + *bullets, + ] +``` + +- [ ] **Step 4: Add `role` parameter to `resolve_lines` and `render_setup_instructions`** + +Find `def resolve_lines(...)` (around line 609). Modify the signature and dispatch: + +```python +from typing import Literal + +def resolve_lines( + wheel_filename: str, + *, + plugin_install_names: list[str] | None = None, + self_signed_tls: bool = False, + server_host: str = "", + ca_pem: str | None = None, + role: Literal["analyst", "admin"] = "admin", +) -> list[str]: + """...""" + if role == "analyst": + return _resolve_analyst_lines(wheel_filename, ca_pem=ca_pem) + # Existing admin path: + names = list(plugin_install_names or []) + has_marketplace = bool(names) + has_ca = bool(ca_pem and ca_pem.strip()) + # ... (existing body unchanged) +``` + +Add the new analyst dispatcher right after `resolve_lines`: + +```python +def _resolve_analyst_lines(wheel_filename: str, *, ca_pem: str | None) -> list[str]: + """Analyst workspace-bootstrap layout. Self-contained — no admin-only steps.""" + has_ca = bool(ca_pem and ca_pem.strip()) + confirm_step = "4" if has_ca else "4" # numbering: 0 (TLS optional), 1, 2, 3, 4 + + lines: list[str] = [] + if has_ca: + lines.extend(_tls_trust_block(ca_pem)) + lines.extend(_preamble_lines(has_ca=has_ca)) + lines.extend(_install_cli_lines(has_ca=has_ca)) # step 1 + lines.extend(_analyst_init_lines()) # steps 2-3 + lines.extend(_analyst_finale_lines(confirm_step, has_ca=has_ca)) # step 4 + + return [ + line.replace("{wheel_filename}", wheel_filename) + for line in lines + ] +``` + +Update `render_setup_instructions` to accept and forward `role`: + +```python +def render_setup_instructions( + server_url: str, + token: str, + wheel_filename: str = "agnes.whl", + *, + plugin_install_names: list[str] | None = None, + self_signed_tls: bool = False, + server_host: str = "", + ca_pem: str | None = None, + role: Literal["analyst", "admin"] = "admin", +) -> str: + lines = resolve_lines( + wheel_filename, + plugin_install_names=plugin_install_names, + self_signed_tls=self_signed_tls, + server_host=server_host, + ca_pem=ca_pem, + role=role, + ) + text = "\n".join(lines) + return text.replace("{server_url}", server_url).replace("{token}", token) +``` + +- [ ] **Step 5: Run tests to verify they pass** + +```bash +pytest tests/test_setup_instructions_analyst.py -v +``` + +Expected: all PASS. + +- [ ] **Step 6: Run regression on existing setup-instruction tests** + +```bash +pytest tests/ -k setup_instructions -v +``` + +Expected: existing admin-role tests still PASS (no regression). + +- [ ] **Step 7: Commit** + +```bash +git add app/web/setup_instructions.py tests/test_setup_instructions_analyst.py +git commit -m "feat(setup): add analyst role to install-prompt renderer" +``` + +--- + +### Task 4: Add `role` query branching to `/setup` route + +**Files:** +- Modify: `app/web/router.py` (`setup_page` around line 717 — read `role` query param, pass to renderer) +- Test: `tests/test_setup_page_roles.py` (new) + +- [ ] **Step 1: Read existing `setup_page` to understand its current shape** + +```bash +grep -n "setup_page\|/setup\|/install" app/web/router.py | head +``` + +Read the function (~30 lines around the match) to understand its current call sites and template rendering. + +- [ ] **Step 2: Write failing tests** + +```python +# tests/test_setup_page_roles.py +"""Tests for /setup role query-param branching.""" + + +def test_setup_page_default_role_is_admin(client): + resp = client.get("/setup") + assert resp.status_code == 200 + # Admin tile is active; analyst tile is linked. + assert "Admin CLI" in resp.text or "role=admin" in resp.text + + +def test_setup_page_analyst_role(client): + resp = client.get("/setup?role=analyst") + assert resp.status_code == 200 + assert "Analyst workspace" in resp.text or "role=analyst" in resp.text + + +def test_install_redirects_to_setup(client): + resp = client.get("/install", follow_redirects=False) + assert resp.status_code in (302, 307) + assert "/setup" in resp.headers["location"] +``` + +- [ ] **Step 3: Run tests to verify they fail** + +```bash +pytest tests/test_setup_page_roles.py -v +``` + +Expected: tests for analyst/role-branching content FAIL; redirect test may PASS (existing behavior). + +- [ ] **Step 4: Modify `setup_page` to read `role` query param** + +Find `setup_page` in `app/web/router.py`. Update its signature to add a `role` query param and pass it to the renderer: + +```python +from typing import Literal +from fastapi import Query + +@router.get("/setup", response_class=HTMLResponse) +async def setup_page( + request: Request, + role: Literal["analyst", "admin"] = Query(default="admin", description="Bootstrap target role"), + # ... existing dependencies (auth, etc.) +): + """Renders the role-specific install paste prompt.""" + # ... existing context-building code ... + ctx["role"] = role + return templates.TemplateResponse(request, "setup.html", ctx) +``` + +If `setup_page` already calls `render_setup_instructions(...)` server-side (vs. JS-rendered), pass `role` there too: + +```python +prompt_text = render_setup_instructions( + server_url=str(request.base_url).rstrip("/"), + token="{token}", # placeholder filled by JS at click time + wheel_filename=resolved_wheel, + plugin_install_names=plugin_install_names if role == "admin" else None, + self_signed_tls=..., + server_host=..., + ca_pem=..., + role=role, +) +``` + +- [ ] **Step 5: Update `setup.html` template to render role tiles** + +Find `app/web/templates/setup.html` (or whatever `setup_page` actually renders — `grep -n "setup.html\|TemplateResponse" app/web/router.py`). Add two role tiles near the top of the body: + +```html + +``` + +If the template is more sophisticated (e.g., with role-specific JS), wire the JS to use the `role` ctx variable when calling `POST /auth/tokens` for PAT minting: + +```javascript +const role = "{{ role }}"; +const scope = role === "analyst" ? "bootstrap-analyst" : "general"; +const ttlSeconds = role === "analyst" ? 3600 : 86400; // analyst short-lived + +await fetch('/auth/tokens', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ name: `setup-${role}`, scope, ttl_seconds: ttlSeconds }), +}); +``` + +- [ ] **Step 6: Run tests to verify they pass** + +```bash +pytest tests/test_setup_page_roles.py -v +``` + +Expected: all PASS. + +- [ ] **Step 7: Commit** + +```bash +git add app/web/router.py app/web/templates/setup.html tests/test_setup_page_roles.py +git commit -m "feat(setup): /setup?role=analyst|admin branching with role tiles" +``` + +--- + +### Task 5: Add legacy-strings banner to admin workspace-prompt template UI + +**Files:** +- Modify: `app/web/templates/admin_workspace_prompt.html` (add banner above editor when `legacy_strings_detected` non-empty) +- Test: `tests/test_legacy_strings_scan.py` (extend with HTML rendering test) + +- [ ] **Step 1: Read existing admin-prompt template** + +```bash +cat app/web/templates/admin_workspace_prompt.html +``` + +Find where the editor (textarea) is rendered. + +- [ ] **Step 2: Write extension test** + +Append to `tests/test_legacy_strings_scan.py`: + +```python +def test_admin_prompt_template_renders_banner_when_legacy_present(web_session): + web_session.put("/api/admin/workspace-prompt-template", + json={"content": "Run `da sync` daily."}) + resp = web_session.get("/admin/workspace-prompt") + assert resp.status_code == 200 + assert "yellow" in resp.text.lower() or "warning" in resp.text.lower() + assert "da sync" in resp.text # the hit is rendered in the banner + + +def test_admin_prompt_template_no_banner_when_clean(web_session): + web_session.put("/api/admin/workspace-prompt-template", + json={"content": "Run `da pull` daily."}) + resp = web_session.get("/admin/workspace-prompt") + assert resp.status_code == 200 + # The banner div is absent or empty + # Implementation: e.g., id="legacy-banner" wraps the warning; check empty + assert "legacy-banner" not in resp.text or "display: none" in resp.text or len( + [l for l in resp.text.split("\n") if "legacy-banner" in l and "hidden" not in l] + ) <= 2 +``` + +(The exact test is fragile — strengthen once the implementation lands.) + +- [ ] **Step 3: Run tests to verify they fail** + +```bash +pytest tests/test_legacy_strings_scan.py -v -k banner +``` + +- [ ] **Step 4: Modify `admin_workspace_prompt.html`** + +Find the spot above the editor `