diff --git a/docs/superpowers/specs/2026-05-04-clean-analyst-bootstrap-design.md b/docs/superpowers/specs/2026-05-04-clean-analyst-bootstrap-design.md index f57ad1e..62ca945 100644 --- a/docs/superpowers/specs/2026-05-04-clean-analyst-bootstrap-design.md +++ b/docs/superpowers/specs/2026-05-04-clean-analyst-bootstrap-design.md @@ -1,6 +1,6 @@ # Clean analyst bootstrap — design -**Date:** 2026-05-04 (revision 3 — round-2 review + rebase on `main` ≥ 0.32.0) +**Date:** 2026-05-04 (revision 4 — round-3 review fixes; cleared for implementation) **Branch:** `zs/clean-analyst-bootstrap-spec` **Status:** Draft (approved by user, pre-implementation) **Successor to:** today's `da analyst setup` flow (interactive email/password) and the empty-folder bug under `da sync`. @@ -172,19 +172,19 @@ Reader commands explicitly listed (`da explore`, `da disk-info`, `da snapshot re | Analyst install-prompt branch | `app/web/setup_instructions.py` | Add `role: Literal["analyst","admin"]="admin"` to `resolve_lines()` and `render_setup_instructions()`. Analyst layout: TLS trust block (reused, when `ca_pem` supplied) → install `da` (reused) → `da init --server-url X --token Y --workspace .` → `da catalog` smoke verify → confirm. Drop for analyst: marketplace, plugins, skills, diagnose, login, whoami (all subsumed by `da init`). | | `/setup?role=...` query branching | `app/web/router.py` `setup_page` (line 717) | Read `role` query param, default `"admin"`. Pass to `render_setup_instructions(role=...)`. Existing `/install` 302 redirect to `/setup` is preserved (legacy bookmarks keep working). | | `setup.html` UI | `app/web/templates/setup.html` (or wherever `setup_page` renders) | Two role tiles: "Analyst workspace" / "Admin CLI". PAT mint button per tile, posts to `/auth/tokens` with `scope` matching the tile. Renders the prompt for the selected role. | -| PAT scope + TTL clamp | `app/api/tokens.py` (`CreateTokenRequest` Pydantic model + `create_token` route) | Add two fields: `scope: str = "general"` and `ttl_seconds: int \| None = None` (alongside the existing `expires_in_days`). Resolution: when `ttl_seconds` is set, it wins; otherwise fall back to `expires_in_days`. For `scope == "bootstrap-analyst"`, server force-clamps the resolved TTL to ≤ 3600 s regardless of request. Audit-log entry includes the scope. The audit log is the only consumer of `scope` in this PR — per-endpoint enforcement is an explicit follow-up. | +| PAT scope + TTL clamp | `app/api/tokens.py` (`CreateTokenRequest` Pydantic model + `create_token` route) | Add two fields: `scope: str = "general"` and `ttl_seconds: int \| None = None` (alongside the existing `expires_in_days: Optional[int] = 90` at lines 23-25). Resolution: when `ttl_seconds` is set, it wins; otherwise fall back to `expires_in_days`. **Upper bound:** mirror the existing `expires_in_days <= 3650` cap at line 100 with `ttl_seconds <= 315_360_000` (3650 days × 86400 s) so a hostile client can't bypass the cap by switching field names. For `scope == "bootstrap-analyst"`, server force-clamps the resolved TTL to ≤ 3600 s regardless of request. Audit-log entry includes the scope. The audit log is the only consumer of `scope` in this PR — per-endpoint enforcement is an explicit follow-up. | | Server-side template rewrite | `config/claude_md_template.txt` (or wherever `render_claude_md` reads its default from) | Update path strings: `data/parquet/` → `server/parquet/`, `data/duckdb/...` → `user/duckdb/analytics.duckdb`. Replace `da sync` → `da pull`, `da fetch` → `da snapshot create`, `da metrics list` → `da catalog --metrics`. | -| Admin override migration | `claude_md_template` DB table (schema v23, exposed via `/admin/workspace-prompt` UI and `app/api/claude_md.py` admin CRUD) | Add a `legacy_strings_detected: list[str]` field to `GET /api/admin/workspace-prompt-template` response — server scans the saved override for `data/parquet`, `da sync`, `da fetch`, `da analyst setup`, `da metrics list/show`, returns the list of hits. UI in `app/web/templates/admin_workspace_prompt.html` (or whichever the editor template is) renders a yellow banner above the editor when the list is non-empty: "This override references CLI verbs / paths that were renamed in this release. Re-author and Save to clear the warning. See CHANGELOG for the rename list." Migration stays manual — admin re-authors and saves. | +| Admin override migration | `claude_md_template` DB table (schema v23, exposed via `/admin/workspace-prompt` UI and `app/api/claude_md.py` admin CRUD) | Add a module-level constant `_LEGACY_STRINGS = ("data/parquet", "da sync", "da fetch", "da analyst setup", "da metrics list", "da metrics show")` and a helper `def _scan_legacy_strings(text: str) -> list[str]` inside `app/api/claude_md.py`. Add a `legacy_strings_detected: list[str] = []` field to `TemplateGetResponse` (today defined at `app/api/claude_md.py:72-76`); `admin_get_workspace_template` populates it via `_scan_legacy_strings(override.content)`. UI in `app/web/templates/admin_workspace_prompt.html` (file confirmed to exist) renders a yellow banner above the editor when the list is non-empty: "This override references CLI verbs / paths that were renamed in this release. Re-author and Save to clear the warning. See CHANGELOG for the rename list." Migration stays manual — admin re-authors and saves. | | `/api/welcome` content unchanged | `app/api/claude_md.py:91` (`get_welcome`) | No code change — endpoint already serves rendered CLAUDE.md. Spec calls it out so implementer knows `da init`'s producer is here, not in the client. | -| Reuse `cli/error_render.py` (added in #160) | server: nothing — client-side only | Note: `cli/error_render.py:render_error()` already exists (added in 0.32.0 for typed BQ errors). `da init` and `da pull` adopt the same renderer for typed errors raised during bootstrap and refresh — `auth_failed`, `server_unreachable`, `manifest_unauthorized`, `disk_full`, etc. — so analyst-facing error UX is consistent across all CLI commands. No new server work; client-side only. | +| Adopt `cli/error_render.py` (added in #160) for client-side errors | server: nothing — client-side only | `cli/error_render.py:render_error(status_code, body)` was introduced in 0.32.0 for typed BQ errors served by `da query --remote` (recognizes `detail.kind` / `detail.reason` shapes; falls back to plain HTTP `{code}: {text}`). The renderer is structurally generic — no BQ-specific code. `da init` and `da pull` are **first-time adopters in the bootstrap path** (today's `sync.py`, `auth.py`, `fetch.py` don't import it). Pattern: synthesize a `{"detail": {"kind": "...", "hint": "...", "message": "..."}}` dict client-side and pass with a chosen `status_code` (0 or `-1` for purely client-side errors with no HTTP origin), exactly as `cli/commands/query.py:152, 165` already does for `RemoteQueryError` translation. New typed kinds added in this PR: `auth_failed`, `server_unreachable`, `manifest_unauthorized`, `disk_full`, `partial_state` — the renderer doesn't gate on a kind allowlist, so no renderer change is needed. No server work; client-side only. | ### Client-side (CLI, Python) | Component | File | Change | |---|---|---| -| `da init` (new) | `cli/commands/init.py` (new) | Required args: `--server-url`, `--token`. Optional: `--force`, `--workspace` (default `cwd`). Steps: (1) verify server reachability + PAT validity via `GET /api/catalog/tables` with `Authorization: Bearer ` — same endpoint `da auth import-token` already uses for this purpose (`cli/commands/auth.py:154`); exercises full PAT validation chain (revocation, expiry, hash) and 401s on bad PAT, unlike `/api/health` which is unauthenticated; (2) save server URL + PAT to `~/.config/da/{config.yaml,token.json}`; (3) `GET /api/welcome` and write its body to `/CLAUDE.md`; (4) write `.claude/settings.json` (model, permissions, hooks pointing at `da pull` and `da push`) — delegate hook installation to `cli/lib/hooks.py:install_claude_hooks` (see new module row below); (5) write `.claude/CLAUDE.local.md` (stub, only if absent); (6) call `cli/lib/pull.py:run_pull(server_url, token, workspace)` programmatically (no Typer round-trip); (7) write `AGNES_WORKSPACE.md` from a static client-side template with `{created_at}`, `{server_url}`, `{workspace_path}` substituted. `da init` does NOT call `da auth login`; the PAT from the paste-prompt is the only auth path during bootstrap. Errors render via `cli/error_render.py:render_error()` (typed kinds: `auth_failed`, `server_unreachable`, `partial_state`, `disk_full`, etc.). | -| `cli/lib/pull.py` (new module) | `cli/lib/pull.py` (new) — establish `cli/lib/` as the shared-library tree | Pure-function refactor of today's `cli/commands/sync.py:sync()` body, minus Typer decorators and stdout. Signature: `def run_pull(server_url: str, token: str, workspace: Path, *, dry_run: bool = False) -> PullResult`. Returns a structured `PullResult` (tables_updated, parquets_total, rules_count, duration_s, errors). Caller decides what to print (`da init` summarizes; `da pull` Typer wrapper prints per `--quiet`/`--json` flags). Tested directly without subprocess. | -| `cli/lib/hooks.py` (new module) | `cli/lib/hooks.py` (new) — replaces `cli/commands/analyst.py:_install_claude_hooks` | `def install_claude_hooks(workspace: Path) -> None`. Idempotent. Reads `/.claude/settings.json`, drops any prior entry whose every command is a `da pull`/`da sync`/`da push` invocation (covers both today's hook commands and the new ones during a transition window if anyone runs the new init in a folder that had old hooks), appends fresh entries: `SessionStart → da pull --quiet 2>/dev/null \|\| true`, `SessionEnd → da push --quiet 2>/dev/null \|\| true`. Workspace-level scope (`/.claude/settings.json`, not user-home), preserves third-party hooks. | +| `da init` (new) | `cli/commands/init.py` (new) | Required args: `--server-url`, `--token`. Optional: `--force`, `--workspace` (default `cwd`). Steps: (1) verify server reachability + PAT validity via `GET /api/catalog/tables` with `Authorization: Bearer ` — same endpoint `da auth import-token` already uses for this purpose (`cli/commands/auth.py:154`); exercises full PAT validation chain (revocation, expiry, hash) and 401s on bad PAT, unlike `/api/health` which is unauthenticated; (2) save server URL + PAT to `~/.config/da/{config.yaml,token.json}`; (3) `GET /api/welcome` and write its body to `/CLAUDE.md`; (4) write `.claude/settings.json` (model, permissions, hooks pointing at `da pull` and `da push`) — delegate hook installation to `cli/lib/hooks.py:install_claude_hooks` (see new module row below); (5) write `.claude/CLAUDE.local.md` (stub, only if absent); (6) call `cli/lib/pull.py:run_pull(server_url, token, workspace)` programmatically (no Typer round-trip); (7) write `AGNES_WORKSPACE.md` from a static client-side template with `{created_at}`, `{server_url}`, `{workspace_path}` substituted. `da init` does NOT call `da auth login`; the PAT from the paste-prompt is the only auth path during bootstrap. Errors are rendered by `cli/error_render.py:render_error()` — `da init` synthesizes `{"detail": {"kind": "...", "hint": "..."}}` dicts client-side (pattern: `cli/commands/query.py:152, 165`); typed kinds: `auth_failed`, `server_unreachable`, `partial_state`, `disk_full`. | +| `cli/lib/pull.py` (new module) | `cli/lib/pull.py` + `cli/lib/__init__.py` (new) — establish `cli/lib/` as the shared-library tree | Pure-function refactor of today's `cli/commands/sync.py:sync()` body, minus Typer decorators and stdout. Signature: `def run_pull(server_url: str, token: str, workspace: Path, *, dry_run: bool = False) -> PullResult`. Returns a structured `PullResult` (tables_updated, parquets_total, rules_count, duration_s, errors). Caller decides what to print (`da init` summarizes; `da pull` Typer wrapper prints per `--quiet`/`--json` flags). Tested directly without subprocess. **Packaging:** `cli/lib/__init__.py` (empty file) is required for Hatchling to include the dir in the wheel — `pyproject.toml:packages` already lists `cli`, sub-packages with `__init__.py` are picked up automatically. | +| `cli/lib/hooks.py` (new module) | `cli/lib/hooks.py` (new) — replaces `cli/commands/analyst.py:_install_claude_hooks` | `def install_claude_hooks(workspace: Path) -> None`. Idempotent. Reads `/.claude/settings.json`, drops any prior entry whose every command is a `da pull`/`da sync`/`da push` invocation (covers both today's hook commands and the new ones during a transition window if anyone runs the new init in a folder that had old hooks), appends fresh entries: `SessionStart → da pull --quiet 2>/dev/null \|\| true`, `SessionEnd → da push --quiet 2>/dev/null \|\| true`. Workspace-level scope (`/.claude/settings.json`, not user-home), preserves third-party hooks. Lives next to `cli/lib/pull.py` under the new `cli/lib/__init__.py` package. | | `da pull` (renamed from `da sync`) | `cli/commands/pull.py` (renamed from `cli/commands/sync.py`) | Behavior is today's `da sync` minus the `--upload-only` branch. Lazy-mkdir fixes (see below). Calls `cli/lib/pull.py:run_pull` and prints the result. Flags: `--quiet` (suppress success stdout, used by hook), `--json` (machine output of `PullResult`), `--dry-run` (compute deltas without writing — uses `dry_run=True`). Errors render via `cli/error_render.py`. | | `da push` (extracted from `da sync --upload-only`) | `cli/commands/push.py` (new) | Uploads `user/sessions/*.jsonl` and `.claude/CLAUDE.local.md`. Lazy: skip when nothing to upload (no `user/sessions/` mkdir if no sessions). Same auth as `da pull`. Flags: `--quiet`, `--json`, `--dry-run`. Errors render via `cli/error_render.py`. | | `da snapshot create` (renamed from `da fetch`) | `cli/commands/snapshot.py` | Move logic from `cli/commands/fetch.py` into a `create` subcommand of the existing `snapshot` group. Remove `cli/commands/fetch.py`. Carry over all flags: `--select`, `--where`, `--limit`, `--order-by`, `--as`, `--estimate`, `--no-estimate`, `--force`. Add existence check before opening DuckDB to avoid creating an empty DB file when no `da pull` has run yet (guard: `if not db_path.exists(): typer.echo("Local DuckDB not found. Run: da pull"); raise typer.Exit(1)`). Existing `da snapshot {refresh, prune, list, drop}` are unchanged. | @@ -272,7 +272,11 @@ Empty folder + Claude Code with paste prompt ├─ Step 2 — da init --server-url URL --token PAT --workspace . │ ├─ verify: GET /api/catalog/tables with Bearer PAT → 200 (PAT-validating endpoint) │ ├─ save: ~/.config/da/{config.yaml, token.json} -│ ├─ fetch: GET /api/welcome → write ./CLAUDE.md +│ ├─ fetch: GET /api/welcome?server_url= → write ./CLAUDE.md +│ │ (passing the analyst-facing URL explicitly so behind-proxy +│ │ installs render the operator-visible URL, not the FastAPI +│ │ internal hostname; endpoint default falls back to +│ │ request.base_url which equals --server-url in practice) │ ├─ write: ./.claude/settings.json (with hooks SessionStart→`da pull`, SessionEnd→`da push`) │ ├─ write: ./.claude/CLAUDE.local.md (stub, if absent) │ ├─ call: da pull (programmatic — calls cli/lib/pull.py:run_pull) @@ -413,12 +417,12 @@ Verification has three layers: (a) automated reader-smoke matrix that proves no | `test_pat` | string PAT for `analyst@example.com` | Group membership: `Everyone` only. `resource_grants` for the local + materialized tables (so manifest returns 2 rows for them). Two `mandatory` corporate-memory items granted via group. PAT TTL: 1 h. | | `test_pat_no_grants` | string PAT for `analyst@example.com` | Same user, but `resource_grants` is empty and `corporate_memory` has zero items granted to `Everyone`. Manifest returns `{"tables": []}`; memory bundle returns `{"mandatory": [], "approved": []}`. | | `zero_grants_workspace` | `tmp_path` after running `da init --token --server-url ` | A fully-bootstrapped workspace where every conditional dir is absent. Used by the reader smoke matrix. The fixture also exposes a sentinel constant `NONEXISTENT_TABLE = "__nonexistent__"` for tests that need a deliberately-unknown table id; readers must produce a friendly exit-1 (no traceback) when given this id. | -| `web_session` | authenticated `httpx.Client` with cookies | Calls `POST /auth/token` with `{"email": "admin@example.com", "password": }` (the test password is seeded into the same `users` row by `fastapi_test_server`). Returns a client with the resulting session cookie. Used to mint PATs in PAT-scope tests. Choice rationale: real-endpoint login over dependency-override keeps the auth path under test rather than bypassed. | +| `web_session` | authenticated `httpx.Client` with cookies | Calls `POST /auth/password/login/web` with form fields `email=admin@example.com` and `password=` (the test password is seeded into the same `users` row by `fastapi_test_server`). The form-login endpoint sets the session cookie that `POST /auth/tokens` requires (PAT mint route gates on `require_session_token`, see `app/api/tokens.py:88`). Used to mint PATs in PAT-scope tests. Choice rationale: real-endpoint login over dependency-override keeps the auth path under test rather than bypassed. | | `client` | `TestClient(app)` | Plain FastAPI test client with no auth. Used for endpoint-shape tests. | Fixtures live in `tests/conftest.py` (existing) plus a new `tests/fixtures/analyst_bootstrap.py`. -The autouse fixture `_reset_module_caches` in `tests/conftest.py:50-82` (added in 0.32.0 / #160 / commit `9ecbfd2a`) resets `app.instance_config._instance_config`, `connectors.bigquery.access.get_bq_access` lru cache, and `app.api.v2_quota._quota_singleton` between tests on the same xdist worker. The new bootstrap fixtures rely on this to keep `fastapi_test_server` invocations independent — no manual cache resets needed in test bodies. +The autouse fixture `_reset_module_caches` in `tests/conftest.py:50-83` (added in 0.32.0 / #160 / commit `9ecbfd2a`) resets `app.instance_config._instance_config`, `connectors.bigquery.access.get_bq_access` lru cache, and `app.api.v2_quota._quota_singleton` between tests on the same xdist worker. The new bootstrap fixtures rely on this to keep `fastapi_test_server` invocations independent — no manual cache resets needed in test bodies. ### 5.1 Reader smoke matrix (automated)