agnes-the-ai-analyst/docs/superpowers/plans/2026-05-04-clean-analyst-bootstrap.md
ZdenekSrotyr 841dcc8447 docs(spec+plan): round-4 review fixes — rename hygiene
5 must-fixes from rename-hygiene review:

- B1: test command arrays ["da", ...] -> ["agnes", ...] in spec lines
  433-446, 466, 502 and plan reader smoke matrix + clean-install
  integration tests + readers-in-pre-init test (39 occurrences)
- B2: ~/.local/bin/da -> ~/.local/bin/agnes (binary path string in
  spec data flow + plan AGNES_WORKSPACE.md uninstall table)
- B3: CHANGELOG missing BREAKING bullet for binary rename — added in
  both spec and plan CHANGELOG drafts
- B4: plan CHANGELOG typo "previous agnes status" -> "previous da
  status" (server-health overview was historically da status)
- B5: spec Components table missing row for the binary rename — added
  Client-side row mapping pyproject.toml + cli/main.py changes to
  plan Task 0
2026-05-04 15:57:07 +02:00

123 KiB

Clean Analyst Bootstrap Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Replace the interactive da analyst setup flow with a single web→paste→done bootstrap. New analyst pastes a clipboard prompt from /setup?role=analyst into Claude Code in an empty folder, and ends up with CLAUDE.md, AGNES_WORKSPACE.md, hooks, fresh data, and DuckDB views — fully ready to query. Drop dead workspace dirs (data/parquet/, data/duckdb/, data/metadata/, user/artifacts/). Establish a lazy-mkdir contract so nothing creates empty directories.

Architecture: PAT-only auth. agnes init is a thin orchestrator that auths, fetches CLAUDE.md from /api/welcome, installs hooks, and calls cli/lib/pull.py:run_pull for first data refresh. CLI verbs renamed: init/pull/push/status/snapshot create (greenfield, no aliases). Server-side install prompt branches on role query param. cli/lib/ shared library tree separates data primitives from Typer wrappers so agnes init can call them without subprocess. Reader contract: every reader handles missing dirs gracefully (exit 0 empty or exit 1 with friendly hint, never traceback).

Tech Stack: Python 3.11+, FastAPI, Typer, Pydantic, DuckDB, httpx, pytest, Hatchling.

Spec: docs/superpowers/specs/2026-05-04-clean-analyst-bootstrap-design.md (revision 5, cleared for implementation).

CLI rename: As part of this plan, the binary changes from da to agnes. References to legacy commands (da sync, da fetch, da analyst setup, da metrics) keep their da prefix throughout this document — they're historical artifacts being removed. New commands and hook strings use agnes.


File Structure

New files

Path Responsibility
cli/lib/__init__.py Empty — makes cli/lib/ a package so Hatchling includes it in the wheel.
cli/lib/pull.py run_pull(server_url, token, workspace, *, dry_run) -> PullResult — pure-function data refresh primitive (manifest, parquet download, DuckDB rebuild, memory bundle write). Lazy mkdir.
cli/lib/hooks.py install_claude_hooks(workspace) — idempotent SessionStart/End hook installer for <workspace>/.claude/settings.json.
cli/commands/init.py agnes init Typer command — auth check, save config, write CLAUDE.md, install hooks, call run_pull, write AGNES_WORKSPACE.md.
cli/commands/pull.py agnes pull Typer wrapper around cli/lib/pull.py:run_pull. Flags --quiet, --json, --dry-run.
cli/commands/push.py agnes push Typer command — uploads user/sessions/*.jsonl and .claude/CLAUDE.local.md.
cli/commands/admin_metrics.py agnes admin metrics {import,export,validate} sub-Typer (lifted from cli/commands/metrics.py).
config/agnes_workspace_template.txt Static client-side template for AGNES_WORKSPACE.md. Three placeholders: {created_at}, {server_url}, {workspace_path}.
tests/fixtures/analyst_bootstrap.py Test fixtures: fastapi_test_server, test_pat, test_pat_no_grants, zero_grants_workspace, web_session.
tests/test_lib_hooks.py Tests for install_claude_hooks (idempotent, preserves third-party hooks, replaces old agnes pull/da sync entries).
tests/test_lib_pull.py Tests for run_pull (lazy mkdir, partial failure handling, manifest empty case).
tests/test_setup_instructions_analyst.py Tests render_setup_instructions(role="analyst") produces correct steps.
tests/test_tokens_bootstrap_scope.py Tests scope=bootstrap-analyst PATs are TTL-clamped to ≤ 1 h; ttl_seconds upper bound; ttl_seconds wins over expires_in_days.
tests/test_legacy_strings_scan.py Tests _scan_legacy_strings and legacy_strings_detected field on GET /api/admin/workspace-prompt-template.
tests/test_clean_install_integration.py End-to-end clean-install integration tests (minimal grants, zero grants, force preserves, AGNES_WORKSPACE.md content).
tests/test_reader_smoke_matrix.py Reader smoke matrix — every CLI command on a freshly-bootstrapped zero-grants workspace, asserts no traceback.

Modified files

Path Change
app/api/tokens.py CreateTokenRequest: add scope: str = "general" and ttl_seconds: Optional[int] = None. Validate ttl_seconds <= 315_360_000. Resolution: ttl_seconds wins; fall back to expires_in_days. For scope == "bootstrap-analyst", force-clamp resolved TTL ≤ 3600 s. Audit-log includes scope.
app/api/claude_md.py Add module-level _LEGACY_STRINGS = ("data/parquet", "da sync", "da fetch", "da analyst setup", "da metrics list", "da metrics show"). Add helper _scan_legacy_strings(text) -> list[str]. Add field legacy_strings_detected: list[str] = [] to TemplateGetResponse. Populate in admin_get_workspace_template.
app/web/setup_instructions.py Add role: Literal["analyst","admin"] = "admin" to resolve_lines() and render_setup_instructions(). Analyst layout: TLS trust (when ca_pem) → install agnesagnes init --server-url X --token Y --workspace .agnes catalog smoke verify → confirm. Drop for analyst: marketplace, plugins, skills, diagnose, login, whoami.
app/web/router.py setup_page: read role query param (default "admin"), pass to render_setup_instructions(role=...).
app/web/templates/setup.html (or wherever setup_page renders) Two role tiles (Analyst / Admin), POST /auth/tokens with matching scope.
app/web/templates/admin_workspace_prompt.html Yellow banner above editor when legacy_strings_detected non-empty.
config/claude_md_template.txt Update verb names: da syncagnes pull, da fetchagnes snapshot create, da metrics list/showagnes catalog --metrics, da analyst setupagnes init. Path strings: data/parquet/server/parquet/, data/duckdb/...user/duckdb/analytics.duckdb.
cli/commands/snapshot.py Add create subcommand — moves logic from cli/commands/fetch.py verbatim. Add if not db_path.exists(): exit 1 guard before duckdb.connect().
cli/commands/catalog.py Add --metrics flag (replaces da metrics list); --metrics --show <id> (replaces da metrics show).
cli/commands/admin.py Register the new admin_metrics_app sub-Typer.
cli/commands/query.py Update hint text "Run: da sync" → "Run: agnes pull" in two places.
cli/commands/explore.py Update hint text "Run: da sync" → "Run: agnes pull".
cli/main.py Drop registrations for sync, analyst, metrics, fetch, status (existing). Add init, pull, push. Re-register status to point at new workspace-status command.
CLAUDE.md (repo root) Verb + path rewrites throughout. The "Local sync & Claude Code hooks" subsection rewrites verbatim with new commands. The "Querying Agnes data — agent rails" subsection keeps the 0.32.0 query_mode='materialized' and query_mode='remote' cost-guardrail prose verbatim, just verb-renaming da fetchagnes snapshot create.
CHANGELOG.md Entry under [Unreleased] per spec preview (Changed/Added/Fixed/Removed/Kept).
pyproject.toml No change; cli/lib/__init__.py triggers Hatchling auto-discovery.

Deleted files

Path Reason
cli/commands/sync.py Replaced by cli/commands/pull.py + cli/commands/push.py + cli/lib/pull.py.
cli/commands/fetch.py Folded into cli/commands/snapshot.py:create.
cli/commands/analyst.py Replaced by cli/commands/init.py + new cli/commands/status.py (workspace status). _install_claude_hooks lifted to cli/lib/hooks.py.
cli/commands/metrics.py Read paths fold into agnes catalog --metrics; write paths move to cli/commands/admin_metrics.py.

Existing cli/commands/status.py rename

Action Detail
Existing agnes status ("System status") Renamed to agnes diagnose system (subcommand under diagnose_app) — its content is a subset of what agnes diagnose already does.
New agnes status Workspace status — fresh implementation, replaces da analyst status. Lives in cli/commands/status.py (overwrite).

Phase 0 — CLI binary rename (daagnes)

Task 0: Rename the CLI entry point

Files:

  • Modify: pyproject.toml ([project.scripts]), cli/main.py (Typer(name=...))
  • Test: tests/test_cli_binary_rename.py (new)

Why first: Every later task that registers Typer apps, writes hook command strings, or asserts CLI output uses agnes. Rename the binary up front so tests in subsequent tasks reference the right name.

  • Step 1: Read current entry points
grep -n "scripts\|^name\|tool.hatch" pyproject.toml | head
grep -n "Typer\|name=\"da\"\|name='da'" cli/main.py
  • Step 2: Update pyproject.toml

In [project.scripts], replace da = "cli.main:app" with:

[project.scripts]
agnes = "cli.main:app"

Single entry — no da alias kept. Greenfield.

  • Step 3: Update cli/main.py

Change the Typer app construction from name="da" to name="agnes" and update the help string:

app = typer.Typer(
    name="agnes",
    help="Agnes — AI Data Analyst CLI",
    no_args_is_help=True,
)
  • Step 4: Reinstall the editable package
uv pip install -e ".[dev]"
which agnes
agnes --version

Expected: agnes <version> prints; da --version now fails with "command not found".

  • Step 5: Write a binary-name regression test
# tests/test_cli_binary_rename.py
"""Confirm the wheel installs the binary as `agnes`, not `da`."""
import subprocess


def test_agnes_command_exists():
    result = subprocess.run(["agnes", "--version"], capture_output=True, text=True)
    assert result.returncode == 0


def test_da_command_no_longer_works():
    """Greenfield: no backward-compat alias."""
    result = subprocess.run(["bash", "-c", "command -v da"],
                            capture_output=True, text=True)
    assert result.returncode != 0, "da should NOT be on PATH after rename"
  • Step 6: Run the test
pytest tests/test_cli_binary_rename.py -v
  • Step 7: Commit
git add pyproject.toml cli/main.py tests/test_cli_binary_rename.py
git commit -m "feat(cli): rename binary from da to agnes (BREAKING)"

Phase 1 — Server-side foundation (PAT scope, legacy-strings scan, install-prompt branching)

Task 1: Add scope + ttl_seconds fields to CreateTokenRequest

Files:

  • Modify: app/api/tokens.py:23-25 (CreateTokenRequest), app/api/tokens.py:85-101 (create_token route)

  • Test: tests/test_tokens_bootstrap_scope.py (new)

  • Step 1: Write failing tests

# tests/test_tokens_bootstrap_scope.py
"""Tests for PAT scope + ttl_seconds fields (clean-analyst-bootstrap spec)."""

from __future__ import annotations

import jwt
import pytest


@pytest.fixture
def web_session(client, db_with_admin_user):
    """Authenticated test client with session cookie for admin user."""
    # Form-login endpoint — see fixtures/analyst_bootstrap.py
    resp = client.post("/auth/password/login/web",
                       data={"email": "admin@example.com", "password": "test-password"})
    assert resp.status_code in (200, 302), f"login failed: {resp.text}"
    return client


def _decode(pat: str) -> dict:
    return jwt.decode(pat, options={"verify_signature": False})


def test_bootstrap_pat_ttl_clamped_to_one_hour(web_session):
    resp = web_session.post("/auth/tokens", json={
        "name": "init",
        "scope": "bootstrap-analyst",
        "ttl_seconds": 86400,  # 1 day — must be ignored, clamped to 3600
    })
    assert resp.status_code == 201, resp.text
    payload = _decode(resp.json()["token"])
    assert payload.get("scope") == "bootstrap-analyst"
    assert payload["exp"] - payload["iat"] <= 3600 + 5


def test_general_pat_uses_ttl_seconds_when_set(web_session):
    resp = web_session.post("/auth/tokens", json={
        "name": "test",
        "ttl_seconds": 7200,  # 2 hours
    })
    assert resp.status_code == 201
    payload = _decode(resp.json()["token"])
    assert payload["exp"] - payload["iat"] <= 7200 + 5


def test_general_pat_falls_back_to_expires_in_days(web_session):
    resp = web_session.post("/auth/tokens", json={
        "name": "test", "expires_in_days": 30,
    })
    assert resp.status_code == 201
    payload = _decode(resp.json()["token"])
    assert payload["exp"] - payload["iat"] <= 30 * 86400 + 5


def test_ttl_seconds_upper_bound(web_session):
    # 3650 days * 86400 = 315_360_000 seconds. One past this must reject.
    resp = web_session.post("/auth/tokens", json={
        "name": "test", "ttl_seconds": 315_360_001,
    })
    assert resp.status_code == 400


def test_ttl_seconds_must_be_positive(web_session):
    resp = web_session.post("/auth/tokens", json={
        "name": "test", "ttl_seconds": 0,
    })
    assert resp.status_code == 400


def test_scope_default_is_general(web_session):
    resp = web_session.post("/auth/tokens", json={"name": "test"})
    assert resp.status_code == 201
    payload = _decode(resp.json()["token"])
    # scope=general is informational; check audit_log carries it
    # (skipped here — tested in test_audit_log_includes_scope below)
    assert payload.get("scope", "general") == "general"

The db_with_admin_user fixture is part of the existing test suite or will be added in tests/fixtures/analyst_bootstrap.py (Task 22). For now, this test depends on it; if it doesn't exist, mark these as pytest.skip until Task 22.

  • Step 2: Run tests to verify they fail
cd "$(git rev-parse --show-toplevel)"
pytest tests/test_tokens_bootstrap_scope.py -v

Expected: tests FAIL with either fixture-missing error or extra fields not permitted from Pydantic (if the fixture exists).

  • Step 3: Update CreateTokenRequest model

Replace app/api/tokens.py:23-25:

class CreateTokenRequest(BaseModel):
    name: str
    expires_in_days: Optional[int] = 90  # null = no expiry
    scope: str = "general"  # informational; "bootstrap-analyst" force-clamps TTL ≤ 1 h
    ttl_seconds: Optional[int] = None  # if set, wins over expires_in_days
  • Step 4: Update create_token route

Replace app/api/tokens.py:85-118 (the create_token function body up through the jwt_token = create_access_token(...) call):

@router.post("", response_model=CreateTokenResponse, status_code=201)
async def create_token(
    payload: CreateTokenRequest,
    user: dict = Depends(require_session_token),
    conn: duckdb.DuckDBPyConnection = Depends(_get_db),
):
    if not payload.name.strip():
        raise HTTPException(status_code=400, detail="name is required")
    if payload.expires_in_days is not None and payload.expires_in_days <= 0:
        raise HTTPException(status_code=400, detail="expires_in_days must be a positive integer")
    if payload.expires_in_days is not None and payload.expires_in_days > 3650:
        raise HTTPException(status_code=400, detail="expires_in_days must not exceed 3650 (10 years)")
    if payload.ttl_seconds is not None and payload.ttl_seconds <= 0:
        raise HTTPException(status_code=400, detail="ttl_seconds must be a positive integer")
    # Mirror the 3650-day cap on ttl_seconds so a hostile client can't
    # bypass via field rename. 3650 days * 86400 = 315_360_000.
    if payload.ttl_seconds is not None and payload.ttl_seconds > 315_360_000:
        raise HTTPException(status_code=400, detail="ttl_seconds must not exceed 315360000 (10 years)")

    # Resolve TTL: ttl_seconds wins; fall back to expires_in_days.
    expires_delta: Optional[timedelta] = None
    omit_exp = False
    if payload.ttl_seconds is not None:
        expires_delta = timedelta(seconds=payload.ttl_seconds)
    elif payload.expires_in_days is not None:
        expires_delta = timedelta(days=payload.expires_in_days)
    else:
        omit_exp = True  # "no expiry"

    # Force-clamp bootstrap-analyst PATs to ≤ 1 h regardless of request.
    if payload.scope == "bootstrap-analyst":
        ONE_HOUR = timedelta(hours=1)
        if expires_delta is None or expires_delta > ONE_HOUR:
            expires_delta = ONE_HOUR
        omit_exp = False

    expires_at: Optional[datetime] = None
    if expires_delta is not None:
        expires_at = datetime.now(timezone.utc) + expires_delta

    repo = AccessTokenRepository(conn)
    token_id = str(uuid.uuid4())

    jwt_token = create_access_token(
        user_id=user["id"], email=user["email"],
        token_id=token_id, typ="pat",
        expires_delta=expires_delta, omit_exp=omit_exp,
        extra_claims={"scope": payload.scope},
    )
  • Step 5: Update create_access_token to accept extra_claims

Find app/auth/jwt.py:create_access_token (read it first to get the current signature). Add an extra_claims: dict | None = None parameter that gets merged into the JWT payload before encoding. Show your edit:

grep -n "def create_access_token" app/auth/jwt.py
# Read the function and update.

If the function already supports extra claims, this is a no-op. Otherwise add:

def create_access_token(
    user_id: str, email: str, token_id: Optional[str] = None,
    typ: str = "session", expires_delta: Optional[timedelta] = None,
    omit_exp: bool = False, extra_claims: Optional[dict] = None,
) -> str:
    payload = {"sub": user_id, "email": email, "typ": typ}
    if token_id:
        payload["jti"] = token_id
    if not omit_exp:
        payload["iat"] = int(datetime.now(timezone.utc).timestamp())
        if expires_delta:
            payload["exp"] = int((datetime.now(timezone.utc) + expires_delta).timestamp())
    if extra_claims:
        payload.update(extra_claims)
    return jwt.encode(payload, _SECRET, algorithm="HS256")

(Adapt to the actual function shape after reading it.)

  • Step 6: Update audit-log entry to include scope

Search app/api/tokens.py for the _audit(...) call inside create_token and add scope to the params dict:

_audit(conn, actor=user["id"], action="token.create",
       target=token_id,
       params={"name": payload.name,
               "expires_at": str(expires_at) if expires_at else None,
               "scope": payload.scope})
  • Step 7: Run tests to verify they pass
pytest tests/test_tokens_bootstrap_scope.py -v

Expected: all PASS (or skip if db_with_admin_user fixture doesn't yet exist; in that case the failure mode is a clear fixture-not-found error, not a logic error).

  • Step 8: Run the full token test suite to verify no regression
pytest tests/ -k token -v

Expected: all token-related tests PASS.

  • Step 9: Commit
git add app/api/tokens.py app/auth/jwt.py tests/test_tokens_bootstrap_scope.py
git commit -m "feat(tokens): add scope + ttl_seconds fields with bootstrap-analyst clamp"

Task 2: Add _LEGACY_STRINGS scan to admin workspace-prompt endpoint

Files:

  • Modify: app/api/claude_md.py (add _LEGACY_STRINGS, _scan_legacy_strings, augment TemplateGetResponse, populate in admin_get_workspace_template)

  • Test: tests/test_legacy_strings_scan.py (new)

  • Step 1: Write failing tests

# tests/test_legacy_strings_scan.py
"""Tests for legacy-string scan in admin CLAUDE.md template endpoint."""

from app.api.claude_md import _scan_legacy_strings, _LEGACY_STRINGS


def test_scan_finds_all_known_legacy_strings():
    text = """
    Run `da sync` then `da fetch web_sessions --where ...`.
    Old workspace at data/parquet/ — see `da analyst setup`.
    Use `da metrics list` and `da metrics show <id>`.
    """
    hits = _scan_legacy_strings(text)
    assert "da sync" in hits
    assert "da fetch" in hits
    assert "data/parquet" in hits
    assert "da analyst setup" in hits
    assert "da metrics list" in hits
    assert "da metrics show" in hits


def test_scan_returns_empty_for_clean_text():
    text = "Use `agnes pull` to refresh, `agnes snapshot create` for ad-hoc, `server/parquet/`."
    assert _scan_legacy_strings(text) == []


def test_scan_returns_unique_sorted_hits():
    text = "da sync da sync data/parquet/ data/parquet/foo"
    hits = _scan_legacy_strings(text)
    assert hits == sorted(set(hits))


def test_legacy_strings_constant_shape():
    assert isinstance(_LEGACY_STRINGS, tuple)
    assert all(isinstance(s, str) for s in _LEGACY_STRINGS)
    assert "da sync" in _LEGACY_STRINGS
    assert "data/parquet" in _LEGACY_STRINGS
  • Step 2: Run tests to verify they fail
pytest tests/test_legacy_strings_scan.py -v

Expected: FAIL with ImportError: cannot import name '_scan_legacy_strings' from 'app.api.claude_md'.

  • Step 3: Add _LEGACY_STRINGS and _scan_legacy_strings to app/api/claude_md.py

Insert near the other module-level constants (after the imports, before the class definitions — find a stable location, e.g., right before class ClaudeMdResponse):

# Substrings that, when found in an admin-saved CLAUDE.md override, signal
# the override is stale relative to the post-clean-bootstrap CLI surface.
# Surfaced via TemplateGetResponse.legacy_strings_detected so the admin UI
# can render a yellow banner prompting re-authoring.
_LEGACY_STRINGS = (
    "data/parquet",
    "da sync",
    "da fetch",
    "da analyst setup",
    "da metrics list",
    "da metrics show",
)


def _scan_legacy_strings(text: str) -> list[str]:
    """Return sorted unique substrings from _LEGACY_STRINGS present in text."""
    return sorted({s for s in _LEGACY_STRINGS if s in text})
  • Step 4: Run tests to verify they pass
pytest tests/test_legacy_strings_scan.py -v

Expected: all PASS.

  • Step 5: Augment TemplateGetResponse

Find class TemplateGetResponse (around app/api/claude_md.py:72-76) and add the field:

class TemplateGetResponse(BaseModel):
    content: Optional[str]
    default: str
    updated_at: Optional[str] = None
    updated_by: Optional[str] = None
    legacy_strings_detected: list[str] = []  # populated when override contains stale verbs/paths
  • Step 6: Populate the field in admin_get_workspace_template

Find the route (search for admin_get_workspace_template in app/api/claude_md.py). Inside the function body, before constructing the response, add:

# Scan the saved override (not the live default) for legacy strings.
# A non-empty list triggers the yellow banner in the admin UI.
override_text = override.content if override else ""
legacy_hits = _scan_legacy_strings(override_text)

Then include legacy_strings_detected=legacy_hits in the TemplateGetResponse(...) construction.

  • Step 7: Add an HTTP test for the populated field

Append to tests/test_legacy_strings_scan.py:

def test_admin_get_template_returns_legacy_strings_when_override_dirty(web_session):
    """Setting an override containing legacy strings populates the field."""
    web_session.put("/api/admin/workspace-prompt-template",
                    json={"content": "Run `da sync` and check data/parquet/."})
    resp = web_session.get("/api/admin/workspace-prompt-template")
    assert resp.status_code == 200
    body = resp.json()
    assert "da sync" in body["legacy_strings_detected"]
    assert "data/parquet" in body["legacy_strings_detected"]


def test_admin_get_template_returns_empty_when_clean(web_session):
    web_session.put("/api/admin/workspace-prompt-template",
                    json={"content": "Use `agnes pull` and check `server/parquet/`."})
    resp = web_session.get("/api/admin/workspace-prompt-template")
    assert resp.json()["legacy_strings_detected"] == []

These depend on web_session fixture from Task 22; mark pytest.skip if not yet present.

  • Step 8: Run all claude_md tests
pytest tests/test_legacy_strings_scan.py tests/ -k claude_md -v

Expected: PASS (skip the HTTP tests if fixture missing — that's OK).

  • Step 9: Commit
git add app/api/claude_md.py tests/test_legacy_strings_scan.py
git commit -m "feat(admin): scan CLAUDE.md override for legacy strings"

Task 3: Add role parameter to setup_instructions.py (analyst branch)

Files:

  • Modify: app/web/setup_instructions.py (add role parameter to resolve_lines and render_setup_instructions; add analyst-branch helper)

  • Test: tests/test_setup_instructions_analyst.py (new)

  • Step 1: Write failing tests

# tests/test_setup_instructions_analyst.py
"""Tests for analyst-branch rendering of /setup paste prompt."""

from app.web.setup_instructions import render_setup_instructions


def test_render_analyst_role_basic():
    text = render_setup_instructions(
        server_url="https://agnes.example.com",
        token="agnes_pat_TEST",
        wheel_filename="agnes-0.32.0-py3-none-any.whl",
        role="analyst",
    )
    # Required content for analyst role:
    assert "uv tool install" in text
    assert "agnes init" in text
    assert "--token" in text and "agnes_pat_TEST" in text
    assert "--server-url" in text and "https://agnes.example.com" in text
    assert "agnes catalog" in text  # smoke verify step
    # Forbidden content (admin-only):
    assert "marketplace" not in text
    assert "claude plugin install" not in text
    assert "agnes skills install" not in text  # analyst doesn't bulk-install skills
    assert "agnes diagnose" not in text  # analyst smoke verify is `agnes catalog`, not diagnose


def test_render_admin_role_unchanged():
    """Default role=admin keeps the existing 6/8-step layout."""
    text = render_setup_instructions(
        server_url="https://agnes.example.com",
        token="agnes_pat_TEST",
        wheel_filename="agnes-0.32.0-py3-none-any.whl",
        # role omitted — defaults to "admin"
    )
    assert "agnes auth import-token" in text  # admin uses import-token, not agnes init
    assert "agnes diagnose" in text  # admin keeps diagnose


def test_render_analyst_with_ca_pem():
    """Analyst role + private CA → TLS trust block reused from admin path."""
    text = render_setup_instructions(
        server_url="https://agnes.example.com",
        token="agnes_pat_TEST",
        wheel_filename="agnes-0.32.0-py3-none-any.whl",
        role="analyst",
        ca_pem="-----BEGIN CERTIFICATE-----\nMIIBxxx\n-----END CERTIFICATE-----",
    )
    assert "AGNES_CA_PEM" in text  # heredoc marker from trust block
    assert "ca-bundle.pem" in text
    assert "agnes init" in text  # analyst-specific step still present
  • Step 2: Run tests to verify they fail
pytest tests/test_setup_instructions_analyst.py -v

Expected: FAIL — render_setup_instructions() doesn't accept role parameter.

  • Step 3: Add analyst-branch helper functions

Insert after _install_cli_lines (around line 311 in setup_instructions.py):

def _analyst_init_lines(server_url_placeholder: str = "{server_url}") -> list[str]:
    """Steps 2-3 — `agnes init` (auth + workspace bootstrap) + smoke verify.

    Replaces the admin-flow login + verify steps (today: `agnes auth import-token`
    + `agnes auth whoami`). `agnes init` is non-interactive: `--token` carries the PAT,
    `--server-url` carries the origin. The bootstrap PAT has a 1 h TTL — if the
    user takes longer than that to paste this prompt, the init call returns 401
    and the user re-clicks "Generate prompt" on the install page.
    """
    return [
        "",
        "2) Bootstrap your analyst workspace in this directory:",
        f"   agnes init --server-url \"{server_url_placeholder}\" --token \"{{token}}\" --workspace .",
        "",
        "   This authenticates with the PAT, fetches your CLAUDE.md (RBAC-filtered),",
        "   installs Claude Code SessionStart/End hooks (auto-refresh), and runs an",
        "   initial `agnes pull` so your DuckDB views are ready.",
        "",
        "3) Verify the data is queryable:",
        "   agnes catalog",
        "",
        "   This should list the tables your account has grants for. Empty list",
        "   means your admin hasn't granted you access yet — contact them.",
    ]


def _analyst_finale_lines(confirm_step_num: str, has_ca: bool) -> list[str]:
    """Final Confirm step for analyst role. Shorter than admin: no marketplace,
    no plugins, no skills."""
    bullets = [
        "   - `agnes --version` output",
        "   - First few lines of `agnes catalog` (tables you can see)",
        "   - Confirmation that `./CLAUDE.md` and `./AGNES_WORKSPACE.md` exist",
        "   - Confirmation that `./.claude/settings.json` contains SessionStart/End hooks",
    ]
    if has_ca:
        bullets.append(
            "   - Which CA bundle source got picked in step 0(d)"
        )
    return [
        "",
        f"{confirm_step_num}) Confirm:",
        "   Tell me \"Agnes analyst workspace is ready\" and summarize:",
        *bullets,
    ]
  • Step 4: Add role parameter to resolve_lines and render_setup_instructions

Find def resolve_lines(...) (around line 609). Modify the signature and dispatch:

from typing import Literal

def resolve_lines(
    wheel_filename: str,
    *,
    plugin_install_names: list[str] | None = None,
    self_signed_tls: bool = False,
    server_host: str = "",
    ca_pem: str | None = None,
    role: Literal["analyst", "admin"] = "admin",
) -> list[str]:
    """..."""
    if role == "analyst":
        return _resolve_analyst_lines(wheel_filename, ca_pem=ca_pem)
    # Existing admin path:
    names = list(plugin_install_names or [])
    has_marketplace = bool(names)
    has_ca = bool(ca_pem and ca_pem.strip())
    # ... (existing body unchanged)

Add the new analyst dispatcher right after resolve_lines:

def _resolve_analyst_lines(wheel_filename: str, *, ca_pem: str | None) -> list[str]:
    """Analyst workspace-bootstrap layout. Self-contained — no admin-only steps."""
    has_ca = bool(ca_pem and ca_pem.strip())
    confirm_step = "4" if has_ca else "4"  # numbering: 0 (TLS optional), 1, 2, 3, 4

    lines: list[str] = []
    if has_ca:
        lines.extend(_tls_trust_block(ca_pem))
    lines.extend(_preamble_lines(has_ca=has_ca))
    lines.extend(_install_cli_lines(has_ca=has_ca))   # step 1
    lines.extend(_analyst_init_lines())                # steps 2-3
    lines.extend(_analyst_finale_lines(confirm_step, has_ca=has_ca))  # step 4

    return [
        line.replace("{wheel_filename}", wheel_filename)
        for line in lines
    ]

Update render_setup_instructions to accept and forward role:

def render_setup_instructions(
    server_url: str,
    token: str,
    wheel_filename: str = "agnes.whl",
    *,
    plugin_install_names: list[str] | None = None,
    self_signed_tls: bool = False,
    server_host: str = "",
    ca_pem: str | None = None,
    role: Literal["analyst", "admin"] = "admin",
) -> str:
    lines = resolve_lines(
        wheel_filename,
        plugin_install_names=plugin_install_names,
        self_signed_tls=self_signed_tls,
        server_host=server_host,
        ca_pem=ca_pem,
        role=role,
    )
    text = "\n".join(lines)
    return text.replace("{server_url}", server_url).replace("{token}", token)
  • Step 5: Run tests to verify they pass
pytest tests/test_setup_instructions_analyst.py -v

Expected: all PASS.

  • Step 6: Run regression on existing setup-instruction tests
pytest tests/ -k setup_instructions -v

Expected: existing admin-role tests still PASS (no regression).

  • Step 7: Commit
git add app/web/setup_instructions.py tests/test_setup_instructions_analyst.py
git commit -m "feat(setup): add analyst role to install-prompt renderer"

Task 4: Add role query branching to /setup route

Files:

  • Modify: app/web/router.py (setup_page around line 717 — read role query param, pass to renderer)

  • Test: tests/test_setup_page_roles.py (new)

  • Step 1: Read existing setup_page to understand its current shape

grep -n "setup_page\|/setup\|/install" app/web/router.py | head

Read the function (~30 lines around the match) to understand its current call sites and template rendering.

  • Step 2: Write failing tests
# tests/test_setup_page_roles.py
"""Tests for /setup role query-param branching."""


def test_setup_page_default_role_is_admin(client):
    resp = client.get("/setup")
    assert resp.status_code == 200
    # Admin tile is active; analyst tile is linked.
    assert "Admin CLI" in resp.text or "role=admin" in resp.text


def test_setup_page_analyst_role(client):
    resp = client.get("/setup?role=analyst")
    assert resp.status_code == 200
    assert "Analyst workspace" in resp.text or "role=analyst" in resp.text


def test_install_redirects_to_setup(client):
    resp = client.get("/install", follow_redirects=False)
    assert resp.status_code in (302, 307)
    assert "/setup" in resp.headers["location"]
  • Step 3: Run tests to verify they fail
pytest tests/test_setup_page_roles.py -v

Expected: tests for analyst/role-branching content FAIL; redirect test may PASS (existing behavior).

  • Step 4: Modify setup_page to read role query param

Find setup_page in app/web/router.py. Update its signature to add a role query param and pass it to the renderer:

from typing import Literal
from fastapi import Query

@router.get("/setup", response_class=HTMLResponse)
async def setup_page(
    request: Request,
    role: Literal["analyst", "admin"] = Query(default="admin", description="Bootstrap target role"),
    # ... existing dependencies (auth, etc.)
):
    """Renders the role-specific install paste prompt."""
    # ... existing context-building code ...
    ctx["role"] = role
    return templates.TemplateResponse(request, "setup.html", ctx)

If setup_page already calls render_setup_instructions(...) server-side (vs. JS-rendered), pass role there too:

prompt_text = render_setup_instructions(
    server_url=str(request.base_url).rstrip("/"),
    token="{token}",  # placeholder filled by JS at click time
    wheel_filename=resolved_wheel,
    plugin_install_names=plugin_install_names if role == "admin" else None,
    self_signed_tls=...,
    server_host=...,
    ca_pem=...,
    role=role,
)
  • Step 5: Update setup.html template to render role tiles

Find app/web/templates/setup.html (or whatever setup_page actually renders — grep -n "setup.html\|TemplateResponse" app/web/router.py). Add two role tiles near the top of the body:

<div class="role-tiles" style="display:flex; gap:1rem; margin-bottom:2rem;">
  <a href="/setup?role=analyst"
     class="role-tile {% if role == 'analyst' %}is-active{% endif %}"
     style="flex:1; padding:1rem; border:2px solid {% if role == 'analyst' %}#0070f3{% else %}#ddd{% endif %}; border-radius:8px; text-decoration:none;">
    <h3>Analyst workspace</h3>
    <p>Bootstrap a workspace folder with CLAUDE.md, hooks, and synced data.</p>
  </a>
  <a href="/setup?role=admin"
     class="role-tile {% if role == 'admin' %}is-active{% endif %}"
     style="flex:1; padding:1rem; border:2px solid {% if role == 'admin' %}#0070f3{% else %}#ddd{% endif %}; border-radius:8px; text-decoration:none;">
    <h3>Admin CLI</h3>
    <p>Install the CLI, register the marketplace, set up admin tooling.</p>
  </a>
</div>

If the template is more sophisticated (e.g., with role-specific JS), wire the JS to use the role ctx variable when calling POST /auth/tokens for PAT minting:

const role = "{{ role }}";
const scope = role === "analyst" ? "bootstrap-analyst" : "general";
const ttlSeconds = role === "analyst" ? 3600 : 86400; // analyst short-lived

await fetch('/auth/tokens', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ name: `setup-${role}`, scope, ttl_seconds: ttlSeconds }),
});
  • Step 6: Run tests to verify they pass
pytest tests/test_setup_page_roles.py -v

Expected: all PASS.

  • Step 7: Commit
git add app/web/router.py app/web/templates/setup.html tests/test_setup_page_roles.py
git commit -m "feat(setup): /setup?role=analyst|admin branching with role tiles"

Task 5: Add legacy-strings banner to admin workspace-prompt template UI

Files:

  • Modify: app/web/templates/admin_workspace_prompt.html (add banner above editor when legacy_strings_detected non-empty)

  • Test: tests/test_legacy_strings_scan.py (extend with HTML rendering test)

  • Step 1: Read existing admin-prompt template

cat app/web/templates/admin_workspace_prompt.html

Find where the editor (textarea) is rendered.

  • Step 2: Write extension test

Append to tests/test_legacy_strings_scan.py:

def test_admin_prompt_template_renders_banner_when_legacy_present(web_session):
    web_session.put("/api/admin/workspace-prompt-template",
                    json={"content": "Run `da sync` daily."})
    resp = web_session.get("/admin/workspace-prompt")
    assert resp.status_code == 200
    assert "yellow" in resp.text.lower() or "warning" in resp.text.lower()
    assert "da sync" in resp.text  # the hit is rendered in the banner


def test_admin_prompt_template_no_banner_when_clean(web_session):
    web_session.put("/api/admin/workspace-prompt-template",
                    json={"content": "Run `agnes pull` daily."})
    resp = web_session.get("/admin/workspace-prompt")
    assert resp.status_code == 200
    # The banner div is absent or empty
    # Implementation: e.g., id="legacy-banner" wraps the warning; check empty
    assert "legacy-banner" not in resp.text or "display: none" in resp.text or len(
        [l for l in resp.text.split("\n") if "legacy-banner" in l and "hidden" not in l]
    ) <= 2

(The exact test is fragile — strengthen once the implementation lands.)

  • Step 3: Run tests to verify they fail
pytest tests/test_legacy_strings_scan.py -v -k banner
  • Step 4: Modify admin_workspace_prompt.html

Find the spot above the editor <textarea> and insert:

{% if legacy_strings_detected %}
<div id="legacy-banner" style="background:#fff3cd; border:1px solid #ffc107; padding:0.75rem 1rem; border-radius:4px; margin-bottom:1rem;">
  <strong>⚠ This override references CLI verbs / paths that were renamed:</strong>
  <ul style="margin:0.5rem 0 0 1.5rem;">
    {% for hit in legacy_strings_detected %}
    <li><code>{{ hit }}</code></li>
    {% endfor %}
  </ul>
  <p style="margin:0.5rem 0 0 0;">Re-author and Save to clear this warning. See CHANGELOG for the rename list.</p>
</div>
{% endif %}

In the route that renders this template (find via grep -n "admin_workspace_prompt\.html" app/web/router.py app/web/admin_router.py), pass legacy_strings_detected into the context. The data comes from the same _scan_legacy_strings(override_text) call as the API — DRY by importing from app.api.claude_md:

from app.api.claude_md import _scan_legacy_strings

@router.get("/admin/workspace-prompt", response_class=HTMLResponse)
async def admin_workspace_prompt_page(...):
    override = repo.get_workspace_prompt_template()
    ctx = {
        "override_content": override.content if override else "",
        "legacy_strings_detected": _scan_legacy_strings(override.content) if override else [],
        # ... rest of existing context
    }
    return templates.TemplateResponse(request, "admin_workspace_prompt.html", ctx)
  • Step 5: Run tests to verify they pass
pytest tests/test_legacy_strings_scan.py -v

Expected: PASS.

  • Step 6: Commit
git add app/web/templates/admin_workspace_prompt.html app/web/router.py tests/test_legacy_strings_scan.py
git commit -m "feat(admin): yellow banner for legacy CLI verbs in workspace-prompt override"

Task 6: Update config/claude_md_template.txt (server-side rendered to /api/welcome)

Files:

  • Modify: config/claude_md_template.txt (verb + path rewrites)

  • Step 1: Read the current template

cat config/claude_md_template.txt | head -100
wc -l config/claude_md_template.txt
  • Step 2: Apply systematic rewrites

Replace throughout the file:

  • da syncagnes pull (everywhere)
  • da analyst setupagnes init (everywhere)
  • da fetchagnes snapshot create
  • da metrics listagnes catalog --metrics
  • da metrics showagnes catalog --metrics --show
  • data/parquet/server/parquet/
  • data/duckdb/user/duckdb/
  • data/metadata/ → (delete references; the path no longer exists)

Use sed:

sed -i.bak \
  -e 's|da sync --upload-only|agnes push|g' \
  -e 's|da sync|agnes pull|g' \
  -e 's|da analyst setup|agnes init|g' \
  -e 's|da fetch|agnes snapshot create|g' \
  -e 's|da metrics list|agnes catalog --metrics|g' \
  -e 's|da metrics show|agnes catalog --metrics --show|g' \
  -e 's|data/parquet/|server/parquet/|g' \
  -e 's|data/duckdb/|user/duckdb/|g' \
  config/claude_md_template.txt

rm config/claude_md_template.txt.bak
  • Step 3: Read the result and review for any leftover legacy strings
grep -nE 'da sync|da fetch|da analyst|da metrics list|da metrics show|data/parquet|data/duckdb|data/metadata' config/claude_md_template.txt

Expected: no matches.

  • Step 4: Add a top-of-file pointer to AGNES_WORKSPACE.md

Insert near the top of the rendered template (e.g., after the # {instance_name} heading):

> Looking for human-readable workspace docs? Open `AGNES_WORKSPACE.md` in this directory — that file documents what `agnes init` installed, where files live, and how to uninstall.
  • Step 5: Render the template via /api/welcome (manual smoke)

(Defer the real test to Task 27 — clean-install integration.)

  • Step 6: Commit
git add config/claude_md_template.txt
git commit -m "docs(claude-md-template): rewrite verbs + paths for new CLI surface"

Phase 2 — Client-side library (cli/lib/)

Task 7: Establish cli/lib/ package + install_claude_hooks

Files:

  • Create: cli/lib/__init__.py, cli/lib/hooks.py

  • Test: tests/test_lib_hooks.py (new)

  • Step 1: Write failing tests

# tests/test_lib_hooks.py
"""Tests for cli/lib/hooks.py:install_claude_hooks."""

import json
from pathlib import Path

import pytest

from cli.lib.hooks import install_claude_hooks


def _read_settings(workspace: Path) -> dict:
    return json.loads((workspace / ".claude" / "settings.json").read_text())


def test_install_creates_settings_file(tmp_path):
    install_claude_hooks(tmp_path)
    cfg = _read_settings(tmp_path)
    assert cfg["hooks"]["SessionStart"]
    assert "agnes pull --quiet" in cfg["hooks"]["SessionStart"][0]["hooks"][0]["command"]
    assert cfg["hooks"]["SessionEnd"]
    assert "agnes push --quiet" in cfg["hooks"]["SessionEnd"][0]["hooks"][0]["command"]


def test_install_idempotent(tmp_path):
    install_claude_hooks(tmp_path)
    install_claude_hooks(tmp_path)
    cfg = _read_settings(tmp_path)
    # Only ONE entry per event after second install (not duplicated)
    assert len(cfg["hooks"]["SessionStart"]) == 1
    assert len(cfg["hooks"]["SessionEnd"]) == 1


def test_install_replaces_old_da_sync_entries(tmp_path):
    """Hook from a pre-rewrite workspace gets replaced cleanly."""
    settings_path = tmp_path / ".claude" / "settings.json"
    settings_path.parent.mkdir(parents=True)
    settings_path.write_text(json.dumps({
        "hooks": {
            "SessionStart": [{"hooks": [{"type": "command", "command": "da sync --quiet"}]}],
            "SessionEnd": [{"hooks": [{"type": "command", "command": "da sync --upload-only --quiet"}]}],
        }
    }))
    install_claude_hooks(tmp_path)
    cfg = _read_settings(tmp_path)
    assert len(cfg["hooks"]["SessionStart"]) == 1
    assert "agnes pull" in cfg["hooks"]["SessionStart"][0]["hooks"][0]["command"]
    assert "da sync" not in cfg["hooks"]["SessionStart"][0]["hooks"][0]["command"]


def test_install_preserves_third_party_hooks(tmp_path):
    settings_path = tmp_path / ".claude" / "settings.json"
    settings_path.parent.mkdir(parents=True)
    settings_path.write_text(json.dumps({
        "hooks": {
            "SessionStart": [{"hooks": [{"type": "command", "command": "echo hi from another tool"}]}],
            "PreToolUse": [{"hooks": [{"type": "command", "command": "echo pre"}]}],
        }
    }))
    install_claude_hooks(tmp_path)
    cfg = _read_settings(tmp_path)
    # Third-party SessionStart entry survives; our agnes pull entry appended
    starts = cfg["hooks"]["SessionStart"]
    assert any("echo hi from another tool" in s["hooks"][0]["command"] for s in starts)
    assert any("agnes pull" in s["hooks"][0]["command"] for s in starts)
    # PreToolUse untouched
    assert cfg["hooks"]["PreToolUse"][0]["hooks"][0]["command"] == "echo pre"


def test_install_handles_missing_settings_file(tmp_path):
    """No prior settings.json → create from scratch."""
    install_claude_hooks(tmp_path)
    assert (tmp_path / ".claude" / "settings.json").exists()


def test_install_handles_invalid_json(tmp_path, capsys):
    """Invalid existing settings.json → warn, skip."""
    settings_path = tmp_path / ".claude" / "settings.json"
    settings_path.parent.mkdir(parents=True)
    settings_path.write_text("not valid json {")
    # Should not raise; should print a warning
    install_claude_hooks(tmp_path)
    captured = capsys.readouterr()
    assert "not valid JSON" in captured.err or "warning" in captured.err.lower()
  • Step 2: Run tests to verify they fail
pytest tests/test_lib_hooks.py -v

Expected: ImportErrorcli.lib doesn't exist.

  • Step 3: Create cli/lib/__init__.py
touch cli/lib/__init__.py
  • Step 4: Create cli/lib/hooks.py
# cli/lib/hooks.py
"""Workspace-scoped Claude Code hook installer.

Replaces the in-place `_install_claude_hooks` from `cli/commands/analyst.py`
(deleted as part of the clean-analyst-bootstrap rewrite). Splits hook
installation into a pure-function library so `agnes init` and any future caller
can use it without dragging in the deleted command module.

Design notes:
- Workspace-scoped (`<workspace>/.claude/settings.json`), NOT user-home.
  The hooks fire only when Claude Code opens this workspace.
- Idempotent: second invocation drops a prior `agnes pull` / `da sync` /
  `agnes push` entry (matched by command substring) and appends fresh entries.
  Third-party hooks (mixed entries, foreign commands) are left alone.
- Uses `\\| true` so the hook never blocks a session on a transient sync error.
"""

from __future__ import annotations

import json
import sys
from pathlib import Path

_OUR_COMMAND_MARKERS = ("agnes pull", "agnes push", "da sync")


def install_claude_hooks(workspace: Path) -> None:
    """Install SessionStart→`agnes pull` and SessionEnd→`agnes push` hooks.

    Idempotent. Workspace-scoped (writes `<workspace>/.claude/settings.json`).
    Preserves third-party hooks and other event types.
    """
    settings_path = workspace / ".claude" / "settings.json"
    settings_path.parent.mkdir(parents=True, exist_ok=True)

    if settings_path.exists():
        try:
            cfg = json.loads(settings_path.read_text(encoding="utf-8"))
        except json.JSONDecodeError:
            print(
                f"Warning: {settings_path} is not valid JSON; skipping hook install.",
                file=sys.stderr,
            )
            return
    else:
        cfg = {}

    hooks = cfg.setdefault("hooks", {})

    def _replace_or_add(event: str, command: str) -> None:
        existing = hooks.setdefault(event, [])
        # Drop any prior entry whose every hook command matches one of our
        # markers. Mixed entries (third-party + ours) are left alone.
        for entry in list(existing):
            entry_cmds = [h.get("command", "") for h in entry.get("hooks", [])]
            if entry_cmds and all(
                any(marker in c for marker in _OUR_COMMAND_MARKERS) for c in entry_cmds
            ):
                existing.remove(entry)
        existing.append({"hooks": [{"type": "command", "command": command}]})

    _replace_or_add("SessionStart", "agnes pull --quiet 2>/dev/null || true")
    _replace_or_add("SessionEnd", "agnes push --quiet 2>/dev/null || true")

    settings_path.write_text(json.dumps(cfg, indent=2) + "\n", encoding="utf-8")
  • Step 5: Run tests to verify they pass
pytest tests/test_lib_hooks.py -v

Expected: all PASS.

  • Step 6: Commit
git add cli/lib/__init__.py cli/lib/hooks.py tests/test_lib_hooks.py
git commit -m "feat(cli-lib): cli/lib/hooks.py:install_claude_hooks"

Task 8: cli/lib/pull.py:run_pull — extract data-refresh primitive from sync.py

Files:

  • Create: cli/lib/pull.py

  • Test: tests/test_lib_pull.py (new)

  • Step 1: Read cli/commands/sync.py to identify the function body to lift

wc -l cli/commands/sync.py
grep -n "^def \|^class " cli/commands/sync.py

Identify:

  • The Typer command function (e.g., sync() decorated with @sync_app.command())

  • The helper functions called from it: _rebuild_duckdb_views, _fetch_and_write_rules, _is_valid_parquet, etc.

  • Step 2: Write failing tests

# tests/test_lib_pull.py
"""Tests for cli/lib/pull.py:run_pull."""

from __future__ import annotations

import json
from pathlib import Path
from unittest.mock import patch, MagicMock

import pytest

from cli.lib.pull import run_pull, PullResult


@pytest.fixture
def fake_server(monkeypatch):
    """Mock api_get to return canned manifest + memory bundle."""
    canned = {
        "/api/sync/manifest": {"tables": []},
        "/api/memory/bundle": {"mandatory": [], "approved": []},
    }
    def _api_get(path, *args, **kwargs):
        resp = MagicMock()
        resp.status_code = 200
        body = canned.get(path, {})
        resp.json.return_value = body
        resp.iter_bytes = lambda chunk_size=65536: iter([b""])
        resp.raise_for_status = lambda: None
        return resp
    monkeypatch.setattr("cli.lib.pull.api_get", _api_get, raising=False)
    return canned


def test_run_pull_empty_manifest_no_parquet_dir(tmp_path, fake_server):
    result = run_pull(server_url="http://x", token="t", workspace=tmp_path)
    assert isinstance(result, PullResult)
    assert result.tables_updated == 0
    assert not (tmp_path / "server" / "parquet").exists(), \
        "lazy mkdir: empty manifest must not create server/parquet/"


def test_run_pull_empty_memory_no_rules_dir(tmp_path, fake_server):
    run_pull(server_url="http://x", token="t", workspace=tmp_path)
    assert not (tmp_path / ".claude" / "rules").exists(), \
        "lazy mkdir: empty bundle must not create .claude/rules/"


def test_run_pull_creates_duckdb_unconditionally(tmp_path, fake_server):
    """Even with zero data, the DuckDB file is opened (it's the load-bearing
    artifact and other readers expect its parent dir to exist)."""
    run_pull(server_url="http://x", token="t", workspace=tmp_path)
    assert (tmp_path / "user" / "duckdb" / "analytics.duckdb").exists()


def test_run_pull_with_one_table(tmp_path, monkeypatch):
    """Manifest with one table → server/parquet/ created, parquet downloaded."""
    canned_manifest = {"tables": [{"id": "tbl1", "md5": "abc"}]}
    canned_memory = {"mandatory": [], "approved": []}
    parquet_bytes = b"PAR1" + b"\x00" * 1000 + b"PAR1"  # minimal valid parquet shape

    def _api_get(path, *args, **kwargs):
        resp = MagicMock()
        resp.status_code = 200
        if path == "/api/sync/manifest":
            resp.json.return_value = canned_manifest
        elif path == "/api/memory/bundle":
            resp.json.return_value = canned_memory
        elif path.startswith("/api/data/tbl1/download"):
            resp.iter_bytes = lambda chunk_size=65536: iter([parquet_bytes])
        return resp

    monkeypatch.setattr("cli.lib.pull.api_get", _api_get, raising=False)
    monkeypatch.setattr("cli.lib.pull._is_valid_parquet", lambda p: True, raising=False)

    result = run_pull(server_url="http://x", token="t", workspace=tmp_path)
    assert (tmp_path / "server" / "parquet").exists()
    assert (tmp_path / "server" / "parquet" / "tbl1.parquet").exists()
    assert result.tables_updated == 1


def test_run_pull_dry_run_writes_nothing(tmp_path, fake_server):
    run_pull(server_url="http://x", token="t", workspace=tmp_path, dry_run=True)
    assert not (tmp_path / "server").exists()
    assert not (tmp_path / "user" / "duckdb").exists()
  • Step 3: Run tests to verify they fail
pytest tests/test_lib_pull.py -v

Expected: ImportError on cli.lib.pull.

  • Step 4: Create cli/lib/pull.py

Lift the body of today's cli/commands/sync.py:sync() into a pure function. Specifically:

  • Move _rebuild_duckdb_views, _fetch_and_write_rules, _is_valid_parquet (private helpers) into cli/lib/pull.py.
  • Drop Typer decorators and typer.echo calls — replace with returning structured result.
  • Apply lazy-mkdir fixes:
    • _fetch_and_write_rules: check mandatory + approved non-empty before mkdir.
    • Per-table download loop: mkdir server/parquet/ inside the loop, only when about to write.
# cli/lib/pull.py
"""Pure-function data-refresh primitive — used by `agnes pull` and `agnes init`.

Extracted from `cli/commands/sync.py` (deleted in the clean-bootstrap rewrite).
This module has no Typer dependency, no stdout side effects, no exit calls.
Callers decide what to print and how to handle errors.

Lazy-mkdir contract:
- `<workspace>/server/parquet/` — created only when the manifest has at least one
  table to download. Empty manifest → directory is never created.
- `<workspace>/.claude/rules/` — created only when `/api/memory/bundle` returns
  at least one mandatory or approved item. Empty bundle → directory absent.
- `<workspace>/user/duckdb/analytics.duckdb` — created unconditionally. The DB
  file (not just the dir) is the load-bearing artifact every reader expects.
"""

from __future__ import annotations

import hashlib
import re
from dataclasses import dataclass, field
from pathlib import Path
from typing import Optional

from cli.client import api_get


_SAFE_ID_RE = re.compile(r"^[a-zA-Z0-9_\-]+$")


@dataclass
class PullResult:
    tables_updated: int = 0
    parquets_total: int = 0
    rules_count: int = 0
    duration_s: float = 0.0
    errors: list[str] = field(default_factory=list)


def run_pull(
    server_url: str,
    token: str,
    workspace: Path,
    *,
    dry_run: bool = False,
) -> PullResult:
    """Refresh registered data into <workspace>/server/parquet + user/duckdb.

    Args:
        server_url: Base URL of the Agnes server.
        token: PAT for `Authorization: Bearer <token>`.
        workspace: Local workspace root. Lazy-mkdir applies inside this dir.
        dry_run: If True, computes deltas but writes nothing to disk.

    Returns:
        PullResult with summary counts.
    """
    import time
    start = time.time()
    result = PullResult()

    # 1. Fetch RBAC-filtered manifest
    resp = api_get("/api/sync/manifest", server_url=server_url, token=token)
    resp.raise_for_status()
    manifest = resp.json()
    tables = manifest.get("tables", [])

    # 2. Per-table download with lazy mkdir
    parquet_dir = workspace / "server" / "parquet"
    for tbl in tables:
        tbl_id = tbl.get("id", "")
        if not _SAFE_ID_RE.match(tbl_id):
            result.errors.append(f"unsafe table id skipped: {tbl_id!r}")
            continue
        target = parquet_dir / f"{tbl_id}.parquet"
        # Skip if local md5 matches remote
        remote_md5 = tbl.get("md5", "")
        if target.exists() and _file_md5(target) == remote_md5:
            continue
        if dry_run:
            result.tables_updated += 1
            continue
        # Lazy mkdir: only when about to write the FIRST parquet.
        parquet_dir.mkdir(parents=True, exist_ok=True)
        try:
            stream_resp = api_get(
                f"/api/data/{tbl_id}/download", server_url=server_url, token=token, stream=True,
            )
            stream_resp.raise_for_status()
            with open(target, "wb") as fh:
                for chunk in stream_resp.iter_bytes(chunk_size=65536):
                    fh.write(chunk)
            if not _is_valid_parquet(target):
                target.unlink()
                result.errors.append(f"{tbl_id}: invalid parquet (missing PAR1)")
                continue
            result.tables_updated += 1
        except Exception as exc:
            if target.exists():
                target.unlink()
            result.errors.append(f"{tbl_id}: {exc}")

    # 3. Rebuild DuckDB views
    if not dry_run:
        _rebuild_duckdb_views(workspace, parquet_dir)
        result.parquets_total = len(list(parquet_dir.glob("*.parquet"))) if parquet_dir.exists() else 0

    # 4. Fetch corporate-memory bundle (lazy mkdir for .claude/rules/)
    if not dry_run:
        result.rules_count = _fetch_and_write_rules(workspace, server_url, token)

    result.duration_s = time.time() - start
    return result


def _file_md5(path: Path) -> str:
    h = hashlib.md5()
    with open(path, "rb") as fh:
        for chunk in iter(lambda: fh.read(8192), b""):
            h.update(chunk)
    return h.hexdigest()


def _is_valid_parquet(path: Path) -> bool:
    """Cheap structural check — parquet files begin and end with `PAR1`."""
    try:
        with open(path, "rb") as fh:
            head = fh.read(4)
            fh.seek(-4, 2)
            tail = fh.read(4)
        return head == b"PAR1" and tail == b"PAR1"
    except OSError:
        return False


def _rebuild_duckdb_views(workspace: Path, parquet_dir: Path) -> None:
    import duckdb

    db_path = workspace / "user" / "duckdb" / "analytics.duckdb"
    db_path.parent.mkdir(parents=True, exist_ok=True)
    conn = duckdb.connect(str(db_path))
    try:
        # Drop all existing views
        views = conn.execute(
            "SELECT table_name FROM information_schema.tables WHERE table_type='VIEW'"
        ).fetchall()
        for (view_name,) in views:
            conn.execute(f'DROP VIEW IF EXISTS "{view_name}"')
        # Recreate from parquets (if any)
        if parquet_dir.exists():
            for pq_file in parquet_dir.glob("*.parquet"):
                view_name = pq_file.stem
                if not _SAFE_ID_RE.match(view_name):
                    continue
                if not _is_valid_parquet(pq_file):
                    continue
                abs_path = str(pq_file.resolve()).replace("'", "''")
                try:
                    conn.execute(
                        f'CREATE VIEW "{view_name}" AS SELECT * FROM read_parquet(\'{abs_path}\')'
                    )
                except duckdb.Error:
                    continue
    finally:
        conn.close()


def _fetch_and_write_rules(workspace: Path, server_url: str, token: str) -> int:
    """Fetch /api/memory/bundle and write .claude/rules/km_*.md files.

    Lazy mkdir: only creates `<workspace>/.claude/rules/` if the bundle is non-empty.
    Returns the number of rule files written.
    """
    rules_dir = workspace / ".claude" / "rules"
    try:
        resp = api_get("/api/memory/bundle", server_url=server_url, token=token)
        resp.raise_for_status()
        bundle = resp.json()
    except Exception:
        return 0

    items = list(bundle.get("mandatory", [])) + list(bundle.get("approved", []))
    if not items:
        return 0  # no mkdir, nothing to write

    rules_dir.mkdir(parents=True, exist_ok=True)
    written = 0
    for item in items:
        item_id = item.get("id", "")
        if not _SAFE_ID_RE.match(item_id):
            continue
        fname = f"km_{item_id}.md"
        body = _item_to_md(item)
        (rules_dir / fname).write_text(body, encoding="utf-8")
        written += 1
    return written


def _item_to_md(item: dict) -> str:
    title = item.get("title", "")
    body = item.get("body", "")
    return f"# {title}\n\n{body}\n"

(If cli/client.py:api_get doesn't accept the server_url/token/stream kwargs as shown, adapt the calls to the actual signature.)

  • Step 5: Run tests to verify they pass
pytest tests/test_lib_pull.py -v

Expected: all PASS.

  • Step 6: Commit
git add cli/lib/pull.py tests/test_lib_pull.py
git commit -m "feat(cli-lib): cli/lib/pull.py:run_pull primitive with lazy mkdir"

Phase 3 — New CLI commands

Task 9: agnes pull Typer wrapper

Files:

  • Create: cli/commands/pull.py

  • Test: tests/test_cli_pull.py (new)

  • Step 1: Write failing test

# tests/test_cli_pull.py
"""Tests for `agnes pull` Typer wrapper."""

from typer.testing import CliRunner
from cli.commands.pull import pull_app

runner = CliRunner()


def test_pull_help():
    result = runner.invoke(pull_app, ["--help"])
    assert result.exit_code == 0
    assert "--quiet" in result.output
    assert "--json" in result.output
    assert "--dry-run" in result.output
  • Step 2: Run test to verify it fails
pytest tests/test_cli_pull.py -v

Expected: ImportError.

  • Step 3: Create cli/commands/pull.py
# cli/commands/pull.py
"""`agnes pull` — refresh registered data into the workspace.

Thin Typer wrapper around `cli/lib/pull.py:run_pull`. Used by:
- Manual invocation: analyst types `agnes pull` to force a refresh.
- SessionStart hook: `agnes pull --quiet 2>/dev/null || true` runs at the start
  of every Claude Code session in this workspace.
"""

from __future__ import annotations

import json
import os
from pathlib import Path

import typer

from cli.config import get_server_url, load_token
from cli.error_render import render_error
from cli.lib.pull import run_pull, PullResult


pull_app = typer.Typer(help="Refresh registered data from the server")


@pull_app.callback(invoke_without_command=True)
def pull(
    quiet: bool = typer.Option(False, "--quiet", help="Suppress success stdout"),
    as_json: bool = typer.Option(False, "--json", help="Machine-readable output"),
    dry_run: bool = typer.Option(False, "--dry-run", help="Compute deltas without writing"),
):
    """Refresh data from the server into ./server/parquet + ./user/duckdb."""
    server_url = get_server_url()
    if not server_url:
        typer.echo(render_error(0, {"detail": {
            "kind": "server_unreachable",
            "hint": "No server configured. Run: agnes init --server-url <URL> --token <PAT>",
        }}), err=True)
        raise typer.Exit(1)

    token = load_token()
    if not token:
        typer.echo(render_error(0, {"detail": {
            "kind": "auth_failed",
            "hint": "No token. Run: agnes auth import-token --token <PAT>",
        }}), err=True)
        raise typer.Exit(1)

    workspace = Path(os.environ.get("DA_LOCAL_DIR", ".")).resolve()

    try:
        result: PullResult = run_pull(server_url, token, workspace, dry_run=dry_run)
    except Exception as exc:
        typer.echo(render_error(0, {"detail": {
            "kind": "manifest_unauthorized",
            "hint": f"Pull failed: {exc}",
            "message": str(exc),
        }}), err=True)
        raise typer.Exit(1)

    if as_json:
        typer.echo(json.dumps({
            "tables_updated": result.tables_updated,
            "parquets_total": result.parquets_total,
            "rules_count": result.rules_count,
            "duration_s": round(result.duration_s, 3),
            "errors": result.errors,
        }))
        return

    if quiet:
        if result.errors:
            for e in result.errors:
                typer.echo(f"warn: {e}", err=True)
        return

    typer.echo(f"Updated {result.tables_updated} tables ({result.parquets_total} total).")
    typer.echo(f"Rules: {result.rules_count}.")
    if result.errors:
        for e in result.errors:
            typer.echo(f"warn: {e}", err=True)
  • Step 4: Run test to verify it passes
pytest tests/test_cli_pull.py -v

Expected: PASS.

  • Step 5: Commit
git add cli/commands/pull.py tests/test_cli_pull.py
git commit -m "feat(cli): agnes pull command (Typer wrapper around lib.pull.run_pull)"

Task 10: agnes push command (extract from da sync --upload-only)

Files:

  • Create: cli/commands/push.py

  • Test: tests/test_cli_push.py (new)

  • Step 1: Write failing tests

# tests/test_cli_push.py
from pathlib import Path
from typer.testing import CliRunner

from cli.commands.push import push_app

runner = CliRunner()


def test_push_help():
    result = runner.invoke(push_app, ["--help"])
    assert result.exit_code == 0
    assert "--quiet" in result.output
    assert "--json" in result.output


def test_push_no_sessions_no_mkdir(tmp_path, monkeypatch):
    """Empty workspace → push exits 0, doesn't create user/sessions/."""
    monkeypatch.setenv("DA_LOCAL_DIR", str(tmp_path))
    monkeypatch.setattr("cli.commands.push.get_server_url", lambda: "http://x")
    monkeypatch.setattr("cli.commands.push.load_token", lambda: "test-pat")
    result = runner.invoke(push_app, ["--quiet"])
    assert result.exit_code == 0
    assert not (tmp_path / "user" / "sessions").exists()
  • Step 2: Run test to verify it fails
pytest tests/test_cli_push.py -v
  • Step 3: Create cli/commands/push.py
# cli/commands/push.py
"""`agnes push` — upload local sessions and CLAUDE.local.md to the server.

Extracted from today's `da sync --upload-only`. Hook command:
`agnes push --quiet 2>/dev/null || true` (runs at SessionEnd).
"""

from __future__ import annotations

import json
import os
from pathlib import Path

import typer

from cli.client import api_post
from cli.config import get_server_url, load_token
from cli.error_render import render_error


push_app = typer.Typer(help="Upload local sessions and notes to the server")


@push_app.callback(invoke_without_command=True)
def push(
    quiet: bool = typer.Option(False, "--quiet", help="Suppress stdout"),
    as_json: bool = typer.Option(False, "--json", help="Machine-readable output"),
    dry_run: bool = typer.Option(False, "--dry-run", help="List what would upload, don't send"),
):
    server_url = get_server_url()
    token = load_token()
    if not server_url or not token:
        typer.echo(render_error(0, {"detail": {
            "kind": "auth_failed",
            "hint": "No server/token configured. Run: agnes init",
        }}), err=True)
        raise typer.Exit(1)

    workspace = Path(os.environ.get("DA_LOCAL_DIR", ".")).resolve()
    sessions_dir = workspace / "user" / "sessions"
    local_md = workspace / ".claude" / "CLAUDE.local.md"

    sessions = []
    if sessions_dir.exists():
        sessions = sorted(sessions_dir.glob("*.jsonl"))

    has_local_md = local_md.exists()

    summary = {"sessions_count": len(sessions), "local_md": has_local_md, "uploaded": 0}

    if dry_run:
        if as_json:
            typer.echo(json.dumps(summary))
        else:
            typer.echo(f"Would upload: {len(sessions)} sessions, local_md={has_local_md}")
        return

    for session_file in sessions:
        try:
            with open(session_file, "rb") as fh:
                resp = api_post("/api/upload/sessions", files={"file": (session_file.name, fh)})
                if resp.status_code == 200:
                    summary["uploaded"] += 1
        except Exception as exc:
            if not quiet:
                typer.echo(f"warn: failed to upload {session_file.name}: {exc}", err=True)

    if has_local_md:
        try:
            with open(local_md, "rb") as fh:
                api_post("/api/upload/local-md", files={"file": (local_md.name, fh)})
        except Exception as exc:
            if not quiet:
                typer.echo(f"warn: failed to upload CLAUDE.local.md: {exc}", err=True)

    if as_json:
        typer.echo(json.dumps(summary))
    elif not quiet:
        typer.echo(f"Uploaded {summary['uploaded']} sessions, local_md={has_local_md}")
  • Step 4: Run tests to verify they pass
pytest tests/test_cli_push.py -v
  • Step 5: Commit
git add cli/commands/push.py tests/test_cli_push.py
git commit -m "feat(cli): agnes push command (extracted from sync --upload-only)"

Task 11: agnes init — workspace bootstrap orchestrator

Files:

  • Create: cli/commands/init.py, config/agnes_workspace_template.txt

  • Test: tests/test_cli_init.py (new)

  • Step 1: Write the AGNES_WORKSPACE.md template

Create config/agnes_workspace_template.txt:

# Agnes analyst workspace

**Created:** {created_at}
**Server:** {server_url}
**Workspace:** {workspace_path}

This file documents what `agnes init` installed on this machine and in this folder.
Read this when you want to know "what is this thing", "how does it work", or
"how do I uninstall it". For Claude Code's instructions, see `CLAUDE.md`.

---

## What's installed (global, per-user)

| Path | What it is | How to remove |
|------|------------|---------------|
| `~/.local/bin/agnes` | The `agnes` CLI binary | `uv tool uninstall agnes-the-ai-analyst` |
| `~/.config/da/config.yaml` | Default Agnes server URL | `rm -rf ~/.config/da/` |
| `~/.config/da/token.json` | Personal access token (PAT) | `rm ~/.config/da/token.json` |
| `~/.agnes/ca.pem` | Server's CA cert (private CA installs only) | `rm ~/.agnes/ca.pem` |
| `~/.agnes/ca-bundle.pem` | Combined system + Agnes CA bundle | `rm ~/.agnes/ca-bundle.pem` |
| `~/.zshrc` / `~/.bashrc` block (marker `AGNES_CA_PEM_TRUST`) | `PATH` + `SSL_CERT_FILE` env | Edit rc, remove block |

---

## What's in this folder

| Path | What it is |
|------|------------|
| `./CLAUDE.md` | Rules + golden path for Claude Code (fetched from server's `/api/welcome`) |
| `./AGNES_WORKSPACE.md` | This file |
| `./.claude/settings.json` | Claude Code config: model, permissions, hooks |
| `./.claude/CLAUDE.local.md` | Your private notes (uploaded on session end) |
| `./.claude/rules/km_*.md` | Server-pushed corporate-knowledge rules (only when granted) |
| `./server/parquet/*.parquet` | Synced data — RBAC-filtered subset (only when grants exist) |
| `./user/duckdb/analytics.duckdb` | DuckDB views over the parquets — what `agnes query` reads |
| `./user/snapshots/*.parquet` | Ad-hoc materialized snapshots from `agnes snapshot create` |
| `./user/sessions/*.jsonl` | Captured Claude Code sessions (uploaded on session end) |

Some folders only exist when they have content — `agnes pull` and `agnes push`
only create them when there's something to write.

---

## How it stays fresh

Two hooks in `./.claude/settings.json` keep this workspace in sync without
you doing anything:

- **SessionStart** → `agnes pull --quiet` — new parquets, schema changes, and
  updated rules pull down before Claude Code answers. Failure is silent;
  your session continues with the last-known data.
- **SessionEnd** → `agnes push --quiet` — your session transcript and
  `CLAUDE.local.md` ship to the server.

Both are workspace-scoped — they only run when Claude Code opens this folder.

---

## Cheat sheet

```bash
# Tables you can read (server-side catalog, RBAC-filtered)
agnes catalog
agnes catalog --json | jq '.[] | select(.query_mode=="local")'

# Schema and sample
agnes schema opportunity
agnes describe opportunity -n 10

# Run a SQL query (DuckDB flavor against local parquets)
agnes query "SELECT count(*) FROM opportunity WHERE stage='Closed Won'"

# Remote BigQuery query (server-side, no local materialization)
agnes query --remote "SELECT count(*) FROM web_sessions_example"

# Materialize a remote subset locally
agnes snapshot create web_sessions_example \
  --select event_date,country_code \
  --where "event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)" \
  --as recent_sessions

# Manual data refresh (the SessionStart hook does this automatically)
agnes pull

# Workspace status (what's synced, when)
agnes status

# Re-generate this workspace from scratch (preserves CLAUDE.local.md)
agnes init --server-url https://agnes.example.com --token <PAT> --force

Uninstall

# 1. Remove the CLI
uv tool uninstall agnes-the-ai-analyst

# 2. Remove global config and trust artifacts
rm -rf ~/.config/da
rm -rf ~/.agnes

# 3. Remove the env-var block from your shell rc
# Open ~/.zshrc or ~/.bashrc, find the lines between
# "# AGNES_CA_PEM_TRUST — added by Agnes setup" and the next blank line, delete.

# 4. Remove this workspace
rm -rf ./CLAUDE.md ./AGNES_WORKSPACE.md ./.claude ./server ./user

- [ ] **Step 2: Write failing tests for `agnes init`**

```python
# tests/test_cli_init.py
"""Tests for `agnes init` orchestrator command."""

import json
from pathlib import Path
from unittest.mock import patch
from typer.testing import CliRunner

from cli.commands.init import init_app

runner = CliRunner()


def test_init_help():
    result = runner.invoke(init_app, ["--help"])
    assert result.exit_code == 0
    assert "--server-url" in result.output
    assert "--token" in result.output
    assert "--force" in result.output
    assert "--workspace" in result.output


def test_init_writes_expected_files(tmp_path, monkeypatch):
    """Mocked end-to-end: init writes CLAUDE.md, settings.json, AGNES_WORKSPACE.md."""
    # Mock all server-side calls
    def _api_get(path, *args, **kwargs):
        from unittest.mock import MagicMock
        resp = MagicMock()
        resp.status_code = 200
        if path == "/api/catalog/tables":
            resp.json.return_value = []
        elif path == "/api/welcome":
            resp.json.return_value = {"content": "# Test CLAUDE.md\n\nUse `agnes pull`.\n"}
        elif path == "/api/sync/manifest":
            resp.json.return_value = {"tables": []}
        elif path == "/api/memory/bundle":
            resp.json.return_value = {"mandatory": [], "approved": []}
        return resp
    monkeypatch.setattr("cli.commands.init.api_get", _api_get, raising=False)
    monkeypatch.setattr("cli.lib.pull.api_get", _api_get, raising=False)

    result = runner.invoke(init_app, [
        "--server-url", "http://test.example.com",
        "--token", "test-pat",
        "--workspace", str(tmp_path),
    ])
    assert result.exit_code == 0, result.output
    assert (tmp_path / "CLAUDE.md").exists()
    assert "agnes pull" in (tmp_path / "CLAUDE.md").read_text()
    assert (tmp_path / ".claude" / "settings.json").exists()
    assert (tmp_path / ".claude" / "CLAUDE.local.md").exists()
    assert (tmp_path / "AGNES_WORKSPACE.md").exists()
    assert (tmp_path / "user" / "duckdb" / "analytics.duckdb").exists()


def test_init_no_dead_dirs_zero_grants(tmp_path, monkeypatch):
    """Zero grants → no .claude/rules, no server/parquet, no user/sessions."""
    # Same mock as above
    def _api_get(path, *args, **kwargs):
        from unittest.mock import MagicMock
        resp = MagicMock()
        resp.status_code = 200
        if path == "/api/catalog/tables":
            resp.json.return_value = []
        elif path == "/api/welcome":
            resp.json.return_value = {"content": "test"}
        elif path == "/api/sync/manifest":
            resp.json.return_value = {"tables": []}
        elif path == "/api/memory/bundle":
            resp.json.return_value = {"mandatory": [], "approved": []}
        return resp
    monkeypatch.setattr("cli.commands.init.api_get", _api_get, raising=False)
    monkeypatch.setattr("cli.lib.pull.api_get", _api_get, raising=False)

    runner.invoke(init_app, [
        "--server-url", "http://x", "--token", "t", "--workspace", str(tmp_path),
    ])
    for forbidden in ["data/parquet", "data/duckdb", "data/metadata",
                      "user/artifacts", "user/sessions",
                      "server/parquet", ".claude/rules"]:
        assert not (tmp_path / forbidden).exists(), f"forbidden created: {forbidden}"


def test_init_force_preserves_local_md(tmp_path, monkeypatch):
    """--force regenerates CLAUDE.md but never touches CLAUDE.local.md."""
    # First init
    def _api_get(path, *args, **kwargs):
        from unittest.mock import MagicMock
        resp = MagicMock()
        resp.status_code = 200
        if path == "/api/catalog/tables": resp.json.return_value = []
        elif path == "/api/welcome": resp.json.return_value = {"content": "v1"}
        else: resp.json.return_value = {} if "manifest" in path else {"mandatory": [], "approved": []}
        return resp
    monkeypatch.setattr("cli.commands.init.api_get", _api_get, raising=False)
    monkeypatch.setattr("cli.lib.pull.api_get", _api_get, raising=False)

    runner.invoke(init_app, ["--server-url", "http://x", "--token", "t", "--workspace", str(tmp_path)])
    (tmp_path / ".claude" / "CLAUDE.local.md").write_text("# my notes")

    # Second init with --force
    runner.invoke(init_app, ["--server-url", "http://x", "--token", "t",
                              "--workspace", str(tmp_path), "--force"])
    assert "my notes" in (tmp_path / ".claude" / "CLAUDE.local.md").read_text()
  • Step 3: Run tests to verify they fail
pytest tests/test_cli_init.py -v
  • Step 4: Create cli/commands/init.py
# cli/commands/init.py
"""`agnes init` — bootstrap an analyst workspace.

Single-paste flow: web user clicks "Generate prompt" on /setup?role=analyst,
pastes into Claude Code in an empty folder; Claude runs `agnes init` (among other
steps). Non-interactive: --token + --server-url required.
"""

from __future__ import annotations

import json
import os
from datetime import datetime, timezone
from pathlib import Path
from typing import Optional

import typer

from cli.client import api_get
from cli.config import save_config, save_token
from cli.error_render import render_error
from cli.lib.hooks import install_claude_hooks
from cli.lib.pull import run_pull, PullResult


_INIT_MARKER = "AI Data Analyst"  # Detect existing workspace via CLAUDE.md substring


init_app = typer.Typer(help="Bootstrap an analyst workspace in this directory")


@init_app.callback(invoke_without_command=True)
def init(
    server_url: str = typer.Option(..., "--server-url", help="Agnes server URL"),
    token: str = typer.Option(..., "--token", help="Personal access token"),
    force: bool = typer.Option(False, "--force", help="Re-initialize an existing workspace"),
    workspace_str: Optional[str] = typer.Option(None, "--workspace", help="Target dir (default: cwd)"),
):
    """Bootstrap workspace: auth, CLAUDE.md, hooks, first pull, AGNES_WORKSPACE.md."""
    workspace = Path(workspace_str).resolve() if workspace_str else Path.cwd()
    server_url = server_url.rstrip("/")

    # Detect existing workspace
    claude_md = workspace / "CLAUDE.md"
    if claude_md.exists() and _INIT_MARKER in claude_md.read_text() and not force:
        typer.echo(render_error(0, {"detail": {
            "kind": "partial_state",
            "hint": "Workspace already initialized. Re-run with --force to redo.",
        }}), err=True)
        raise typer.Exit(1)

    # Step 1: Verify PAT via /api/catalog/tables (PAT-validating endpoint)
    try:
        resp = api_get("/api/catalog/tables", server_url=server_url, token=token)
        if resp.status_code == 401:
            typer.echo(render_error(401, {"detail": {
                "kind": "auth_failed",
                "hint": f"Token expired or invalid — get a fresh one at {server_url}/setup?role=analyst",
            }}), err=True)
            raise typer.Exit(1)
        resp.raise_for_status()
    except typer.Exit:
        raise
    except Exception as exc:
        typer.echo(render_error(0, {"detail": {
            "kind": "server_unreachable",
            "hint": f"Cannot reach {server_url} — check network or server status",
            "message": str(exc),
        }}), err=True)
        raise typer.Exit(1)

    # Step 2: Save config + token globally
    save_config({"server": server_url})
    save_token(token, email="")  # email empty — JWT carries it; we don't decode here

    # Step 3: Fetch CLAUDE.md from /api/welcome (server-rendered, RBAC-filtered)
    welcome_resp = api_get("/api/welcome", server_url=server_url, token=token,
                           params={"server_url": server_url})
    welcome_resp.raise_for_status()
    workspace.mkdir(parents=True, exist_ok=True)
    claude_md.write_text(welcome_resp.json()["content"], encoding="utf-8")

    # Step 4: Default settings.json + install hooks
    settings_path = workspace / ".claude" / "settings.json"
    if not settings_path.exists():
        settings_path.parent.mkdir(parents=True, exist_ok=True)
        settings_path.write_text(json.dumps(
            {"model": "sonnet", "permissions": {"allow": ["Read", "Bash", "Grep", "Glob"]}},
            indent=2,
        ))
    install_claude_hooks(workspace)

    # Step 5: CLAUDE.local.md stub (preserved on re-run)
    local_md = workspace / ".claude" / "CLAUDE.local.md"
    if not local_md.exists():
        local_md.write_text(
            "# My Notes\n\nPersonal notes for this workspace. Uploaded on `agnes push`.\n",
            encoding="utf-8",
        )

    # Step 6: First pull
    try:
        result: PullResult = run_pull(server_url, token, workspace)
    except Exception as exc:
        typer.echo(render_error(0, {"detail": {
            "kind": "manifest_unauthorized",
            "hint": "Initial pull failed — workspace partially set up",
            "message": str(exc),
        }}), err=True)
        raise typer.Exit(1)

    # Step 7: Write AGNES_WORKSPACE.md from client-side template
    here = Path(__file__).parent
    template_path = here.parent.parent / "config" / "agnes_workspace_template.txt"
    if template_path.exists():
        template = template_path.read_text(encoding="utf-8")
    else:
        template = "# Agnes workspace\n\nCreated: {created_at}\nServer: {server_url}\n"
    workspace_md = (template
                    .replace("{created_at}", datetime.now(timezone.utc).isoformat())
                    .replace("{server_url}", server_url)
                    .replace("{workspace_path}", str(workspace)))
    (workspace / "AGNES_WORKSPACE.md").write_text(workspace_md, encoding="utf-8")

    # Step 8: Summary
    typer.echo("Workspace ready.")
    typer.echo(f"  Server   : {server_url}")
    typer.echo(f"  Tables   : {result.tables_updated} synced ({result.parquets_total} total)")
    typer.echo(f"  Rules    : {result.rules_count}")
    typer.echo(f"  Workspace: {workspace}")
    typer.echo("")
    typer.echo("Try: agnes catalog")
  • Step 5: Run tests to verify they pass
pytest tests/test_cli_init.py -v
  • Step 6: Commit
git add cli/commands/init.py config/agnes_workspace_template.txt tests/test_cli_init.py
git commit -m "feat(cli): agnes init orchestrator + AGNES_WORKSPACE.md template"

Task 12: New agnes status (workspace status, replaces da analyst status)

Files:

  • Modify (overwrite): cli/commands/status.py

  • Test: tests/test_cli_status.py (new)

  • Step 1: Read existing status.py to understand what to replace

cat cli/commands/status.py

The existing agnes status shows server health. Per spec, this content moves to agnes diagnose system (Task 13); the file is repurposed for workspace status.

  • Step 2: Write failing tests
# tests/test_cli_status.py
from pathlib import Path
from typer.testing import CliRunner
from cli.commands.status import status_app

runner = CliRunner()


def test_status_uninitialized_workspace(tmp_path, monkeypatch):
    monkeypatch.setenv("DA_LOCAL_DIR", str(tmp_path))
    result = runner.invoke(status_app)
    assert result.exit_code in (0, 1)
    assert "not initialized" in result.output.lower() or "no workspace" in result.output.lower()


def test_status_initialized_workspace(tmp_path, monkeypatch):
    """A bootstrapped workspace shows 'initialized: yes' and basic stats."""
    monkeypatch.setenv("DA_LOCAL_DIR", str(tmp_path))
    (tmp_path / "CLAUDE.md").write_text("# AI Data Analyst\n")
    (tmp_path / "user" / "duckdb").mkdir(parents=True)
    (tmp_path / "user" / "duckdb" / "analytics.duckdb").touch()
    (tmp_path / "server" / "parquet").mkdir(parents=True)
    (tmp_path / "server" / "parquet" / "tbl1.parquet").touch()

    result = runner.invoke(status_app)
    assert result.exit_code == 0
    assert "initialized" in result.output.lower()
    assert "1" in result.output  # one parquet


def test_status_json(tmp_path, monkeypatch):
    monkeypatch.setenv("DA_LOCAL_DIR", str(tmp_path))
    (tmp_path / "CLAUDE.md").write_text("# AI Data Analyst\n")
    result = runner.invoke(status_app, ["--json"])
    assert result.exit_code == 0
    import json
    body = json.loads(result.output)
    assert "workspace" in body and "initialized" in body
  • Step 3: Run tests to verify they fail
pytest tests/test_cli_status.py -v
  • Step 4: Overwrite cli/commands/status.py
# cli/commands/status.py
"""`agnes status` — workspace status: initialized? data fresh? hooks active?"""

from __future__ import annotations

import json
import os
from datetime import datetime, timezone
from pathlib import Path

import typer


_INIT_MARKER = "AI Data Analyst"


status_app = typer.Typer(help="Workspace status (was `da analyst status`)")


@status_app.callback(invoke_without_command=True)
def status(
    as_json: bool = typer.Option(False, "--json", help="Machine-readable output"),
):
    workspace = Path(os.environ.get("DA_LOCAL_DIR", ".")).resolve()

    initialized = False
    claude_md = workspace / "CLAUDE.md"
    if claude_md.exists():
        initialized = _INIT_MARKER in claude_md.read_text()

    parquet_dir = workspace / "server" / "parquet"
    parquets = list(parquet_dir.glob("*.parquet")) if parquet_dir.exists() else []

    db_path = workspace / "user" / "duckdb" / "analytics.duckdb"
    last_synced = None
    if db_path.exists():
        last_synced = datetime.fromtimestamp(db_path.stat().st_mtime, tz=timezone.utc).isoformat()

    sessions_dir = workspace / "user" / "sessions"
    session_count = len(list(sessions_dir.glob("*.jsonl"))) if sessions_dir.exists() else 0

    info = {
        "workspace": str(workspace),
        "initialized": initialized,
        "parquet_tables": len(parquets),
        "duckdb_exists": db_path.exists(),
        "last_synced": last_synced,
        "sessions_pending_upload": session_count,
    }

    if as_json:
        typer.echo(json.dumps(info, indent=2))
        return

    typer.echo(f"Workspace : {workspace}")
    typer.echo(f"Initialized: {'yes' if initialized else 'no'}")
    typer.echo(f"Parquets  : {info['parquet_tables']}")
    typer.echo(f"DuckDB    : {'yes' if info['duckdb_exists'] else 'no'}")
    typer.echo(f"Last sync : {last_synced or 'never'}")
    typer.echo(f"Pending uploads: {session_count} sessions")

    if not initialized:
        typer.echo("")
        typer.echo("Run `agnes init --server-url <URL> --token <PAT>` to bootstrap.")
  • Step 5: Run tests to verify they pass
pytest tests/test_cli_status.py -v
  • Step 6: Commit
git add cli/commands/status.py tests/test_cli_status.py
git commit -m "feat(cli): agnes status now shows workspace state (was system health)"

Task 13: Move old agnes status content into agnes diagnose system

Files:

  • Modify: cli/commands/diagnose.py (add system subcommand with the old status logic)

  • Test: tests/test_cli_diagnose_system.py (new)

  • Step 1: Recover old agnes status logic

git show HEAD~12:cli/commands/status.py

(Adjust the ref — find the commit before the rewrite via git log --oneline cli/commands/status.py | head.) Save the body to a scratch file.

  • Step 2: Read existing diagnose command structure
cat cli/commands/diagnose.py
  • Step 3: Add system subcommand

Append the old status logic as a system subcommand of diagnose_app. Keep diagnose's existing default behavior (overall health) intact.

  • Step 4: Test
# tests/test_cli_diagnose_system.py
from typer.testing import CliRunner
from cli.commands.diagnose import diagnose_app


def test_diagnose_system_help():
    runner = CliRunner()
    result = runner.invoke(diagnose_app, ["system", "--help"])
    assert result.exit_code == 0
  • Step 5: Commit
git add cli/commands/diagnose.py tests/test_cli_diagnose_system.py
git commit -m "refactor(cli): move old `agnes status` health check to `agnes diagnose system`"

Task 14: agnes snapshot create — fold da fetch into snapshot group

Files:

  • Modify: cli/commands/snapshot.py (add create subcommand)

  • Test: tests/test_cli_snapshot_create.py (new)

  • Step 1: Read existing fetch.py and snapshot.py

cat cli/commands/fetch.py
cat cli/commands/snapshot.py
  • Step 2: Add create subcommand to snapshot_app

Move the body of fetch.py:fetch() into a new @snapshot_app.command("create"). Keep all flags. Update the existence check:

local_db = _local_dir() / "user" / "duckdb" / "analytics.duckdb"
if not local_db.exists():
    typer.echo("Local DuckDB not found. Run: agnes pull first.", err=True)
    raise typer.Exit(1)
# (then proceed with duckdb.connect — no longer creates an empty DB)
  • Step 3: Add tests
# tests/test_cli_snapshot_create.py
from typer.testing import CliRunner
from cli.commands.snapshot import snapshot_app


def test_snapshot_create_help():
    runner = CliRunner()
    result = runner.invoke(snapshot_app, ["create", "--help"])
    assert result.exit_code == 0
    for flag in ["--select", "--where", "--limit", "--order-by", "--as", "--estimate", "--no-estimate", "--force"]:
        assert flag in result.output


def test_snapshot_create_no_duckdb_friendly_exit(tmp_path, monkeypatch):
    monkeypatch.setenv("DA_LOCAL_DIR", str(tmp_path))
    runner = CliRunner()
    result = runner.invoke(snapshot_app, ["create", "any_table", "--as", "x", "--estimate"])
    assert result.exit_code == 1
    assert "Run: agnes pull" in result.output or "Run: agnes pull" in (result.stderr or "")
  • Step 4: Run tests
pytest tests/test_cli_snapshot_create.py -v
  • Step 5: Commit
git add cli/commands/snapshot.py tests/test_cli_snapshot_create.py
git commit -m "feat(cli): agnes snapshot create (folded from da fetch); friendly exit if no DuckDB"

Task 15: agnes catalog --metrics — fold da metrics list/show

Files:

  • Modify: cli/commands/catalog.py

  • Test: tests/test_cli_catalog_metrics.py (new)

  • Step 1: Read existing metrics list/show logic

grep -n "def list\|def show\|@" cli/commands/metrics.py | head
  • Step 2: Add --metrics and --metrics --show <id> to catalog

Modify cli/commands/catalog.py:

@catalog_app.callback(invoke_without_command=True)
def catalog(
    as_json: bool = typer.Option(False, "--json"),
    metrics: bool = typer.Option(False, "--metrics", help="Show metric definitions instead of tables"),
    show: Optional[str] = typer.Option(None, "--show", help="With --metrics: show one metric by id"),
):
    if metrics and show:
        return _show_one_metric(show, as_json)
    if metrics:
        return _list_metrics(as_json)
    return _list_tables(as_json)

(_list_metrics and _show_one_metric lift from metrics.py:list_metrics and metrics.py:show_metric.)

  • Step 3: Add tests
# tests/test_cli_catalog_metrics.py
from typer.testing import CliRunner
from cli.commands.catalog import catalog_app


def test_catalog_metrics_help():
    runner = CliRunner()
    result = runner.invoke(catalog_app, ["--help"])
    assert result.exit_code == 0
    assert "--metrics" in result.output
    assert "--show" in result.output
  • Step 4: Commit
git add cli/commands/catalog.py tests/test_cli_catalog_metrics.py
git commit -m "feat(cli): agnes catalog --metrics replaces da metrics list/show"

Task 16: Move da metrics import/export/validate to agnes admin metrics

Files:

  • Create: cli/commands/admin_metrics.py

  • Modify: cli/commands/admin.py (register the sub-Typer)

  • Step 1: Create cli/commands/admin_metrics.py

Lift import_metrics, export_metrics, validate_metrics from cli/commands/metrics.py. Wrap in a sub-Typer:

# cli/commands/admin_metrics.py
"""`agnes admin metrics {import,export,validate}` — lifted from metrics.py."""

import typer

admin_metrics_app = typer.Typer(help="Admin: metric definition management")


@admin_metrics_app.command("import")
def import_metrics(directory: str = typer.Argument(...)):
    # ... lifted logic from cli/commands/metrics.py:import_metrics
    pass


@admin_metrics_app.command("export")
def export_metrics(target: str = typer.Argument(...)):
    # ... lifted logic
    pass


@admin_metrics_app.command("validate")
def validate_metrics():
    # ... lifted logic
    pass

(Copy the actual implementations from metrics.py verbatim.)

  • Step 2: Register in admin app

In cli/commands/admin.py, add:

from cli.commands.admin_metrics import admin_metrics_app
admin_app.add_typer(admin_metrics_app, name="metrics")
  • Step 3: Test
# tests/test_cli_admin_metrics.py
from typer.testing import CliRunner
from cli.commands.admin import admin_app


def test_admin_metrics_subcommands_present():
    runner = CliRunner()
    result = runner.invoke(admin_app, ["metrics", "--help"])
    assert result.exit_code == 0
    assert "import" in result.output
    assert "export" in result.output
    assert "validate" in result.output
  • Step 4: Commit
git add cli/commands/admin_metrics.py cli/commands/admin.py tests/test_cli_admin_metrics.py
git commit -m "feat(cli): agnes admin metrics {import,export,validate}"

Phase 4 — Wiring + cleanup

Task 17: Update reader hint texts

Files:

  • Modify: cli/commands/query.py (two occurrences), cli/commands/explore.py

  • Step 1: Find all "Run: da sync" strings

grep -rn "Run: da sync" cli/
  • Step 2: Replace with "Run: agnes pull"
sed -i.bak 's/Run: da sync/Run: agnes pull/g' cli/commands/query.py cli/commands/explore.py
rm cli/commands/query.py.bak cli/commands/explore.py.bak
  • Step 3: Verify no leftover
grep -rn "Run: da sync" cli/

Expected: no matches.

  • Step 4: Commit
git add cli/commands/query.py cli/commands/explore.py
git commit -m "fix(cli): hint text 'Run: da sync' → 'Run: agnes pull'"

Task 18: Update cli/main.py registrations + delete obsolete commands

Files:

  • Modify: cli/main.py

  • Delete: cli/commands/sync.py, cli/commands/fetch.py, cli/commands/analyst.py, cli/commands/metrics.py

  • Step 1: Update cli/main.py

Replace lines 11-28 (imports) and 91-109 (registrations):

from cli.commands.auth import auth_app
from cli.commands.init import init_app
from cli.commands.pull import pull_app
from cli.commands.push import push_app
from cli.commands.query import query_command
from cli.commands.status import status_app
from cli.commands.admin import admin_app
from cli.commands.diagnose import diagnose_app
from cli.commands.skills import skills_app
from cli.commands.setup import setup_app
from cli.commands.server import server_app
from cli.commands.explore import explore_app
from cli.commands.catalog import catalog_app
from cli.commands.schema import schema_app
from cli.commands.describe import describe_app
from cli.commands.snapshot import snapshot_app
from cli.commands.disk_info import disk_info_app
# Register subcommands
app.add_typer(auth_app, name="auth")
app.add_typer(init_app, name="init")
app.add_typer(pull_app, name="pull")
app.add_typer(push_app, name="push")
app.command("query")(query_command)
app.add_typer(status_app, name="status")
app.add_typer(admin_app, name="admin")
app.add_typer(diagnose_app, name="diagnose")
app.add_typer(skills_app, name="skills")
app.add_typer(setup_app, name="setup")
app.add_typer(server_app, name="server")
app.add_typer(explore_app, name="explore")
app.add_typer(catalog_app, name="catalog")
app.add_typer(schema_app, name="schema")
app.add_typer(describe_app, name="describe")
app.add_typer(snapshot_app, name="snapshot")
app.add_typer(disk_info_app, name="disk-info")
  • Step 2: Delete obsolete files
git rm cli/commands/sync.py cli/commands/fetch.py cli/commands/analyst.py cli/commands/metrics.py
  • Step 3: Verify no other code imports them
grep -rn "from cli.commands.sync\|from cli.commands.fetch\|from cli.commands.analyst\|from cli.commands.metrics" .

Expected: no matches (anything found needs to be updated to use the new homes).

  • Step 4: Run the full test suite (smoke)
pytest tests/ -x --ignore=tests/test_clean_install_integration.py --ignore=tests/test_reader_smoke_matrix.py 2>&1 | tail -30

Expected: tests for moved/deleted commands fail with import errors — those tests are also being deleted (or already updated in earlier tasks). Other tests should pass.

If old test files reference the deleted commands, git rm them too:

git rm tests/test_analyst*.py tests/test_sync*.py tests/test_fetch*.py tests/test_metrics_cli*.py 2>/dev/null || true
  • Step 5: Commit
git add cli/main.py
git rm cli/commands/{sync,fetch,analyst,metrics}.py 2>/dev/null
git commit -m "refactor(cli): drop sync/fetch/analyst/metrics; register init/pull/push"

Task 19: Update repo-root CLAUDE.md

Files:

  • Modify: CLAUDE.md

  • Step 1: Apply systematic rewrites

sed -i.bak \
  -e 's|da sync --upload-only|agnes push|g' \
  -e 's|da sync|agnes pull|g' \
  -e 's|da analyst setup|agnes init|g' \
  -e 's|da fetch|agnes snapshot create|g' \
  -e 's|da metrics list|agnes catalog --metrics|g' \
  -e 's|da metrics show|agnes catalog --metrics --show|g' \
  -e 's|da metrics import|agnes admin metrics import|g' \
  -e 's|data/parquet/|server/parquet/|g' \
  -e 's|data/duckdb/|user/duckdb/|g' \
  CLAUDE.md
rm CLAUDE.md.bak
  • Step 2: Manually rewrite the "Local sync & Claude Code hooks" subsection

Find the section. Replace the surrounding prose so it describes agnes pull + agnes push hooks:

### Local sync & Claude Code hooks

`agnes pull` is the canonical analyst-side distribution path: pulls the
RBAC-filtered manifest from the server, downloads parquets whose MD5 changed
(skipping `query_mode='remote'` rows), rebuilds local DuckDB views over them.
`agnes push` mirrors it for the upload direction (sessions, CLAUDE.local.md).

`agnes init` writes two hooks into `<workspace>/.claude/settings.json`:

- `SessionStart``agnes pull --quiet` — pulls fresh parquets at the start of every Claude Code session
- `SessionEnd``agnes push --quiet` — uploads session jsonl + `CLAUDE.local.md` to the server

Both pass `--quiet` so they don't pollute Claude Code stdout, and trail with `|| true` so a server outage never blocks a session. Workspace-level (not user-home) so the hooks fire only when Claude Code opens this analyst workspace, not in unrelated sessions on the same machine.

Admin RBAC for auto-sync: `query_mode IN ('local', 'materialized')` plus a `resource_grants` row for one of the analyst's groups → table appears in their manifest → `agnes pull` downloads it. No per-user sync config; the admin layer is the single source of truth.
  • Step 3: Verify no leftover legacy strings
grep -nE 'da sync|da fetch|da analyst|da metrics list|da metrics show|data/parquet/|data/duckdb/' CLAUDE.md

Expected: no matches.

  • Step 4: Commit
git add CLAUDE.md
git commit -m "docs(claude-md): rewrite verbs + paths for new CLI surface"

Phase 5 — Test fixtures and integration tests

Task 20: Create tests/fixtures/analyst_bootstrap.py

Files:

  • Create: tests/fixtures/analyst_bootstrap.py

  • Modify: tests/conftest.py (import the fixtures)

  • Step 1: Read existing test infrastructure

grep -n "fastapi\|TestClient\|tmp_path" tests/conftest.py | head -30
  • Step 2: Create the fixtures
# tests/fixtures/analyst_bootstrap.py
"""Test fixtures for the clean-bootstrap test suite.

Per spec §"Test fixtures":
- fastapi_test_server, test_pat, test_pat_no_grants, zero_grants_workspace,
  web_session, client.
"""

from __future__ import annotations

import json
import subprocess
import threading
import time
from contextlib import contextmanager
from pathlib import Path
from typing import Iterator

import httpx
import pytest
import uvicorn


NONEXISTENT_TABLE = "__nonexistent__"  # Sentinel for reader smoke matrix


class _ServerHandle:
    def __init__(self, url: str, server: uvicorn.Server, thread: threading.Thread):
        self.url = url
        self._server = server
        self._thread = thread

    def shutdown(self):
        self._server.should_exit = True
        self._thread.join(timeout=5)


def _seed_db(data_dir: Path):
    """Initialize a fresh system.duckdb with seeded admin/analyst users + tables.

    Imports app modules at function scope to avoid circular imports during
    collection.
    """
    import os
    os.environ["DATA_DIR"] = str(data_dir)
    from src.db import get_db_connection
    from src.repositories.users import UserRepository
    from src.repositories.user_groups import UserGroupRepository
    from src.repositories.table_registry import TableRegistryRepository
    from src.repositories.user_group_members import UserGroupMembersRepository
    from src.repositories.resource_grants import ResourceGrantsRepository
    from app.auth.providers.password import _hash_password

    conn = get_db_connection()

    # Seed users
    user_repo = UserRepository(conn)
    admin_id = user_repo.create(email="admin@example.com", name="Admin",
                                password_hash=_hash_password("test-password"),
                                is_admin=True)
    analyst_id = user_repo.create(email="analyst@example.com", name="Analyst",
                                  password_hash=_hash_password("analyst-pw"),
                                  is_admin=False)

    # Seed groups (Admin + Everyone are seeded as is_system=TRUE on first run)
    grp_repo = UserGroupRepository(conn)
    admin_group = grp_repo.find_by_name("Admin")
    everyone_group = grp_repo.find_by_name("Everyone")

    # Memberships
    members = UserGroupMembersRepository(conn)
    members.add(user_id=admin_id, group_id=admin_group.id, source="system_seed")
    members.add(user_id=analyst_id, group_id=everyone_group.id, source="system_seed")

    # Tables
    tbl_repo = TableRegistryRepository(conn)
    tbl_repo.create(id="local_tbl", name="local_tbl", source_type="keboola",
                    bucket="test", source_table="local_tbl", query_mode="local")
    tbl_repo.create(id="materialized_tbl", name="materialized_tbl", source_type="bigquery",
                    bucket="test", source_table="materialized_tbl", query_mode="materialized")
    tbl_repo.create(id="remote_tbl", name="remote_tbl", source_type="bigquery",
                    bucket="test", source_table="remote_tbl", query_mode="remote")

    return {"admin_id": admin_id, "analyst_id": analyst_id,
            "admin_group_id": admin_group.id, "everyone_group_id": everyone_group.id}


@pytest.fixture
def fastapi_test_server(tmp_path) -> Iterator[_ServerHandle]:
    """Boot a real FastAPI server in a background thread against tmp_path DATA_DIR."""
    data_dir = tmp_path / "agnes-data"
    data_dir.mkdir()
    seeded = _seed_db(data_dir)
    handle_port = 18712 + (id(tmp_path) % 1000)

    from app.main import app as fastapi_app
    config = uvicorn.Config(fastapi_app, host="127.0.0.1", port=handle_port, log_level="warning")
    server = uvicorn.Server(config)
    thread = threading.Thread(target=server.run, daemon=True)
    thread.start()

    # Wait for server up
    url = f"http://127.0.0.1:{handle_port}"
    for _ in range(50):
        try:
            httpx.get(f"{url}/api/health", timeout=0.2)
            break
        except Exception:
            time.sleep(0.1)
    else:
        pytest.fail("fastapi_test_server failed to start")

    handle = _ServerHandle(url, server, thread)
    handle._seeded = seeded
    handle.NONEXISTENT_TABLE = NONEXISTENT_TABLE
    yield handle
    handle.shutdown()


@pytest.fixture
def web_session(fastapi_test_server) -> Iterator[httpx.Client]:
    """Authenticated httpx.Client using cookie session for admin@example.com."""
    client = httpx.Client(base_url=fastapi_test_server.url, follow_redirects=False)
    resp = client.post("/auth/password/login/web",
                       data={"email": "admin@example.com", "password": "test-password"})
    assert resp.status_code in (200, 302), f"web_session login failed: {resp.text}"
    yield client
    client.close()


@pytest.fixture
def test_pat(web_session) -> str:
    """Mint a PAT for analyst@example.com with 2 grants + 2 mandatory rules."""
    # First, grant the analyst access to local_tbl + materialized_tbl
    web_session.post("/api/admin/grants",
                     json={"group_id": "...everyone...", "resource_type": "table",
                           "resource_id": "local_tbl"})
    # ... similarly for materialized_tbl + 2 mandatory memory items
    # (Use the actual admin endpoints for grants and memory items.)

    # Mint PAT (as analyst — log in as analyst first, then mint)
    analyst_session = httpx.Client(base_url=web_session.base_url, follow_redirects=False)
    analyst_session.post("/auth/password/login/web",
                         data={"email": "analyst@example.com", "password": "analyst-pw"})
    resp = analyst_session.post("/auth/tokens",
                                json={"name": "test", "ttl_seconds": 3600})
    assert resp.status_code == 201, resp.text
    return resp.json()["token"]


@pytest.fixture
def test_pat_no_grants(web_session) -> str:
    analyst_session = httpx.Client(base_url=web_session.base_url, follow_redirects=False)
    analyst_session.post("/auth/password/login/web",
                         data={"email": "analyst@example.com", "password": "analyst-pw"})
    resp = analyst_session.post("/auth/tokens",
                                json={"name": "test-nogrants", "ttl_seconds": 3600})
    return resp.json()["token"]


@pytest.fixture
def zero_grants_workspace(tmp_path, fastapi_test_server, test_pat_no_grants) -> Path:
    """Run `agnes init` against a no-grants PAT; return the workspace path."""
    workspace = tmp_path / "workspace"
    workspace.mkdir()
    result = subprocess.run([
        "da", "init",
        "--server-url", fastapi_test_server.url,
        "--token", test_pat_no_grants,
        "--workspace", str(workspace),
    ], capture_output=True, text=True)
    assert result.returncode == 0, f"init failed: {result.stderr}"
    return workspace

(Adapt API calls as needed — the actual repository/route names may differ. Spec the goal; implementer adapts.)

  • Step 3: Wire fixtures into conftest

In tests/conftest.py, append:

from tests.fixtures.analyst_bootstrap import (
    fastapi_test_server, web_session, test_pat, test_pat_no_grants,
    zero_grants_workspace, NONEXISTENT_TABLE,
)
  • Step 4: Smoke-test fixture creation
# tests/test_fixtures_smoke.py
def test_server_boots(fastapi_test_server):
    import httpx
    resp = httpx.get(f"{fastapi_test_server.url}/api/health")
    assert resp.status_code == 200


def test_zero_grants_workspace_minimal(zero_grants_workspace):
    assert (zero_grants_workspace / "CLAUDE.md").exists()
    assert (zero_grants_workspace / "AGNES_WORKSPACE.md").exists()
    assert not (zero_grants_workspace / "server" / "parquet").exists()
    assert not (zero_grants_workspace / ".claude" / "rules").exists()
pytest tests/test_fixtures_smoke.py -v
  • Step 5: Commit
git add tests/fixtures/analyst_bootstrap.py tests/conftest.py tests/test_fixtures_smoke.py
git commit -m "test: clean-bootstrap fixtures (fastapi_test_server, test_pat, etc.)"

Task 21: Reader smoke matrix

Files:

  • Create: tests/test_reader_smoke_matrix.py

  • Step 1: Write the matrix

# tests/test_reader_smoke_matrix.py
"""Reader smoke matrix — every CLI command on a freshly-bootstrapped
zero-grants workspace, asserts no traceback. The load-bearing test for
'nothing crashes on missing dirs'."""

import subprocess

import pytest

from tests.fixtures.analyst_bootstrap import NONEXISTENT_TABLE


READER_COMMANDS = [
    ["agnes", "catalog"],
    ["agnes", "catalog", "--metrics"],
    ["agnes", "schema", NONEXISTENT_TABLE],
    ["agnes", "describe", NONEXISTENT_TABLE],
    ["agnes", "query", "SELECT 1"],
    ["agnes", "explore", NONEXISTENT_TABLE],
    ["agnes", "disk-info"],
    ["agnes", "snapshot", "list"],
    ["agnes", "snapshot", "create", NONEXISTENT_TABLE, "--as", "x", "--estimate"],
    ["agnes", "status"],
    ["agnes", "diagnose"],
    ["agnes", "auth", "whoami"],
    ["agnes", "skills", "list"],
    ["agnes", "skills", "show", "agnes-data-querying"],
]


@pytest.mark.parametrize("cmd", READER_COMMANDS, ids=lambda c: " ".join(c))
def test_reader_does_not_crash_on_zero_grants(zero_grants_workspace, cmd):
    """Exit 0 (success) or exit 1 (friendly hint) is OK; tracebacks are forbidden."""
    result = subprocess.run(cmd, cwd=zero_grants_workspace,
                            capture_output=True, text=True, timeout=30)
    assert result.returncode in (0, 1), \
        f"{cmd} crashed: rc={result.returncode}, stderr={result.stderr}"
    assert "Traceback" not in result.stderr, f"{cmd} threw: {result.stderr}"
  • Step 2: Run
pytest tests/test_reader_smoke_matrix.py -v

Expected: all parametrized cases PASS.

  • Step 3: Commit
git add tests/test_reader_smoke_matrix.py
git commit -m "test: reader smoke matrix on zero-grants workspace"

Task 22: Clean-install integration tests

Files:

  • Create: tests/test_clean_install_integration.py

  • Step 1: Write the integration tests per spec §5.2

# tests/test_clean_install_integration.py
"""End-to-end clean-install integration tests for `agnes init`."""

import json
import subprocess
from pathlib import Path


def assert_no_dead_dirs(workspace: Path):
    forbidden_unconditional = ["data/parquet", "data/duckdb", "data/metadata",
                               "user/artifacts", ".agnes"]
    for d in forbidden_unconditional:
        assert not (workspace / d).exists(), f"forbidden dir created: {d}"
    for d in [".claude/rules", "server/parquet", "user/sessions", "user/snapshots"]:
        path = workspace / d
        if path.exists():
            assert any(path.iterdir()), f"{d} exists but is empty"


def test_clean_install_minimal_grants(fastapi_test_server, tmp_path, test_pat):
    workspace = tmp_path / "ws"
    workspace.mkdir()
    result = subprocess.run([
        "da", "init",
        "--server-url", fastapi_test_server.url,
        "--token", test_pat,
        "--workspace", str(workspace),
    ], capture_output=True, text=True)
    assert result.returncode == 0, result.stderr

    for must in ["CLAUDE.md", "AGNES_WORKSPACE.md",
                 ".claude/settings.json", ".claude/CLAUDE.local.md",
                 "user/duckdb/analytics.duckdb"]:
        assert (workspace / must).exists(), f"missing: {must}"

    parquets = list((workspace / "server" / "parquet").glob("*.parquet"))
    assert len(parquets) == 2, "expected 2 parquets (local + materialized grants)"

    rules = list((workspace / ".claude" / "rules").iterdir())
    assert len(rules) == 2, "expected 2 mandatory rules"

    assert_no_dead_dirs(workspace)

    settings = json.loads((workspace / ".claude" / "settings.json").read_text())
    assert any("agnes pull" in h["hooks"][0]["command"]
               for h in settings["hooks"]["SessionStart"])
    assert any("agnes push" in h["hooks"][0]["command"]
               for h in settings["hooks"]["SessionEnd"])

    claude_md = (workspace / "CLAUDE.md").read_text()
    assert "agnes pull" in claude_md
    assert "da sync" not in claude_md

    workspace_md = (workspace / "AGNES_WORKSPACE.md").read_text()
    assert test_pat not in workspace_md, "PAT must not leak into AGNES_WORKSPACE.md"
    for placeholder in ["{created_at}", "{server_url}", "{workspace_path}"]:
        assert placeholder not in workspace_md, f"placeholder leaked: {placeholder}"
    assert fastapi_test_server.url in workspace_md
    assert str(workspace) in workspace_md
    assert "agnes pull" in workspace_md


def test_clean_install_zero_grants(fastapi_test_server, tmp_path, test_pat_no_grants):
    workspace = tmp_path / "ws"
    workspace.mkdir()
    subprocess.run([
        "da", "init",
        "--server-url", fastapi_test_server.url,
        "--token", test_pat_no_grants,
        "--workspace", str(workspace),
    ], check=True)

    must_exist = {"CLAUDE.md", "AGNES_WORKSPACE.md",
                  ".claude/settings.json", ".claude/CLAUDE.local.md",
                  "user/duckdb/analytics.duckdb"}
    must_not_exist = {".claude/rules", "server/parquet", "data/parquet",
                      "data/duckdb", "data/metadata", "user/artifacts",
                      "user/sessions", "user/snapshots", ".agnes"}
    for p in must_exist:
        assert (workspace / p).exists(), f"missing: {p}"
    for p in must_not_exist:
        assert not (workspace / p).exists(), f"unexpected: {p}"
    assert_no_dead_dirs(workspace)


def test_init_force_preserves_local_md(fastapi_test_server, tmp_path, test_pat):
    workspace = tmp_path / "ws"
    workspace.mkdir()
    subprocess.run(["agnes", "init", "--server-url", fastapi_test_server.url,
                    "--token", test_pat, "--workspace", str(workspace)], check=True)
    (workspace / ".claude" / "CLAUDE.local.md").write_text("# my private notes\n")

    subprocess.run(["agnes", "init", "--server-url", fastapi_test_server.url,
                    "--token", test_pat, "--workspace", str(workspace),
                    "--force"], check=True)
    assert "my private notes" in (workspace / ".claude" / "CLAUDE.local.md").read_text()


def test_readers_in_pre_init_dir(tmp_path):
    """Reader commands in a folder that never had `agnes init`. Friendly hints, no tracebacks."""
    for cmd in [["agnes", "query", "SELECT 1"],
                ["agnes", "snapshot", "create", "x", "--as", "y", "--estimate"],
                ["agnes", "explore", "x"],
                ["agnes", "snapshot", "list"]]:
        result = subprocess.run(cmd, cwd=tmp_path,
                                capture_output=True, text=True, timeout=15)
        assert "Traceback" not in result.stderr
  • Step 2: Run
pytest tests/test_clean_install_integration.py -v
  • Step 3: Commit
git add tests/test_clean_install_integration.py
git commit -m "test: clean-install integration suite (minimal/zero grants, force, pre-init)"

Task 23: Manual clean-install protocol — document in RELEASE_CHECKLIST.md

Files:

  • Modify or create: docs/RELEASE_CHECKLIST.md

  • Step 1: Add the manual protocol from spec §5.5

If docs/RELEASE_CHECKLIST.md exists, append; otherwise create with header:

# Release Checklist

## Bootstrap path changes (mandatory pre-merge)

For any PR touching the analyst-bootstrap path (`agnes init`, `cli/lib/pull.py`,
`cli/lib/hooks.py`, `app/web/setup_instructions.py`, `/api/welcome`), run
this protocol locally before requesting review:

1. `git clean -fdx` in the repo (no build artifacts).
2. Boot FastAPI locally against a clean test instance state.
3. Empty terminal in `/tmp/test-analyst-1`. From the web `/setup?role=analyst`, paste prompt.
4. `tree -a /tmp/test-analyst-1` and compare with the expected tree from
   `docs/superpowers/specs/2026-05-04-clean-analyst-bootstrap-design.md` §5.2.
5. `claude` in that folder. Three queries: "what tables can I see",
   "SELECT count(*) FROM <t>", "show me last 5 rows of <t>". All must work
   without further intervention.
6. `/exit`. Verify SessionEnd hook ran (server-side audit log shows `agnes push`;
   `du -sh /tmp/test-analyst-1/user/sessions/` non-empty).
7. Second `claude` in same folder. Verify SessionStart hook fires
   (`agnes pull` request in audit log).
8. Second workspace `/tmp/test-analyst-2` with the same PAT (within TTL).
   Repeat 3-5. Verify global `~/.config/da/` is not duplicated; the second
   workspace has its own DuckDB.
  • Step 2: Commit
git add docs/RELEASE_CHECKLIST.md
git commit -m "docs: clean-install manual protocol in release checklist"

Phase 6 — CHANGELOG and final verification

Task 24: Update CHANGELOG

Files:

  • Modify: CHANGELOG.md

  • Step 1: Add an entry under [Unreleased]

Open CHANGELOG.md, find the topmost ## [Unreleased] section (it sits above the most recent released version, currently ## [0.32.0]). Add the entry from the spec §"CHANGELOG entry (preview)":

## [Unreleased]

### Changed
- **BREAKING** CLI binary renamed from `da` to `agnes`. No backward-compat alias is shipped. Update shell aliases, hook commands in any pre-existing `.claude/settings.json`, scripts, and cron jobs. Reinstall via `uv tool install <wheel>`; the wheel now ships an `agnes` entry point.
- **BREAKING** Analyst bootstrap rewritten end-to-end. `da analyst setup` is removed; replaced by `agnes init` (non-interactive, requires `--server-url` and `--token`). `da sync` is split into `agnes pull` (refresh) and `agnes push` (upload). `da fetch` is folded into `agnes snapshot create`. `da metrics list/show` is folded into `agnes catalog --metrics`; `da metrics import/export/validate` move to `agnes admin metrics {import,export,validate}`. The `da analyst` namespace is removed; the workspace status command is now `agnes status`. The previous `da status` (server-health overview) becomes `agnes diagnose system`.
- **BREAKING** Workspace layout simplified. Removed: `data/parquet/`, `data/duckdb/`, `data/metadata/`, `user/artifacts/`. Canonical paths: `server/parquet/` (synced parquets), `user/duckdb/analytics.duckdb` (DuckDB views), `user/snapshots/` (ad-hoc snapshots), `user/sessions/` (recorded sessions).
- The `/setup` web page now branches on a `role` query parameter: `/setup?role=analyst` renders the analyst workspace bootstrap prompt; `/setup?role=admin` renders the admin CLI install prompt. `/install` continues to 302 to `/setup`.
- `CLAUDE.md` server-side template + repo-root `CLAUDE.md` updated to reference the new CLI verbs and workspace paths. The admin UI for the `claude_md_template` DB override (`/admin/workspace-prompt`) renders a yellow banner when the saved override contains legacy strings (`data/parquet/`, `da sync`, `da fetch`, `da analyst setup`, `da metrics list/show`); admins re-author and save to clear it. Migration is manual.

### Added
- `AGNES_WORKSPACE.md` — human-readable workspace docs file generated by `agnes init` in the workspace root. Documents global install, workspace layout, hooks, cheat sheet, uninstall recipe.
- PAT request body now accepts `scope: str = "general"` and `ttl_seconds: int | None = None` fields. PATs minted with `scope="bootstrap-analyst"` are TTL-clamped to ≤ 1 h server-side. Existing `expires_in_days` field continues to work; `ttl_seconds` wins when both are set. `ttl_seconds` upper bound is 315_360_000 (matches `expires_in_days <= 3650` cap).
- `cli/lib/` shared-library tree, with `cli/lib/pull.py:run_pull` (data-refresh primitive callable from both the Typer wrapper and `agnes init`) and `cli/lib/hooks.py:install_claude_hooks` (workspace-scoped Claude Code hook installer).

### Fixed
- `agnes pull` (formerly `da sync`) no longer creates `.claude/rules/` when the corporate-memory bundle is empty.
- `agnes pull` no longer creates `server/parquet/` when the manifest is empty.
- `agnes snapshot create` (formerly `da fetch`) no longer materializes an empty `user/duckdb/analytics.duckdb` when run before any `agnes pull`.
- Workspace `agnes status` reads from the canonical `server/parquet/` and `user/duckdb/analytics.duckdb` paths (was reading legacy `data/parquet/`, `data/metadata/last_sync.json`).
- `agnes init` and `agnes pull` errors now use the `cli/error_render.py` typed-error renderer (added in 0.32.0), so analyst-facing error UX matches the structured shape `agnes query --remote` already produces.

### Removed
- `da analyst setup`, `da analyst status`, `da sync`, `da fetch`. See "Changed" above for replacements.
- `da metrics` namespace as a top-level group (subcommands moved to `agnes catalog --metrics` for read-only views and `agnes admin metrics …` for write operations).
- Legacy workspace directories `data/parquet/`, `data/duckdb/`, `data/metadata/`, `user/artifacts/`. Existing analyst workspaces should be reinitialized with `agnes init --server-url ... --token ... --force` (a fresh empty folder is recommended).

### Internal
- `cli/lib/__init__.py` (empty) makes `cli/lib/` a proper package picked up by Hatchling for wheel inclusion.
- `tests/fixtures/analyst_bootstrap.py` — reusable test fixtures for clean-install verification.
  • Step 2: Commit
git add CHANGELOG.md
git commit -m "docs(changelog): clean-analyst-bootstrap rewrite (BREAKING)"

Task 25: Final verification — full test suite

Files: none (just runs)

  • Step 1: Run the full test suite
pytest tests/ -v 2>&1 | tail -50

Expected: all PASS (or expected skips for fixtures that depend on slow/external resources).

  • Step 2: Lint check (if project has one)
ruff check . 2>&1 | tail -20

If ruff reports issues in code we touched, fix them.

  • Step 3: Manually run the clean-install protocol per Task 23

(If on-machine: do steps 1-8 from docs/RELEASE_CHECKLIST.md. If headless CI: skip — the integration tests cover this.)

  • Step 4: If everything passes, declare ready for review
git log --oneline origin/main..HEAD

Expected: a clean linear history of one commit per task. Push the branch and open a PR using the Spec + this Plan as the PR body context.

git push -u origin zs/clean-analyst-bootstrap-spec

Then gh pr create with the spec + CHANGELOG entry + ### Test plan per CLAUDE.md discipline.


Self-Review (post-write)

After writing the plan, I reviewed it against the spec for:

  1. Spec coverage:

    • Server-side: PAT scope/TTL (Task 1), legacy-strings scan (Tasks 2, 5), setup_instructions analyst branch (Task 3), /setup?role= branching (Task 4), claude_md_template rewrite (Task 6).
    • Client-side library: cli/lib/init.py + hooks.py (Task 7), cli/lib/pull.py (Task 8).
    • CLI commands: pull (9), push (10), init (11), status (12), diagnose system (13), snapshot create (14), catalog --metrics (15), admin metrics (16).
    • Wiring: hint texts (17), main.py + deletes (18), CLAUDE.md (19).
    • Tests: fixtures (20), reader smoke (21), clean install (22), manual protocol (23).
    • CHANGELOG (24), final verification (25).
  2. Placeholder scan: No "TBD"/"TODO" remain. Each step has the actual code or shell command. The cli/commands/admin_metrics.py task says "lift X from metrics.py verbatim" rather than restating — that's intentional since the engineer can git show HEAD~N:cli/commands/metrics.py to see exactly what to copy.

  3. Type consistency: PullResult shape consistent between cli/lib/pull.py and cli/commands/pull.py. install_claude_hooks signature (workspace: Path) -> None consistent across hooks.py + init.py. _LEGACY_STRINGS tuple shape used identically in tests and module.

  4. Known fragility: Some shell-based test assertions (Task 5 legacy-banner div presence) are heuristic; implementer may need to tighten once HTML lands. Marked with comment.

  5. Open questions in spec stay in spec: Per-endpoint PAT scope enforcement (deferred), layered config (deferred), hook performance budget (monitoring-only), anti-coupling test (deferred). Not in this plan.

The plan is implementable. Tasks are roughly ordered by dependency: Phase 1 server foundation, Phase 2 client library, Phase 3 commands, Phase 4 wiring/cleanup, Phase 5 fixtures + tests, Phase 6 changelog/verification.