CLAUDE.md rewritten (708 -> ~320 lines): four overlapping release sections collapsed to one, stale v1->v35 schema history dropped (it lives in CHANGELOG), marketplace endpoint internals and verbose process sections moved out or tightened. New focused docs: - docs/RELEASING.md - release process, deploy workflows, CI quirks (RELEASE_TEMPLATE.md folded in as an appendix) - docs/marketplace.md - marketplace ingestion + re-serving internals - docs/README.md - documentation index by audience, linked from README.md and CLAUDE.md Archived under docs/archive/: docs/superpowers/ (52 historical planning artifacts), HACKATHON.md, pd-ps-comments.md, security-audit-2026-04.md, future/NOTIFICATIONS.md. Removed the docs/auto-install.md stub. Fixed dangling links in connectors/jira/README.md and dev_docs/README.md, repointed code/doc references to archived paths.
123 KiB
Clean Analyst Bootstrap Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Replace the interactive da analyst setup flow with a single web→paste→done bootstrap. New analyst pastes a clipboard prompt from /setup?role=analyst into Claude Code in an empty folder, and ends up with CLAUDE.md, AGNES_WORKSPACE.md, hooks, fresh data, and DuckDB views — fully ready to query. Drop dead workspace dirs (data/parquet/, data/duckdb/, data/metadata/, user/artifacts/). Establish a lazy-mkdir contract so nothing creates empty directories.
Architecture: PAT-only auth. agnes init is a thin orchestrator that auths, fetches CLAUDE.md from /api/welcome, installs hooks, and calls cli/lib/pull.py:run_pull for first data refresh. CLI verbs renamed: init/pull/push/status/snapshot create (greenfield, no aliases). Server-side install prompt branches on role query param. cli/lib/ shared library tree separates data primitives from Typer wrappers so agnes init can call them without subprocess. Reader contract: every reader handles missing dirs gracefully (exit 0 empty or exit 1 with friendly hint, never traceback).
Tech Stack: Python 3.11+, FastAPI, Typer, Pydantic, DuckDB, httpx, pytest, Hatchling.
Spec: docs/superpowers/specs/2026-05-04-clean-analyst-bootstrap-design.md (revision 5, cleared for implementation).
CLI rename: As part of this plan, the binary changes from da to agnes. References to legacy commands (da sync, da fetch, da analyst setup, da metrics) keep their da prefix throughout this document — they're historical artifacts being removed. New commands and hook strings use agnes.
File Structure
New files
| Path | Responsibility |
|---|---|
cli/lib/__init__.py |
Empty — makes cli/lib/ a package so Hatchling includes it in the wheel. |
cli/lib/pull.py |
run_pull(server_url, token, workspace, *, dry_run) -> PullResult — pure-function data refresh primitive (manifest, parquet download, DuckDB rebuild, memory bundle write). Lazy mkdir. |
cli/lib/hooks.py |
install_claude_hooks(workspace) — idempotent SessionStart/End hook installer for <workspace>/.claude/settings.json. |
cli/commands/init.py |
agnes init Typer command — auth check, save config, write CLAUDE.md, install hooks, call run_pull, write AGNES_WORKSPACE.md. |
cli/commands/pull.py |
agnes pull Typer wrapper around cli/lib/pull.py:run_pull. Flags --quiet, --json, --dry-run. |
cli/commands/push.py |
agnes push Typer command — uploads user/sessions/*.jsonl and .claude/CLAUDE.local.md. |
cli/commands/admin_metrics.py |
agnes admin metrics {import,export,validate} sub-Typer (lifted from cli/commands/metrics.py). |
config/agnes_workspace_template.txt |
Static client-side template for AGNES_WORKSPACE.md. Three placeholders: {created_at}, {server_url}, {workspace_path}. |
tests/fixtures/analyst_bootstrap.py |
Test fixtures: fastapi_test_server, test_pat, test_pat_no_grants, zero_grants_workspace, web_session. |
tests/test_lib_hooks.py |
Tests for install_claude_hooks (idempotent, preserves third-party hooks, replaces old agnes pull/da sync entries). |
tests/test_lib_pull.py |
Tests for run_pull (lazy mkdir, partial failure handling, manifest empty case). |
tests/test_setup_instructions_analyst.py |
Tests render_setup_instructions(role="analyst") produces correct steps. |
tests/test_tokens_bootstrap_scope.py |
Tests scope=bootstrap-analyst PATs are TTL-clamped to ≤ 1 h; ttl_seconds upper bound; ttl_seconds wins over expires_in_days. |
tests/test_legacy_strings_scan.py |
Tests _scan_legacy_strings and legacy_strings_detected field on GET /api/admin/workspace-prompt-template. |
tests/test_clean_install_integration.py |
End-to-end clean-install integration tests (minimal grants, zero grants, force preserves, AGNES_WORKSPACE.md content). |
tests/test_reader_smoke_matrix.py |
Reader smoke matrix — every CLI command on a freshly-bootstrapped zero-grants workspace, asserts no traceback. |
Modified files
| Path | Change |
|---|---|
app/api/tokens.py |
CreateTokenRequest: add scope: str = "general" and ttl_seconds: Optional[int] = None. Validate ttl_seconds <= 315_360_000. Resolution: ttl_seconds wins; fall back to expires_in_days. For scope == "bootstrap-analyst", force-clamp resolved TTL ≤ 3600 s. Audit-log includes scope. |
app/api/claude_md.py |
Add module-level _LEGACY_STRINGS = ("data/parquet", "da sync", "da fetch", "da analyst setup", "da metrics list", "da metrics show"). Add helper _scan_legacy_strings(text) -> list[str]. Add field legacy_strings_detected: list[str] = [] to TemplateGetResponse. Populate in admin_get_workspace_template. |
app/web/setup_instructions.py |
Add role: Literal["analyst","admin"] = "admin" to resolve_lines() and render_setup_instructions(). Analyst layout: TLS trust (when ca_pem) → install agnes → agnes init --server-url X --token Y --workspace . → agnes catalog smoke verify → confirm. Drop for analyst: marketplace, plugins, skills, diagnose, login, whoami. |
app/web/router.py |
setup_page: read role query param (default "admin"), pass to render_setup_instructions(role=...). |
app/web/templates/setup.html (or wherever setup_page renders) |
Two role tiles (Analyst / Admin), POST /auth/tokens with matching scope. |
app/web/templates/admin_workspace_prompt.html |
Yellow banner above editor when legacy_strings_detected non-empty. |
config/claude_md_template.txt |
Update verb names: da sync → agnes pull, da fetch → agnes snapshot create, da metrics list/show → agnes catalog --metrics, da analyst setup → agnes init. Path strings: data/parquet/ → server/parquet/, data/duckdb/... → user/duckdb/analytics.duckdb. |
cli/commands/snapshot.py |
Add create subcommand — moves logic from cli/commands/fetch.py verbatim. Add if not db_path.exists(): exit 1 guard before duckdb.connect(). |
cli/commands/catalog.py |
Add --metrics flag (replaces da metrics list); --metrics --show <id> (replaces da metrics show). |
cli/commands/admin.py |
Register the new admin_metrics_app sub-Typer. |
cli/commands/query.py |
Update hint text "Run: da sync" → "Run: agnes pull" in two places. |
cli/commands/explore.py |
Update hint text "Run: da sync" → "Run: agnes pull". |
cli/main.py |
Drop registrations for sync, analyst, metrics, fetch, status (existing). Add init, pull, push. Re-register status to point at new workspace-status command. |
CLAUDE.md (repo root) |
Verb + path rewrites throughout. The "Local sync & Claude Code hooks" subsection rewrites verbatim with new commands. The "Querying Agnes data — agent rails" subsection keeps the 0.32.0 query_mode='materialized' and query_mode='remote' cost-guardrail prose verbatim, just verb-renaming da fetch → agnes snapshot create. |
CHANGELOG.md |
Entry under [Unreleased] per spec preview (Changed/Added/Fixed/Removed/Kept). |
pyproject.toml |
No change; cli/lib/__init__.py triggers Hatchling auto-discovery. |
Deleted files
| Path | Reason |
|---|---|
cli/commands/sync.py |
Replaced by cli/commands/pull.py + cli/commands/push.py + cli/lib/pull.py. |
cli/commands/fetch.py |
Folded into cli/commands/snapshot.py:create. |
cli/commands/analyst.py |
Replaced by cli/commands/init.py + new cli/commands/status.py (workspace status). _install_claude_hooks lifted to cli/lib/hooks.py. |
cli/commands/metrics.py |
Read paths fold into agnes catalog --metrics; write paths move to cli/commands/admin_metrics.py. |
Existing cli/commands/status.py rename
| Action | Detail |
|---|---|
Existing agnes status ("System status") |
Renamed to agnes diagnose system (subcommand under diagnose_app) — its content is a subset of what agnes diagnose already does. |
New agnes status |
Workspace status — fresh implementation, replaces da analyst status. Lives in cli/commands/status.py (overwrite). |
Phase 0 — CLI binary rename (da → agnes)
Task 0: Rename the CLI entry point
Files:
- Modify:
pyproject.toml([project.scripts]),cli/main.py(Typer(name=...)) - Test:
tests/test_cli_binary_rename.py(new)
Why first: Every later task that registers Typer apps, writes hook command strings, or asserts CLI output uses agnes. Rename the binary up front so tests in subsequent tasks reference the right name.
- Step 1: Read current entry points
grep -n "scripts\|^name\|tool.hatch" pyproject.toml | head
grep -n "Typer\|name=\"da\"\|name='da'" cli/main.py
- Step 2: Update
pyproject.toml
In [project.scripts], replace da = "cli.main:app" with:
[project.scripts]
agnes = "cli.main:app"
Single entry — no da alias kept. Greenfield.
- Step 3: Update
cli/main.py
Change the Typer app construction from name="da" to name="agnes" and update the help string:
app = typer.Typer(
name="agnes",
help="Agnes — AI Data Analyst CLI",
no_args_is_help=True,
)
- Step 4: Reinstall the editable package
uv pip install -e ".[dev]"
which agnes
agnes --version
Expected: agnes <version> prints; da --version now fails with "command not found".
- Step 5: Write a binary-name regression test
# tests/test_cli_binary_rename.py
"""Confirm the wheel installs the binary as `agnes`, not `da`."""
import subprocess
def test_agnes_command_exists():
result = subprocess.run(["agnes", "--version"], capture_output=True, text=True)
assert result.returncode == 0
def test_da_command_no_longer_works():
"""Greenfield: no backward-compat alias."""
result = subprocess.run(["bash", "-c", "command -v da"],
capture_output=True, text=True)
assert result.returncode != 0, "da should NOT be on PATH after rename"
- Step 6: Run the test
pytest tests/test_cli_binary_rename.py -v
- Step 7: Commit
git add pyproject.toml cli/main.py tests/test_cli_binary_rename.py
git commit -m "feat(cli): rename binary from da to agnes (BREAKING)"
Phase 1 — Server-side foundation (PAT scope, legacy-strings scan, install-prompt branching)
Task 1: Add scope + ttl_seconds fields to CreateTokenRequest
Files:
-
Modify:
app/api/tokens.py:23-25(CreateTokenRequest),app/api/tokens.py:85-101(create_tokenroute) -
Test:
tests/test_tokens_bootstrap_scope.py(new) -
Step 1: Write failing tests
# tests/test_tokens_bootstrap_scope.py
"""Tests for PAT scope + ttl_seconds fields (clean-analyst-bootstrap spec)."""
from __future__ import annotations
import jwt
import pytest
@pytest.fixture
def web_session(client, db_with_admin_user):
"""Authenticated test client with session cookie for admin user."""
# Form-login endpoint — see fixtures/analyst_bootstrap.py
resp = client.post("/auth/password/login/web",
data={"email": "admin@example.com", "password": "test-password"})
assert resp.status_code in (200, 302), f"login failed: {resp.text}"
return client
def _decode(pat: str) -> dict:
return jwt.decode(pat, options={"verify_signature": False})
def test_bootstrap_pat_ttl_clamped_to_one_hour(web_session):
resp = web_session.post("/auth/tokens", json={
"name": "init",
"scope": "bootstrap-analyst",
"ttl_seconds": 86400, # 1 day — must be ignored, clamped to 3600
})
assert resp.status_code == 201, resp.text
payload = _decode(resp.json()["token"])
assert payload.get("scope") == "bootstrap-analyst"
assert payload["exp"] - payload["iat"] <= 3600 + 5
def test_general_pat_uses_ttl_seconds_when_set(web_session):
resp = web_session.post("/auth/tokens", json={
"name": "test",
"ttl_seconds": 7200, # 2 hours
})
assert resp.status_code == 201
payload = _decode(resp.json()["token"])
assert payload["exp"] - payload["iat"] <= 7200 + 5
def test_general_pat_falls_back_to_expires_in_days(web_session):
resp = web_session.post("/auth/tokens", json={
"name": "test", "expires_in_days": 30,
})
assert resp.status_code == 201
payload = _decode(resp.json()["token"])
assert payload["exp"] - payload["iat"] <= 30 * 86400 + 5
def test_ttl_seconds_upper_bound(web_session):
# 3650 days * 86400 = 315_360_000 seconds. One past this must reject.
resp = web_session.post("/auth/tokens", json={
"name": "test", "ttl_seconds": 315_360_001,
})
assert resp.status_code == 400
def test_ttl_seconds_must_be_positive(web_session):
resp = web_session.post("/auth/tokens", json={
"name": "test", "ttl_seconds": 0,
})
assert resp.status_code == 400
def test_scope_default_is_general(web_session):
resp = web_session.post("/auth/tokens", json={"name": "test"})
assert resp.status_code == 201
payload = _decode(resp.json()["token"])
# scope=general is informational; check audit_log carries it
# (skipped here — tested in test_audit_log_includes_scope below)
assert payload.get("scope", "general") == "general"
The db_with_admin_user fixture is part of the existing test suite or will be added in tests/fixtures/analyst_bootstrap.py (Task 22). For now, this test depends on it; if it doesn't exist, mark these as pytest.skip until Task 22.
- Step 2: Run tests to verify they fail
cd "$(git rev-parse --show-toplevel)"
pytest tests/test_tokens_bootstrap_scope.py -v
Expected: tests FAIL with either fixture-missing error or extra fields not permitted from Pydantic (if the fixture exists).
- Step 3: Update
CreateTokenRequestmodel
Replace app/api/tokens.py:23-25:
class CreateTokenRequest(BaseModel):
name: str
expires_in_days: Optional[int] = 90 # null = no expiry
scope: str = "general" # informational; "bootstrap-analyst" force-clamps TTL ≤ 1 h
ttl_seconds: Optional[int] = None # if set, wins over expires_in_days
- Step 4: Update
create_tokenroute
Replace app/api/tokens.py:85-118 (the create_token function body up through the jwt_token = create_access_token(...) call):
@router.post("", response_model=CreateTokenResponse, status_code=201)
async def create_token(
payload: CreateTokenRequest,
user: dict = Depends(require_session_token),
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
):
if not payload.name.strip():
raise HTTPException(status_code=400, detail="name is required")
if payload.expires_in_days is not None and payload.expires_in_days <= 0:
raise HTTPException(status_code=400, detail="expires_in_days must be a positive integer")
if payload.expires_in_days is not None and payload.expires_in_days > 3650:
raise HTTPException(status_code=400, detail="expires_in_days must not exceed 3650 (10 years)")
if payload.ttl_seconds is not None and payload.ttl_seconds <= 0:
raise HTTPException(status_code=400, detail="ttl_seconds must be a positive integer")
# Mirror the 3650-day cap on ttl_seconds so a hostile client can't
# bypass via field rename. 3650 days * 86400 = 315_360_000.
if payload.ttl_seconds is not None and payload.ttl_seconds > 315_360_000:
raise HTTPException(status_code=400, detail="ttl_seconds must not exceed 315360000 (10 years)")
# Resolve TTL: ttl_seconds wins; fall back to expires_in_days.
expires_delta: Optional[timedelta] = None
omit_exp = False
if payload.ttl_seconds is not None:
expires_delta = timedelta(seconds=payload.ttl_seconds)
elif payload.expires_in_days is not None:
expires_delta = timedelta(days=payload.expires_in_days)
else:
omit_exp = True # "no expiry"
# Force-clamp bootstrap-analyst PATs to ≤ 1 h regardless of request.
if payload.scope == "bootstrap-analyst":
ONE_HOUR = timedelta(hours=1)
if expires_delta is None or expires_delta > ONE_HOUR:
expires_delta = ONE_HOUR
omit_exp = False
expires_at: Optional[datetime] = None
if expires_delta is not None:
expires_at = datetime.now(timezone.utc) + expires_delta
repo = AccessTokenRepository(conn)
token_id = str(uuid.uuid4())
jwt_token = create_access_token(
user_id=user["id"], email=user["email"],
token_id=token_id, typ="pat",
expires_delta=expires_delta, omit_exp=omit_exp,
extra_claims={"scope": payload.scope},
)
- Step 5: Update
create_access_tokento acceptextra_claims
Find app/auth/jwt.py:create_access_token (read it first to get the current signature). Add an extra_claims: dict | None = None parameter that gets merged into the JWT payload before encoding. Show your edit:
grep -n "def create_access_token" app/auth/jwt.py
# Read the function and update.
If the function already supports extra claims, this is a no-op. Otherwise add:
def create_access_token(
user_id: str, email: str, token_id: Optional[str] = None,
typ: str = "session", expires_delta: Optional[timedelta] = None,
omit_exp: bool = False, extra_claims: Optional[dict] = None,
) -> str:
payload = {"sub": user_id, "email": email, "typ": typ}
if token_id:
payload["jti"] = token_id
if not omit_exp:
payload["iat"] = int(datetime.now(timezone.utc).timestamp())
if expires_delta:
payload["exp"] = int((datetime.now(timezone.utc) + expires_delta).timestamp())
if extra_claims:
payload.update(extra_claims)
return jwt.encode(payload, _SECRET, algorithm="HS256")
(Adapt to the actual function shape after reading it.)
- Step 6: Update audit-log entry to include scope
Search app/api/tokens.py for the _audit(...) call inside create_token and add scope to the params dict:
_audit(conn, actor=user["id"], action="token.create",
target=token_id,
params={"name": payload.name,
"expires_at": str(expires_at) if expires_at else None,
"scope": payload.scope})
- Step 7: Run tests to verify they pass
pytest tests/test_tokens_bootstrap_scope.py -v
Expected: all PASS (or skip if db_with_admin_user fixture doesn't yet exist; in that case the failure mode is a clear fixture-not-found error, not a logic error).
- Step 8: Run the full token test suite to verify no regression
pytest tests/ -k token -v
Expected: all token-related tests PASS.
- Step 9: Commit
git add app/api/tokens.py app/auth/jwt.py tests/test_tokens_bootstrap_scope.py
git commit -m "feat(tokens): add scope + ttl_seconds fields with bootstrap-analyst clamp"
Task 2: Add _LEGACY_STRINGS scan to admin workspace-prompt endpoint
Files:
-
Modify:
app/api/claude_md.py(add_LEGACY_STRINGS,_scan_legacy_strings, augmentTemplateGetResponse, populate inadmin_get_workspace_template) -
Test:
tests/test_legacy_strings_scan.py(new) -
Step 1: Write failing tests
# tests/test_legacy_strings_scan.py
"""Tests for legacy-string scan in admin CLAUDE.md template endpoint."""
from app.api.claude_md import _scan_legacy_strings, _LEGACY_STRINGS
def test_scan_finds_all_known_legacy_strings():
text = """
Run `da sync` then `da fetch web_sessions --where ...`.
Old workspace at data/parquet/ — see `da analyst setup`.
Use `da metrics list` and `da metrics show <id>`.
"""
hits = _scan_legacy_strings(text)
assert "da sync" in hits
assert "da fetch" in hits
assert "data/parquet" in hits
assert "da analyst setup" in hits
assert "da metrics list" in hits
assert "da metrics show" in hits
def test_scan_returns_empty_for_clean_text():
text = "Use `agnes pull` to refresh, `agnes snapshot create` for ad-hoc, `server/parquet/`."
assert _scan_legacy_strings(text) == []
def test_scan_returns_unique_sorted_hits():
text = "da sync da sync data/parquet/ data/parquet/foo"
hits = _scan_legacy_strings(text)
assert hits == sorted(set(hits))
def test_legacy_strings_constant_shape():
assert isinstance(_LEGACY_STRINGS, tuple)
assert all(isinstance(s, str) for s in _LEGACY_STRINGS)
assert "da sync" in _LEGACY_STRINGS
assert "data/parquet" in _LEGACY_STRINGS
- Step 2: Run tests to verify they fail
pytest tests/test_legacy_strings_scan.py -v
Expected: FAIL with ImportError: cannot import name '_scan_legacy_strings' from 'app.api.claude_md'.
- Step 3: Add
_LEGACY_STRINGSand_scan_legacy_stringstoapp/api/claude_md.py
Insert near the other module-level constants (after the imports, before the class definitions — find a stable location, e.g., right before class ClaudeMdResponse):
# Substrings that, when found in an admin-saved CLAUDE.md override, signal
# the override is stale relative to the post-clean-bootstrap CLI surface.
# Surfaced via TemplateGetResponse.legacy_strings_detected so the admin UI
# can render a yellow banner prompting re-authoring.
_LEGACY_STRINGS = (
"data/parquet",
"da sync",
"da fetch",
"da analyst setup",
"da metrics list",
"da metrics show",
)
def _scan_legacy_strings(text: str) -> list[str]:
"""Return sorted unique substrings from _LEGACY_STRINGS present in text."""
return sorted({s for s in _LEGACY_STRINGS if s in text})
- Step 4: Run tests to verify they pass
pytest tests/test_legacy_strings_scan.py -v
Expected: all PASS.
- Step 5: Augment
TemplateGetResponse
Find class TemplateGetResponse (around app/api/claude_md.py:72-76) and add the field:
class TemplateGetResponse(BaseModel):
content: Optional[str]
default: str
updated_at: Optional[str] = None
updated_by: Optional[str] = None
legacy_strings_detected: list[str] = [] # populated when override contains stale verbs/paths
- Step 6: Populate the field in
admin_get_workspace_template
Find the route (search for admin_get_workspace_template in app/api/claude_md.py). Inside the function body, before constructing the response, add:
# Scan the saved override (not the live default) for legacy strings.
# A non-empty list triggers the yellow banner in the admin UI.
override_text = override.content if override else ""
legacy_hits = _scan_legacy_strings(override_text)
Then include legacy_strings_detected=legacy_hits in the TemplateGetResponse(...) construction.
- Step 7: Add an HTTP test for the populated field
Append to tests/test_legacy_strings_scan.py:
def test_admin_get_template_returns_legacy_strings_when_override_dirty(web_session):
"""Setting an override containing legacy strings populates the field."""
web_session.put("/api/admin/workspace-prompt-template",
json={"content": "Run `da sync` and check data/parquet/."})
resp = web_session.get("/api/admin/workspace-prompt-template")
assert resp.status_code == 200
body = resp.json()
assert "da sync" in body["legacy_strings_detected"]
assert "data/parquet" in body["legacy_strings_detected"]
def test_admin_get_template_returns_empty_when_clean(web_session):
web_session.put("/api/admin/workspace-prompt-template",
json={"content": "Use `agnes pull` and check `server/parquet/`."})
resp = web_session.get("/api/admin/workspace-prompt-template")
assert resp.json()["legacy_strings_detected"] == []
These depend on web_session fixture from Task 22; mark pytest.skip if not yet present.
- Step 8: Run all
claude_mdtests
pytest tests/test_legacy_strings_scan.py tests/ -k claude_md -v
Expected: PASS (skip the HTTP tests if fixture missing — that's OK).
- Step 9: Commit
git add app/api/claude_md.py tests/test_legacy_strings_scan.py
git commit -m "feat(admin): scan CLAUDE.md override for legacy strings"
Task 3: Add role parameter to setup_instructions.py (analyst branch)
Files:
-
Modify:
app/web/setup_instructions.py(addroleparameter toresolve_linesandrender_setup_instructions; add analyst-branch helper) -
Test:
tests/test_setup_instructions_analyst.py(new) -
Step 1: Write failing tests
# tests/test_setup_instructions_analyst.py
"""Tests for analyst-branch rendering of /setup paste prompt."""
from app.web.setup_instructions import render_setup_instructions
def test_render_analyst_role_basic():
text = render_setup_instructions(
server_url="https://agnes.example.com",
token="agnes_pat_TEST",
wheel_filename="agnes-0.32.0-py3-none-any.whl",
role="analyst",
)
# Required content for analyst role:
assert "uv tool install" in text
assert "agnes init" in text
assert "--token" in text and "agnes_pat_TEST" in text
assert "--server-url" in text and "https://agnes.example.com" in text
assert "agnes catalog" in text # smoke verify step
# Forbidden content (admin-only):
assert "marketplace" not in text
assert "claude plugin install" not in text
assert "agnes skills install" not in text # analyst doesn't bulk-install skills
assert "agnes diagnose" not in text # analyst smoke verify is `agnes catalog`, not diagnose
def test_render_admin_role_unchanged():
"""Default role=admin keeps the existing 6/8-step layout."""
text = render_setup_instructions(
server_url="https://agnes.example.com",
token="agnes_pat_TEST",
wheel_filename="agnes-0.32.0-py3-none-any.whl",
# role omitted — defaults to "admin"
)
assert "agnes auth import-token" in text # admin uses import-token, not agnes init
assert "agnes diagnose" in text # admin keeps diagnose
def test_render_analyst_with_ca_pem():
"""Analyst role + private CA → TLS trust block reused from admin path."""
text = render_setup_instructions(
server_url="https://agnes.example.com",
token="agnes_pat_TEST",
wheel_filename="agnes-0.32.0-py3-none-any.whl",
role="analyst",
ca_pem="-----BEGIN CERTIFICATE-----\nMIIBxxx\n-----END CERTIFICATE-----",
)
assert "AGNES_CA_PEM" in text # heredoc marker from trust block
assert "ca-bundle.pem" in text
assert "agnes init" in text # analyst-specific step still present
- Step 2: Run tests to verify they fail
pytest tests/test_setup_instructions_analyst.py -v
Expected: FAIL — render_setup_instructions() doesn't accept role parameter.
- Step 3: Add analyst-branch helper functions
Insert after _install_cli_lines (around line 311 in setup_instructions.py):
def _analyst_init_lines(server_url_placeholder: str = "{server_url}") -> list[str]:
"""Steps 2-3 — `agnes init` (auth + workspace bootstrap) + smoke verify.
Replaces the admin-flow login + verify steps (today: `agnes auth import-token`
+ `agnes auth whoami`). `agnes init` is non-interactive: `--token` carries the PAT,
`--server-url` carries the origin. The bootstrap PAT has a 1 h TTL — if the
user takes longer than that to paste this prompt, the init call returns 401
and the user re-clicks "Generate prompt" on the install page.
"""
return [
"",
"2) Bootstrap your analyst workspace in this directory:",
f" agnes init --server-url \"{server_url_placeholder}\" --token \"{{token}}\" --workspace .",
"",
" This authenticates with the PAT, fetches your CLAUDE.md (RBAC-filtered),",
" installs Claude Code SessionStart/End hooks (auto-refresh), and runs an",
" initial `agnes pull` so your DuckDB views are ready.",
"",
"3) Verify the data is queryable:",
" agnes catalog",
"",
" This should list the tables your account has grants for. Empty list",
" means your admin hasn't granted you access yet — contact them.",
]
def _analyst_finale_lines(confirm_step_num: str, has_ca: bool) -> list[str]:
"""Final Confirm step for analyst role. Shorter than admin: no marketplace,
no plugins, no skills."""
bullets = [
" - `agnes --version` output",
" - First few lines of `agnes catalog` (tables you can see)",
" - Confirmation that `./CLAUDE.md` and `./AGNES_WORKSPACE.md` exist",
" - Confirmation that `./.claude/settings.json` contains SessionStart/End hooks",
]
if has_ca:
bullets.append(
" - Which CA bundle source got picked in step 0(d)"
)
return [
"",
f"{confirm_step_num}) Confirm:",
" Tell me \"Agnes analyst workspace is ready\" and summarize:",
*bullets,
]
- Step 4: Add
roleparameter toresolve_linesandrender_setup_instructions
Find def resolve_lines(...) (around line 609). Modify the signature and dispatch:
from typing import Literal
def resolve_lines(
wheel_filename: str,
*,
plugin_install_names: list[str] | None = None,
self_signed_tls: bool = False,
server_host: str = "",
ca_pem: str | None = None,
role: Literal["analyst", "admin"] = "admin",
) -> list[str]:
"""..."""
if role == "analyst":
return _resolve_analyst_lines(wheel_filename, ca_pem=ca_pem)
# Existing admin path:
names = list(plugin_install_names or [])
has_marketplace = bool(names)
has_ca = bool(ca_pem and ca_pem.strip())
# ... (existing body unchanged)
Add the new analyst dispatcher right after resolve_lines:
def _resolve_analyst_lines(wheel_filename: str, *, ca_pem: str | None) -> list[str]:
"""Analyst workspace-bootstrap layout. Self-contained — no admin-only steps."""
has_ca = bool(ca_pem and ca_pem.strip())
confirm_step = "4" if has_ca else "4" # numbering: 0 (TLS optional), 1, 2, 3, 4
lines: list[str] = []
if has_ca:
lines.extend(_tls_trust_block(ca_pem))
lines.extend(_preamble_lines(has_ca=has_ca))
lines.extend(_install_cli_lines(has_ca=has_ca)) # step 1
lines.extend(_analyst_init_lines()) # steps 2-3
lines.extend(_analyst_finale_lines(confirm_step, has_ca=has_ca)) # step 4
return [
line.replace("{wheel_filename}", wheel_filename)
for line in lines
]
Update render_setup_instructions to accept and forward role:
def render_setup_instructions(
server_url: str,
token: str,
wheel_filename: str = "agnes.whl",
*,
plugin_install_names: list[str] | None = None,
self_signed_tls: bool = False,
server_host: str = "",
ca_pem: str | None = None,
role: Literal["analyst", "admin"] = "admin",
) -> str:
lines = resolve_lines(
wheel_filename,
plugin_install_names=plugin_install_names,
self_signed_tls=self_signed_tls,
server_host=server_host,
ca_pem=ca_pem,
role=role,
)
text = "\n".join(lines)
return text.replace("{server_url}", server_url).replace("{token}", token)
- Step 5: Run tests to verify they pass
pytest tests/test_setup_instructions_analyst.py -v
Expected: all PASS.
- Step 6: Run regression on existing setup-instruction tests
pytest tests/ -k setup_instructions -v
Expected: existing admin-role tests still PASS (no regression).
- Step 7: Commit
git add app/web/setup_instructions.py tests/test_setup_instructions_analyst.py
git commit -m "feat(setup): add analyst role to install-prompt renderer"
Task 4: Add role query branching to /setup route
Files:
-
Modify:
app/web/router.py(setup_pagearound line 717 — readrolequery param, pass to renderer) -
Test:
tests/test_setup_page_roles.py(new) -
Step 1: Read existing
setup_pageto understand its current shape
grep -n "setup_page\|/setup\|/install" app/web/router.py | head
Read the function (~30 lines around the match) to understand its current call sites and template rendering.
- Step 2: Write failing tests
# tests/test_setup_page_roles.py
"""Tests for /setup role query-param branching."""
def test_setup_page_default_role_is_admin(client):
resp = client.get("/setup")
assert resp.status_code == 200
# Admin tile is active; analyst tile is linked.
assert "Admin CLI" in resp.text or "role=admin" in resp.text
def test_setup_page_analyst_role(client):
resp = client.get("/setup?role=analyst")
assert resp.status_code == 200
assert "Analyst workspace" in resp.text or "role=analyst" in resp.text
def test_install_redirects_to_setup(client):
resp = client.get("/install", follow_redirects=False)
assert resp.status_code in (302, 307)
assert "/setup" in resp.headers["location"]
- Step 3: Run tests to verify they fail
pytest tests/test_setup_page_roles.py -v
Expected: tests for analyst/role-branching content FAIL; redirect test may PASS (existing behavior).
- Step 4: Modify
setup_pageto readrolequery param
Find setup_page in app/web/router.py. Update its signature to add a role query param and pass it to the renderer:
from typing import Literal
from fastapi import Query
@router.get("/setup", response_class=HTMLResponse)
async def setup_page(
request: Request,
role: Literal["analyst", "admin"] = Query(default="admin", description="Bootstrap target role"),
# ... existing dependencies (auth, etc.)
):
"""Renders the role-specific install paste prompt."""
# ... existing context-building code ...
ctx["role"] = role
return templates.TemplateResponse(request, "setup.html", ctx)
If setup_page already calls render_setup_instructions(...) server-side (vs. JS-rendered), pass role there too:
prompt_text = render_setup_instructions(
server_url=str(request.base_url).rstrip("/"),
token="{token}", # placeholder filled by JS at click time
wheel_filename=resolved_wheel,
plugin_install_names=plugin_install_names if role == "admin" else None,
self_signed_tls=...,
server_host=...,
ca_pem=...,
role=role,
)
- Step 5: Update
setup.htmltemplate to render role tiles
Find app/web/templates/setup.html (or whatever setup_page actually renders — grep -n "setup.html\|TemplateResponse" app/web/router.py). Add two role tiles near the top of the body:
<div class="role-tiles" style="display:flex; gap:1rem; margin-bottom:2rem;">
<a href="/setup?role=analyst"
class="role-tile {% if role == 'analyst' %}is-active{% endif %}"
style="flex:1; padding:1rem; border:2px solid {% if role == 'analyst' %}#0070f3{% else %}#ddd{% endif %}; border-radius:8px; text-decoration:none;">
<h3>Analyst workspace</h3>
<p>Bootstrap a workspace folder with CLAUDE.md, hooks, and synced data.</p>
</a>
<a href="/setup?role=admin"
class="role-tile {% if role == 'admin' %}is-active{% endif %}"
style="flex:1; padding:1rem; border:2px solid {% if role == 'admin' %}#0070f3{% else %}#ddd{% endif %}; border-radius:8px; text-decoration:none;">
<h3>Admin CLI</h3>
<p>Install the CLI, register the marketplace, set up admin tooling.</p>
</a>
</div>
If the template is more sophisticated (e.g., with role-specific JS), wire the JS to use the role ctx variable when calling POST /auth/tokens for PAT minting:
const role = "{{ role }}";
const scope = role === "analyst" ? "bootstrap-analyst" : "general";
const ttlSeconds = role === "analyst" ? 3600 : 86400; // analyst short-lived
await fetch('/auth/tokens', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ name: `setup-${role}`, scope, ttl_seconds: ttlSeconds }),
});
- Step 6: Run tests to verify they pass
pytest tests/test_setup_page_roles.py -v
Expected: all PASS.
- Step 7: Commit
git add app/web/router.py app/web/templates/setup.html tests/test_setup_page_roles.py
git commit -m "feat(setup): /setup?role=analyst|admin branching with role tiles"
Task 5: Add legacy-strings banner to admin workspace-prompt template UI
Files:
-
Modify:
app/web/templates/admin_workspace_prompt.html(add banner above editor whenlegacy_strings_detectednon-empty) -
Test:
tests/test_legacy_strings_scan.py(extend with HTML rendering test) -
Step 1: Read existing admin-prompt template
cat app/web/templates/admin_workspace_prompt.html
Find where the editor (textarea) is rendered.
- Step 2: Write extension test
Append to tests/test_legacy_strings_scan.py:
def test_admin_prompt_template_renders_banner_when_legacy_present(web_session):
web_session.put("/api/admin/workspace-prompt-template",
json={"content": "Run `da sync` daily."})
resp = web_session.get("/admin/workspace-prompt")
assert resp.status_code == 200
assert "yellow" in resp.text.lower() or "warning" in resp.text.lower()
assert "da sync" in resp.text # the hit is rendered in the banner
def test_admin_prompt_template_no_banner_when_clean(web_session):
web_session.put("/api/admin/workspace-prompt-template",
json={"content": "Run `agnes pull` daily."})
resp = web_session.get("/admin/workspace-prompt")
assert resp.status_code == 200
# The banner div is absent or empty
# Implementation: e.g., id="legacy-banner" wraps the warning; check empty
assert "legacy-banner" not in resp.text or "display: none" in resp.text or len(
[l for l in resp.text.split("\n") if "legacy-banner" in l and "hidden" not in l]
) <= 2
(The exact test is fragile — strengthen once the implementation lands.)
- Step 3: Run tests to verify they fail
pytest tests/test_legacy_strings_scan.py -v -k banner
- Step 4: Modify
admin_workspace_prompt.html
Find the spot above the editor <textarea> and insert:
{% if legacy_strings_detected %}
<div id="legacy-banner" style="background:#fff3cd; border:1px solid #ffc107; padding:0.75rem 1rem; border-radius:4px; margin-bottom:1rem;">
<strong>⚠ This override references CLI verbs / paths that were renamed:</strong>
<ul style="margin:0.5rem 0 0 1.5rem;">
{% for hit in legacy_strings_detected %}
<li><code>{{ hit }}</code></li>
{% endfor %}
</ul>
<p style="margin:0.5rem 0 0 0;">Re-author and Save to clear this warning. See CHANGELOG for the rename list.</p>
</div>
{% endif %}
In the route that renders this template (find via grep -n "admin_workspace_prompt\.html" app/web/router.py app/web/admin_router.py), pass legacy_strings_detected into the context. The data comes from the same _scan_legacy_strings(override_text) call as the API — DRY by importing from app.api.claude_md:
from app.api.claude_md import _scan_legacy_strings
@router.get("/admin/workspace-prompt", response_class=HTMLResponse)
async def admin_workspace_prompt_page(...):
override = repo.get_workspace_prompt_template()
ctx = {
"override_content": override.content if override else "",
"legacy_strings_detected": _scan_legacy_strings(override.content) if override else [],
# ... rest of existing context
}
return templates.TemplateResponse(request, "admin_workspace_prompt.html", ctx)
- Step 5: Run tests to verify they pass
pytest tests/test_legacy_strings_scan.py -v
Expected: PASS.
- Step 6: Commit
git add app/web/templates/admin_workspace_prompt.html app/web/router.py tests/test_legacy_strings_scan.py
git commit -m "feat(admin): yellow banner for legacy CLI verbs in workspace-prompt override"
Task 6: Update config/claude_md_template.txt (server-side rendered to /api/welcome)
Files:
-
Modify:
config/claude_md_template.txt(verb + path rewrites) -
Step 1: Read the current template
cat config/claude_md_template.txt | head -100
wc -l config/claude_md_template.txt
- Step 2: Apply systematic rewrites
Replace throughout the file:
da sync→agnes pull(everywhere)da analyst setup→agnes init(everywhere)da fetch→agnes snapshot createda metrics list→agnes catalog --metricsda metrics show→agnes catalog --metrics --showdata/parquet/→server/parquet/data/duckdb/→user/duckdb/data/metadata/→ (delete references; the path no longer exists)
Use sed:
sed -i.bak \
-e 's|da sync --upload-only|agnes push|g' \
-e 's|da sync|agnes pull|g' \
-e 's|da analyst setup|agnes init|g' \
-e 's|da fetch|agnes snapshot create|g' \
-e 's|da metrics list|agnes catalog --metrics|g' \
-e 's|da metrics show|agnes catalog --metrics --show|g' \
-e 's|data/parquet/|server/parquet/|g' \
-e 's|data/duckdb/|user/duckdb/|g' \
config/claude_md_template.txt
rm config/claude_md_template.txt.bak
- Step 3: Read the result and review for any leftover legacy strings
grep -nE 'da sync|da fetch|da analyst|da metrics list|da metrics show|data/parquet|data/duckdb|data/metadata' config/claude_md_template.txt
Expected: no matches.
- Step 4: Add a top-of-file pointer to AGNES_WORKSPACE.md
Insert near the top of the rendered template (e.g., after the # {instance_name} heading):
> Looking for human-readable workspace docs? Open `AGNES_WORKSPACE.md` in this directory — that file documents what `agnes init` installed, where files live, and how to uninstall.
- Step 5: Render the template via
/api/welcome(manual smoke)
(Defer the real test to Task 27 — clean-install integration.)
- Step 6: Commit
git add config/claude_md_template.txt
git commit -m "docs(claude-md-template): rewrite verbs + paths for new CLI surface"
Phase 2 — Client-side library (cli/lib/)
Task 7: Establish cli/lib/ package + install_claude_hooks
Files:
-
Create:
cli/lib/__init__.py,cli/lib/hooks.py -
Test:
tests/test_lib_hooks.py(new) -
Step 1: Write failing tests
# tests/test_lib_hooks.py
"""Tests for cli/lib/hooks.py:install_claude_hooks."""
import json
from pathlib import Path
import pytest
from cli.lib.hooks import install_claude_hooks
def _read_settings(workspace: Path) -> dict:
return json.loads((workspace / ".claude" / "settings.json").read_text())
def test_install_creates_settings_file(tmp_path):
install_claude_hooks(tmp_path)
cfg = _read_settings(tmp_path)
assert cfg["hooks"]["SessionStart"]
assert "agnes pull --quiet" in cfg["hooks"]["SessionStart"][0]["hooks"][0]["command"]
assert cfg["hooks"]["SessionEnd"]
assert "agnes push --quiet" in cfg["hooks"]["SessionEnd"][0]["hooks"][0]["command"]
def test_install_idempotent(tmp_path):
install_claude_hooks(tmp_path)
install_claude_hooks(tmp_path)
cfg = _read_settings(tmp_path)
# Only ONE entry per event after second install (not duplicated)
assert len(cfg["hooks"]["SessionStart"]) == 1
assert len(cfg["hooks"]["SessionEnd"]) == 1
def test_install_replaces_old_da_sync_entries(tmp_path):
"""Hook from a pre-rewrite workspace gets replaced cleanly."""
settings_path = tmp_path / ".claude" / "settings.json"
settings_path.parent.mkdir(parents=True)
settings_path.write_text(json.dumps({
"hooks": {
"SessionStart": [{"hooks": [{"type": "command", "command": "da sync --quiet"}]}],
"SessionEnd": [{"hooks": [{"type": "command", "command": "da sync --upload-only --quiet"}]}],
}
}))
install_claude_hooks(tmp_path)
cfg = _read_settings(tmp_path)
assert len(cfg["hooks"]["SessionStart"]) == 1
assert "agnes pull" in cfg["hooks"]["SessionStart"][0]["hooks"][0]["command"]
assert "da sync" not in cfg["hooks"]["SessionStart"][0]["hooks"][0]["command"]
def test_install_preserves_third_party_hooks(tmp_path):
settings_path = tmp_path / ".claude" / "settings.json"
settings_path.parent.mkdir(parents=True)
settings_path.write_text(json.dumps({
"hooks": {
"SessionStart": [{"hooks": [{"type": "command", "command": "echo hi from another tool"}]}],
"PreToolUse": [{"hooks": [{"type": "command", "command": "echo pre"}]}],
}
}))
install_claude_hooks(tmp_path)
cfg = _read_settings(tmp_path)
# Third-party SessionStart entry survives; our agnes pull entry appended
starts = cfg["hooks"]["SessionStart"]
assert any("echo hi from another tool" in s["hooks"][0]["command"] for s in starts)
assert any("agnes pull" in s["hooks"][0]["command"] for s in starts)
# PreToolUse untouched
assert cfg["hooks"]["PreToolUse"][0]["hooks"][0]["command"] == "echo pre"
def test_install_handles_missing_settings_file(tmp_path):
"""No prior settings.json → create from scratch."""
install_claude_hooks(tmp_path)
assert (tmp_path / ".claude" / "settings.json").exists()
def test_install_handles_invalid_json(tmp_path, capsys):
"""Invalid existing settings.json → warn, skip."""
settings_path = tmp_path / ".claude" / "settings.json"
settings_path.parent.mkdir(parents=True)
settings_path.write_text("not valid json {")
# Should not raise; should print a warning
install_claude_hooks(tmp_path)
captured = capsys.readouterr()
assert "not valid JSON" in captured.err or "warning" in captured.err.lower()
- Step 2: Run tests to verify they fail
pytest tests/test_lib_hooks.py -v
Expected: ImportError — cli.lib doesn't exist.
- Step 3: Create
cli/lib/__init__.py
touch cli/lib/__init__.py
- Step 4: Create
cli/lib/hooks.py
# cli/lib/hooks.py
"""Workspace-scoped Claude Code hook installer.
Replaces the in-place `_install_claude_hooks` from `cli/commands/analyst.py`
(deleted as part of the clean-analyst-bootstrap rewrite). Splits hook
installation into a pure-function library so `agnes init` and any future caller
can use it without dragging in the deleted command module.
Design notes:
- Workspace-scoped (`<workspace>/.claude/settings.json`), NOT user-home.
The hooks fire only when Claude Code opens this workspace.
- Idempotent: second invocation drops a prior `agnes pull` / `da sync` /
`agnes push` entry (matched by command substring) and appends fresh entries.
Third-party hooks (mixed entries, foreign commands) are left alone.
- Uses `\\| true` so the hook never blocks a session on a transient sync error.
"""
from __future__ import annotations
import json
import sys
from pathlib import Path
_OUR_COMMAND_MARKERS = ("agnes pull", "agnes push", "da sync")
def install_claude_hooks(workspace: Path) -> None:
"""Install SessionStart→`agnes pull` and SessionEnd→`agnes push` hooks.
Idempotent. Workspace-scoped (writes `<workspace>/.claude/settings.json`).
Preserves third-party hooks and other event types.
"""
settings_path = workspace / ".claude" / "settings.json"
settings_path.parent.mkdir(parents=True, exist_ok=True)
if settings_path.exists():
try:
cfg = json.loads(settings_path.read_text(encoding="utf-8"))
except json.JSONDecodeError:
print(
f"Warning: {settings_path} is not valid JSON; skipping hook install.",
file=sys.stderr,
)
return
else:
cfg = {}
hooks = cfg.setdefault("hooks", {})
def _replace_or_add(event: str, command: str) -> None:
existing = hooks.setdefault(event, [])
# Drop any prior entry whose every hook command matches one of our
# markers. Mixed entries (third-party + ours) are left alone.
for entry in list(existing):
entry_cmds = [h.get("command", "") for h in entry.get("hooks", [])]
if entry_cmds and all(
any(marker in c for marker in _OUR_COMMAND_MARKERS) for c in entry_cmds
):
existing.remove(entry)
existing.append({"hooks": [{"type": "command", "command": command}]})
_replace_or_add("SessionStart", "agnes pull --quiet 2>/dev/null || true")
_replace_or_add("SessionEnd", "agnes push --quiet 2>/dev/null || true")
settings_path.write_text(json.dumps(cfg, indent=2) + "\n", encoding="utf-8")
- Step 5: Run tests to verify they pass
pytest tests/test_lib_hooks.py -v
Expected: all PASS.
- Step 6: Commit
git add cli/lib/__init__.py cli/lib/hooks.py tests/test_lib_hooks.py
git commit -m "feat(cli-lib): cli/lib/hooks.py:install_claude_hooks"
Task 8: cli/lib/pull.py:run_pull — extract data-refresh primitive from sync.py
Files:
-
Create:
cli/lib/pull.py -
Test:
tests/test_lib_pull.py(new) -
Step 1: Read
cli/commands/sync.pyto identify the function body to lift
wc -l cli/commands/sync.py
grep -n "^def \|^class " cli/commands/sync.py
Identify:
-
The Typer command function (e.g.,
sync()decorated with@sync_app.command()) -
The helper functions called from it:
_rebuild_duckdb_views,_fetch_and_write_rules,_is_valid_parquet, etc. -
Step 2: Write failing tests
# tests/test_lib_pull.py
"""Tests for cli/lib/pull.py:run_pull."""
from __future__ import annotations
import json
from pathlib import Path
from unittest.mock import patch, MagicMock
import pytest
from cli.lib.pull import run_pull, PullResult
@pytest.fixture
def fake_server(monkeypatch):
"""Mock api_get to return canned manifest + memory bundle."""
canned = {
"/api/sync/manifest": {"tables": []},
"/api/memory/bundle": {"mandatory": [], "approved": []},
}
def _api_get(path, *args, **kwargs):
resp = MagicMock()
resp.status_code = 200
body = canned.get(path, {})
resp.json.return_value = body
resp.iter_bytes = lambda chunk_size=65536: iter([b""])
resp.raise_for_status = lambda: None
return resp
monkeypatch.setattr("cli.lib.pull.api_get", _api_get, raising=False)
return canned
def test_run_pull_empty_manifest_no_parquet_dir(tmp_path, fake_server):
result = run_pull(server_url="http://x", token="t", workspace=tmp_path)
assert isinstance(result, PullResult)
assert result.tables_updated == 0
assert not (tmp_path / "server" / "parquet").exists(), \
"lazy mkdir: empty manifest must not create server/parquet/"
def test_run_pull_empty_memory_no_rules_dir(tmp_path, fake_server):
run_pull(server_url="http://x", token="t", workspace=tmp_path)
assert not (tmp_path / ".claude" / "rules").exists(), \
"lazy mkdir: empty bundle must not create .claude/rules/"
def test_run_pull_creates_duckdb_unconditionally(tmp_path, fake_server):
"""Even with zero data, the DuckDB file is opened (it's the load-bearing
artifact and other readers expect its parent dir to exist)."""
run_pull(server_url="http://x", token="t", workspace=tmp_path)
assert (tmp_path / "user" / "duckdb" / "analytics.duckdb").exists()
def test_run_pull_with_one_table(tmp_path, monkeypatch):
"""Manifest with one table → server/parquet/ created, parquet downloaded."""
canned_manifest = {"tables": [{"id": "tbl1", "md5": "abc"}]}
canned_memory = {"mandatory": [], "approved": []}
parquet_bytes = b"PAR1" + b"\x00" * 1000 + b"PAR1" # minimal valid parquet shape
def _api_get(path, *args, **kwargs):
resp = MagicMock()
resp.status_code = 200
if path == "/api/sync/manifest":
resp.json.return_value = canned_manifest
elif path == "/api/memory/bundle":
resp.json.return_value = canned_memory
elif path.startswith("/api/data/tbl1/download"):
resp.iter_bytes = lambda chunk_size=65536: iter([parquet_bytes])
return resp
monkeypatch.setattr("cli.lib.pull.api_get", _api_get, raising=False)
monkeypatch.setattr("cli.lib.pull._is_valid_parquet", lambda p: True, raising=False)
result = run_pull(server_url="http://x", token="t", workspace=tmp_path)
assert (tmp_path / "server" / "parquet").exists()
assert (tmp_path / "server" / "parquet" / "tbl1.parquet").exists()
assert result.tables_updated == 1
def test_run_pull_dry_run_writes_nothing(tmp_path, fake_server):
run_pull(server_url="http://x", token="t", workspace=tmp_path, dry_run=True)
assert not (tmp_path / "server").exists()
assert not (tmp_path / "user" / "duckdb").exists()
- Step 3: Run tests to verify they fail
pytest tests/test_lib_pull.py -v
Expected: ImportError on cli.lib.pull.
- Step 4: Create
cli/lib/pull.py
Lift the body of today's cli/commands/sync.py:sync() into a pure function. Specifically:
- Move
_rebuild_duckdb_views,_fetch_and_write_rules,_is_valid_parquet(private helpers) intocli/lib/pull.py. - Drop Typer decorators and
typer.echocalls — replace with returning structured result. - Apply lazy-mkdir fixes:
_fetch_and_write_rules: checkmandatory + approvednon-empty before mkdir.- Per-table download loop: mkdir
server/parquet/inside the loop, only when about to write.
# cli/lib/pull.py
"""Pure-function data-refresh primitive — used by `agnes pull` and `agnes init`.
Extracted from `cli/commands/sync.py` (deleted in the clean-bootstrap rewrite).
This module has no Typer dependency, no stdout side effects, no exit calls.
Callers decide what to print and how to handle errors.
Lazy-mkdir contract:
- `<workspace>/server/parquet/` — created only when the manifest has at least one
table to download. Empty manifest → directory is never created.
- `<workspace>/.claude/rules/` — created only when `/api/memory/bundle` returns
at least one mandatory or approved item. Empty bundle → directory absent.
- `<workspace>/user/duckdb/analytics.duckdb` — created unconditionally. The DB
file (not just the dir) is the load-bearing artifact every reader expects.
"""
from __future__ import annotations
import hashlib
import re
from dataclasses import dataclass, field
from pathlib import Path
from typing import Optional
from cli.client import api_get
_SAFE_ID_RE = re.compile(r"^[a-zA-Z0-9_\-]+$")
@dataclass
class PullResult:
tables_updated: int = 0
parquets_total: int = 0
rules_count: int = 0
duration_s: float = 0.0
errors: list[str] = field(default_factory=list)
def run_pull(
server_url: str,
token: str,
workspace: Path,
*,
dry_run: bool = False,
) -> PullResult:
"""Refresh registered data into <workspace>/server/parquet + user/duckdb.
Args:
server_url: Base URL of the Agnes server.
token: PAT for `Authorization: Bearer <token>`.
workspace: Local workspace root. Lazy-mkdir applies inside this dir.
dry_run: If True, computes deltas but writes nothing to disk.
Returns:
PullResult with summary counts.
"""
import time
start = time.time()
result = PullResult()
# 1. Fetch RBAC-filtered manifest
resp = api_get("/api/sync/manifest", server_url=server_url, token=token)
resp.raise_for_status()
manifest = resp.json()
tables = manifest.get("tables", [])
# 2. Per-table download with lazy mkdir
parquet_dir = workspace / "server" / "parquet"
for tbl in tables:
tbl_id = tbl.get("id", "")
if not _SAFE_ID_RE.match(tbl_id):
result.errors.append(f"unsafe table id skipped: {tbl_id!r}")
continue
target = parquet_dir / f"{tbl_id}.parquet"
# Skip if local md5 matches remote
remote_md5 = tbl.get("md5", "")
if target.exists() and _file_md5(target) == remote_md5:
continue
if dry_run:
result.tables_updated += 1
continue
# Lazy mkdir: only when about to write the FIRST parquet.
parquet_dir.mkdir(parents=True, exist_ok=True)
try:
stream_resp = api_get(
f"/api/data/{tbl_id}/download", server_url=server_url, token=token, stream=True,
)
stream_resp.raise_for_status()
with open(target, "wb") as fh:
for chunk in stream_resp.iter_bytes(chunk_size=65536):
fh.write(chunk)
if not _is_valid_parquet(target):
target.unlink()
result.errors.append(f"{tbl_id}: invalid parquet (missing PAR1)")
continue
result.tables_updated += 1
except Exception as exc:
if target.exists():
target.unlink()
result.errors.append(f"{tbl_id}: {exc}")
# 3. Rebuild DuckDB views
if not dry_run:
_rebuild_duckdb_views(workspace, parquet_dir)
result.parquets_total = len(list(parquet_dir.glob("*.parquet"))) if parquet_dir.exists() else 0
# 4. Fetch corporate-memory bundle (lazy mkdir for .claude/rules/)
if not dry_run:
result.rules_count = _fetch_and_write_rules(workspace, server_url, token)
result.duration_s = time.time() - start
return result
def _file_md5(path: Path) -> str:
h = hashlib.md5()
with open(path, "rb") as fh:
for chunk in iter(lambda: fh.read(8192), b""):
h.update(chunk)
return h.hexdigest()
def _is_valid_parquet(path: Path) -> bool:
"""Cheap structural check — parquet files begin and end with `PAR1`."""
try:
with open(path, "rb") as fh:
head = fh.read(4)
fh.seek(-4, 2)
tail = fh.read(4)
return head == b"PAR1" and tail == b"PAR1"
except OSError:
return False
def _rebuild_duckdb_views(workspace: Path, parquet_dir: Path) -> None:
import duckdb
db_path = workspace / "user" / "duckdb" / "analytics.duckdb"
db_path.parent.mkdir(parents=True, exist_ok=True)
conn = duckdb.connect(str(db_path))
try:
# Drop all existing views
views = conn.execute(
"SELECT table_name FROM information_schema.tables WHERE table_type='VIEW'"
).fetchall()
for (view_name,) in views:
conn.execute(f'DROP VIEW IF EXISTS "{view_name}"')
# Recreate from parquets (if any)
if parquet_dir.exists():
for pq_file in parquet_dir.glob("*.parquet"):
view_name = pq_file.stem
if not _SAFE_ID_RE.match(view_name):
continue
if not _is_valid_parquet(pq_file):
continue
abs_path = str(pq_file.resolve()).replace("'", "''")
try:
conn.execute(
f'CREATE VIEW "{view_name}" AS SELECT * FROM read_parquet(\'{abs_path}\')'
)
except duckdb.Error:
continue
finally:
conn.close()
def _fetch_and_write_rules(workspace: Path, server_url: str, token: str) -> int:
"""Fetch /api/memory/bundle and write .claude/rules/km_*.md files.
Lazy mkdir: only creates `<workspace>/.claude/rules/` if the bundle is non-empty.
Returns the number of rule files written.
"""
rules_dir = workspace / ".claude" / "rules"
try:
resp = api_get("/api/memory/bundle", server_url=server_url, token=token)
resp.raise_for_status()
bundle = resp.json()
except Exception:
return 0
items = list(bundle.get("mandatory", [])) + list(bundle.get("approved", []))
if not items:
return 0 # no mkdir, nothing to write
rules_dir.mkdir(parents=True, exist_ok=True)
written = 0
for item in items:
item_id = item.get("id", "")
if not _SAFE_ID_RE.match(item_id):
continue
fname = f"km_{item_id}.md"
body = _item_to_md(item)
(rules_dir / fname).write_text(body, encoding="utf-8")
written += 1
return written
def _item_to_md(item: dict) -> str:
title = item.get("title", "")
body = item.get("body", "")
return f"# {title}\n\n{body}\n"
(If cli/client.py:api_get doesn't accept the server_url/token/stream kwargs as shown, adapt the calls to the actual signature.)
- Step 5: Run tests to verify they pass
pytest tests/test_lib_pull.py -v
Expected: all PASS.
- Step 6: Commit
git add cli/lib/pull.py tests/test_lib_pull.py
git commit -m "feat(cli-lib): cli/lib/pull.py:run_pull primitive with lazy mkdir"
Phase 3 — New CLI commands
Task 9: agnes pull Typer wrapper
Files:
-
Create:
cli/commands/pull.py -
Test:
tests/test_cli_pull.py(new) -
Step 1: Write failing test
# tests/test_cli_pull.py
"""Tests for `agnes pull` Typer wrapper."""
from typer.testing import CliRunner
from cli.commands.pull import pull_app
runner = CliRunner()
def test_pull_help():
result = runner.invoke(pull_app, ["--help"])
assert result.exit_code == 0
assert "--quiet" in result.output
assert "--json" in result.output
assert "--dry-run" in result.output
- Step 2: Run test to verify it fails
pytest tests/test_cli_pull.py -v
Expected: ImportError.
- Step 3: Create
cli/commands/pull.py
# cli/commands/pull.py
"""`agnes pull` — refresh registered data into the workspace.
Thin Typer wrapper around `cli/lib/pull.py:run_pull`. Used by:
- Manual invocation: analyst types `agnes pull` to force a refresh.
- SessionStart hook: `agnes pull --quiet 2>/dev/null || true` runs at the start
of every Claude Code session in this workspace.
"""
from __future__ import annotations
import json
import os
from pathlib import Path
import typer
from cli.config import get_server_url, load_token
from cli.error_render import render_error
from cli.lib.pull import run_pull, PullResult
pull_app = typer.Typer(help="Refresh registered data from the server")
@pull_app.callback(invoke_without_command=True)
def pull(
quiet: bool = typer.Option(False, "--quiet", help="Suppress success stdout"),
as_json: bool = typer.Option(False, "--json", help="Machine-readable output"),
dry_run: bool = typer.Option(False, "--dry-run", help="Compute deltas without writing"),
):
"""Refresh data from the server into ./server/parquet + ./user/duckdb."""
server_url = get_server_url()
if not server_url:
typer.echo(render_error(0, {"detail": {
"kind": "server_unreachable",
"hint": "No server configured. Run: agnes init --server-url <URL> --token <PAT>",
}}), err=True)
raise typer.Exit(1)
token = load_token()
if not token:
typer.echo(render_error(0, {"detail": {
"kind": "auth_failed",
"hint": "No token. Run: agnes auth import-token --token <PAT>",
}}), err=True)
raise typer.Exit(1)
workspace = Path(os.environ.get("DA_LOCAL_DIR", ".")).resolve()
try:
result: PullResult = run_pull(server_url, token, workspace, dry_run=dry_run)
except Exception as exc:
typer.echo(render_error(0, {"detail": {
"kind": "manifest_unauthorized",
"hint": f"Pull failed: {exc}",
"message": str(exc),
}}), err=True)
raise typer.Exit(1)
if as_json:
typer.echo(json.dumps({
"tables_updated": result.tables_updated,
"parquets_total": result.parquets_total,
"rules_count": result.rules_count,
"duration_s": round(result.duration_s, 3),
"errors": result.errors,
}))
return
if quiet:
if result.errors:
for e in result.errors:
typer.echo(f"warn: {e}", err=True)
return
typer.echo(f"Updated {result.tables_updated} tables ({result.parquets_total} total).")
typer.echo(f"Rules: {result.rules_count}.")
if result.errors:
for e in result.errors:
typer.echo(f"warn: {e}", err=True)
- Step 4: Run test to verify it passes
pytest tests/test_cli_pull.py -v
Expected: PASS.
- Step 5: Commit
git add cli/commands/pull.py tests/test_cli_pull.py
git commit -m "feat(cli): agnes pull command (Typer wrapper around lib.pull.run_pull)"
Task 10: agnes push command (extract from da sync --upload-only)
Files:
-
Create:
cli/commands/push.py -
Test:
tests/test_cli_push.py(new) -
Step 1: Write failing tests
# tests/test_cli_push.py
from pathlib import Path
from typer.testing import CliRunner
from cli.commands.push import push_app
runner = CliRunner()
def test_push_help():
result = runner.invoke(push_app, ["--help"])
assert result.exit_code == 0
assert "--quiet" in result.output
assert "--json" in result.output
def test_push_no_sessions_no_mkdir(tmp_path, monkeypatch):
"""Empty workspace → push exits 0, doesn't create user/sessions/."""
monkeypatch.setenv("DA_LOCAL_DIR", str(tmp_path))
monkeypatch.setattr("cli.commands.push.get_server_url", lambda: "http://x")
monkeypatch.setattr("cli.commands.push.load_token", lambda: "test-pat")
result = runner.invoke(push_app, ["--quiet"])
assert result.exit_code == 0
assert not (tmp_path / "user" / "sessions").exists()
- Step 2: Run test to verify it fails
pytest tests/test_cli_push.py -v
- Step 3: Create
cli/commands/push.py
# cli/commands/push.py
"""`agnes push` — upload local sessions and CLAUDE.local.md to the server.
Extracted from today's `da sync --upload-only`. Hook command:
`agnes push --quiet 2>/dev/null || true` (runs at SessionEnd).
"""
from __future__ import annotations
import json
import os
from pathlib import Path
import typer
from cli.client import api_post
from cli.config import get_server_url, load_token
from cli.error_render import render_error
push_app = typer.Typer(help="Upload local sessions and notes to the server")
@push_app.callback(invoke_without_command=True)
def push(
quiet: bool = typer.Option(False, "--quiet", help="Suppress stdout"),
as_json: bool = typer.Option(False, "--json", help="Machine-readable output"),
dry_run: bool = typer.Option(False, "--dry-run", help="List what would upload, don't send"),
):
server_url = get_server_url()
token = load_token()
if not server_url or not token:
typer.echo(render_error(0, {"detail": {
"kind": "auth_failed",
"hint": "No server/token configured. Run: agnes init",
}}), err=True)
raise typer.Exit(1)
workspace = Path(os.environ.get("DA_LOCAL_DIR", ".")).resolve()
sessions_dir = workspace / "user" / "sessions"
local_md = workspace / ".claude" / "CLAUDE.local.md"
sessions = []
if sessions_dir.exists():
sessions = sorted(sessions_dir.glob("*.jsonl"))
has_local_md = local_md.exists()
summary = {"sessions_count": len(sessions), "local_md": has_local_md, "uploaded": 0}
if dry_run:
if as_json:
typer.echo(json.dumps(summary))
else:
typer.echo(f"Would upload: {len(sessions)} sessions, local_md={has_local_md}")
return
for session_file in sessions:
try:
with open(session_file, "rb") as fh:
resp = api_post("/api/upload/sessions", files={"file": (session_file.name, fh)})
if resp.status_code == 200:
summary["uploaded"] += 1
except Exception as exc:
if not quiet:
typer.echo(f"warn: failed to upload {session_file.name}: {exc}", err=True)
if has_local_md:
try:
with open(local_md, "rb") as fh:
api_post("/api/upload/local-md", files={"file": (local_md.name, fh)})
except Exception as exc:
if not quiet:
typer.echo(f"warn: failed to upload CLAUDE.local.md: {exc}", err=True)
if as_json:
typer.echo(json.dumps(summary))
elif not quiet:
typer.echo(f"Uploaded {summary['uploaded']} sessions, local_md={has_local_md}")
- Step 4: Run tests to verify they pass
pytest tests/test_cli_push.py -v
- Step 5: Commit
git add cli/commands/push.py tests/test_cli_push.py
git commit -m "feat(cli): agnes push command (extracted from sync --upload-only)"
Task 11: agnes init — workspace bootstrap orchestrator
Files:
-
Create:
cli/commands/init.py,config/agnes_workspace_template.txt -
Test:
tests/test_cli_init.py(new) -
Step 1: Write the AGNES_WORKSPACE.md template
Create config/agnes_workspace_template.txt:
# Agnes analyst workspace
**Created:** {created_at}
**Server:** {server_url}
**Workspace:** {workspace_path}
This file documents what `agnes init` installed on this machine and in this folder.
Read this when you want to know "what is this thing", "how does it work", or
"how do I uninstall it". For Claude Code's instructions, see `CLAUDE.md`.
---
## What's installed (global, per-user)
| Path | What it is | How to remove |
|------|------------|---------------|
| `~/.local/bin/agnes` | The `agnes` CLI binary | `uv tool uninstall agnes-the-ai-analyst` |
| `~/.config/da/config.yaml` | Default Agnes server URL | `rm -rf ~/.config/da/` |
| `~/.config/da/token.json` | Personal access token (PAT) | `rm ~/.config/da/token.json` |
| `~/.agnes/ca.pem` | Server's CA cert (private CA installs only) | `rm ~/.agnes/ca.pem` |
| `~/.agnes/ca-bundle.pem` | Combined system + Agnes CA bundle | `rm ~/.agnes/ca-bundle.pem` |
| `~/.zshrc` / `~/.bashrc` block (marker `AGNES_CA_PEM_TRUST`) | `PATH` + `SSL_CERT_FILE` env | Edit rc, remove block |
---
## What's in this folder
| Path | What it is |
|------|------------|
| `./CLAUDE.md` | Rules + golden path for Claude Code (fetched from server's `/api/welcome`) |
| `./AGNES_WORKSPACE.md` | This file |
| `./.claude/settings.json` | Claude Code config: model, permissions, hooks |
| `./.claude/CLAUDE.local.md` | Your private notes (uploaded on session end) |
| `./.claude/rules/km_*.md` | Server-pushed corporate-knowledge rules (only when granted) |
| `./server/parquet/*.parquet` | Synced data — RBAC-filtered subset (only when grants exist) |
| `./user/duckdb/analytics.duckdb` | DuckDB views over the parquets — what `agnes query` reads |
| `./user/snapshots/*.parquet` | Ad-hoc materialized snapshots from `agnes snapshot create` |
| `./user/sessions/*.jsonl` | Captured Claude Code sessions (uploaded on session end) |
Some folders only exist when they have content — `agnes pull` and `agnes push`
only create them when there's something to write.
---
## How it stays fresh
Two hooks in `./.claude/settings.json` keep this workspace in sync without
you doing anything:
- **SessionStart** → `agnes pull --quiet` — new parquets, schema changes, and
updated rules pull down before Claude Code answers. Failure is silent;
your session continues with the last-known data.
- **SessionEnd** → `agnes push --quiet` — your session transcript and
`CLAUDE.local.md` ship to the server.
Both are workspace-scoped — they only run when Claude Code opens this folder.
---
## Cheat sheet
```bash
# Tables you can read (server-side catalog, RBAC-filtered)
agnes catalog
agnes catalog --json | jq '.[] | select(.query_mode=="local")'
# Schema and sample
agnes schema opportunity
agnes describe opportunity -n 10
# Run a SQL query (DuckDB flavor against local parquets)
agnes query "SELECT count(*) FROM opportunity WHERE stage='Closed Won'"
# Remote BigQuery query (server-side, no local materialization)
agnes query --remote "SELECT count(*) FROM web_sessions_example"
# Materialize a remote subset locally
agnes snapshot create web_sessions_example \
--select event_date,country_code \
--where "event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)" \
--as recent_sessions
# Manual data refresh (the SessionStart hook does this automatically)
agnes pull
# Workspace status (what's synced, when)
agnes status
# Re-generate this workspace from scratch (preserves CLAUDE.local.md)
agnes init --server-url https://agnes.example.com --token <PAT> --force
Uninstall
# 1. Remove the CLI
uv tool uninstall agnes-the-ai-analyst
# 2. Remove global config and trust artifacts
rm -rf ~/.config/da
rm -rf ~/.agnes
# 3. Remove the env-var block from your shell rc
# Open ~/.zshrc or ~/.bashrc, find the lines between
# "# AGNES_CA_PEM_TRUST — added by Agnes setup" and the next blank line, delete.
# 4. Remove this workspace
rm -rf ./CLAUDE.md ./AGNES_WORKSPACE.md ./.claude ./server ./user
- [ ] **Step 2: Write failing tests for `agnes init`**
```python
# tests/test_cli_init.py
"""Tests for `agnes init` orchestrator command."""
import json
from pathlib import Path
from unittest.mock import patch
from typer.testing import CliRunner
from cli.commands.init import init_app
runner = CliRunner()
def test_init_help():
result = runner.invoke(init_app, ["--help"])
assert result.exit_code == 0
assert "--server-url" in result.output
assert "--token" in result.output
assert "--force" in result.output
assert "--workspace" in result.output
def test_init_writes_expected_files(tmp_path, monkeypatch):
"""Mocked end-to-end: init writes CLAUDE.md, settings.json, AGNES_WORKSPACE.md."""
# Mock all server-side calls
def _api_get(path, *args, **kwargs):
from unittest.mock import MagicMock
resp = MagicMock()
resp.status_code = 200
if path == "/api/catalog/tables":
resp.json.return_value = []
elif path == "/api/welcome":
resp.json.return_value = {"content": "# Test CLAUDE.md\n\nUse `agnes pull`.\n"}
elif path == "/api/sync/manifest":
resp.json.return_value = {"tables": []}
elif path == "/api/memory/bundle":
resp.json.return_value = {"mandatory": [], "approved": []}
return resp
monkeypatch.setattr("cli.commands.init.api_get", _api_get, raising=False)
monkeypatch.setattr("cli.lib.pull.api_get", _api_get, raising=False)
result = runner.invoke(init_app, [
"--server-url", "http://test.example.com",
"--token", "test-pat",
"--workspace", str(tmp_path),
])
assert result.exit_code == 0, result.output
assert (tmp_path / "CLAUDE.md").exists()
assert "agnes pull" in (tmp_path / "CLAUDE.md").read_text()
assert (tmp_path / ".claude" / "settings.json").exists()
assert (tmp_path / ".claude" / "CLAUDE.local.md").exists()
assert (tmp_path / "AGNES_WORKSPACE.md").exists()
assert (tmp_path / "user" / "duckdb" / "analytics.duckdb").exists()
def test_init_no_dead_dirs_zero_grants(tmp_path, monkeypatch):
"""Zero grants → no .claude/rules, no server/parquet, no user/sessions."""
# Same mock as above
def _api_get(path, *args, **kwargs):
from unittest.mock import MagicMock
resp = MagicMock()
resp.status_code = 200
if path == "/api/catalog/tables":
resp.json.return_value = []
elif path == "/api/welcome":
resp.json.return_value = {"content": "test"}
elif path == "/api/sync/manifest":
resp.json.return_value = {"tables": []}
elif path == "/api/memory/bundle":
resp.json.return_value = {"mandatory": [], "approved": []}
return resp
monkeypatch.setattr("cli.commands.init.api_get", _api_get, raising=False)
monkeypatch.setattr("cli.lib.pull.api_get", _api_get, raising=False)
runner.invoke(init_app, [
"--server-url", "http://x", "--token", "t", "--workspace", str(tmp_path),
])
for forbidden in ["data/parquet", "data/duckdb", "data/metadata",
"user/artifacts", "user/sessions",
"server/parquet", ".claude/rules"]:
assert not (tmp_path / forbidden).exists(), f"forbidden created: {forbidden}"
def test_init_force_preserves_local_md(tmp_path, monkeypatch):
"""--force regenerates CLAUDE.md but never touches CLAUDE.local.md."""
# First init
def _api_get(path, *args, **kwargs):
from unittest.mock import MagicMock
resp = MagicMock()
resp.status_code = 200
if path == "/api/catalog/tables": resp.json.return_value = []
elif path == "/api/welcome": resp.json.return_value = {"content": "v1"}
else: resp.json.return_value = {} if "manifest" in path else {"mandatory": [], "approved": []}
return resp
monkeypatch.setattr("cli.commands.init.api_get", _api_get, raising=False)
monkeypatch.setattr("cli.lib.pull.api_get", _api_get, raising=False)
runner.invoke(init_app, ["--server-url", "http://x", "--token", "t", "--workspace", str(tmp_path)])
(tmp_path / ".claude" / "CLAUDE.local.md").write_text("# my notes")
# Second init with --force
runner.invoke(init_app, ["--server-url", "http://x", "--token", "t",
"--workspace", str(tmp_path), "--force"])
assert "my notes" in (tmp_path / ".claude" / "CLAUDE.local.md").read_text()
- Step 3: Run tests to verify they fail
pytest tests/test_cli_init.py -v
- Step 4: Create
cli/commands/init.py
# cli/commands/init.py
"""`agnes init` — bootstrap an analyst workspace.
Single-paste flow: web user clicks "Generate prompt" on /setup?role=analyst,
pastes into Claude Code in an empty folder; Claude runs `agnes init` (among other
steps). Non-interactive: --token + --server-url required.
"""
from __future__ import annotations
import json
import os
from datetime import datetime, timezone
from pathlib import Path
from typing import Optional
import typer
from cli.client import api_get
from cli.config import save_config, save_token
from cli.error_render import render_error
from cli.lib.hooks import install_claude_hooks
from cli.lib.pull import run_pull, PullResult
_INIT_MARKER = "AI Data Analyst" # Detect existing workspace via CLAUDE.md substring
init_app = typer.Typer(help="Bootstrap an analyst workspace in this directory")
@init_app.callback(invoke_without_command=True)
def init(
server_url: str = typer.Option(..., "--server-url", help="Agnes server URL"),
token: str = typer.Option(..., "--token", help="Personal access token"),
force: bool = typer.Option(False, "--force", help="Re-initialize an existing workspace"),
workspace_str: Optional[str] = typer.Option(None, "--workspace", help="Target dir (default: cwd)"),
):
"""Bootstrap workspace: auth, CLAUDE.md, hooks, first pull, AGNES_WORKSPACE.md."""
workspace = Path(workspace_str).resolve() if workspace_str else Path.cwd()
server_url = server_url.rstrip("/")
# Detect existing workspace
claude_md = workspace / "CLAUDE.md"
if claude_md.exists() and _INIT_MARKER in claude_md.read_text() and not force:
typer.echo(render_error(0, {"detail": {
"kind": "partial_state",
"hint": "Workspace already initialized. Re-run with --force to redo.",
}}), err=True)
raise typer.Exit(1)
# Step 1: Verify PAT via /api/catalog/tables (PAT-validating endpoint)
try:
resp = api_get("/api/catalog/tables", server_url=server_url, token=token)
if resp.status_code == 401:
typer.echo(render_error(401, {"detail": {
"kind": "auth_failed",
"hint": f"Token expired or invalid — get a fresh one at {server_url}/setup?role=analyst",
}}), err=True)
raise typer.Exit(1)
resp.raise_for_status()
except typer.Exit:
raise
except Exception as exc:
typer.echo(render_error(0, {"detail": {
"kind": "server_unreachable",
"hint": f"Cannot reach {server_url} — check network or server status",
"message": str(exc),
}}), err=True)
raise typer.Exit(1)
# Step 2: Save config + token globally
save_config({"server": server_url})
save_token(token, email="") # email empty — JWT carries it; we don't decode here
# Step 3: Fetch CLAUDE.md from /api/welcome (server-rendered, RBAC-filtered)
welcome_resp = api_get("/api/welcome", server_url=server_url, token=token,
params={"server_url": server_url})
welcome_resp.raise_for_status()
workspace.mkdir(parents=True, exist_ok=True)
claude_md.write_text(welcome_resp.json()["content"], encoding="utf-8")
# Step 4: Default settings.json + install hooks
settings_path = workspace / ".claude" / "settings.json"
if not settings_path.exists():
settings_path.parent.mkdir(parents=True, exist_ok=True)
settings_path.write_text(json.dumps(
{"model": "sonnet", "permissions": {"allow": ["Read", "Bash", "Grep", "Glob"]}},
indent=2,
))
install_claude_hooks(workspace)
# Step 5: CLAUDE.local.md stub (preserved on re-run)
local_md = workspace / ".claude" / "CLAUDE.local.md"
if not local_md.exists():
local_md.write_text(
"# My Notes\n\nPersonal notes for this workspace. Uploaded on `agnes push`.\n",
encoding="utf-8",
)
# Step 6: First pull
try:
result: PullResult = run_pull(server_url, token, workspace)
except Exception as exc:
typer.echo(render_error(0, {"detail": {
"kind": "manifest_unauthorized",
"hint": "Initial pull failed — workspace partially set up",
"message": str(exc),
}}), err=True)
raise typer.Exit(1)
# Step 7: Write AGNES_WORKSPACE.md from client-side template
here = Path(__file__).parent
template_path = here.parent.parent / "config" / "agnes_workspace_template.txt"
if template_path.exists():
template = template_path.read_text(encoding="utf-8")
else:
template = "# Agnes workspace\n\nCreated: {created_at}\nServer: {server_url}\n"
workspace_md = (template
.replace("{created_at}", datetime.now(timezone.utc).isoformat())
.replace("{server_url}", server_url)
.replace("{workspace_path}", str(workspace)))
(workspace / "AGNES_WORKSPACE.md").write_text(workspace_md, encoding="utf-8")
# Step 8: Summary
typer.echo("Workspace ready.")
typer.echo(f" Server : {server_url}")
typer.echo(f" Tables : {result.tables_updated} synced ({result.parquets_total} total)")
typer.echo(f" Rules : {result.rules_count}")
typer.echo(f" Workspace: {workspace}")
typer.echo("")
typer.echo("Try: agnes catalog")
- Step 5: Run tests to verify they pass
pytest tests/test_cli_init.py -v
- Step 6: Commit
git add cli/commands/init.py config/agnes_workspace_template.txt tests/test_cli_init.py
git commit -m "feat(cli): agnes init orchestrator + AGNES_WORKSPACE.md template"
Task 12: New agnes status (workspace status, replaces da analyst status)
Files:
-
Modify (overwrite):
cli/commands/status.py -
Test:
tests/test_cli_status.py(new) -
Step 1: Read existing status.py to understand what to replace
cat cli/commands/status.py
The existing agnes status shows server health. Per spec, this content moves to agnes diagnose system (Task 13); the file is repurposed for workspace status.
- Step 2: Write failing tests
# tests/test_cli_status.py
from pathlib import Path
from typer.testing import CliRunner
from cli.commands.status import status_app
runner = CliRunner()
def test_status_uninitialized_workspace(tmp_path, monkeypatch):
monkeypatch.setenv("DA_LOCAL_DIR", str(tmp_path))
result = runner.invoke(status_app)
assert result.exit_code in (0, 1)
assert "not initialized" in result.output.lower() or "no workspace" in result.output.lower()
def test_status_initialized_workspace(tmp_path, monkeypatch):
"""A bootstrapped workspace shows 'initialized: yes' and basic stats."""
monkeypatch.setenv("DA_LOCAL_DIR", str(tmp_path))
(tmp_path / "CLAUDE.md").write_text("# AI Data Analyst\n")
(tmp_path / "user" / "duckdb").mkdir(parents=True)
(tmp_path / "user" / "duckdb" / "analytics.duckdb").touch()
(tmp_path / "server" / "parquet").mkdir(parents=True)
(tmp_path / "server" / "parquet" / "tbl1.parquet").touch()
result = runner.invoke(status_app)
assert result.exit_code == 0
assert "initialized" in result.output.lower()
assert "1" in result.output # one parquet
def test_status_json(tmp_path, monkeypatch):
monkeypatch.setenv("DA_LOCAL_DIR", str(tmp_path))
(tmp_path / "CLAUDE.md").write_text("# AI Data Analyst\n")
result = runner.invoke(status_app, ["--json"])
assert result.exit_code == 0
import json
body = json.loads(result.output)
assert "workspace" in body and "initialized" in body
- Step 3: Run tests to verify they fail
pytest tests/test_cli_status.py -v
- Step 4: Overwrite
cli/commands/status.py
# cli/commands/status.py
"""`agnes status` — workspace status: initialized? data fresh? hooks active?"""
from __future__ import annotations
import json
import os
from datetime import datetime, timezone
from pathlib import Path
import typer
_INIT_MARKER = "AI Data Analyst"
status_app = typer.Typer(help="Workspace status (was `da analyst status`)")
@status_app.callback(invoke_without_command=True)
def status(
as_json: bool = typer.Option(False, "--json", help="Machine-readable output"),
):
workspace = Path(os.environ.get("DA_LOCAL_DIR", ".")).resolve()
initialized = False
claude_md = workspace / "CLAUDE.md"
if claude_md.exists():
initialized = _INIT_MARKER in claude_md.read_text()
parquet_dir = workspace / "server" / "parquet"
parquets = list(parquet_dir.glob("*.parquet")) if parquet_dir.exists() else []
db_path = workspace / "user" / "duckdb" / "analytics.duckdb"
last_synced = None
if db_path.exists():
last_synced = datetime.fromtimestamp(db_path.stat().st_mtime, tz=timezone.utc).isoformat()
sessions_dir = workspace / "user" / "sessions"
session_count = len(list(sessions_dir.glob("*.jsonl"))) if sessions_dir.exists() else 0
info = {
"workspace": str(workspace),
"initialized": initialized,
"parquet_tables": len(parquets),
"duckdb_exists": db_path.exists(),
"last_synced": last_synced,
"sessions_pending_upload": session_count,
}
if as_json:
typer.echo(json.dumps(info, indent=2))
return
typer.echo(f"Workspace : {workspace}")
typer.echo(f"Initialized: {'yes' if initialized else 'no'}")
typer.echo(f"Parquets : {info['parquet_tables']}")
typer.echo(f"DuckDB : {'yes' if info['duckdb_exists'] else 'no'}")
typer.echo(f"Last sync : {last_synced or 'never'}")
typer.echo(f"Pending uploads: {session_count} sessions")
if not initialized:
typer.echo("")
typer.echo("Run `agnes init --server-url <URL> --token <PAT>` to bootstrap.")
- Step 5: Run tests to verify they pass
pytest tests/test_cli_status.py -v
- Step 6: Commit
git add cli/commands/status.py tests/test_cli_status.py
git commit -m "feat(cli): agnes status now shows workspace state (was system health)"
Task 13: Move old agnes status content into agnes diagnose system
Files:
-
Modify:
cli/commands/diagnose.py(addsystemsubcommand with the old status logic) -
Test:
tests/test_cli_diagnose_system.py(new) -
Step 1: Recover old
agnes statuslogic
git show HEAD~12:cli/commands/status.py
(Adjust the ref — find the commit before the rewrite via git log --oneline cli/commands/status.py | head.) Save the body to a scratch file.
- Step 2: Read existing diagnose command structure
cat cli/commands/diagnose.py
- Step 3: Add
systemsubcommand
Append the old status logic as a system subcommand of diagnose_app. Keep diagnose's existing default behavior (overall health) intact.
- Step 4: Test
# tests/test_cli_diagnose_system.py
from typer.testing import CliRunner
from cli.commands.diagnose import diagnose_app
def test_diagnose_system_help():
runner = CliRunner()
result = runner.invoke(diagnose_app, ["system", "--help"])
assert result.exit_code == 0
- Step 5: Commit
git add cli/commands/diagnose.py tests/test_cli_diagnose_system.py
git commit -m "refactor(cli): move old `agnes status` health check to `agnes diagnose system`"
Task 14: agnes snapshot create — fold da fetch into snapshot group
Files:
-
Modify:
cli/commands/snapshot.py(addcreatesubcommand) -
Test:
tests/test_cli_snapshot_create.py(new) -
Step 1: Read existing fetch.py and snapshot.py
cat cli/commands/fetch.py
cat cli/commands/snapshot.py
- Step 2: Add
createsubcommand tosnapshot_app
Move the body of fetch.py:fetch() into a new @snapshot_app.command("create"). Keep all flags. Update the existence check:
local_db = _local_dir() / "user" / "duckdb" / "analytics.duckdb"
if not local_db.exists():
typer.echo("Local DuckDB not found. Run: agnes pull first.", err=True)
raise typer.Exit(1)
# (then proceed with duckdb.connect — no longer creates an empty DB)
- Step 3: Add tests
# tests/test_cli_snapshot_create.py
from typer.testing import CliRunner
from cli.commands.snapshot import snapshot_app
def test_snapshot_create_help():
runner = CliRunner()
result = runner.invoke(snapshot_app, ["create", "--help"])
assert result.exit_code == 0
for flag in ["--select", "--where", "--limit", "--order-by", "--as", "--estimate", "--no-estimate", "--force"]:
assert flag in result.output
def test_snapshot_create_no_duckdb_friendly_exit(tmp_path, monkeypatch):
monkeypatch.setenv("DA_LOCAL_DIR", str(tmp_path))
runner = CliRunner()
result = runner.invoke(snapshot_app, ["create", "any_table", "--as", "x", "--estimate"])
assert result.exit_code == 1
assert "Run: agnes pull" in result.output or "Run: agnes pull" in (result.stderr or "")
- Step 4: Run tests
pytest tests/test_cli_snapshot_create.py -v
- Step 5: Commit
git add cli/commands/snapshot.py tests/test_cli_snapshot_create.py
git commit -m "feat(cli): agnes snapshot create (folded from da fetch); friendly exit if no DuckDB"
Task 15: agnes catalog --metrics — fold da metrics list/show
Files:
-
Modify:
cli/commands/catalog.py -
Test:
tests/test_cli_catalog_metrics.py(new) -
Step 1: Read existing metrics list/show logic
grep -n "def list\|def show\|@" cli/commands/metrics.py | head
- Step 2: Add
--metricsand--metrics --show <id>to catalog
Modify cli/commands/catalog.py:
@catalog_app.callback(invoke_without_command=True)
def catalog(
as_json: bool = typer.Option(False, "--json"),
metrics: bool = typer.Option(False, "--metrics", help="Show metric definitions instead of tables"),
show: Optional[str] = typer.Option(None, "--show", help="With --metrics: show one metric by id"),
):
if metrics and show:
return _show_one_metric(show, as_json)
if metrics:
return _list_metrics(as_json)
return _list_tables(as_json)
(_list_metrics and _show_one_metric lift from metrics.py:list_metrics and metrics.py:show_metric.)
- Step 3: Add tests
# tests/test_cli_catalog_metrics.py
from typer.testing import CliRunner
from cli.commands.catalog import catalog_app
def test_catalog_metrics_help():
runner = CliRunner()
result = runner.invoke(catalog_app, ["--help"])
assert result.exit_code == 0
assert "--metrics" in result.output
assert "--show" in result.output
- Step 4: Commit
git add cli/commands/catalog.py tests/test_cli_catalog_metrics.py
git commit -m "feat(cli): agnes catalog --metrics replaces da metrics list/show"
Task 16: Move da metrics import/export/validate to agnes admin metrics
Files:
-
Create:
cli/commands/admin_metrics.py -
Modify:
cli/commands/admin.py(register the sub-Typer) -
Step 1: Create
cli/commands/admin_metrics.py
Lift import_metrics, export_metrics, validate_metrics from cli/commands/metrics.py. Wrap in a sub-Typer:
# cli/commands/admin_metrics.py
"""`agnes admin metrics {import,export,validate}` — lifted from metrics.py."""
import typer
admin_metrics_app = typer.Typer(help="Admin: metric definition management")
@admin_metrics_app.command("import")
def import_metrics(directory: str = typer.Argument(...)):
# ... lifted logic from cli/commands/metrics.py:import_metrics
pass
@admin_metrics_app.command("export")
def export_metrics(target: str = typer.Argument(...)):
# ... lifted logic
pass
@admin_metrics_app.command("validate")
def validate_metrics():
# ... lifted logic
pass
(Copy the actual implementations from metrics.py verbatim.)
- Step 2: Register in admin app
In cli/commands/admin.py, add:
from cli.commands.admin_metrics import admin_metrics_app
admin_app.add_typer(admin_metrics_app, name="metrics")
- Step 3: Test
# tests/test_cli_admin_metrics.py
from typer.testing import CliRunner
from cli.commands.admin import admin_app
def test_admin_metrics_subcommands_present():
runner = CliRunner()
result = runner.invoke(admin_app, ["metrics", "--help"])
assert result.exit_code == 0
assert "import" in result.output
assert "export" in result.output
assert "validate" in result.output
- Step 4: Commit
git add cli/commands/admin_metrics.py cli/commands/admin.py tests/test_cli_admin_metrics.py
git commit -m "feat(cli): agnes admin metrics {import,export,validate}"
Phase 4 — Wiring + cleanup
Task 17: Update reader hint texts
Files:
-
Modify:
cli/commands/query.py(two occurrences),cli/commands/explore.py -
Step 1: Find all "Run: da sync" strings
grep -rn "Run: da sync" cli/
- Step 2: Replace with "Run: agnes pull"
sed -i.bak 's/Run: da sync/Run: agnes pull/g' cli/commands/query.py cli/commands/explore.py
rm cli/commands/query.py.bak cli/commands/explore.py.bak
- Step 3: Verify no leftover
grep -rn "Run: da sync" cli/
Expected: no matches.
- Step 4: Commit
git add cli/commands/query.py cli/commands/explore.py
git commit -m "fix(cli): hint text 'Run: da sync' → 'Run: agnes pull'"
Task 18: Update cli/main.py registrations + delete obsolete commands
Files:
-
Modify:
cli/main.py -
Delete:
cli/commands/sync.py,cli/commands/fetch.py,cli/commands/analyst.py,cli/commands/metrics.py -
Step 1: Update
cli/main.py
Replace lines 11-28 (imports) and 91-109 (registrations):
from cli.commands.auth import auth_app
from cli.commands.init import init_app
from cli.commands.pull import pull_app
from cli.commands.push import push_app
from cli.commands.query import query_command
from cli.commands.status import status_app
from cli.commands.admin import admin_app
from cli.commands.diagnose import diagnose_app
from cli.commands.skills import skills_app
from cli.commands.setup import setup_app
from cli.commands.server import server_app
from cli.commands.explore import explore_app
from cli.commands.catalog import catalog_app
from cli.commands.schema import schema_app
from cli.commands.describe import describe_app
from cli.commands.snapshot import snapshot_app
from cli.commands.disk_info import disk_info_app
# Register subcommands
app.add_typer(auth_app, name="auth")
app.add_typer(init_app, name="init")
app.add_typer(pull_app, name="pull")
app.add_typer(push_app, name="push")
app.command("query")(query_command)
app.add_typer(status_app, name="status")
app.add_typer(admin_app, name="admin")
app.add_typer(diagnose_app, name="diagnose")
app.add_typer(skills_app, name="skills")
app.add_typer(setup_app, name="setup")
app.add_typer(server_app, name="server")
app.add_typer(explore_app, name="explore")
app.add_typer(catalog_app, name="catalog")
app.add_typer(schema_app, name="schema")
app.add_typer(describe_app, name="describe")
app.add_typer(snapshot_app, name="snapshot")
app.add_typer(disk_info_app, name="disk-info")
- Step 2: Delete obsolete files
git rm cli/commands/sync.py cli/commands/fetch.py cli/commands/analyst.py cli/commands/metrics.py
- Step 3: Verify no other code imports them
grep -rn "from cli.commands.sync\|from cli.commands.fetch\|from cli.commands.analyst\|from cli.commands.metrics" .
Expected: no matches (anything found needs to be updated to use the new homes).
- Step 4: Run the full test suite (smoke)
pytest tests/ -x --ignore=tests/test_clean_install_integration.py --ignore=tests/test_reader_smoke_matrix.py 2>&1 | tail -30
Expected: tests for moved/deleted commands fail with import errors — those tests are also being deleted (or already updated in earlier tasks). Other tests should pass.
If old test files reference the deleted commands, git rm them too:
git rm tests/test_analyst*.py tests/test_sync*.py tests/test_fetch*.py tests/test_metrics_cli*.py 2>/dev/null || true
- Step 5: Commit
git add cli/main.py
git rm cli/commands/{sync,fetch,analyst,metrics}.py 2>/dev/null
git commit -m "refactor(cli): drop sync/fetch/analyst/metrics; register init/pull/push"
Task 19: Update repo-root CLAUDE.md
Files:
-
Modify:
CLAUDE.md -
Step 1: Apply systematic rewrites
sed -i.bak \
-e 's|da sync --upload-only|agnes push|g' \
-e 's|da sync|agnes pull|g' \
-e 's|da analyst setup|agnes init|g' \
-e 's|da fetch|agnes snapshot create|g' \
-e 's|da metrics list|agnes catalog --metrics|g' \
-e 's|da metrics show|agnes catalog --metrics --show|g' \
-e 's|da metrics import|agnes admin metrics import|g' \
-e 's|data/parquet/|server/parquet/|g' \
-e 's|data/duckdb/|user/duckdb/|g' \
CLAUDE.md
rm CLAUDE.md.bak
- Step 2: Manually rewrite the "Local sync & Claude Code hooks" subsection
Find the section. Replace the surrounding prose so it describes agnes pull + agnes push hooks:
### Local sync & Claude Code hooks
`agnes pull` is the canonical analyst-side distribution path: pulls the
RBAC-filtered manifest from the server, downloads parquets whose MD5 changed
(skipping `query_mode='remote'` rows), rebuilds local DuckDB views over them.
`agnes push` mirrors it for the upload direction (sessions, CLAUDE.local.md).
`agnes init` writes two hooks into `<workspace>/.claude/settings.json`:
- `SessionStart` → `agnes pull --quiet` — pulls fresh parquets at the start of every Claude Code session
- `SessionEnd` → `agnes push --quiet` — uploads session jsonl + `CLAUDE.local.md` to the server
Both pass `--quiet` so they don't pollute Claude Code stdout, and trail with `|| true` so a server outage never blocks a session. Workspace-level (not user-home) so the hooks fire only when Claude Code opens this analyst workspace, not in unrelated sessions on the same machine.
Admin RBAC for auto-sync: `query_mode IN ('local', 'materialized')` plus a `resource_grants` row for one of the analyst's groups → table appears in their manifest → `agnes pull` downloads it. No per-user sync config; the admin layer is the single source of truth.
- Step 3: Verify no leftover legacy strings
grep -nE 'da sync|da fetch|da analyst|da metrics list|da metrics show|data/parquet/|data/duckdb/' CLAUDE.md
Expected: no matches.
- Step 4: Commit
git add CLAUDE.md
git commit -m "docs(claude-md): rewrite verbs + paths for new CLI surface"
Phase 5 — Test fixtures and integration tests
Task 20: Create tests/fixtures/analyst_bootstrap.py
Files:
-
Create:
tests/fixtures/analyst_bootstrap.py -
Modify:
tests/conftest.py(import the fixtures) -
Step 1: Read existing test infrastructure
grep -n "fastapi\|TestClient\|tmp_path" tests/conftest.py | head -30
- Step 2: Create the fixtures
# tests/fixtures/analyst_bootstrap.py
"""Test fixtures for the clean-bootstrap test suite.
Per spec §"Test fixtures":
- fastapi_test_server, test_pat, test_pat_no_grants, zero_grants_workspace,
web_session, client.
"""
from __future__ import annotations
import json
import subprocess
import threading
import time
from contextlib import contextmanager
from pathlib import Path
from typing import Iterator
import httpx
import pytest
import uvicorn
NONEXISTENT_TABLE = "__nonexistent__" # Sentinel for reader smoke matrix
class _ServerHandle:
def __init__(self, url: str, server: uvicorn.Server, thread: threading.Thread):
self.url = url
self._server = server
self._thread = thread
def shutdown(self):
self._server.should_exit = True
self._thread.join(timeout=5)
def _seed_db(data_dir: Path):
"""Initialize a fresh system.duckdb with seeded admin/analyst users + tables.
Imports app modules at function scope to avoid circular imports during
collection.
"""
import os
os.environ["DATA_DIR"] = str(data_dir)
from src.db import get_db_connection
from src.repositories.users import UserRepository
from src.repositories.user_groups import UserGroupRepository
from src.repositories.table_registry import TableRegistryRepository
from src.repositories.user_group_members import UserGroupMembersRepository
from src.repositories.resource_grants import ResourceGrantsRepository
from app.auth.providers.password import _hash_password
conn = get_db_connection()
# Seed users
user_repo = UserRepository(conn)
admin_id = user_repo.create(email="admin@example.com", name="Admin",
password_hash=_hash_password("test-password"),
is_admin=True)
analyst_id = user_repo.create(email="analyst@example.com", name="Analyst",
password_hash=_hash_password("analyst-pw"),
is_admin=False)
# Seed groups (Admin + Everyone are seeded as is_system=TRUE on first run)
grp_repo = UserGroupRepository(conn)
admin_group = grp_repo.find_by_name("Admin")
everyone_group = grp_repo.find_by_name("Everyone")
# Memberships
members = UserGroupMembersRepository(conn)
members.add(user_id=admin_id, group_id=admin_group.id, source="system_seed")
members.add(user_id=analyst_id, group_id=everyone_group.id, source="system_seed")
# Tables
tbl_repo = TableRegistryRepository(conn)
tbl_repo.create(id="local_tbl", name="local_tbl", source_type="keboola",
bucket="test", source_table="local_tbl", query_mode="local")
tbl_repo.create(id="materialized_tbl", name="materialized_tbl", source_type="bigquery",
bucket="test", source_table="materialized_tbl", query_mode="materialized")
tbl_repo.create(id="remote_tbl", name="remote_tbl", source_type="bigquery",
bucket="test", source_table="remote_tbl", query_mode="remote")
return {"admin_id": admin_id, "analyst_id": analyst_id,
"admin_group_id": admin_group.id, "everyone_group_id": everyone_group.id}
@pytest.fixture
def fastapi_test_server(tmp_path) -> Iterator[_ServerHandle]:
"""Boot a real FastAPI server in a background thread against tmp_path DATA_DIR."""
data_dir = tmp_path / "agnes-data"
data_dir.mkdir()
seeded = _seed_db(data_dir)
handle_port = 18712 + (id(tmp_path) % 1000)
from app.main import app as fastapi_app
config = uvicorn.Config(fastapi_app, host="127.0.0.1", port=handle_port, log_level="warning")
server = uvicorn.Server(config)
thread = threading.Thread(target=server.run, daemon=True)
thread.start()
# Wait for server up
url = f"http://127.0.0.1:{handle_port}"
for _ in range(50):
try:
httpx.get(f"{url}/api/health", timeout=0.2)
break
except Exception:
time.sleep(0.1)
else:
pytest.fail("fastapi_test_server failed to start")
handle = _ServerHandle(url, server, thread)
handle._seeded = seeded
handle.NONEXISTENT_TABLE = NONEXISTENT_TABLE
yield handle
handle.shutdown()
@pytest.fixture
def web_session(fastapi_test_server) -> Iterator[httpx.Client]:
"""Authenticated httpx.Client using cookie session for admin@example.com."""
client = httpx.Client(base_url=fastapi_test_server.url, follow_redirects=False)
resp = client.post("/auth/password/login/web",
data={"email": "admin@example.com", "password": "test-password"})
assert resp.status_code in (200, 302), f"web_session login failed: {resp.text}"
yield client
client.close()
@pytest.fixture
def test_pat(web_session) -> str:
"""Mint a PAT for analyst@example.com with 2 grants + 2 mandatory rules."""
# First, grant the analyst access to local_tbl + materialized_tbl
web_session.post("/api/admin/grants",
json={"group_id": "...everyone...", "resource_type": "table",
"resource_id": "local_tbl"})
# ... similarly for materialized_tbl + 2 mandatory memory items
# (Use the actual admin endpoints for grants and memory items.)
# Mint PAT (as analyst — log in as analyst first, then mint)
analyst_session = httpx.Client(base_url=web_session.base_url, follow_redirects=False)
analyst_session.post("/auth/password/login/web",
data={"email": "analyst@example.com", "password": "analyst-pw"})
resp = analyst_session.post("/auth/tokens",
json={"name": "test", "ttl_seconds": 3600})
assert resp.status_code == 201, resp.text
return resp.json()["token"]
@pytest.fixture
def test_pat_no_grants(web_session) -> str:
analyst_session = httpx.Client(base_url=web_session.base_url, follow_redirects=False)
analyst_session.post("/auth/password/login/web",
data={"email": "analyst@example.com", "password": "analyst-pw"})
resp = analyst_session.post("/auth/tokens",
json={"name": "test-nogrants", "ttl_seconds": 3600})
return resp.json()["token"]
@pytest.fixture
def zero_grants_workspace(tmp_path, fastapi_test_server, test_pat_no_grants) -> Path:
"""Run `agnes init` against a no-grants PAT; return the workspace path."""
workspace = tmp_path / "workspace"
workspace.mkdir()
result = subprocess.run([
"da", "init",
"--server-url", fastapi_test_server.url,
"--token", test_pat_no_grants,
"--workspace", str(workspace),
], capture_output=True, text=True)
assert result.returncode == 0, f"init failed: {result.stderr}"
return workspace
(Adapt API calls as needed — the actual repository/route names may differ. Spec the goal; implementer adapts.)
- Step 3: Wire fixtures into conftest
In tests/conftest.py, append:
from tests.fixtures.analyst_bootstrap import (
fastapi_test_server, web_session, test_pat, test_pat_no_grants,
zero_grants_workspace, NONEXISTENT_TABLE,
)
- Step 4: Smoke-test fixture creation
# tests/test_fixtures_smoke.py
def test_server_boots(fastapi_test_server):
import httpx
resp = httpx.get(f"{fastapi_test_server.url}/api/health")
assert resp.status_code == 200
def test_zero_grants_workspace_minimal(zero_grants_workspace):
assert (zero_grants_workspace / "CLAUDE.md").exists()
assert (zero_grants_workspace / "AGNES_WORKSPACE.md").exists()
assert not (zero_grants_workspace / "server" / "parquet").exists()
assert not (zero_grants_workspace / ".claude" / "rules").exists()
pytest tests/test_fixtures_smoke.py -v
- Step 5: Commit
git add tests/fixtures/analyst_bootstrap.py tests/conftest.py tests/test_fixtures_smoke.py
git commit -m "test: clean-bootstrap fixtures (fastapi_test_server, test_pat, etc.)"
Task 21: Reader smoke matrix
Files:
-
Create:
tests/test_reader_smoke_matrix.py -
Step 1: Write the matrix
# tests/test_reader_smoke_matrix.py
"""Reader smoke matrix — every CLI command on a freshly-bootstrapped
zero-grants workspace, asserts no traceback. The load-bearing test for
'nothing crashes on missing dirs'."""
import subprocess
import pytest
from tests.fixtures.analyst_bootstrap import NONEXISTENT_TABLE
READER_COMMANDS = [
["agnes", "catalog"],
["agnes", "catalog", "--metrics"],
["agnes", "schema", NONEXISTENT_TABLE],
["agnes", "describe", NONEXISTENT_TABLE],
["agnes", "query", "SELECT 1"],
["agnes", "explore", NONEXISTENT_TABLE],
["agnes", "disk-info"],
["agnes", "snapshot", "list"],
["agnes", "snapshot", "create", NONEXISTENT_TABLE, "--as", "x", "--estimate"],
["agnes", "status"],
["agnes", "diagnose"],
["agnes", "auth", "whoami"],
["agnes", "skills", "list"],
["agnes", "skills", "show", "agnes-data-querying"],
]
@pytest.mark.parametrize("cmd", READER_COMMANDS, ids=lambda c: " ".join(c))
def test_reader_does_not_crash_on_zero_grants(zero_grants_workspace, cmd):
"""Exit 0 (success) or exit 1 (friendly hint) is OK; tracebacks are forbidden."""
result = subprocess.run(cmd, cwd=zero_grants_workspace,
capture_output=True, text=True, timeout=30)
assert result.returncode in (0, 1), \
f"{cmd} crashed: rc={result.returncode}, stderr={result.stderr}"
assert "Traceback" not in result.stderr, f"{cmd} threw: {result.stderr}"
- Step 2: Run
pytest tests/test_reader_smoke_matrix.py -v
Expected: all parametrized cases PASS.
- Step 3: Commit
git add tests/test_reader_smoke_matrix.py
git commit -m "test: reader smoke matrix on zero-grants workspace"
Task 22: Clean-install integration tests
Files:
-
Create:
tests/test_clean_install_integration.py -
Step 1: Write the integration tests per spec §5.2
# tests/test_clean_install_integration.py
"""End-to-end clean-install integration tests for `agnes init`."""
import json
import subprocess
from pathlib import Path
def assert_no_dead_dirs(workspace: Path):
forbidden_unconditional = ["data/parquet", "data/duckdb", "data/metadata",
"user/artifacts", ".agnes"]
for d in forbidden_unconditional:
assert not (workspace / d).exists(), f"forbidden dir created: {d}"
for d in [".claude/rules", "server/parquet", "user/sessions", "user/snapshots"]:
path = workspace / d
if path.exists():
assert any(path.iterdir()), f"{d} exists but is empty"
def test_clean_install_minimal_grants(fastapi_test_server, tmp_path, test_pat):
workspace = tmp_path / "ws"
workspace.mkdir()
result = subprocess.run([
"da", "init",
"--server-url", fastapi_test_server.url,
"--token", test_pat,
"--workspace", str(workspace),
], capture_output=True, text=True)
assert result.returncode == 0, result.stderr
for must in ["CLAUDE.md", "AGNES_WORKSPACE.md",
".claude/settings.json", ".claude/CLAUDE.local.md",
"user/duckdb/analytics.duckdb"]:
assert (workspace / must).exists(), f"missing: {must}"
parquets = list((workspace / "server" / "parquet").glob("*.parquet"))
assert len(parquets) == 2, "expected 2 parquets (local + materialized grants)"
rules = list((workspace / ".claude" / "rules").iterdir())
assert len(rules) == 2, "expected 2 mandatory rules"
assert_no_dead_dirs(workspace)
settings = json.loads((workspace / ".claude" / "settings.json").read_text())
assert any("agnes pull" in h["hooks"][0]["command"]
for h in settings["hooks"]["SessionStart"])
assert any("agnes push" in h["hooks"][0]["command"]
for h in settings["hooks"]["SessionEnd"])
claude_md = (workspace / "CLAUDE.md").read_text()
assert "agnes pull" in claude_md
assert "da sync" not in claude_md
workspace_md = (workspace / "AGNES_WORKSPACE.md").read_text()
assert test_pat not in workspace_md, "PAT must not leak into AGNES_WORKSPACE.md"
for placeholder in ["{created_at}", "{server_url}", "{workspace_path}"]:
assert placeholder not in workspace_md, f"placeholder leaked: {placeholder}"
assert fastapi_test_server.url in workspace_md
assert str(workspace) in workspace_md
assert "agnes pull" in workspace_md
def test_clean_install_zero_grants(fastapi_test_server, tmp_path, test_pat_no_grants):
workspace = tmp_path / "ws"
workspace.mkdir()
subprocess.run([
"da", "init",
"--server-url", fastapi_test_server.url,
"--token", test_pat_no_grants,
"--workspace", str(workspace),
], check=True)
must_exist = {"CLAUDE.md", "AGNES_WORKSPACE.md",
".claude/settings.json", ".claude/CLAUDE.local.md",
"user/duckdb/analytics.duckdb"}
must_not_exist = {".claude/rules", "server/parquet", "data/parquet",
"data/duckdb", "data/metadata", "user/artifacts",
"user/sessions", "user/snapshots", ".agnes"}
for p in must_exist:
assert (workspace / p).exists(), f"missing: {p}"
for p in must_not_exist:
assert not (workspace / p).exists(), f"unexpected: {p}"
assert_no_dead_dirs(workspace)
def test_init_force_preserves_local_md(fastapi_test_server, tmp_path, test_pat):
workspace = tmp_path / "ws"
workspace.mkdir()
subprocess.run(["agnes", "init", "--server-url", fastapi_test_server.url,
"--token", test_pat, "--workspace", str(workspace)], check=True)
(workspace / ".claude" / "CLAUDE.local.md").write_text("# my private notes\n")
subprocess.run(["agnes", "init", "--server-url", fastapi_test_server.url,
"--token", test_pat, "--workspace", str(workspace),
"--force"], check=True)
assert "my private notes" in (workspace / ".claude" / "CLAUDE.local.md").read_text()
def test_readers_in_pre_init_dir(tmp_path):
"""Reader commands in a folder that never had `agnes init`. Friendly hints, no tracebacks."""
for cmd in [["agnes", "query", "SELECT 1"],
["agnes", "snapshot", "create", "x", "--as", "y", "--estimate"],
["agnes", "explore", "x"],
["agnes", "snapshot", "list"]]:
result = subprocess.run(cmd, cwd=tmp_path,
capture_output=True, text=True, timeout=15)
assert "Traceback" not in result.stderr
- Step 2: Run
pytest tests/test_clean_install_integration.py -v
- Step 3: Commit
git add tests/test_clean_install_integration.py
git commit -m "test: clean-install integration suite (minimal/zero grants, force, pre-init)"
Task 23: Manual clean-install protocol — document in RELEASE_CHECKLIST.md
Files:
-
Modify or create:
docs/RELEASE_CHECKLIST.md -
Step 1: Add the manual protocol from spec §5.5
If docs/RELEASE_CHECKLIST.md exists, append; otherwise create with header:
# Release Checklist
## Bootstrap path changes (mandatory pre-merge)
For any PR touching the analyst-bootstrap path (`agnes init`, `cli/lib/pull.py`,
`cli/lib/hooks.py`, `app/web/setup_instructions.py`, `/api/welcome`), run
this protocol locally before requesting review:
1. `git clean -fdx` in the repo (no build artifacts).
2. Boot FastAPI locally against a clean test instance state.
3. Empty terminal in `/tmp/test-analyst-1`. From the web `/setup?role=analyst`, paste prompt.
4. `tree -a /tmp/test-analyst-1` and compare with the expected tree from
`docs/superpowers/specs/2026-05-04-clean-analyst-bootstrap-design.md` §5.2.
5. `claude` in that folder. Three queries: "what tables can I see",
"SELECT count(*) FROM <t>", "show me last 5 rows of <t>". All must work
without further intervention.
6. `/exit`. Verify SessionEnd hook ran (server-side audit log shows `agnes push`;
`du -sh /tmp/test-analyst-1/user/sessions/` non-empty).
7. Second `claude` in same folder. Verify SessionStart hook fires
(`agnes pull` request in audit log).
8. Second workspace `/tmp/test-analyst-2` with the same PAT (within TTL).
Repeat 3-5. Verify global `~/.config/da/` is not duplicated; the second
workspace has its own DuckDB.
- Step 2: Commit
git add docs/RELEASE_CHECKLIST.md
git commit -m "docs: clean-install manual protocol in release checklist"
Phase 6 — CHANGELOG and final verification
Task 24: Update CHANGELOG
Files:
-
Modify:
CHANGELOG.md -
Step 1: Add an entry under
[Unreleased]
Open CHANGELOG.md, find the topmost ## [Unreleased] section (it sits above the most recent released version, currently ## [0.32.0]). Add the entry from the spec §"CHANGELOG entry (preview)":
## [Unreleased]
### Changed
- **BREAKING** CLI binary renamed from `da` to `agnes`. No backward-compat alias is shipped. Update shell aliases, hook commands in any pre-existing `.claude/settings.json`, scripts, and cron jobs. Reinstall via `uv tool install <wheel>`; the wheel now ships an `agnes` entry point.
- **BREAKING** Analyst bootstrap rewritten end-to-end. `da analyst setup` is removed; replaced by `agnes init` (non-interactive, requires `--server-url` and `--token`). `da sync` is split into `agnes pull` (refresh) and `agnes push` (upload). `da fetch` is folded into `agnes snapshot create`. `da metrics list/show` is folded into `agnes catalog --metrics`; `da metrics import/export/validate` move to `agnes admin metrics {import,export,validate}`. The `da analyst` namespace is removed; the workspace status command is now `agnes status`. The previous `da status` (server-health overview) becomes `agnes diagnose system`.
- **BREAKING** Workspace layout simplified. Removed: `data/parquet/`, `data/duckdb/`, `data/metadata/`, `user/artifacts/`. Canonical paths: `server/parquet/` (synced parquets), `user/duckdb/analytics.duckdb` (DuckDB views), `user/snapshots/` (ad-hoc snapshots), `user/sessions/` (recorded sessions).
- The `/setup` web page now branches on a `role` query parameter: `/setup?role=analyst` renders the analyst workspace bootstrap prompt; `/setup?role=admin` renders the admin CLI install prompt. `/install` continues to 302 to `/setup`.
- `CLAUDE.md` server-side template + repo-root `CLAUDE.md` updated to reference the new CLI verbs and workspace paths. The admin UI for the `claude_md_template` DB override (`/admin/workspace-prompt`) renders a yellow banner when the saved override contains legacy strings (`data/parquet/`, `da sync`, `da fetch`, `da analyst setup`, `da metrics list/show`); admins re-author and save to clear it. Migration is manual.
### Added
- `AGNES_WORKSPACE.md` — human-readable workspace docs file generated by `agnes init` in the workspace root. Documents global install, workspace layout, hooks, cheat sheet, uninstall recipe.
- PAT request body now accepts `scope: str = "general"` and `ttl_seconds: int | None = None` fields. PATs minted with `scope="bootstrap-analyst"` are TTL-clamped to ≤ 1 h server-side. Existing `expires_in_days` field continues to work; `ttl_seconds` wins when both are set. `ttl_seconds` upper bound is 315_360_000 (matches `expires_in_days <= 3650` cap).
- `cli/lib/` shared-library tree, with `cli/lib/pull.py:run_pull` (data-refresh primitive callable from both the Typer wrapper and `agnes init`) and `cli/lib/hooks.py:install_claude_hooks` (workspace-scoped Claude Code hook installer).
### Fixed
- `agnes pull` (formerly `da sync`) no longer creates `.claude/rules/` when the corporate-memory bundle is empty.
- `agnes pull` no longer creates `server/parquet/` when the manifest is empty.
- `agnes snapshot create` (formerly `da fetch`) no longer materializes an empty `user/duckdb/analytics.duckdb` when run before any `agnes pull`.
- Workspace `agnes status` reads from the canonical `server/parquet/` and `user/duckdb/analytics.duckdb` paths (was reading legacy `data/parquet/`, `data/metadata/last_sync.json`).
- `agnes init` and `agnes pull` errors now use the `cli/error_render.py` typed-error renderer (added in 0.32.0), so analyst-facing error UX matches the structured shape `agnes query --remote` already produces.
### Removed
- `da analyst setup`, `da analyst status`, `da sync`, `da fetch`. See "Changed" above for replacements.
- `da metrics` namespace as a top-level group (subcommands moved to `agnes catalog --metrics` for read-only views and `agnes admin metrics …` for write operations).
- Legacy workspace directories `data/parquet/`, `data/duckdb/`, `data/metadata/`, `user/artifacts/`. Existing analyst workspaces should be reinitialized with `agnes init --server-url ... --token ... --force` (a fresh empty folder is recommended).
### Internal
- `cli/lib/__init__.py` (empty) makes `cli/lib/` a proper package picked up by Hatchling for wheel inclusion.
- `tests/fixtures/analyst_bootstrap.py` — reusable test fixtures for clean-install verification.
- Step 2: Commit
git add CHANGELOG.md
git commit -m "docs(changelog): clean-analyst-bootstrap rewrite (BREAKING)"
Task 25: Final verification — full test suite
Files: none (just runs)
- Step 1: Run the full test suite
pytest tests/ -v 2>&1 | tail -50
Expected: all PASS (or expected skips for fixtures that depend on slow/external resources).
- Step 2: Lint check (if project has one)
ruff check . 2>&1 | tail -20
If ruff reports issues in code we touched, fix them.
- Step 3: Manually run the clean-install protocol per Task 23
(If on-machine: do steps 1-8 from docs/RELEASE_CHECKLIST.md. If headless CI: skip — the integration tests cover this.)
- Step 4: If everything passes, declare ready for review
git log --oneline origin/main..HEAD
Expected: a clean linear history of one commit per task. Push the branch and open a PR using the Spec + this Plan as the PR body context.
git push -u origin zs/clean-analyst-bootstrap-spec
Then gh pr create with the spec + CHANGELOG entry + ### Test plan per CLAUDE.md discipline.
Self-Review (post-write)
After writing the plan, I reviewed it against the spec for:
-
Spec coverage:
- Server-side: PAT scope/TTL ✅ (Task 1), legacy-strings scan ✅ (Tasks 2, 5), setup_instructions analyst branch ✅ (Task 3), /setup?role= branching ✅ (Task 4), claude_md_template rewrite ✅ (Task 6).
- Client-side library: cli/lib/init.py + hooks.py ✅ (Task 7), cli/lib/pull.py ✅ (Task 8).
- CLI commands: pull ✅ (9), push ✅ (10), init ✅ (11), status ✅ (12), diagnose system ✅ (13), snapshot create ✅ (14), catalog --metrics ✅ (15), admin metrics ✅ (16).
- Wiring: hint texts ✅ (17), main.py + deletes ✅ (18), CLAUDE.md ✅ (19).
- Tests: fixtures ✅ (20), reader smoke ✅ (21), clean install ✅ (22), manual protocol ✅ (23).
- CHANGELOG ✅ (24), final verification ✅ (25).
-
Placeholder scan: No "TBD"/"TODO" remain. Each step has the actual code or shell command. The
cli/commands/admin_metrics.pytask says "lift X from metrics.py verbatim" rather than restating — that's intentional since the engineer cangit show HEAD~N:cli/commands/metrics.pyto see exactly what to copy. -
Type consistency:
PullResultshape consistent betweencli/lib/pull.pyandcli/commands/pull.py.install_claude_hookssignature(workspace: Path) -> Noneconsistent across hooks.py + init.py._LEGACY_STRINGStuple shape used identically in tests and module. -
Known fragility: Some shell-based test assertions (Task 5
legacy-bannerdiv presence) are heuristic; implementer may need to tighten once HTML lands. Marked with comment. -
Open questions in spec stay in spec: Per-endpoint PAT scope enforcement (deferred), layered config (deferred), hook performance budget (monitoring-only), anti-coupling test (deferred). Not in this plan.
The plan is implementable. Tasks are roughly ordered by dependency: Phase 1 server foundation, Phase 2 client library, Phase 3 commands, Phase 4 wiring/cleanup, Phase 5 fixtures + tests, Phase 6 changelog/verification.