ZdenekSrotyr
cd8dd9508c
docs(testing): add coverage honesty + prerequisites to E2E plan
...
Adds three sections to the E2E plan:
- "Coverage honesty" — explicit list of what the plan reveals (✅ ) and
what it does NOT (❌ , with reasoning per gap)
- "Recommended additional coverage layers" — Tier 1/2/3 with realistic
coverage estimates (~70 % / ~80 % / ~95 % / ~98 %)
- "Prerequisites" table — what's needed on the VM, with fallback
behavior per missing item
The plan is intentionally not exhaustive. Goal is to surface the worst
contract violations fast, not to prove correctness across all real-world
environments. Documenting the gap explicitly so operators don't ship
on a false sense of "tests passed = production-ready."
2026-05-04 19:59:47 +02:00
ZdenekSrotyr
5fa1c94b5c
fix(tests): smoke matrix asserts no-traceback only (per-command rc varies)
2026-05-04 19:47:18 +02:00
ZdenekSrotyr
5162c488bb
fix(tests): strip ANSI escapes from --help output before substring asserts
...
Typer/rich emits ANSI styling in CI's --help output (e.g. `--metrics`
becomes `-\x1b[0m\x1b[1;36m-metrics`), so literal substring asserts
like `assert "--metrics" in result.output` fail. Locally the test runner
auto-detects no-TTY and produces plain text, masking the issue.
Add a small `_clean()` helper per test file that strips ANSI escape
codes (`\x1b\[[0-9;]*m`) before substring containment checks.
2026-05-04 19:43:47 +02:00
ZdenekSrotyr
d311b07d5d
docs(testing): E2E verification plan for clean-analyst-bootstrap (PR #173 )
2026-05-04 19:41:50 +02:00
ZdenekSrotyr
5bffec641f
chore(lint): final ruff fixes
2026-05-04 19:32:52 +02:00
ZdenekSrotyr
675f8e1909
chore(lint): drop unused imports from new test files (ruff F401)
2026-05-04 19:32:31 +02:00
ZdenekSrotyr
20bb9efc0e
chore(lint): drop unused os import from init.py
2026-05-04 19:32:18 +02:00
ZdenekSrotyr
d44cace17c
docs(changelog): clean-analyst-bootstrap rewrite (BREAKING)
2026-05-04 19:25:38 +02:00
ZdenekSrotyr
cc84222216
docs: clean-install manual protocol in release checklist
2026-05-04 19:23:01 +02:00
ZdenekSrotyr
8403529fcd
test: clean-install integration suite (minimal/zero grants, force, pre-init)
2026-05-04 19:22:24 +02:00
ZdenekSrotyr
42e108ae5e
test: reader smoke matrix on zero-grants workspace
2026-05-04 19:15:39 +02:00
ZdenekSrotyr
a47c2be282
test: clean-bootstrap fixtures (fastapi_test_server, test_pat, zero_grants_workspace)
...
Task 20: reusable pytest fixtures for the clean-bootstrap test suite.
Tasks 21 and 22 (reader smoke matrix + init smoke matrix) consume them.
- fastapi_test_server boots a real uvicorn subprocess against a tmp DATA_DIR,
pre-seeded with admin@example.com (Admin group), analyst@example.com
(Everyone group), and three tables (one per query_mode: local /
materialized / remote).
- web_session: cookie-authenticated httpx.Client for the admin user.
- test_pat: minted JWT for the analyst with table grants on local +
materialized.
- test_pat_no_grants: same shape, zero resource_grants.
- zero_grants_workspace: subprocess invocation of `agnes init` against the
no-grants PAT; returns the bootstrapped workspace path.
- NONEXISTENT_TABLE: module-level sentinel for the upcoming reader matrix.
Subprocess uvicorn (mirrors tests/test_e2e_corporate_memory.py) instead of
in-thread so DATA_DIR + module-level singletons in src.db don't bleed
across tests. agnes CLI invoked via `python -m cli.main` instead of the
.venv/bin/agnes shim, which depends on .pth file visibility that iCloud
Drive intermittently re-hides on macOS.
2026-05-04 19:11:54 +02:00
ZdenekSrotyr
8d9323c99e
docs(claude-md): sweep surviving-verb da X references (Task 19 follow-up)
2026-05-04 19:01:27 +02:00
ZdenekSrotyr
3990fb0d85
docs(claude-md): rewrite verbs + paths for new CLI surface
2026-05-04 19:00:31 +02:00
ZdenekSrotyr
7e1dd1adba
refactor(cli): drop sync/fetch/analyst/metrics; register init/pull/push (BREAKING)
2026-05-04 18:59:51 +02:00
ZdenekSrotyr
5551f12bb0
fix(cli): hint text 'Run: da sync' → 'Run: agnes pull'
2026-05-04 18:42:21 +02:00
ZdenekSrotyr
ff5da0af90
feat(cli): agnes admin metrics {import,export,validate}
2026-05-04 18:39:05 +02:00
ZdenekSrotyr
42b8d0309b
feat(cli): agnes catalog --metrics replaces da metrics list/show
2026-05-04 18:33:17 +02:00
ZdenekSrotyr
8309141705
feat(cli): agnes snapshot create (folded from da fetch); friendly exit if no DuckDB
2026-05-04 18:32:30 +02:00
ZdenekSrotyr
5e1e8c4e14
feat(cli): agnes status = workspace state; old health check moves to agnes diagnose system
2026-05-04 18:29:15 +02:00
ZdenekSrotyr
b799aa534a
fix(cli): I1+I2 review — surface manifest_unauthorized + add 3 typed-error tests
2026-05-04 18:19:35 +02:00
ZdenekSrotyr
9b70ca3069
feat(cli): agnes init orchestrator + AGNES_WORKSPACE.md template
2026-05-04 18:15:08 +02:00
ZdenekSrotyr
60b6fbed97
feat(cli): agnes push command (extracted from sync --upload-only)
2026-05-04 18:09:57 +02:00
ZdenekSrotyr
7f89e1d594
feat(cli): agnes pull command (Typer wrapper around lib.pull.run_pull)
2026-05-04 18:07:28 +02:00
ZdenekSrotyr
15004126de
fix(cli-lib): I1+I2+I3 review fixes — token-precedence note, sync-state TODO, dry-run hermeticity test
2026-05-04 18:04:56 +02:00
ZdenekSrotyr
37da602060
feat(cli-lib): cli/lib/pull.py:run_pull primitive with lazy mkdir
2026-05-04 18:00:57 +02:00
ZdenekSrotyr
2b3d62fbf5
chore(.gitignore): allowlist cli/lib/ from generic lib/ rule (Task 7 follow-up)
2026-05-04 17:54:00 +02:00
ZdenekSrotyr
5aebeabf23
feat(cli-lib): cli/lib/hooks.py:install_claude_hooks
2026-05-04 17:53:20 +02:00
ZdenekSrotyr
d25d075ed2
docs(claude-md-template): rewrite verbs + paths for new CLI surface (Task 6)
...
- Verb renames (da X -> agnes X for surviving verbs; legacy verbs already
absent from this default template — admin overrides with legacy verbs are
caught by Task 2's _LEGACY_STRINGS scan + Task 5's admin banner).
- Path renames: data/parquet/ -> server/parquet/, data/duckdb/ ->
user/duckdb/, data/metadata/ removed entirely (no longer exists per spec).
- Drop user/artifacts/ from directory structure (spec workspace layout
drops it; surviving paths: server/parquet/, user/duckdb/, user/snapshots/,
user/sessions/).
- Add AGNES_WORKSPACE.md pointer near top-of-template so analysts know
where to find human-readable docs.
Cleans Task 0.5's missed sweep on this file (was not in cli/ tree but is
user-visible via /api/welcome).
81 claude_md/welcome_template tests pass.
2026-05-04 17:51:14 +02:00
ZdenekSrotyr
a92c624dba
feat(admin): yellow banner for legacy CLI verbs in workspace-prompt override
2026-05-04 17:46:50 +02:00
ZdenekSrotyr
8091620d33
fix(setup): role-aware clipboard render + JSON-escape ROLE injection
...
Two Task 4 review fixes for app/web/templates/install.html:
1. JSON-escape `ROLE` JS const via `{{ role | tojson }}` (defense in
depth — removes the dependency on Jinja autoescape semantics for JS
contexts; FastAPI's Literal validator already constrains role values).
2. Verify the analyst tile's clipboard payload is the analyst layout.
The pre-existing role-aware plumbing (compute_default_agent_prompt
threading role into setup_instructions_lines, picked up by the JS
SETUP_INSTRUCTIONS_TEMPLATE array) was correct; adding regression tests
that pin to the JS clipboard block specifically so a future inversion
would fail loudly.
Tests: analyst clipboard contains `agnes init` + `agnes catalog` and
NOT `agnes auth import-token` / `agnes skills`; admin clipboard is the
inverse. Plus an explicit assertion that ROLE is rendered via tojson.
2026-05-04 17:43:46 +02:00
ZdenekSrotyr
44234ba3ae
test(setup): add mutation-resistant ternary-direction assertion (Task 4 polish)
2026-05-04 17:37:54 +02:00
ZdenekSrotyr
7965f8021d
fix(setup): role-aware PAT scope+TTL in setupNewClaude JS (Task 4 spec fix)
2026-05-04 17:34:30 +02:00
ZdenekSrotyr
f731ee7897
feat(setup): /setup?role=analyst|admin branching with role tiles
2026-05-04 17:28:47 +02:00
ZdenekSrotyr
54f83c281c
test(setup): I1+I2 review fixes — AGNES_WORKSPACE.md alignment + step-number pin
2026-05-04 17:23:15 +02:00
ZdenekSrotyr
ae00945cbf
fix(setup): clean stale 'da' refs in setup_instructions.py (Task 0.5 missed sweep)
2026-05-04 17:19:55 +02:00
ZdenekSrotyr
29e28ccbd3
feat(setup): add analyst role to install-prompt renderer
2026-05-04 17:17:59 +02:00
ZdenekSrotyr
59324f9361
feat(admin): scan CLAUDE.md override for legacy strings
2026-05-04 17:10:58 +02:00
ZdenekSrotyr
68639e54cf
test(tokens): tighten scope-default + add precedence + audit + reserved-key tests
2026-05-04 17:07:02 +02:00
ZdenekSrotyr
4ee7323436
feat(tokens): add scope + ttl_seconds fields with bootstrap-analyst clamp
2026-05-04 17:00:54 +02:00
ZdenekSrotyr
8fbf4c7873
refactor: Task 0.5 amendments — README/ARCHITECTURE sweep + main.py install hint + drop dead AGNES_SERVER_URL
2026-05-04 16:55:55 +02:00
ZdenekSrotyr
ed371c84d1
refactor(docs): sweep DA_* env vars + surviving da-verbs in docs/*.md (Task 0.5 fix)
2026-05-04 16:43:15 +02:00
ZdenekSrotyr
1563b05f2e
refactor(cli): hard-cutover env vars + config dir to AGNES_*
...
Task 0.5 of clean-analyst-bootstrap. Greenfield rewrite — no fallback,
no aliases. Existing dev environments lose their cached PAT and must
re-authenticate.
Env var renames (hard cutover):
- DA_CONFIG_DIR -> AGNES_CONFIG_DIR
- DA_SERVER -> AGNES_SERVER
- DA_SERVER_URL -> AGNES_SERVER_URL (test-only stale ref, not in spec)
- DA_NO_UPDATE_CHECK -> AGNES_NO_UPDATE_CHECK
- DA_LOCAL_DIR -> AGNES_LOCAL_DIR
- DA_TOKEN -> AGNES_TOKEN
- DA_STREAM_RETRIES -> AGNES_STREAM_RETRIES
Config dir rename: ~/.config/da/ -> ~/.config/agnes/ (across code,
comments, docstrings, error messages, install templates, dev scripts).
Stale `da X` references in CLI source (and adjacent app/, tests/):
swept docstrings, comments, help text, and error messages where the
verb survives the rewrite (init, pull, push, catalog, status, diagnose,
auth, admin, skills, query, schema, describe, explore, disk-info,
snapshot, login, logout, whoami, server, setup) and replaced `da X`
with `agnes X`. Intentionally kept `da sync`, `da fetch`, `da analyst`,
`da metrics` — those verbs are removed in later tasks; the legacy
strings will be detected by `_LEGACY_STRINGS` (added in Task 2).
Test fixes:
- TestCLIVersion now asserts output starts with `agnes ` (was `da `).
Test results: 2675 passed, 25 skipped (full pytest run, excluding 9
pre-existing test_db.py / test_user_management.py / test_e2e_extract.py
/ test_cli_binary_rename.py failures unrelated to this rename).
2026-05-04 16:35:44 +02:00
ZdenekSrotyr
8c8cdf6a6a
feat(cli): rename binary from da to agnes (BREAKING)
2026-05-04 16:05:14 +02:00
ZdenekSrotyr
841dcc8447
docs(spec+plan): round-4 review fixes — rename hygiene
...
5 must-fixes from rename-hygiene review:
- B1: test command arrays ["da", ...] -> ["agnes", ...] in spec lines
433-446, 466, 502 and plan reader smoke matrix + clean-install
integration tests + readers-in-pre-init test (39 occurrences)
- B2: ~/.local/bin/da -> ~/.local/bin/agnes (binary path string in
spec data flow + plan AGNES_WORKSPACE.md uninstall table)
- B3: CHANGELOG missing BREAKING bullet for binary rename — added in
both spec and plan CHANGELOG drafts
- B4: plan CHANGELOG typo "previous agnes status" -> "previous da
status" (server-health overview was historically da status)
- B5: spec Components table missing row for the binary rename — added
Client-side row mapping pyproject.toml + cli/main.py changes to
plan Task 0
2026-05-04 15:57:07 +02:00
ZdenekSrotyr
5e7fa418d1
docs(spec+plan): rename CLI binary from da to agnes (BREAKING)
...
- Spec rev 5: branding consistency. New CLI verbs use agnes prefix
(agnes init, agnes pull, agnes push, agnes catalog, agnes status,
agnes snapshot create, agnes admin, …).
- Plan: add Phase 0 / Task 0 — pyproject.toml [project.scripts] entry
rename to "agnes = cli.main:app" + Typer(name="agnes") in cli/main.py.
- Legacy command references (da sync, da fetch, da analyst setup,
da metrics) keep their da prefix throughout — they're historical
artifacts being removed (preserved in CHANGELOG Removed section,
_LEGACY_STRINGS constant for admin override scan, etc.).
Bulk rename via Python regex with verb whitelist: 286 verb refs
rewritten in plan, 265 in spec; 104+72 legacy refs restored to "da"
post-pass (false positives where the doc was describing the legacy
flow being replaced).
2026-05-04 15:50:44 +02:00
ZdenekSrotyr
fb8f55c335
docs(plan): clean-analyst-bootstrap implementation plan
...
25 tasks across 6 phases:
- Phase 1: server-side foundation (PAT scope/TTL, legacy-strings scan,
/setup?role= branching, claude_md_template)
- Phase 2: cli/lib/ shared-library tree (hooks.py, pull.py)
- Phase 3: new commands (init, pull, push, status, diagnose system,
snapshot create, catalog --metrics, admin metrics)
- Phase 4: wiring + delete obsolete (sync, fetch, analyst, metrics)
- Phase 5: test fixtures + reader smoke matrix + clean-install integration
- Phase 6: CHANGELOG + final verification
TDD discipline throughout: test → fail → implement → pass → commit per task.
2026-05-04 15:22:10 +02:00
ZdenekSrotyr
b2cc6517aa
docs(spec): rev 4 — round-3 review fixes; cleared for implementation
...
Round-3 review (2 load-bearing + 5 quality-of-life):
- web_session fixture endpoint: POST /auth/password/login/web (form),
not POST /auth/token (which doesn't exist); PAT mint requires session
cookie via require_session_token
- Renderer wording: "adopt" not "reuse" — da init/da pull are first-time
adopters; synthetic {"detail": {"kind": ..., "hint": ...}} pattern
documented (cli/commands/query.py:152, 165)
- cli/lib/__init__.py mention for Hatchling wheel inclusion
- _LEGACY_STRINGS constant + _scan_legacy_strings helper home named in
app/api/claude_md.py
- /api/welcome?server_url= query param explicit in data flow
- ttl_seconds upper bound 315_360_000 (matches existing 3650 d cap)
- Conftest line range 50-82 -> 50-83
2026-05-04 15:10:40 +02:00
ZdenekSrotyr
b52a37b680
docs(spec): rev 3 — round-2 review fixes + main-sync (post-0.32.0)
...
Round-2 review (N1-N10):
- N1 Token CLI keeps current `da auth token` location (not top-level)
- N2 `da catalog --metrics --show <id>` decided in Components, dropped
from Open questions
- N3 `_install_claude_hooks` migrated to new `cli/lib/hooks.py` module
- N4 Test sentinel `__nonexistent__` documented in fixtures
- N5 `web_session` fixture uses real `POST /auth/token` with seeded password
- N6 `AGNES_WORKSPACE.md` content asserts (PAT not leaked, placeholders
substituted) added to clean-install integration test
- N7 Admin UI legacy-strings banner concretized: `legacy_strings_detected`
field + yellow banner in `/admin/workspace-prompt` editor
- N8 `da metrics export/validate` relocate to `da admin metrics …`
alongside `import`
- N9 Bootstrap PAT verify endpoint switched from `/api/health` (unauth)
to `/api/catalog/tables` (PAT-validating, matches `da auth import-token`)
- N10 New `cli/lib/pull.py` and `cli/lib/hooks.py` modules inventoried
Main-sync (rebased on 0.32.0 / #160 ):
- Reconsidered: keep `da skills list / show` as analyst-side discovery
(skill content was strengthened by #160 cost-guardrail/registry rails)
- Bigger CLAUDE.md (repo-root) rewrite scope acknowledges new sections
- `cli/error_render.py` (added in 0.32.0) reused by `da init` and
`da pull` for consistent typed-error UX
- Test fixtures piggyback on autouse `_reset_module_caches` from
tests/conftest.py:50-82 (added in 0.32.0)
2026-05-04 15:05:44 +02:00
ZdenekSrotyr
973e96e6af
docs(spec): apply round-1 review — orphan/wiring fixes
...
11 minimum edits + minor cleanups from first review pass:
- TTL field: add `ttl_seconds` alongside existing `expires_in_days` in
PAT request, force-clamp to 1h for scope=bootstrap-analyst
- Drop layered per-workspace config (no defined producer; multi-instance
is edge case) — moved to Open questions
- CLAUDE.md producer: explicit `GET /api/welcome` flow, address admin
DB-override migration
- URL: `/install` → `/setup?role=...` (matches existing routing; legacy
`/install` keeps 302 redirect)
- `da metrics import` → relocated under `da admin metrics import`
- Repo-root CLAUDE.md added to rewritten files list
- CLI surface: enumerate `da explore`, `da disk-info`, `da snapshot
refresh/prune`, full `da snapshot create` flag list, `da push` flags
- Test fixtures section: contracts for `fastapi_test_server`, `test_pat`,
`test_pat_no_grants`, `zero_grants_workspace`, `web_session`, `client`
- Cleanups: line ref for `_install_claude_hooks`, admin-namespace
disclaimer, `da init`-doesn't-call-`da auth login` note, audit-log
consumer note for PAT scope, `{workspace_path}` placeholder usage
2026-05-04 14:59:40 +02:00