Session capture queue, private session, and setup-prompt fixes (#242)
* Capture session paths via SessionStart hook + lock parallel pushes Replace the encoding-based scan of ~/.claude/projects/<encoded-cwd>/ with a queue file populated by a new `agnes capture-session` SessionStart hook. The hook reads the documented `transcript_path` field from Claude Code's hook stdin JSON, sidestepping the cwd-to-folder encoding (which is an internal implementation detail and varies by Claude Code version). - New `agnes capture-session` subcommand appends transcript_path to <workspace>/.claude/agnes-sessions.txt. Silent on all malformed input so a hook chain failure doesn't clutter Claude Code startup. - `agnes push` now consumes the queue: atomic snapshot rename guards against hooks writing during the push window, successful uploads land in agnes-sessions-uploaded.txt (TSV: timestamp + path), failed paths are requeued. - Cross-platform single-instance lock via the filelock package (fcntl on POSIX, msvcrt on Windows). Concurrent SessionEnd hooks — common when the user closes several sessions at once — silent-exit on the losing side instead of all racing the upload. - Recovery: pre-existing snapshot files from a crashed push are picked up and processed before the live queue. - The SessionStart `agnes push` self-heal entry is dropped — it became redundant once the queue persists across runs (orphans from headless / crashed sessions ship out on the next interactive SessionEnd push). Existing workspaces auto-migrate via the marker-based replace logic. - Legacy encoding scan stays available behind `--legacy-scan` for one- off backfills of sessions predating the queue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Add /agnes-private + statusLine indicator for private sessions Users handling sensitive data inside Claude Code can now opt a session out of the Agnes upload pipeline, either proactively (right after session start) or reactively (mid-session). The `/agnes-private` slash command runs `agnes mark-private` deterministically via `!`-prefix direct bash — no AI in the loop. A workspace-installed statusLine surfaces a `🔒 agnes-private` indicator in Claude Code's status bar so the user sees the state at a glance. Authoritative source of "do not upload" is a separate file `<workspace>/.claude/agnes-sessions-private.txt` (one session_id per line). Both `capture-session` (queue writer) and `push` (queue reader) consult the list. This makes the slash-command / SessionStart-hook race impossible by construction: whichever runs first, the session is correctly filtered out. - `agnes mark-private` reads `CLAUDE_CODE_SESSION_ID` from env (set by Claude Code in every bash subprocess it spawns — stable documented API) and appends to the private list. - `agnes statusline` reads the session JSON Claude Code pipes on stdin, checks the private list, and emits the indicator or nothing. Optimized for the high call frequency of statusLine renders. - `capture-session` extracts session_id from hook stdin and skips queue write when the ID is already on the private list (race protection). - `push` filters snapshot entries by the private list and appends to a per-workspace audit log `agnes-sessions-private-skipped.txt`. - Queue format migrated from `<path>` to `<session_id>\t<path>`; legacy one-column lines still parse (empty session_id, still upload, can't be marked private retroactively — fine, they pre-date the feature). - `install_claude_hooks` writes a workspace statusLine unless the user already has a custom one (warn + preserve). Idempotent re-init. - `install_claude_commands` ships `agnes-private.md` alongside `update-agnes-plugins.md`. Per-template fallback so a missing template doesn't get clobbered with the wrong content. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix setup-prompt + CLAUDE.md marketplace copy + drop skills step Three issues against the post-PR-#240 / post-PR-#237 state: 1. Setup prompt's marketplace block trailer (both has-stack and empty-stack variants) claimed the SessionStart hook keeps the marketplace clone in sync via `agnes refresh-marketplace --quiet` on every session and that admin grants land automatically — both false since PR #237 (0.47.x) moved the install/update path out of the hook into the `/update-agnes-plugins` slash command. The hook is `--check`-only: detects server-side changes, prompts the user to run the slash command, which does the full reconcile interactively with output visible in the transcript. 2. The empty-stack variant framed composition as "admin grants only", missing the actual three-source served stack: (admin RBAC ∩ /marketplace subscriptions) ∪ system-mandatory plugins (admin-pinned, auto-applied) ∪ Flea market installs (skills/agents bundled, plugins standalone) Updated copy spells out all three sources so analysts know where their stack picks live, and what the SessionStart hook actually does on change detection. 3. CLAUDE.md template's "Agnes Marketplace" section conflated eligibility (`resolve_allowed_plugins` — what's listed) with served stack (`resolve_user_marketplace` — what actually reaches Claude Code). The two are different: a user can be RBAC-eligible for a plugin without having subscribed to it on /marketplace. Rewrote the section to distinguish the eligibility set from the served stack and to describe the `--check`-only hook accurately. Plus: deleted the setup prompt's interactive Skills step (final step before Confirm). The named-opinion question — "do you want me to bulk-copy every skill into ~/.claude/skills/agnes/ or pull on-demand via `agnes skills show <name>`?" — had no obvious right answer for new users at the tail end of a wall of technical steps. On-demand lookup is the one-size-fits-all default; `agnes skills list/show` remain discoverable and the CLAUDE.md template references specific skills inline (e.g. agnes-data-querying in the BigQuery section) where they're relevant. Layout: Confirm shifts from step 9 to step 8. Tests updated, full setup/marketplace/welcome surface green (115 passed). Remaining full-suite failures are pre-existing (BQ/Keboola fixtures, Windows charmap collection error in test_v26_keboola_e2e) — verified against a clean stash, unrelated to this diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix session-queue race + snapshot PID-reuse data loss Two blocker fixes from the PR #242 review: 1. Concurrent SessionStart hooks could corrupt the queue file on Windows. Python's `open(path, "a")` is not atomic there — the CRT does not pass FILE_APPEND_DATA to CreateFile, so concurrent appenders (user opening several Claude Code windows simultaneously) could interleave bytes mid-line. The malformed lines then silently fail the parser and the entries are dropped. Fix: wrap append_to_queue, requeue_failed, and snapshot_queue in a short-lived FileLock on a dedicated `agnes-queue.lock`. Separate from `agnes-push.lock` so capture-session hooks don't block on the push command. New test_append_concurrent_threads_no_corruption reproduces the race with 4 threads x 50 appends. 2. Snapshot filenames embedded only the PID (`agnes-sessions.snapshot. <PID>.txt`). After a crashed push left a snapshot on disk and the OS recycled the PID for a new push, `os.rename` would atomically overwrite the recovery snapshot — every entry in it lost, silently. Fix: append a uuid8 hex tail (`agnes-sessions.snapshot.<PID>. <uuid8>.txt`). find_recovery_snapshots already globs the prefix so it picks up both old and new format. New test_snapshot_filename_is_unique_per_call asserts two consecutive snapshots under the same PID don't collide. Targeted tests green (47/47 in session_queue/capture_session/cli_push). Full suite failures unchanged from baseline (pre-existing BQ/Keboola fixture issues per CLAUDE.md). * Auto-refresh workspace hooks + bash-wrap all hook entries (Windows) Fixes from PR #242 second review (ZdenekSrotyr): 1. `uv.lock` regenerated to include `filelock 3.29.0` (declared in pyproject.toml but missing from the lock file — CI's lockfile-consistency check would fail; `uv pip install` on a clean cache would silently miss the dep). 2. `agnes self-upgrade` now auto-refreshes the workspace Claude Code hooks via the new `cli.lib.hooks.maybe_refresh_claude_hooks`. Closes the silent-stop migration gap: a v0.48 workspace would auto-upgrade the CLI from its existing SessionStart self-upgrade entry but never pick up the new `agnes capture-session` SessionStart hook, leaving the queue empty and `agnes push` uploading nothing. The refresh fires on both the "info is None" fast path (CLI already current — catches the second SessionStart after a prior upgrade) and the install-success path. Guarded by `workspace_has_agnes_hooks` so it never writes `.claude/settings.json` into directories that aren't Agnes workspaces (e.g. `agnes self-upgrade` invoked from `~/`). Errors are surfaced on stderr but never flip the upgrade exit code. 3. All Agnes-managed hooks are now wrapped in `bash -c "..."`. The self-upgrade+pull chained SessionStart entry was the only one still shipping unwrapped — Claude Code on Windows runs hook commands directly without a shell, so the `;` chain + `2>/dev/null` + `|| true` shell syntax silently no-op'd on native Windows installs without Git Bash on PATH. Workspaces still on the old form auto-upgrade via the refresh path above. Tests: +12 in test_lib_hooks.py (guard semantics, v0.48→v0.49 migration end-to-end, third-party-hook preservation, bash-wrap invariant). +5 in test_self_upgrade.py (refresh fires on info=None, fires on install success, skipped on failure, skipped on --check-only, refresh failure never flips exit code). 130 targeted tests green. The 2 pre-existing Windows path-separator failures in `test_smoke_test_detects_version_mismatch[uv|pip]` are unrelated (path mismatch `\fake\uv\bin\agnes` vs `/fake/uv/bin/agnes` in test asserts, pre-PR baseline). * CHANGELOG: document PR-242 main features Closes ZdenekSrotyr #4: the [Unreleased] block was missing entries for the PR's primary surface — only the post-merge fix bullets and the unrelated setup-prompt copy change were captured. Adds: - ### Added: 6 bullets covering the session capture queue + new `agnes capture-session` subcommand, `/agnes-private` slash + `agnes mark-private`, `agnes statusline` + statusLine wiring, `--legacy-scan` opt-in fallback, single-instance push lock, and the new `filelock` runtime dep. - ### Changed: BREAKING bullet on the SessionStart / SessionEnd hook wire format change (capture-session as first SessionStart entry, push self-heal removed, SessionEnd push detached via nohup, all entries bash-wrapped). Folds the prior standalone bash-wrap bullet into this consolidated entry — Z's review flagged the layout shift as BREAKING, and grouping the related sub-changes makes the migration story readable in one place. - Operator migration is auto-handled by `maybe_refresh_claude_hooks` invoked from `agnes self-upgrade` (separate Changed entry below). No `agnes init` re-run required. Pre-queue session jsonls on upgrading workspaces still need a one-off `agnes push --legacy-scan` — flagged in the BREAKING bullet. No code change; doc only. * Drop permanent 4xx uploads instead of requeueing forever Closes ZdenekSrotyr #5. Previously the push retry path requeued any non-200 response except the literal "file not found on disk", so 401 (token expired), 403 (RBAC denial), 413 (payload too large), 400 (server-side validation) cycled through every push run forever — the queue grew without bound and each run re-bombarded the server with the same deterministically-failing upload. Now 4xx (except 408 Request Timeout + 429 Too Many Requests, which the HTTP spec marks as transient) is dropped and audit-logged to `<workspace>/.claude/agnes-sessions-failed.txt`: <iso_ts>\t<session_id>\t<status>\t<transcript_path> 5xx and network errors continue to requeue — those reflect server / transport state that can change between runs, so retry is the right behavior. The audit log piggybacks on the push single-instance lock (agnes-push.lock) — push is the only writer to this file, same as the existing `mark_uploaded` and `mark_private_skipped` paths, so no separate filelock is needed. `agnes push --json` surfaces a new `dropped_permanent` counter; non- quiet stdout mentions the audit-log path so operators tailing the output have a pointer to the forensic trail. Tests: +7 in test_cli_push.py (401/400/403/413 → drop; 408/429 → requeue; 500/502/503 → requeue; network exception → requeue; --json `dropped_permanent` counter; stdout audit-log pointer). +1 in test_session_queue.py (mark_failed_permanent TSV format). 127/129 targeted tests green. The 2 pre-existing Windows path-separator failures in `test_smoke_test_detects_version_mismatch [uv|pip]` are unrelated (path mismatch `\fake\uv\bin\agnes` vs `/fake/uv/bin/agnes` in test asserts, pre-PR baseline). * Catch OSError in push lock acquisition Closes ZdenekSrotyr #8. `acquire_or_skip` in `cli/lib/push_lock.py` previously caught only `filelock.Timeout`. Any `OSError` from `FileLock.acquire` — read-only filesystem, permission denied on `.claude/`, disk full, hardware I/O failure — propagated as an unhandled traceback. Two visible failure modes: - SessionEnd hook: `|| true` in the wrapper swallowed the error, so daily pushes silently never ran. Operator had no signal. - Manual `agnes push`: ugly Python traceback dumped to the terminal instead of a clean exit. Now `OSError` is treated the same as `Timeout` — yield `None`, caller returns cleanly with rc=0. The operator's environment in these scenarios has bigger problems than missing session uploads, so we swallow rather than retry-loop or surface a noisy warning. Test: `test_push_silent_exit_when_filelock_raises_oserror` patches the `FileLock` used inside `push_lock` to raise OSError on acquire, verifies push exits 0 with no traceback and the queue is preserved for the next attempt. * Address remaining S2 items from PR-242 review Four items from ZdenekSrotyr's S2 list: S2.10 — `_install_statusline` truthy check (cli/lib/hooks.py): replace `if existing:` with explicit `if existing is None or existing == "":`. Documents and tests the behavior for both edge cases (explicit-null and empty-string `statusLine`) — both treated as "not configured" rather than "explicit user opt-out", so we install ours. Two new tests in test_lib_hooks.py pin the contract. S2.6 — onboarding docs for /agnes-private. New "Private sessions" subsection in `config/claude_md_template.txt` (next to Data Sync) covering the slash command, statusbar indicator, and audit-log location. One-line tip in `app/web/setup_instructions.py` so the feature is discoverable at onboarding. S2.9 — e2e privacy test (tests/test_e2e_privacy.py). Wires capture_session → mark_private → push against a recording fake api_post and asserts zero session uploads for the marked one. Three cases: mark-before-capture (queue write skipped), mark-after-capture (push-side filter catches it + audit-logs), control (unmarked sessions upload normally). David #8 — `--legacy-scan` help text now documents the private-list gap (legacy entries carry empty session_id, so the filter is not consulted). The practical impact is bounded — pre-queue sessions cannot have been marked private since the private list is a queue-era feature — but the disclaimer in the help text means an operator running a backfill is not surprised. 68 targeted tests green (3 new e2e + 2 new truthy edge tests + existing). 2 pre-existing Windows path-separator failures in test_smoke_test_detects_version_mismatch[uv|pip] unchanged. Remaining S2 items (statusline mkdir push-back, capture-session silent-fail follow-up) handled in PR comment + follow-up issue respectively. * Address remaining S2 follow-ups (David #8, S2.7, David #11) Three items left over from Mina's bbf63472 batch — that commit addressed S2.6/S2.9/S2.10 + documented David #8 in help text but deferred the actual implementations of S2.7, David #11, and the real David #8 fix to follow-ups. This commit closes them. David #8 — `agnes push --legacy-scan` now consults the private list. Claude Code names jsonls `<session-id>.jsonl`, so the file stem IS the session id; the legacy-scan path can apply the same private filter the queue path uses. Both the dry-run and live-upload code paths fixed. Help text updated (no longer warns the filter is bypassed). Two new tests in test_cli_push.py cover the upload-skip path + the dry-run `would_skip_private` segregation. S2.7 — `statusline`/`is_private` no longer mkdir-pollutes arbitrary workdirs. Split `_claude_dir` into `_claude_dir_writable` (used only from `add_private`) and `_claude_dir_readonly` (no mkdir). The read-only public helpers (`private_list_path`, `read_all_private`, `is_private`) compose the no-mkdir variant by default; `add_private` opts in via `writable=True`. Added a process-local mtime-keyed cache around `read_all_private` so in-process callers (push doing one stat per upload candidate, future `agnes diagnose`) don't re-parse the file on every check. Cache eviction on `add_private` so a sub-second write+read sequence doesn't see stale data even on coarse-mtime filesystems. Two new tests pin the no-mkdir contract + the in-same-second add+read consistency. David #11 — `agnes capture-session` writes a breadcrumb log on every invocation. New `<workspace>/.claude/agnes-capture-session.log` TSV: `<iso_ts>\t<outcome>\t<detail>` where outcome covers every silent- exit path (`ok`, `private_skip`, `empty_stdin`, `bad_json`, `not_object`, `no_transcript_path`, `stdin_read_error`, `write_error`). Gives operators a signal to detect "hook fires but queue stays empty" — without it, an upstream Claude Code stdin- contract change is invisible because the hook always exits 0. Log rolls at 256 KiB so it doesn't grow unbounded on long-lived workspaces. Best-effort: a breadcrumb-write failure is itself swallowed so the hook contract stays "exit 0 always". Skipped in non-Agnes workdirs (no `.claude/` exists) so opening Claude Code in `~/` doesn't pollute it. Five new tests in test_capture_session.py cover the success / bad_json / no_transcript_path / private_skip / no-pollute paths. 115 targeted tests green (test_cli_push, test_capture_session, test_private_list, test_session_queue, test_e2e_privacy, test_lib_hooks, test_statusline, test_mark_private). --------- Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
This commit is contained in:
parent
9de679c714
commit
19c5a7592a
29 changed files with 3566 additions and 320 deletions
240
CHANGELOG.md
240
CHANGELOG.md
|
|
@ -10,6 +10,246 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C
|
|||
|
||||
## [Unreleased]
|
||||
|
||||
### Fixed (PR #242 follow-ups)
|
||||
|
||||
- **`/agnes-private` legacy-scan gap closed (David #8 from PR review).**
|
||||
`agnes push --legacy-scan` now consults the private list using the
|
||||
jsonl file stem as the session id (Claude Code names them
|
||||
`<session-id>.jsonl`). Previously legacy-scan entries carried an
|
||||
empty session_id, so `--legacy-scan` would upload every transcript
|
||||
on disk regardless of whether the user later marked it private.
|
||||
- **`statusline`/`is_private` no longer mkdir-pollutes arbitrary
|
||||
workdirs (S2.7 from PR review).** Read paths now use a side-effect-
|
||||
free helper that returns the `.claude/` path WITHOUT creating it;
|
||||
only `add_private` materializes the dir. Adds a process-local
|
||||
mtime-keyed cache around `read_all_private` so in-process callers
|
||||
(push doing one stat per upload candidate, `agnes diagnose`
|
||||
scanning workspaces) don't re-parse the file every time.
|
||||
- **`agnes capture-session` writes an operability breadcrumb log
|
||||
(David #11 from PR review).** Every invocation appends one TSV
|
||||
line to `<workspace>/.claude/agnes-capture-session.log` with the
|
||||
outcome (`ok`, `private_skip`, `bad_json`, `no_transcript_path`,
|
||||
…). Gives operators a signal to detect "hook fires but queue stays
|
||||
empty" — without it, an upstream Claude Code stdin-contract change
|
||||
is invisible because the hook always exits 0. Log rolls at 256 KiB.
|
||||
Best-effort: a breadcrumb-write failure is swallowed so the hook
|
||||
contract stays "exit 0 always". Skipped in non-Agnes workdirs (no
|
||||
`.claude/`) so opening Claude Code in `~/` doesn't pollute it.
|
||||
|
||||
### Added
|
||||
|
||||
- **Session capture queue + new `agnes capture-session` SessionStart
|
||||
subcommand.** Replaces the previous encoding-based scan of
|
||||
`~/.claude/projects/` for session jsonls (which depended on Claude
|
||||
Code's cwd-to-folder encoding — a moving target across versions).
|
||||
The hook reads Claude Code's documented stdin JSON
|
||||
(`transcript_path`) and appends `<session_id>\t<transcript_path>` to
|
||||
`<workspace>/.claude/agnes-sessions.txt`. `agnes push` then atomically
|
||||
renames that queue to a snapshot, processes it, and re-queues failed
|
||||
uploads. Recovery snapshots from a crashed push are picked up on the
|
||||
next run. Concurrent SessionStart hooks (multiple Claude Code windows
|
||||
opening at once) are serialized by a short-lived `agnes-queue.lock`
|
||||
so the queue is race-free on every OS.
|
||||
|
||||
- **`/agnes-private` slash command + `agnes mark-private` subcommand.**
|
||||
Mark the current Claude Code session as private — its transcript is
|
||||
skipped by `agnes push` and audit-logged to
|
||||
`<workspace>/.claude/agnes-sessions-private-skipped.txt` instead.
|
||||
The slash command runs deterministically via `!`-prefix bash (no AI
|
||||
in the loop). State lives in
|
||||
`<workspace>/.claude/agnes-sessions-private.txt` (one session_id per
|
||||
line) and is the authoritative source — both `capture-session` and
|
||||
`push` consult it, so the slash-command-before-capture and
|
||||
capture-before-slash-command races both resolve safely without an
|
||||
ordering dependency. Requires the `CLAUDE_CODE_SESSION_ID`
|
||||
environment variable that Claude Code sets in every bash subprocess
|
||||
it spawns; `agnes mark-private` exits 1 if missing (defends against
|
||||
accidental invocations from a regular terminal).
|
||||
|
||||
- **`agnes statusline` subcommand + statusLine wiring.** Renders
|
||||
`🔒 agnes-private` in the Claude Code status bar when the current
|
||||
session is marked private; empty string otherwise. `agnes init` wires
|
||||
it to Claude Code's `statusLine` setting. Polite to existing
|
||||
customizations — if the workspace `settings.json` already has a
|
||||
`statusLine`, the install preserves it untouched and emits a
|
||||
one-line stderr warning instructing the operator how to compose
|
||||
`agnes statusline` into their own command.
|
||||
|
||||
- **`agnes push --legacy-scan` opt-in fallback** scans
|
||||
`~/.claude/projects/` via the pre-queue encoding-based path. Use
|
||||
for one-off backfill of session jsonls that pre-date the queue
|
||||
mechanism on workspaces upgrading from < v0.49. Note: legacy-scanned
|
||||
entries have empty `session_id`, so the `/agnes-private` list filter
|
||||
never matches — backfill uploads bypass the private list. Document
|
||||
this gap before running a backfill on a workspace that has
|
||||
previously-marked-private sessions in the encoding-based location.
|
||||
|
||||
- **Single-instance lock for `agnes push`** (cross-platform via
|
||||
`filelock`: `fcntl.flock` on POSIX, `msvcrt.locking` on Windows).
|
||||
When the user closes several Claude Code sessions simultaneously,
|
||||
every SessionEnd hook fires its own `agnes push` — exactly one
|
||||
acquires `<workspace>/.claude/agnes-push.lock` and runs, the rest
|
||||
silent-exit. Prevents concurrent uploads from each other's queues
|
||||
and matches the existing `bash -c "( nohup ... & ) ; true"`
|
||||
SessionEnd wrapping (push must survive Claude Code's ~1s SIGTERM
|
||||
in `-p` headless mode).
|
||||
|
||||
- **New `filelock>=3.13,<4` runtime dependency.** Backs both the
|
||||
push single-instance lock and the queue-write serialization above.
|
||||
|
||||
### Changed
|
||||
|
||||
- **BREAKING: SessionStart / SessionEnd hook wire format.**
|
||||
`agnes init` (and the new `agnes self-upgrade` auto-refresh path)
|
||||
write a different hook layout than v0.48:
|
||||
- SessionStart gains `agnes capture-session` as the very first
|
||||
entry — feeds the new session-capture queue that powers
|
||||
`agnes push`. Must run before any other SessionStart hook so the
|
||||
`transcript_path` is captured even if a later hook fails.
|
||||
- SessionStart's previous `agnes push` self-heal entry is removed
|
||||
— the queue persists across runs so orphan jsonls from headless /
|
||||
crashed sessions ship out on the next SessionEnd push naturally.
|
||||
Workspaces upgrading from < v0.49 with sessions that pre-date the
|
||||
queue mechanism need a one-off `agnes push --legacy-scan` to
|
||||
backfill them; see `--legacy-scan` entry above.
|
||||
- SessionEnd `agnes push` is wrapped in a `nohup` subshell so the
|
||||
upload survives Claude Code's `-p` headless SIGTERM (~1s after
|
||||
hook fires) and completes the full upload cycle. The synchronous
|
||||
form would lose 5-30s of uploads to the kill.
|
||||
- All entries are wrapped in `bash -c "..."` for Windows
|
||||
compatibility — Claude Code on Windows runs hook commands directly
|
||||
without a shell, so any `;` chain / `2>/dev/null` redirection /
|
||||
`|| true` short-circuit silently no-op'd previously.
|
||||
|
||||
Existing workspaces auto-migrate to the new layout on the next
|
||||
session-start via `maybe_refresh_claude_hooks` invoked from
|
||||
`agnes self-upgrade` (see separate Changed entry). No operator
|
||||
action required.
|
||||
|
||||
- **`agnes self-upgrade` now auto-refreshes the workspace Claude Code
|
||||
hooks** so an existing Agnes workspace picks up the new SessionStart /
|
||||
SessionEnd layout the moment its CLI is upgraded — no need to re-run
|
||||
`agnes init` after a release. Without this, an existing v0.48
|
||||
workspace would auto-upgrade the CLI via its own SessionStart
|
||||
self-upgrade entry, but the new `agnes capture-session` hook (added
|
||||
in this release) would never get installed, the queue would stay
|
||||
empty, and `agnes push` would silently stop uploading sessions. The
|
||||
refresh fires on both the "info is None" fast path (CLI already
|
||||
current — handles the second SessionStart after a prior upgrade) and
|
||||
after a successful install. Guarded by
|
||||
`cli.lib.hooks.workspace_has_agnes_hooks` so it never writes
|
||||
`.claude/settings.json` into directories that aren't Agnes workspaces
|
||||
(e.g. `agnes self-upgrade` from `~/`). Failures are best-effort —
|
||||
they're surfaced on stderr but never flip the upgrade exit code.
|
||||
|
||||
### Added
|
||||
|
||||
- **Onboarding docs for the `/agnes-private` privacy feature.**
|
||||
`config/claude_md_template.txt` gains a short "Private sessions"
|
||||
subsection (next to "Data Sync") covering the slash command,
|
||||
statusbar indicator, and audit-log location. The web-served setup
|
||||
prompt (`app/web/setup_instructions.py`) gets a one-line mention so
|
||||
analysts learn the feature exists at onboarding instead of by
|
||||
accident.
|
||||
|
||||
### Changed
|
||||
|
||||
- **`_install_statusline` distinguishes explicit `null` / empty-string
|
||||
`statusLine` from absent key.** Previously the `if existing:` truthy
|
||||
check silently took the same path for all three cases. The new
|
||||
`existing is None or existing == ""` branch documents and tests the
|
||||
behavior (install ours — treated as "not configured" rather than
|
||||
"explicit user opt-out"). Two new tests pin both edge cases.
|
||||
|
||||
### Fixed
|
||||
|
||||
- **`agnes push --legacy-scan` help text documents the private-list
|
||||
gap.** Legacy-scan entries carry an empty `session_id`, so the
|
||||
`/agnes-private` filter is not consulted. The practical impact is
|
||||
bounded — pre-queue sessions cannot have been marked private (the
|
||||
private list is a queue-era feature) — but the help text now spells
|
||||
out the gap so an operator running a backfill is not surprised.
|
||||
|
||||
- **`agnes push` no longer crashes on filesystem errors when acquiring
|
||||
the single-instance lock.** `acquire_or_skip` in
|
||||
`cli/lib/push_lock.py` now treats `OSError` (read-only filesystem,
|
||||
permission denied on `.claude/`, disk full, hardware I/O failure) the
|
||||
same as `filelock.Timeout` — yields `None`, push exits cleanly.
|
||||
Previously the `OSError` propagated as an unhandled traceback;
|
||||
invisible in the SessionEnd hook context (the `|| true` wrapper
|
||||
swallowed it), but ugly in a manual `agnes push` invocation.
|
||||
|
||||
- **`agnes push` no longer infinite-loops on permanent 4xx failures.**
|
||||
Previously any non-200 response except the literal `file not found
|
||||
on disk` was re-queued, so 401 (token expired), 403 (RBAC denial),
|
||||
413 (payload too large), 400 (server-side validation error) cycled
|
||||
through every push run forever — the queue grew without bound and
|
||||
each run re-bombarded the server with the same failing upload.
|
||||
4xx (except 408 Request Timeout + 429 Too Many Requests, which the
|
||||
HTTP spec marks as transient) is now dropped + audit-logged to
|
||||
`<workspace>/.claude/agnes-sessions-failed.txt` instead (TSV:
|
||||
`<iso_ts>\t<session_id>\t<status>\t<transcript_path>`). 5xx and
|
||||
network errors continue to re-queue (genuinely transient — server
|
||||
or transport state can change between runs). `agnes push --json`
|
||||
surfaces a new `dropped_permanent` counter; non-quiet stdout
|
||||
mentions the audit-log path so operators tailing the output have a
|
||||
pointer to the forensic trail.
|
||||
|
||||
- **Session capture queue: concurrent SessionStart hooks no longer
|
||||
corrupt the queue file on Windows.** `append_to_queue`,
|
||||
`requeue_failed`, and `snapshot_queue` in `cli/lib/session_queue.py`
|
||||
now hold a short-lived `agnes-queue.lock` (filelock) while writing.
|
||||
Previously the code assumed Python's `open(path, "a")` is atomic on
|
||||
NTFS for small writes; it isn't — the Windows CRT does not pass
|
||||
`FILE_APPEND_DATA` to `CreateFile`, so concurrent appenders (e.g.
|
||||
user opens several Claude Code windows simultaneously) could
|
||||
interleave bytes mid-line and the parser would silently drop the
|
||||
malformed entries. The lock is separate from `agnes-push.lock` —
|
||||
capture-session hooks don't block on the push command.
|
||||
- **Session capture queue: snapshot filenames now include a uuid8 tail
|
||||
so a recycled OS PID cannot silently overwrite a recovery snapshot
|
||||
left behind by a crashed push.** `snapshot_queue` previously named
|
||||
files `agnes-sessions.snapshot.<PID>.txt`; after a crash + PID reuse
|
||||
(Linux default `kernel.pid_max=32768`), `os.rename` atomically
|
||||
replaces the recovery file with the new snapshot, losing every entry
|
||||
in it. New format: `agnes-sessions.snapshot.<PID>.<uuid8>.txt`;
|
||||
`find_recovery_snapshots` already uses a glob so the change is
|
||||
backward-compatible with snapshots written by older CLI versions.
|
||||
|
||||
### Changed
|
||||
|
||||
- **Setup prompt + CLAUDE.md template: marketplace copy now reflects the
|
||||
actual three-source served stack composition + `--check`-only
|
||||
SessionStart hook.** Previous text (shipped in 0.48.0 / PR #240) said
|
||||
the SessionStart hook keeps the marketplace clone in sync via
|
||||
`agnes refresh-marketplace --quiet` on every session, and that admin
|
||||
grants land automatically without re-running setup — both false since
|
||||
PR #237 (0.47.x) moved the install/update path out of the hook into
|
||||
the `/update-agnes-plugins` slash command. The hook is `--check`-only:
|
||||
it detects server-side changes and prompts the user to run the slash
|
||||
command, which does the full reconcile interactively with output
|
||||
visible in the transcript. Updated copy spells out the real
|
||||
composition of the served stack — `(admin RBAC ∩ /marketplace
|
||||
subscriptions) ∪ system-mandatory plugins ∪ Flea market installs` —
|
||||
rather than the admin-grants-only framing the previous copy implied.
|
||||
Affects: `app/web/setup_instructions.py:_marketplace_block` (both
|
||||
trailer variants) and `config/claude_md_template.txt` (Agnes
|
||||
Marketplace section).
|
||||
|
||||
### Removed
|
||||
|
||||
- **Setup prompt's interactive Skills step deleted.** The final step
|
||||
before Confirm used to ask the user verbatim whether to bulk-copy
|
||||
every `agnes skills` markdown file into `~/.claude/skills/agnes/` or
|
||||
pull them on-demand via `agnes skills show <name>`. The named-opinion
|
||||
question with no obvious right answer was confusing for new users at
|
||||
the tail end of a wall of technical steps. On-demand lookup via
|
||||
`agnes skills show <name>` is the one-size-fits-all default — the
|
||||
CLI knowledge base remains discoverable through `agnes skills list`
|
||||
and the CLAUDE.md template references specific skills (e.g.
|
||||
`agnes-data-querying`) inline where they're relevant. Layout: Confirm
|
||||
shifts from step 9 to step 8 across all variants.
|
||||
|
||||
## [0.48.0] — 2026-05-10
|
||||
|
||||
### Fixed
|
||||
|
|
|
|||
|
|
@ -91,26 +91,22 @@ The numbered steps are arranged so that:
|
|||
CLAUDE.md fetch, and Claude Code SessionStart/End hooks into one
|
||||
non-interactive call. Replaces the old `agnes auth import-token` +
|
||||
`agnes auth whoami` pair.
|
||||
- The interactive question (skills copy vs on-demand) is the LAST step
|
||||
before Confirm — by that point everything else is done, the user only
|
||||
needs to decide one thing, and the assistant blocks on their answer.
|
||||
- `agnes diagnose` runs late so it doubles as a final smoke test after
|
||||
plugins are in place, instead of gating them.
|
||||
plugins are in place, instead of gating them. It is also the last
|
||||
step before Confirm — the whole prompt is non-interactive, no
|
||||
decision questions for the user.
|
||||
|
||||
Layout (with marketplace plugins to install):
|
||||
Layout:
|
||||
0 TLS trust block (only when ca_pem is supplied)
|
||||
1 Install CLI
|
||||
2 agnes init (auth + workspace bootstrap)
|
||||
3 agnes catalog (smoke verify)
|
||||
4 Pre-flight: git + claude
|
||||
5 Marketplace + plugins
|
||||
6 Diagnose
|
||||
7 Skills (interactive — assistant waits for user)
|
||||
5 Marketplace (always, even with empty served stack)
|
||||
6 MCP servers (Atlassian Remote MCP)
|
||||
7 Diagnose
|
||||
8 Confirm
|
||||
|
||||
Layout (no plugins): steps 4-5 collapse out, diagnose/skills/confirm
|
||||
renumber to 4-5-6.
|
||||
|
||||
The combined-bundle source uses a fallback chain so the prompt still works
|
||||
on machines without the system Python `certifi`: we try (a) `python3 -c
|
||||
'import certifi'`, (b) the platform's curl/openssl bundle path, (c)
|
||||
|
|
@ -358,23 +354,31 @@ def _init_lines(server_url_placeholder: str = "{server_url}") -> list[str]:
|
|||
"",
|
||||
" This should list the tables your account has grants for. Empty list",
|
||||
" means your admin hasn't granted you access yet — contact them.",
|
||||
"",
|
||||
" Tip: type `/agnes-private` inside any Claude Code session to mark it",
|
||||
" private — its transcript is skipped by `agnes push` (audit-logged to",
|
||||
" `.claude/agnes-sessions-private-skipped.txt`). The statusbar shows",
|
||||
" `🔒 agnes-private` while you're in a private session.",
|
||||
]
|
||||
|
||||
|
||||
def _diagnose_skills_lines(*, diagnose_num: str, skills_num: str) -> list[str]:
|
||||
"""Diagnose + skills steps — moved AFTER the marketplace block.
|
||||
def _diagnose_lines(*, diagnose_num: str) -> list[str]:
|
||||
"""Diagnose step — runs AFTER the marketplace + MCP blocks.
|
||||
|
||||
Putting these last (instead of right after `whoami`) means: by the time
|
||||
we ask the user the skills question, all installation work is finished —
|
||||
the only thing the prompt is still waiting on is one human-loop answer.
|
||||
`agnes diagnose` then doubles as a server-health smoke test that runs after
|
||||
plugins are in place, not as a gate before them. With the new ordering
|
||||
skills is the LAST step before Confirm, so the assistant must wait for
|
||||
the user's answer before finalizing — there's no "run other steps in
|
||||
parallel" affordance any more (and it isn't needed).
|
||||
Putting it last (instead of right after `whoami`) means it doubles as
|
||||
a server-health smoke test that runs once everything else is in place,
|
||||
not as a gate before them. It is the last step before Confirm — the
|
||||
whole prompt is non-interactive.
|
||||
|
||||
Step numbers are filled in by the caller because they shift between
|
||||
the no-marketplace layout (4, 5) and the marketplace layout (6, 7).
|
||||
The bundled `agnes skills` knowledge base (markdown documents listable
|
||||
via `agnes skills list` / readable via `agnes skills show <name>`) is
|
||||
no longer surfaced from this prompt: discovery happens organically
|
||||
when CLAUDE.md or another skill references a specific entry (see the
|
||||
`agnes skills show agnes-data-querying` mention in the CLAUDE.md
|
||||
template's BigQuery section). Bulk-copying every skill into
|
||||
`~/.claude/skills/agnes/` at setup time was an interactive opinion
|
||||
question with no obvious right answer; on-demand lookup is the
|
||||
one-size-fits-all default.
|
||||
"""
|
||||
return [
|
||||
"",
|
||||
|
|
@ -387,25 +391,6 @@ def _diagnose_skills_lines(*, diagnose_num: str, skills_num: str) -> list[str]:
|
|||
" - non-admin roles (e.g. `analyst`) that don't have grants to read",
|
||||
" the system schema even on populated instances.",
|
||||
" Only flag actual yellow/red checks (api / duckdb_state / users).",
|
||||
"",
|
||||
f"{skills_num}) Skills (ask the user — this is the last interactive step before Confirm):",
|
||||
" The CLI ships with reusable markdown skills (setup, connectors,",
|
||||
" corporate-memory, deploy, notifications, security, troubleshoot),",
|
||||
" listable via `agnes skills list` and readable via `agnes skills show <name>`.",
|
||||
"",
|
||||
" Ask the user verbatim: \"Do you want me to copy the Agnes skills into",
|
||||
" ~/.claude/skills/agnes/ so they are always loaded in Claude Code,",
|
||||
" or should I pull them on-demand via `agnes skills show <name>` when",
|
||||
" needed?\"",
|
||||
"",
|
||||
" Wait for the user's answer before moving to Confirm.",
|
||||
"",
|
||||
" If they say copy:",
|
||||
" mkdir -p ~/.claude/skills/agnes",
|
||||
" for s in $(agnes skills list | awk '{print $1}'); do",
|
||||
" agnes skills show \"$s\" > ~/.claude/skills/agnes/\"$s\".md",
|
||||
" done",
|
||||
" echo \"Copied skills to ~/.claude/skills/agnes/\"",
|
||||
]
|
||||
|
||||
|
||||
|
|
@ -417,16 +402,16 @@ def _finale_lines(*, confirm_step_num: str, has_ca: bool) -> list[str]:
|
|||
the trust block ran (`has_ca`). The marketplace clone bullet is
|
||||
unconditional now — preflight + marketplace are always emitted (Fix B
|
||||
in the 2026-05-10 init-report response). Init + catalog + diagnose +
|
||||
skills + version always render, so their bullets are unconditional."""
|
||||
version always render, so their bullets are unconditional."""
|
||||
bullets = [
|
||||
" - `agnes --version` output",
|
||||
" - First few lines of `agnes catalog` (tables you can see)",
|
||||
" - Confirmation that `./CLAUDE.md` and `./AGNES_WORKSPACE.md` exist",
|
||||
" - Confirmation that `./.claude/settings.json` contains SessionStart/End hooks",
|
||||
" - The `agnes diagnose` overall status",
|
||||
" - Whether skills were copied or left on-demand",
|
||||
" - Confirmation that `~/.agnes/marketplace/.git/` exists "
|
||||
"(the marketplace clone) and that any granted plugins installed",
|
||||
"(the marketplace clone) and that any plugins currently in the "
|
||||
"served stack installed cleanly",
|
||||
" - Reminder to scroll to the connector cards on /home and connect "
|
||||
"Asana / Google Workspace / Atlassian (those run separately from this script)",
|
||||
]
|
||||
|
|
@ -493,13 +478,26 @@ def _marketplace_block(
|
|||
) -> list[str]:
|
||||
"""Build the marketplace + plugin-install block.
|
||||
|
||||
`plugin_install_names` may be empty: registering the per-user
|
||||
marketplace clone with Claude Code is useful even when the operator
|
||||
has zero plugin grants, because it pre-wires the SessionStart hook
|
||||
and the grant flow — admin grants land on the next Claude Code
|
||||
session without re-running setup. The block copy adapts for the
|
||||
empty case so the comment-bullet doesn't promise plugin installs
|
||||
that won't happen.
|
||||
`plugin_install_names` is the user's current *served stack* as
|
||||
computed by `src/marketplace_filter.py:resolve_user_marketplace` —
|
||||
i.e.::
|
||||
|
||||
(admin_RBAC ∩ /marketplace subscriptions)
|
||||
∪ system-mandatory plugins (admin-pinned, auto-applied)
|
||||
∪ Flea market installs (skills/agents bundled, plugins standalone)
|
||||
|
||||
May be empty: the served stack is curated by the user on the
|
||||
`/marketplace` page (admin grants are eligibility only — the user
|
||||
opts in via "Add to stack") plus whatever the admin pinned as
|
||||
system-mandatory plus the user's own Flea market picks. A brand-new
|
||||
account with no system plugins and no curation has an empty stack
|
||||
until something lands in any of those three buckets.
|
||||
|
||||
Registering the marketplace clone is unconditional regardless —
|
||||
Claude Code learns about the `agnes` marketplace at bootstrap, and
|
||||
the moment the served stack becomes non-empty, the user's next
|
||||
`/update-agnes-plugins` run installs the diff. No need to re-run
|
||||
setup when the stack changes server-side.
|
||||
|
||||
`step_num` is parameterized because step ordering shifted between
|
||||
layouts (this block now runs before diagnose/skills, so it's step 5
|
||||
|
|
@ -555,29 +553,41 @@ def _marketplace_block(
|
|||
"""
|
||||
has_plugins = bool(plugin_install_names)
|
||||
header = (
|
||||
"Register the Agnes Claude Code marketplace and install plugins:"
|
||||
"Register the Agnes Claude Code marketplace and install your current stack:"
|
||||
if has_plugins
|
||||
else "Register the Agnes Claude Code marketplace (no plugins granted yet):"
|
||||
else "Register the Agnes Claude Code marketplace (your stack is empty for now):"
|
||||
)
|
||||
bullet_5 = (
|
||||
" # 5. install every plugin listed in the served manifest"
|
||||
" # 5. install every plugin currently in your served stack"
|
||||
if has_plugins
|
||||
else " # 5. (no plugins to install — your account has zero grants)"
|
||||
else " # 5. (your served stack is empty right now — nothing to install yet)"
|
||||
)
|
||||
if has_plugins:
|
||||
trailer = [
|
||||
" These run non-interactively. After they finish, tell the user to /exit",
|
||||
" and run `claude` again so the new plugins load. From then on, the",
|
||||
" SessionStart hook keeps the marketplace clone in sync via",
|
||||
" `agnes refresh-marketplace --quiet` on every Claude Code session.",
|
||||
" and run `claude` again so the new plugins load.",
|
||||
"",
|
||||
" Stack curation lives on the server — visit /marketplace to add or",
|
||||
" remove items (admin-granted opt-ins, system plugins your org pinned,",
|
||||
" and uploads from the Flea market tab). The SessionStart hook checks",
|
||||
" for server-side changes on every Claude Code session and, when it",
|
||||
" detects a diff, prompts you to run `/update-agnes-plugins` inside",
|
||||
" Claude Code to apply it. No silent auto-install at session start —",
|
||||
" the slash command runs full reconcile with output visible in the",
|
||||
" transcript, under your control.",
|
||||
]
|
||||
else:
|
||||
trailer = [
|
||||
" Your account has no plugin grants right now, but registering the",
|
||||
" marketplace anyway pre-wires the SessionStart hook. When an admin",
|
||||
" grants you a plugin later, `agnes refresh-marketplace` (run by the",
|
||||
" hook on every Claude Code session) will install it automatically —",
|
||||
" no need to re-run this setup script.",
|
||||
" Your served stack is empty right now — nothing to install yet.",
|
||||
" Registering the marketplace clone anyway pre-wires Claude Code so",
|
||||
" future picks land cleanly: visit /marketplace to add plugins to",
|
||||
" your stack (admin-granted opt-ins, uploads from the Flea market",
|
||||
" tab), or wait for your admin to pin something as system-mandatory.",
|
||||
"",
|
||||
" When your stack becomes non-empty, the SessionStart hook detects",
|
||||
" the change on the next Claude Code session and prompts you to run",
|
||||
" `/update-agnes-plugins` inside Claude Code to install the new",
|
||||
" items. No need to re-run this setup script.",
|
||||
]
|
||||
return [
|
||||
"",
|
||||
|
|
@ -670,29 +680,24 @@ def _preamble_lines(*, has_ca: bool) -> list[str]:
|
|||
return lines
|
||||
|
||||
|
||||
def _step_numbers(*, has_skills: bool = True) -> dict[str, str]:
|
||||
def _step_numbers() -> dict[str, str]:
|
||||
"""Compute the step numbers for the unified layout.
|
||||
|
||||
Returns a dict keyed by logical step name; values are stringified
|
||||
1-based step numbers (preserving the existing string-based helper API
|
||||
so call sites stay diff-minimal).
|
||||
|
||||
Mandatory steps (always emitted): install (1), init (2), catalog (3),
|
||||
preflight (4), marketplace (5), mcp_servers (6), diagnose (7), skills
|
||||
(8), confirm (9). Preflight + marketplace + mcp_servers are all
|
||||
always-on:
|
||||
- Marketplace registration is useful even when the operator has
|
||||
zero plugin grants (SessionStart hook reconciles future grants
|
||||
automatically).
|
||||
Steps (always emitted): install (1), init (2), catalog (3),
|
||||
preflight (4), marketplace (5), mcp_servers (6), diagnose (7),
|
||||
confirm (8). Preflight + marketplace + mcp_servers are all always-on:
|
||||
- Marketplace registration is useful even with an empty served
|
||||
stack (future admin grants / system pins / Flea installs land
|
||||
cleanly without re-running setup).
|
||||
- Atlassian MCP registration is unattended-safe (hosted Remote MCP
|
||||
with Claude Code-managed OAuth) and applies to every analyst
|
||||
whose work touches Jira/Confluence — high enough hit rate to
|
||||
justify default-on.
|
||||
|
||||
`has_skills` is kept as a parameter for future flexibility; default
|
||||
True (the Resolved-Question section in the original plan settled on
|
||||
always-on).
|
||||
|
||||
Step-0 (TLS trust block) sits outside this numbering — it is gated by
|
||||
has_ca and has its own "0)" header rendered inside the trust block
|
||||
helper.
|
||||
|
|
@ -702,16 +707,12 @@ def _step_numbers(*, has_skills: bool = True) -> dict[str, str]:
|
|||
marketplace = str(n); n += 1
|
||||
mcp_servers = str(n); n += 1
|
||||
diagnose = str(n); n += 1
|
||||
skills = str(n) if has_skills else ""
|
||||
if has_skills:
|
||||
n += 1
|
||||
confirm = str(n)
|
||||
return {
|
||||
"preflight": preflight,
|
||||
"marketplace": marketplace,
|
||||
"mcp_servers": mcp_servers,
|
||||
"diagnose": diagnose,
|
||||
"skills": skills,
|
||||
"confirm": confirm,
|
||||
}
|
||||
|
||||
|
|
@ -729,10 +730,12 @@ def resolve_lines(
|
|||
`{server_url}` and `{token}` as placeholders for click-time JS
|
||||
substitution (or for `render_setup_instructions()` below).
|
||||
|
||||
When `plugin_install_names` is empty/None, the output matches the
|
||||
six-step no-marketplace layout (Confirm = step 6). When non-empty, a
|
||||
step-4 pre-flight + step-5 marketplace block are inserted and Confirm
|
||||
becomes step 8.
|
||||
The layout is the same regardless of `plugin_install_names`: install
|
||||
(1), init (2), catalog (3), preflight (4), marketplace (5),
|
||||
mcp_servers (6), diagnose (7), confirm (8). The marketplace block's
|
||||
copy adapts to an empty served stack but the step is always emitted
|
||||
so future stack changes (admin grants, system pins, Flea installs)
|
||||
land cleanly without re-running setup.
|
||||
|
||||
`ca_pem` (PEM-encoded fullchain of the Agnes server's TLS cert) gates
|
||||
the cross-platform step-0 trust-bootstrap block AND switches step 1 to
|
||||
|
|
@ -749,14 +752,10 @@ def resolve_lines(
|
|||
names = list(plugin_install_names or [])
|
||||
has_ca = bool(ca_pem and ca_pem.strip())
|
||||
|
||||
# Step layout. Preflight + marketplace go BEFORE diagnose/skills so the
|
||||
# human-loop skills question is the last step before Confirm. Both are
|
||||
# always emitted: the marketplace registration is useful even with zero
|
||||
# plugin grants (the SessionStart hook reconciles future admin grants
|
||||
# automatically without re-running setup). `_step_numbers` returns the
|
||||
# renumbered step labels in one place — no branch on every helper — so
|
||||
# the layout is unambiguous and trivially extendable.
|
||||
steps = _step_numbers(has_skills=True)
|
||||
# Step layout — single fixed shape; `_step_numbers` returns the
|
||||
# renumbered step labels in one place so the layout is unambiguous
|
||||
# and trivially extendable when a future step is added.
|
||||
steps = _step_numbers()
|
||||
|
||||
lines: list[str] = []
|
||||
if has_ca:
|
||||
|
|
@ -767,10 +766,9 @@ def resolve_lines(
|
|||
lines.extend(_preflight_block(steps["preflight"])) # 4
|
||||
lines.extend(_marketplace_block(names, step_num=steps["marketplace"])) # 5
|
||||
lines.extend(_mcp_servers_block(steps["mcp_servers"])) # 6
|
||||
# Diagnose + skills come AFTER marketplace + MCP wiring.
|
||||
lines.extend(_diagnose_skills_lines(
|
||||
diagnose_num=steps["diagnose"], skills_num=steps["skills"],
|
||||
))
|
||||
# Diagnose runs AFTER marketplace + MCP wiring so it doubles as a
|
||||
# final smoke test, not a pre-install gate.
|
||||
lines.extend(_diagnose_lines(diagnose_num=steps["diagnose"])) # 7
|
||||
lines.append("")
|
||||
lines.extend(_finale_lines(
|
||||
confirm_step_num=steps["confirm"],
|
||||
|
|
|
|||
174
cli/commands/capture_session.py
Normal file
174
cli/commands/capture_session.py
Normal file
|
|
@ -0,0 +1,174 @@
|
|||
"""`agnes capture-session` — SessionStart hook helper.
|
||||
|
||||
Reads the Claude Code hook payload from stdin (a JSON object containing
|
||||
``transcript_path``), extracts the absolute path to the current session's
|
||||
``.jsonl`` transcript, and appends it to ``<workspace>/.claude/agnes-sessions.txt``.
|
||||
|
||||
The queue file feeds ``agnes push``: rather than reverse-engineer Claude
|
||||
Code's cwd-to-folder encoding (an internal implementation detail), we use
|
||||
the ``transcript_path`` field of the hook stdin JSON, which is part of the
|
||||
documented hook contract.
|
||||
|
||||
Failure modes — silent exit code 0 in all cases, since this command runs
|
||||
inside a SessionStart hook chain and a noisy failure would clutter Claude
|
||||
Code's startup output:
|
||||
- stdin not JSON → no-op
|
||||
- JSON missing ``transcript_path`` → no-op
|
||||
- ``transcript_path`` empty → no-op
|
||||
- Workspace ``.claude/`` not writable → no-op (best-effort, hook continues)
|
||||
|
||||
Diagnostic stderr output only when ``--verbose`` is set, for debugging
|
||||
hook misconfiguration. The hook command in ``cli/lib/hooks.py`` does NOT
|
||||
pass ``--verbose`` in production.
|
||||
|
||||
Operability breadcrumb: every invocation (success OR silent-failure path)
|
||||
appends one line to ``<workspace>/.claude/agnes-capture-session.log``
|
||||
with the timestamp and outcome. ``agnes diagnose`` (and ad-hoc ops
|
||||
inspection) can read the tail of this log to flag "hook is firing but
|
||||
queue stays empty" — without it, an upstream Claude Code contract change
|
||||
(e.g. stdin payload schema shifts) is invisible to operators because the
|
||||
hook always exits 0 (David's #11 from the PR review).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
import typer
|
||||
|
||||
from cli.lib.private_list import is_private
|
||||
from cli.lib.session_queue import append_to_queue
|
||||
|
||||
|
||||
_BREADCRUMB_FILENAME = "agnes-capture-session.log"
|
||||
_BREADCRUMB_MAX_BYTES = 256 * 1024 # 256 KiB rolling cap; ~3000 lines
|
||||
|
||||
|
||||
def _record_breadcrumb(workspace: Path, outcome: str, detail: str = "") -> None:
|
||||
"""Append one TSV line to the capture-session breadcrumb log.
|
||||
|
||||
Format: ``<iso_ts>\\t<outcome>\\t<detail>``. Outcomes:
|
||||
``ok`` (queued), ``private_skip`` (matched private list),
|
||||
``empty_stdin``, ``bad_json``, ``not_object``, ``no_transcript_path``,
|
||||
``stdin_read_error``, ``write_error``.
|
||||
|
||||
Best-effort: a failure here MUST NOT escape — this is the hook's
|
||||
silent-failure observability layer; raising would defeat the point.
|
||||
Rolls the log file when it crosses 256 KiB so it doesn't grow
|
||||
unboundedly on a long-lived workspace.
|
||||
"""
|
||||
try:
|
||||
claude_dir = workspace / ".claude"
|
||||
if not claude_dir.exists():
|
||||
# Don't materialize .claude/ in non-Agnes directories — same
|
||||
# rationale as private_list._claude_dir_readonly. If the dir
|
||||
# doesn't exist, capture-session is a no-op anyway.
|
||||
return
|
||||
path = claude_dir / _BREADCRUMB_FILENAME
|
||||
try:
|
||||
if path.stat().st_size > _BREADCRUMB_MAX_BYTES:
|
||||
path.unlink()
|
||||
except OSError:
|
||||
pass
|
||||
line = "\t".join([
|
||||
datetime.now(timezone.utc).isoformat(timespec="seconds"),
|
||||
outcome,
|
||||
detail.replace("\t", " ").replace("\n", " ")[:200],
|
||||
]) + "\n"
|
||||
with open(path, "a", encoding="utf-8") as f:
|
||||
f.write(line)
|
||||
except Exception:
|
||||
# Truly best-effort: any breadcrumb failure is swallowed so the
|
||||
# capture-session contract stays "exit 0 always".
|
||||
pass
|
||||
|
||||
|
||||
capture_session_app = typer.Typer(
|
||||
help="Capture the current Claude Code session's transcript path into the upload queue.",
|
||||
)
|
||||
|
||||
|
||||
@capture_session_app.callback(invoke_without_command=True)
|
||||
def capture_session(
|
||||
verbose: bool = typer.Option(
|
||||
False, "--verbose", help="Log diagnostic info to stderr (off by default)."
|
||||
),
|
||||
) -> None:
|
||||
"""Read SessionStart hook stdin JSON and append (session_id, transcript_path) to queue.
|
||||
|
||||
Honors the private list: if the session_id is already marked private
|
||||
(e.g. user ran `/agnes-private` before this hook chain reached
|
||||
capture-session), the queue write is skipped so the session never
|
||||
enters the upload pipeline.
|
||||
"""
|
||||
workspace = Path(os.environ.get("AGNES_LOCAL_DIR", ".")).resolve()
|
||||
|
||||
try:
|
||||
raw = sys.stdin.read()
|
||||
except Exception as exc:
|
||||
if verbose:
|
||||
typer.echo(f"capture-session: stdin read failed: {exc}", err=True)
|
||||
_record_breadcrumb(workspace, "stdin_read_error", str(exc))
|
||||
return
|
||||
|
||||
if not raw.strip():
|
||||
if verbose:
|
||||
typer.echo("capture-session: empty stdin, nothing to capture.", err=True)
|
||||
_record_breadcrumb(workspace, "empty_stdin")
|
||||
return
|
||||
|
||||
try:
|
||||
payload = json.loads(raw)
|
||||
except json.JSONDecodeError as exc:
|
||||
if verbose:
|
||||
typer.echo(f"capture-session: stdin not valid JSON: {exc}", err=True)
|
||||
_record_breadcrumb(workspace, "bad_json", str(exc))
|
||||
return
|
||||
|
||||
if not isinstance(payload, dict):
|
||||
if verbose:
|
||||
typer.echo("capture-session: payload is not a JSON object.", err=True)
|
||||
_record_breadcrumb(workspace, "not_object", type(payload).__name__)
|
||||
return
|
||||
|
||||
transcript_path = payload.get("transcript_path")
|
||||
if not transcript_path or not isinstance(transcript_path, str):
|
||||
if verbose:
|
||||
typer.echo("capture-session: payload missing transcript_path.", err=True)
|
||||
_record_breadcrumb(workspace, "no_transcript_path")
|
||||
return
|
||||
|
||||
session_id = payload.get("session_id") or ""
|
||||
if not isinstance(session_id, str):
|
||||
session_id = ""
|
||||
|
||||
# Race protection: user may have run /agnes-private BEFORE this hook
|
||||
# got a chance to write. Skip the queue append in that case — the
|
||||
# private list is the authoritative source for "do not upload".
|
||||
if session_id and is_private(workspace, session_id):
|
||||
if verbose:
|
||||
typer.echo(
|
||||
f"capture-session: session {session_id} is private; skipping queue.",
|
||||
err=True,
|
||||
)
|
||||
_record_breadcrumb(workspace, "private_skip", session_id)
|
||||
return
|
||||
|
||||
try:
|
||||
append_to_queue(workspace, session_id, transcript_path)
|
||||
except OSError as exc:
|
||||
if verbose:
|
||||
typer.echo(
|
||||
f"capture-session: append to queue failed ({workspace}): {exc}",
|
||||
err=True,
|
||||
)
|
||||
_record_breadcrumb(workspace, "write_error", str(exc))
|
||||
return
|
||||
|
||||
_record_breadcrumb(workspace, "ok", session_id or "(no-sid)")
|
||||
if verbose:
|
||||
typer.echo(f"capture-session: queued {transcript_path}", err=True)
|
||||
55
cli/commands/mark_private.py
Normal file
55
cli/commands/mark_private.py
Normal file
|
|
@ -0,0 +1,55 @@
|
|||
"""`agnes mark-private` — mark the current Claude Code session as private.
|
||||
|
||||
Invoked by the `/agnes-private` slash command (deterministic ``!``-prefix
|
||||
direct bash, no AI in the loop). Reads ``CLAUDE_CODE_SESSION_ID`` from the
|
||||
environment — Claude Code sets this variable in every Bash/PowerShell
|
||||
subprocess it spawns (documented stable API).
|
||||
|
||||
Adds the session_id to ``<workspace>/.claude/agnes-sessions-private.txt``.
|
||||
That file is the authoritative source for "do not upload" — both
|
||||
``capture-session`` and ``push`` consult it. Adding the ID here is enough
|
||||
to keep the session out of the upload pipeline regardless of whether
|
||||
``capture-session`` already ran (race-safe by design — see
|
||||
``cli/lib/private_list.py`` docstring).
|
||||
|
||||
Refuses to run outside a Claude Code session (no ``CLAUDE_CODE_SESSION_ID``)
|
||||
to make accidental CLI invocations from a regular terminal obvious.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import typer
|
||||
|
||||
from cli.lib.private_list import add_private
|
||||
|
||||
|
||||
mark_private_app = typer.Typer(
|
||||
help="Mark the current Claude Code session as private — exclude it from `agnes push`.",
|
||||
)
|
||||
|
||||
|
||||
@mark_private_app.callback(invoke_without_command=True)
|
||||
def mark_private() -> None:
|
||||
"""Add CLAUDE_CODE_SESSION_ID to the workspace private list."""
|
||||
session_id = os.environ.get("CLAUDE_CODE_SESSION_ID", "").strip()
|
||||
if not session_id:
|
||||
typer.echo(
|
||||
"Error: CLAUDE_CODE_SESSION_ID is not set. "
|
||||
"Run this inside a Claude Code session (via /agnes-private).",
|
||||
err=True,
|
||||
)
|
||||
raise typer.Exit(1)
|
||||
|
||||
workspace = Path(os.environ.get("AGNES_LOCAL_DIR", ".")).resolve()
|
||||
newly_added = add_private(workspace, session_id)
|
||||
|
||||
if newly_added:
|
||||
typer.echo(
|
||||
f"Session {session_id} marked as private. "
|
||||
"Its transcript will not be uploaded by `agnes push`."
|
||||
)
|
||||
else:
|
||||
typer.echo(f"Session {session_id} is already marked as private. No change.")
|
||||
|
|
@ -1,28 +1,39 @@
|
|||
"""`agnes push` - upload session jsonl + CLAUDE.local.md to the server.
|
||||
"""`agnes push` — upload session jsonls + CLAUDE.local.md to the server.
|
||||
|
||||
Thin Typer wrapper extracted from the legacy `agnes sync --upload-only`
|
||||
path in `cli/commands/sync.py`. Used by:
|
||||
- Manual invocation: analyst types `agnes push` to force an upload.
|
||||
- SessionEnd hook: `agnes push --quiet 2>/dev/null || true` runs at the
|
||||
end of every Claude Code session in this workspace.
|
||||
The push command consumes a workspace-local queue file
|
||||
(``<workspace>/.claude/agnes-sessions.txt``) populated by the
|
||||
``agnes capture-session`` SessionStart hook. Each line is a TSV pair:
|
||||
``<session_id>\\t<transcript_path>``. The session_id lets push consult
|
||||
the private list (``cli/lib/private_list.py``) and skip uploads that
|
||||
the user explicitly marked via ``/agnes-private``.
|
||||
|
||||
Lazy on-disk contract: when there are no `user/sessions/*.jsonl` files
|
||||
and no `.claude/CLAUDE.local.md`, this command must NOT create
|
||||
`user/sessions/` (or any other directory). Tests pin the lazy mkdir
|
||||
contract so the empty-workspace case stays a true no-op on disk.
|
||||
Concurrency: a single-instance lock (``filelock`` via ``cli/lib/push_lock.py``)
|
||||
ensures only one push runs at a time. When the user closes several Claude
|
||||
Code sessions simultaneously, every SessionEnd hook fires its own
|
||||
``agnes push``; exactly one runs, the rest exit silently.
|
||||
|
||||
Errors render via `cli/error_render.py:render_error()` for typed-error
|
||||
shape consistency with `agnes pull`.
|
||||
Race protection: the queue file is atomically renamed to a snapshot before
|
||||
processing. SessionStart hooks that fire during the push window write to
|
||||
a freshly-created queue, so their entries aren't lost.
|
||||
|
||||
Task 18 will register `push_app` on the root Typer app and delete the
|
||||
legacy `agnes sync --upload-only` flag. Until then this module is
|
||||
callable only via direct import (which is exactly what the test does).
|
||||
Recovery: if a previous push crashed mid-run, its snapshot file persists.
|
||||
The next push picks it up before processing the current queue.
|
||||
|
||||
Private filter: even if a marked-private session_id slipped into the
|
||||
queue before ``/agnes-private`` was run, push re-checks the private list
|
||||
per entry. Skipped entries are audit-logged to
|
||||
``<workspace>/.claude/agnes-sessions-private-skipped.txt``.
|
||||
|
||||
Legacy fallback: the encoding-based ``list_session_files`` path remains
|
||||
available behind ``--legacy-scan`` for one-off backfills of sessions that
|
||||
predate the queue mechanism.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
import typer
|
||||
|
|
@ -30,23 +41,131 @@ import typer
|
|||
from cli.client import api_post
|
||||
from cli.config import get_server_url, get_token
|
||||
from cli.error_render import render_error
|
||||
from cli.lib.private_list import read_all_private
|
||||
from cli.lib.push_lock import acquire_or_skip
|
||||
from cli.lib.session_queue import (
|
||||
discard_snapshot,
|
||||
failed_log_path,
|
||||
find_recovery_snapshots,
|
||||
mark_failed_permanent,
|
||||
mark_private_skipped,
|
||||
mark_uploaded,
|
||||
queue_path,
|
||||
read_entries_from_snapshot,
|
||||
requeue_failed,
|
||||
snapshot_queue,
|
||||
uploaded_log_path,
|
||||
)
|
||||
|
||||
|
||||
push_app = typer.Typer(help="Upload sessions and CLAUDE.local.md to the server")
|
||||
|
||||
|
||||
def _collect_snapshots(workspace: Path) -> list[Path]:
|
||||
"""Recovery snapshots first (oldest first), then a fresh snapshot of the
|
||||
current queue. Either may be absent. Returns the list of snapshot paths
|
||||
to process in order.
|
||||
"""
|
||||
snapshots = find_recovery_snapshots(workspace)
|
||||
fresh = snapshot_queue(workspace)
|
||||
if fresh is not None:
|
||||
snapshots.append(fresh)
|
||||
return snapshots
|
||||
|
||||
|
||||
def _gather_entries_for_dry_run(workspace: Path) -> list[tuple[str, Path]]:
|
||||
"""Read (session_id, path) entries that *would* be uploaded without
|
||||
consuming the queue. Combines existing recovery snapshots + the
|
||||
current live queue. Does NOT rename the live queue (dry-run is
|
||||
read-only).
|
||||
"""
|
||||
out: list[tuple[str, Path]] = []
|
||||
seen: set[tuple[str, str]] = set()
|
||||
|
||||
def _add(entries: list[tuple[str, Path]]) -> None:
|
||||
for sid, p in entries:
|
||||
key = (sid, str(p))
|
||||
if key in seen:
|
||||
continue
|
||||
seen.add(key)
|
||||
out.append((sid, p))
|
||||
|
||||
for snap in find_recovery_snapshots(workspace):
|
||||
_add(read_entries_from_snapshot(snap))
|
||||
|
||||
live = queue_path(workspace)
|
||||
if live.exists():
|
||||
_add(read_entries_from_snapshot(live))
|
||||
|
||||
return out
|
||||
|
||||
|
||||
def _is_permanent_failure(info: dict) -> bool:
|
||||
"""True iff the server's response indicates a deterministic failure
|
||||
that retrying won't help. We treat 4xx (except 408 / 429) as
|
||||
permanent — 401 (token expired), 403 (RBAC denial), 413 (payload
|
||||
too large), 400 (validation error) all have the same property:
|
||||
re-uploading the same file produces the same answer, so a
|
||||
requeue-loop only wastes bytes and grows the queue forever. 5xx
|
||||
and network exceptions stay transient — those reflect server or
|
||||
transport state that can change between push runs.
|
||||
|
||||
408 Request Timeout and 429 Too Many Requests are flagged transient
|
||||
by the HTTP spec (RFC 7231 / RFC 6585); the server is telling us
|
||||
to back off and try again later, not that the request is invalid.
|
||||
"""
|
||||
status = info.get("status")
|
||||
if not isinstance(status, int):
|
||||
return False # network error / exception — transient
|
||||
if status in (408, 429):
|
||||
return False
|
||||
return 400 <= status < 500
|
||||
|
||||
|
||||
def _upload_one(transcript: Path) -> tuple[bool, dict]:
|
||||
"""Upload a single session jsonl. Returns (success, error_or_meta)."""
|
||||
if not transcript.exists():
|
||||
return False, {"file": transcript.name, "error": "file not found on disk"}
|
||||
try:
|
||||
with open(transcript, "rb") as fh:
|
||||
resp = api_post("/api/upload/sessions", files={"file": (transcript.name, fh)})
|
||||
except Exception as exc:
|
||||
return False, {"file": transcript.name, "error": str(exc)}
|
||||
if resp.status_code == 200:
|
||||
return True, {"file": transcript.name}
|
||||
return False, {"file": transcript.name, "status": resp.status_code}
|
||||
|
||||
|
||||
@push_app.callback(invoke_without_command=True)
|
||||
def push(
|
||||
quiet: bool = typer.Option(False, "--quiet", help="Suppress success stdout (errors still surface on stderr)"),
|
||||
as_json: bool = typer.Option(False, "--json", help="Emit a single JSON object summarizing the upload"),
|
||||
dry_run: bool = typer.Option(False, "--dry-run", help="List what would be uploaded without sending anything"),
|
||||
quiet: bool = typer.Option(
|
||||
False,
|
||||
"--quiet",
|
||||
help="Suppress success stdout (errors still surface on stderr).",
|
||||
),
|
||||
as_json: bool = typer.Option(
|
||||
False, "--json", help="Emit a single JSON object summarizing the upload."
|
||||
),
|
||||
dry_run: bool = typer.Option(
|
||||
False,
|
||||
"--dry-run",
|
||||
help="List what would be uploaded without sending anything.",
|
||||
),
|
||||
legacy_scan: bool = typer.Option(
|
||||
False,
|
||||
"--legacy-scan",
|
||||
help=(
|
||||
"Fallback: also include sessions found by the encoding-based scan "
|
||||
"of ~/.claude/projects/. Use for one-off backfill of sessions "
|
||||
"predating the queue mechanism. The /agnes-private list IS "
|
||||
"consulted — Claude Code names jsonls ``<session-id>.jsonl`` so "
|
||||
"the file stem provides the session id even for legacy entries."
|
||||
),
|
||||
),
|
||||
):
|
||||
"""Upload session jsonl + CLAUDE.local.md from ./user/sessions and ./.claude."""
|
||||
"""Upload queued session jsonls + CLAUDE.local.md from this workspace."""
|
||||
server_url = get_server_url()
|
||||
if not server_url:
|
||||
# `get_server_url()` falls back to a localhost default today, so this
|
||||
# branch is mostly a defensive guard for a future config change that
|
||||
# might return an empty string.
|
||||
typer.echo(
|
||||
render_error(0, {"detail": {
|
||||
"kind": "server_unreachable",
|
||||
|
|
@ -69,27 +188,49 @@ def push(
|
|||
|
||||
workspace = Path(os.environ.get("AGNES_LOCAL_DIR", ".")).resolve()
|
||||
local_md = workspace / ".claude" / "CLAUDE.local.md"
|
||||
|
||||
# Claude Code writes session jsonls to ~/.claude/projects/<encoded-cwd>/
|
||||
# — the encoding varies by Claude Code version (older: `/` -> `-`,
|
||||
# newer: all non-alphanumeric -> `-`). The helper tries both encodings
|
||||
# and also falls back to the legacy <workspace>/user/sessions/ for
|
||||
# setups that mirror sessions there explicitly. See
|
||||
# cli/lib/claude_sessions.py for details.
|
||||
from cli.lib.claude_sessions import list_session_files
|
||||
session_files = list_session_files(workspace)
|
||||
has_local_md = local_md.exists()
|
||||
|
||||
# ---- DRY RUN ----------------------------------------------------------
|
||||
if dry_run:
|
||||
candidates = _gather_entries_for_dry_run(workspace)
|
||||
private_ids = read_all_private(workspace)
|
||||
non_private = [(sid, p) for sid, p in candidates if not (sid and sid in private_ids)]
|
||||
private_skipped = [(sid, p) for sid, p in candidates if sid and sid in private_ids]
|
||||
|
||||
if legacy_scan:
|
||||
from cli.lib.claude_sessions import list_session_files
|
||||
seen = {str(p) for _sid, p in non_private}
|
||||
for p in list_session_files(workspace):
|
||||
if str(p) in seen:
|
||||
continue
|
||||
# Apply the private filter to legacy-scan candidates too.
|
||||
# Claude Code names jsonls ``<session-id>.jsonl``, so the
|
||||
# file stem IS the session id and we can apply the same
|
||||
# filter the queue path uses. Closes the gap David #8
|
||||
# raised: legacy-scan would otherwise upload everything
|
||||
# on disk, including sessions the user later marked
|
||||
# private.
|
||||
sid_from_path = p.stem
|
||||
if sid_from_path and sid_from_path in private_ids:
|
||||
private_skipped.append((sid_from_path, p))
|
||||
else:
|
||||
non_private.append((sid_from_path, p))
|
||||
seen.add(str(p))
|
||||
|
||||
plan = {
|
||||
"dry_run": True,
|
||||
"would_upload": {
|
||||
"sessions": [str(f) for f in session_files],
|
||||
"sessions": [str(p) for _sid, p in non_private],
|
||||
"local_md": str(local_md) if has_local_md else None,
|
||||
},
|
||||
"would_skip_private": [
|
||||
{"session_id": sid, "path": str(p)} for sid, p in private_skipped
|
||||
],
|
||||
"summary": {
|
||||
"sessions_count": len(session_files),
|
||||
"sessions_count": len(non_private),
|
||||
"private_skipped_count": len(private_skipped),
|
||||
"local_md_present": has_local_md,
|
||||
"uploaded_log": str(uploaded_log_path(workspace)),
|
||||
},
|
||||
}
|
||||
if as_json:
|
||||
|
|
@ -97,60 +238,154 @@ def push(
|
|||
return
|
||||
if quiet:
|
||||
return
|
||||
typer.echo(f"Dry run - would upload {len(session_files)} session file(s)")
|
||||
for f in session_files:
|
||||
typer.echo(f" {f}")
|
||||
typer.echo(f"Dry run - would upload {len(non_private)} session file(s)")
|
||||
for _sid, p in non_private:
|
||||
typer.echo(f" {p}")
|
||||
if private_skipped:
|
||||
typer.echo(f"Would skip {len(private_skipped)} private session(s):")
|
||||
for sid, p in private_skipped:
|
||||
typer.echo(f" [{sid}] {p}")
|
||||
if has_local_md:
|
||||
typer.echo(f"Would upload CLAUDE.local.md ({local_md})")
|
||||
else:
|
||||
typer.echo("No CLAUDE.local.md to upload")
|
||||
return
|
||||
|
||||
results = {"sessions": 0, "local_md": False, "errors": []}
|
||||
# ---- REAL RUN ---------------------------------------------------------
|
||||
# Acquire single-instance lock. Silent exit if another push is already
|
||||
# running — typical when several SessionEnd hooks fire at once.
|
||||
with acquire_or_skip(workspace) as lock:
|
||||
if lock is None:
|
||||
return # another push has the lock; this one no-ops
|
||||
|
||||
# Upload sessions. Per-file failures are recorded into `errors` and the
|
||||
# loop continues - one corrupt jsonl mustn't block the rest, and a
|
||||
# transient 5xx on one file shouldn't poison the whole upload.
|
||||
for f in session_files:
|
||||
try:
|
||||
with open(f, "rb") as fh:
|
||||
resp = api_post("/api/upload/sessions", files={"file": (f.name, fh)})
|
||||
if resp.status_code == 200:
|
||||
results["sessions"] += 1
|
||||
else:
|
||||
results["errors"].append(
|
||||
{"file": f.name, "status": resp.status_code}
|
||||
)
|
||||
except Exception as exc:
|
||||
results["errors"].append({"file": f.name, "error": str(exc)})
|
||||
results = {
|
||||
"sessions": 0,
|
||||
"local_md": False,
|
||||
"errors": [],
|
||||
"skipped": 0,
|
||||
"private_skipped": 0,
|
||||
"dropped_permanent": 0,
|
||||
}
|
||||
|
||||
# Upload CLAUDE.local.md
|
||||
if has_local_md:
|
||||
try:
|
||||
content = local_md.read_text(encoding="utf-8")
|
||||
resp = api_post("/api/upload/local-md", json={"content": content})
|
||||
if resp.status_code == 200:
|
||||
results["local_md"] = True
|
||||
else:
|
||||
results["errors"].append(
|
||||
{"file": "CLAUDE.local.md", "status": resp.status_code}
|
||||
)
|
||||
except Exception as exc:
|
||||
results["errors"].append({"file": "CLAUDE.local.md", "error": str(exc)})
|
||||
# Snapshot the private list once at the start of the run. Adding
|
||||
# a new private ID between snapshot and the per-entry check is
|
||||
# benign (worst case: one more upload of a session the user just
|
||||
# marked, which next push will skip).
|
||||
private_ids = read_all_private(workspace)
|
||||
|
||||
# Process snapshots: recovery (from prior crash) first, then fresh.
|
||||
snapshots = _collect_snapshots(workspace)
|
||||
all_failed_entries: list[tuple[str, Path]] = []
|
||||
|
||||
for snapshot in snapshots:
|
||||
entries = read_entries_from_snapshot(snapshot)
|
||||
failed_in_snapshot: list[tuple[str, Path]] = []
|
||||
now = datetime.now(timezone.utc)
|
||||
for session_id, transcript in entries:
|
||||
if session_id and session_id in private_ids:
|
||||
# Skip private; audit-log and move on. Do not requeue —
|
||||
# this is the user's explicit "do not upload" intent.
|
||||
mark_private_skipped(workspace, session_id, transcript, now)
|
||||
results["private_skipped"] += 1
|
||||
continue
|
||||
ok, info = _upload_one(transcript)
|
||||
if ok:
|
||||
results["sessions"] += 1
|
||||
mark_uploaded(workspace, transcript, now)
|
||||
else:
|
||||
if info.get("error") == "file not found on disk":
|
||||
# Stale queue entry (Claude Code auto-cleanup deleted
|
||||
# the jsonl). Skip without re-queuing — retry would
|
||||
# loop forever.
|
||||
results["skipped"] += 1
|
||||
results["errors"].append(info)
|
||||
elif _is_permanent_failure(info):
|
||||
# 4xx (except 408 / 429): server says this request
|
||||
# will never succeed. Drop + audit-log instead of
|
||||
# requeueing forever. Closes the prior loop bug
|
||||
# where 401 (token expired), 413 (file too large),
|
||||
# 400 (validation), etc. cycled through every
|
||||
# push run, growing the queue without bound.
|
||||
mark_failed_permanent(
|
||||
workspace, session_id, transcript,
|
||||
info["status"], now,
|
||||
)
|
||||
results["dropped_permanent"] += 1
|
||||
results["errors"].append(info)
|
||||
else:
|
||||
# 5xx, 408, 429, network errors — genuinely
|
||||
# transient. Requeue for the next push.
|
||||
results["errors"].append(info)
|
||||
failed_in_snapshot.append((session_id, transcript))
|
||||
# Failed entries from this snapshot get re-queued on the live file.
|
||||
all_failed_entries.extend(failed_in_snapshot)
|
||||
discard_snapshot(snapshot)
|
||||
|
||||
if all_failed_entries:
|
||||
requeue_failed(workspace, all_failed_entries)
|
||||
|
||||
# Optional: legacy scan to backfill sessions outside the queue.
|
||||
# Honors the private list — Claude Code names jsonls
|
||||
# ``<session-id>.jsonl``, so the file stem IS the session id and
|
||||
# we can apply the same filter the queue path uses. Without this
|
||||
# filter, an operator running ``--legacy-scan`` to backfill old
|
||||
# sessions would silently upload every transcript on disk,
|
||||
# including ones the user later marked private (David's #8 from
|
||||
# the PR review).
|
||||
if legacy_scan:
|
||||
from cli.lib.claude_sessions import list_session_files
|
||||
private_ids = read_all_private(workspace)
|
||||
for transcript in list_session_files(workspace):
|
||||
sid_from_path = transcript.stem
|
||||
if sid_from_path and sid_from_path in private_ids:
|
||||
mark_private_skipped(workspace, sid_from_path, str(transcript))
|
||||
results["private_skipped"] = results.get("private_skipped", 0) + 1
|
||||
continue
|
||||
ok, info = _upload_one(transcript)
|
||||
if ok:
|
||||
results["sessions"] += 1
|
||||
mark_uploaded(workspace, transcript, datetime.now(timezone.utc))
|
||||
else:
|
||||
results["errors"].append(info)
|
||||
|
||||
# Upload CLAUDE.local.md.
|
||||
if has_local_md:
|
||||
try:
|
||||
content = local_md.read_text(encoding="utf-8")
|
||||
resp = api_post("/api/upload/local-md", json={"content": content})
|
||||
if resp.status_code == 200:
|
||||
results["local_md"] = True
|
||||
else:
|
||||
results["errors"].append(
|
||||
{"file": "CLAUDE.local.md", "status": resp.status_code}
|
||||
)
|
||||
except Exception as exc:
|
||||
results["errors"].append({"file": "CLAUDE.local.md", "error": str(exc)})
|
||||
|
||||
# Render output.
|
||||
if as_json:
|
||||
typer.echo(json.dumps(results))
|
||||
return
|
||||
|
||||
if quiet:
|
||||
# Quiet mode is for the SessionEnd hook - silent on success so
|
||||
# Claude Code's stdout stays clean. Errors still flow to stderr.
|
||||
if results["errors"]:
|
||||
for e in results["errors"]:
|
||||
typer.echo(f"warn: {e}", err=True)
|
||||
return
|
||||
|
||||
typer.echo(f"Uploaded {results['sessions']} sessions")
|
||||
if results["private_skipped"]:
|
||||
typer.echo(
|
||||
f"Skipped {results['private_skipped']} private session(s) "
|
||||
f"(see .claude/agnes-sessions-private-skipped.txt)"
|
||||
)
|
||||
if results["skipped"]:
|
||||
typer.echo(f"Skipped {results['skipped']} stale queue entries (file missing)")
|
||||
if results["dropped_permanent"]:
|
||||
typer.echo(
|
||||
f"Dropped {results['dropped_permanent']} session(s) with permanent failure "
|
||||
f"(see .claude/agnes-sessions-failed.txt)"
|
||||
)
|
||||
if results["local_md"]:
|
||||
typer.echo("Uploaded CLAUDE.local.md")
|
||||
if results["errors"]:
|
||||
|
|
|
|||
|
|
@ -15,6 +15,7 @@ from typing import Optional, Union
|
|||
import typer
|
||||
|
||||
from cli.config import _config_dir, get_server_url
|
||||
from cli.lib.hooks import maybe_refresh_claude_hooks
|
||||
from cli.update_check import UpdateInfo, check, format_outdated_notice
|
||||
|
||||
self_upgrade_app = typer.Typer(
|
||||
|
|
@ -169,6 +170,28 @@ def _smoke_test_new_binary(install_method: str, expected_version: str) -> tuple[
|
|||
return False, f"{type(e).__name__}: {e}"
|
||||
|
||||
|
||||
def _try_refresh_hooks(*, quiet: bool) -> None:
|
||||
"""Best-effort idempotent refresh of the workspace's Claude Code hooks.
|
||||
|
||||
Resolves the workspace via ``AGNES_LOCAL_DIR`` (set by Claude Code's
|
||||
hook subprocess to the workspace root) or the current working directory.
|
||||
Delegates the actual decision to :func:`maybe_refresh_claude_hooks`,
|
||||
which guards against writing into non-Agnes directories.
|
||||
|
||||
Swallows any exception — a partially-broken settings.json or a
|
||||
permissions issue must not flip the exit code of a successful
|
||||
upgrade. When ``quiet`` is False the failure is surfaced to stderr
|
||||
so an operator running ``agnes self-upgrade`` interactively still sees
|
||||
it; under ``--quiet`` (the SessionStart case) it stays silent.
|
||||
"""
|
||||
workspace = Path(os.environ.get("AGNES_LOCAL_DIR", ".")).resolve()
|
||||
try:
|
||||
maybe_refresh_claude_hooks(workspace)
|
||||
except Exception as exc: # pragma: no cover — defensive
|
||||
if not quiet:
|
||||
sys.stderr.write(f"agnes self-upgrade: hook refresh failed: {exc}\n")
|
||||
|
||||
|
||||
def _resolve_info(force: bool) -> Union[UpdateInfo, _Unreachable, None]:
|
||||
"""Returns:
|
||||
UpdateInfo — install this wheel
|
||||
|
|
@ -282,9 +305,24 @@ def self_upgrade(
|
|||
raise typer.Exit(1)
|
||||
|
||||
if info is None:
|
||||
raise typer.Exit(0) # nothing to do, silent
|
||||
# CLI already current — still attempt hook refresh in case the
|
||||
# workspace was initialized on an older CLI whose hook layout
|
||||
# has since changed (e.g. v0.48 → v0.49 introduced the
|
||||
# capture-session SessionStart entry). The refresh is a no-op
|
||||
# for directories that don't look like Agnes workspaces, so
|
||||
# an `agnes self-upgrade` invoked from ~/ won't write there.
|
||||
_try_refresh_hooks(quiet=quiet)
|
||||
raise typer.Exit(0)
|
||||
|
||||
rc = _do_install_with_smoke_and_rollback(info, quiet=quiet)
|
||||
if rc == 0:
|
||||
# After a successful install of the new wheel, refresh the
|
||||
# workspace hooks so any wire-format change in the new release
|
||||
# lands on the next session-start without re-running
|
||||
# `agnes init`. Failure here must not turn a successful
|
||||
# upgrade into a non-zero exit — the rollback path has already
|
||||
# finished. Errors are surfaced on stderr only.
|
||||
_try_refresh_hooks(quiet=quiet)
|
||||
raise typer.Exit(rc)
|
||||
finally:
|
||||
if prior_sentinel is None:
|
||||
|
|
|
|||
62
cli/commands/statusline.py
Normal file
62
cli/commands/statusline.py
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
"""`agnes statusline` — Claude Code statusLine helper.
|
||||
|
||||
Claude Code's ``statusLine`` setting in ``settings.json`` invokes a shell
|
||||
command on every status-bar render and displays the first line of its
|
||||
stdout. The session payload (a JSON object containing ``session_id``,
|
||||
``transcript_path``, ``workspace``, ``model``, etc.) arrives on stdin.
|
||||
|
||||
This command:
|
||||
1. Reads the JSON payload from stdin.
|
||||
2. Extracts ``session_id``.
|
||||
3. Checks the workspace private list.
|
||||
4. Prints ``🔒 agnes-private`` if the session is private; otherwise prints
|
||||
nothing (empty status bar segment — lets other tools paint the rest).
|
||||
|
||||
Performance: this is invoked on EVERY status-bar refresh, which can be
|
||||
once per second or more. The implementation is intentionally minimal —
|
||||
stdlib-only, single file read, no API calls, no heavy imports.
|
||||
|
||||
All failure modes (malformed JSON, missing session_id, unreadable list)
|
||||
end with exit 0 + empty stdout. Polluting Claude Code's status bar with
|
||||
errors would be worse than a missing indicator.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import typer
|
||||
|
||||
from cli.lib.private_list import is_private
|
||||
|
||||
|
||||
statusline_app = typer.Typer(
|
||||
help="Status-line helper for Claude Code — prints '🔒 agnes-private' when the current session is private.",
|
||||
)
|
||||
|
||||
|
||||
@statusline_app.callback(invoke_without_command=True)
|
||||
def statusline() -> None:
|
||||
"""Read stdin session JSON; emit private indicator if the session is private."""
|
||||
try:
|
||||
raw = sys.stdin.read()
|
||||
payload = json.loads(raw) if raw.strip() else {}
|
||||
except (json.JSONDecodeError, OSError):
|
||||
return # silent — never poison the status bar
|
||||
|
||||
if not isinstance(payload, dict):
|
||||
return
|
||||
|
||||
session_id = payload.get("session_id")
|
||||
if not isinstance(session_id, str) or not session_id:
|
||||
return
|
||||
|
||||
workspace = Path(os.environ.get("AGNES_LOCAL_DIR", ".")).resolve()
|
||||
try:
|
||||
if is_private(workspace, session_id):
|
||||
typer.echo("🔒 agnes-private")
|
||||
except OSError:
|
||||
return # filesystem hiccup — empty output is better than a crash
|
||||
|
|
@ -33,20 +33,30 @@ from pathlib import Path
|
|||
# `/<command>` slug exposed to Claude Code).
|
||||
_MANAGED_COMMANDS: tuple[tuple[str, str], ...] = (
|
||||
("update-agnes-plugins.md", "update-agnes-plugins.md"),
|
||||
("agnes-private.md", "agnes-private.md"),
|
||||
)
|
||||
|
||||
|
||||
# Defensive fallback used when the bundled template is missing on disk
|
||||
# (broken install, stripped-down test environment). Mirrors the
|
||||
# `agnes_workspace_template.txt` fallback in `cli/commands/init.py` —
|
||||
# better to write a usable stub than to crash `agnes init`.
|
||||
_FALLBACK_BODY = (
|
||||
"---\n"
|
||||
"description: Update Agnes marketplace plugins to latest versions\n"
|
||||
"---\n"
|
||||
"\n"
|
||||
"Run `agnes refresh-marketplace` and report the output.\n"
|
||||
)
|
||||
# Defensive fallbacks used when the bundled template is missing on disk
|
||||
# (broken install, stripped-down test environment). Keyed by source
|
||||
# template filename so a missing `agnes-private.md` doesn't get
|
||||
# clobbered with `update-agnes-plugins` content.
|
||||
_FALLBACK_BODIES: dict[str, str] = {
|
||||
"update-agnes-plugins.md": (
|
||||
"---\n"
|
||||
"description: Update Agnes marketplace plugins to latest versions\n"
|
||||
"---\n"
|
||||
"\n"
|
||||
"Run `agnes refresh-marketplace` and report the output.\n"
|
||||
),
|
||||
"agnes-private.md": (
|
||||
"---\n"
|
||||
"description: Mark the current Claude Code session as private\n"
|
||||
"---\n"
|
||||
"\n"
|
||||
"!`agnes mark-private`\n"
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
def _templates_dir() -> Path:
|
||||
|
|
@ -78,5 +88,5 @@ def install_claude_commands(workspace: Path) -> None:
|
|||
f"{source_path} missing; writing defensive fallback.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
body = _FALLBACK_BODY
|
||||
body = _FALLBACK_BODIES.get(source_name, "")
|
||||
(commands_dir / dest_name).write_text(body, encoding="utf-8")
|
||||
|
|
|
|||
162
cli/lib/hooks.py
162
cli/lib/hooks.py
|
|
@ -14,11 +14,17 @@ Design notes:
|
|||
- Uses `|| true` in the hook command so the hook never blocks a session on
|
||||
a transient sync error.
|
||||
- SessionStart gets three entries:
|
||||
1. Chained `agnes self-upgrade; agnes pull` — self-upgrade runs first
|
||||
1. `agnes capture-session` — reads the SessionStart stdin JSON payload
|
||||
(`transcript_path`) and appends the absolute path to the queue file
|
||||
`<workspace>/.claude/agnes-sessions.txt`. This feeds `agnes push`
|
||||
without reverse-engineering Claude Code's cwd-to-folder encoding.
|
||||
Runs first so the path is captured before any subsequent hook can
|
||||
fail and prevent later hooks from firing.
|
||||
2. Chained `agnes self-upgrade; agnes pull` — self-upgrade runs first
|
||||
so any wire-protocol bump lands before pull tries to use the new
|
||||
CLI version. Both `|| true`-guarded so an upgrade failure doesn't
|
||||
block the pull.
|
||||
2. `agnes refresh-marketplace --check` — independent entry. Detector-
|
||||
3. `agnes refresh-marketplace --check` — independent entry. Detector-
|
||||
only (since the slash-command split): runs `git fetch` against the
|
||||
marketplace clone and emits a Claude Code hook JSON message
|
||||
hinting the user at `/update-agnes-plugins` when remote content
|
||||
|
|
@ -26,11 +32,13 @@ Design notes:
|
|||
command does that interactively, with full output visible in the
|
||||
Claude Code transcript and under user control. Failure (no clone,
|
||||
no token) silently no-ops via the surrounding `|| true`.
|
||||
3. `agnes push` — uploads any session JSONLs that haven't reached the
|
||||
server yet (orphans from `claude -p` headless mode where Claude Code
|
||||
does NOT fire SessionEnd, or from abnormal session exits). Symmetric
|
||||
with `agnes pull` so the workspace heals on the next interactive
|
||||
session start.
|
||||
|
||||
The previous SessionStart `agnes push` self-heal entry was removed once
|
||||
the capture-queue mechanism made it redundant: orphan session JSONLs from
|
||||
headless / crashed sessions stay in the queue and get uploaded by the
|
||||
next SessionEnd push (queue file persists across runs). Workspaces with
|
||||
the old entry are migrated cleanly — `_replace_or_add` strips any
|
||||
matching `agnes push` from SessionStart on the next `agnes init`.
|
||||
|
||||
- SessionEnd gets one entry: `agnes push --quiet`, wrapped to detach into
|
||||
the background. Claude Code in `-p` (headless) mode terminates SessionEnd
|
||||
|
|
@ -61,6 +69,9 @@ _OUR_COMMAND_MARKERS = (
|
|||
"agnes pull",
|
||||
"agnes push",
|
||||
"agnes refresh-marketplace",
|
||||
"agnes capture-session",
|
||||
"agnes mark-private",
|
||||
"agnes statusline",
|
||||
"da sync",
|
||||
)
|
||||
|
||||
|
|
@ -108,13 +119,15 @@ def install_claude_hooks(workspace: Path) -> None:
|
|||
for cmd in commands:
|
||||
existing.append({"hooks": [{"type": "command", "command": cmd}]})
|
||||
|
||||
# `refresh-marketplace` is wrapped in `bash -c` because Claude Code on
|
||||
# Windows runs hook commands directly (no shell), so the `2>/dev/null
|
||||
# || true` redirection + short-circuit syntax never gets interpreted.
|
||||
# The self-upgrade+pull chained entry pre-dates the Windows fix and
|
||||
# isn't churned for parity (the same redirection fluff applies but
|
||||
# changing the existing wire would force every workspace to re-write
|
||||
# its settings.json on the next `agnes init` for no behaviour gain).
|
||||
# All entries are wrapped in `bash -c "..."` for Windows compatibility:
|
||||
# Claude Code on Windows runs hook commands directly (no shell), so
|
||||
# the `; ` chain operator, `2>/dev/null` redirection, and `|| true`
|
||||
# short-circuit never get interpreted unless we explicitly invoke
|
||||
# bash. (Git Bash on PATH or WSL satisfies this.) The self-upgrade +
|
||||
# pull chain previously shipped unwrapped — that pre-dates the
|
||||
# Windows fix; ship it wrapped now so every entry uses the same
|
||||
# contract. Workspaces still on the older unwrapped form auto-upgrade
|
||||
# via `maybe_refresh_claude_hooks` on the next `agnes self-upgrade`.
|
||||
#
|
||||
# `--check` makes the marketplace entry a detector only: the actual
|
||||
# plugin install/update happens in the `/update-agnes-plugins` slash
|
||||
|
|
@ -122,11 +135,23 @@ def install_claude_hooks(workspace: Path) -> None:
|
|||
# Workspaces still on the older `--quiet` form auto-upgrade here
|
||||
# because `_OUR_COMMAND_MARKERS` matches by substring on the
|
||||
# `agnes refresh-marketplace` prefix.
|
||||
# `agnes capture-session` reads the SessionStart hook stdin (a JSON
|
||||
# payload with `transcript_path`) and appends the path to
|
||||
# `.claude/agnes-sessions.txt`. That queue file feeds `agnes push`,
|
||||
# avoiding any reverse-engineering of Claude Code's cwd-to-folder
|
||||
# encoding.
|
||||
#
|
||||
# The previous SessionStart `agnes push` self-heal entry was dropped
|
||||
# once the queue mechanism made it redundant — orphans from headless
|
||||
# / crashed sessions remain in the queue and ship out with the next
|
||||
# SessionEnd push. The marker substring "agnes push" stays in
|
||||
# _OUR_COMMAND_MARKERS so the old entry is cleanly removed from any
|
||||
# pre-existing settings.json on the next init.
|
||||
_replace_or_add("SessionStart", [
|
||||
"agnes self-upgrade --quiet 2>/dev/null || true; "
|
||||
"agnes pull --quiet 2>/dev/null || true",
|
||||
'bash -c "agnes capture-session 2>/dev/null || true"',
|
||||
'bash -c "agnes self-upgrade --quiet 2>/dev/null || true; '
|
||||
'agnes pull --quiet 2>/dev/null || true"',
|
||||
'bash -c "agnes refresh-marketplace --check 2>/dev/null || true"',
|
||||
'bash -c "agnes push --quiet 2>/dev/null || true"',
|
||||
])
|
||||
# SessionEnd push must run detached. Claude Code in `-p` (headless) mode
|
||||
# SIGTERMs hook subprocesses after ~1 second regardless of work in
|
||||
|
|
@ -142,4 +167,107 @@ def install_claude_hooks(workspace: Path) -> None:
|
|||
'bash -c "( nohup agnes push --quiet </dev/null >/dev/null 2>&1 & ) ; true"',
|
||||
])
|
||||
|
||||
_install_statusline(cfg)
|
||||
|
||||
settings_path.write_text(json.dumps(cfg, indent=2) + "\n", encoding="utf-8")
|
||||
|
||||
|
||||
# Claude Code's `statusLine` setting tells the editor to invoke a command
|
||||
# on every status-bar refresh and display the first line of stdout. We
|
||||
# wire it to `agnes statusline`, which surfaces the `🔒 agnes-private`
|
||||
# indicator when the current session is marked private.
|
||||
#
|
||||
# Politeness: if the user (or another tool) has already set a `statusLine`
|
||||
# in the workspace `settings.json`, we leave it untouched and emit a
|
||||
# one-line stderr warning. Customizing the status bar is a personal
|
||||
# preference and Agnes shouldn't clobber it. Operators who want both
|
||||
# their custom output and the private indicator can compose `agnes
|
||||
# statusline` into their own status-line command manually.
|
||||
_STATUSLINE_MARKER = "agnes statusline"
|
||||
|
||||
|
||||
def _install_statusline(cfg: dict) -> None:
|
||||
existing = cfg.get("statusLine")
|
||||
# Distinguish "key absent" / "key=null" / "key=empty string" from any
|
||||
# real value. A `None` or `""` value is legal JSON but conveys "no
|
||||
# statusLine wanted" rather than "default" — overwriting it would
|
||||
# silently undo the user's explicit opt-out.
|
||||
if existing is None or existing == "":
|
||||
cfg["statusLine"] = {"type": "command", "command": "agnes statusline"}
|
||||
return
|
||||
if isinstance(existing, dict) and _STATUSLINE_MARKER in str(existing.get("command", "")):
|
||||
return # already ours — idempotent re-init
|
||||
print(
|
||||
"Warning: existing statusLine in .claude/settings.json preserved. "
|
||||
"To show the agnes-private indicator alongside your custom status, "
|
||||
"add `agnes statusline` to your command.",
|
||||
file=sys.stderr,
|
||||
)
|
||||
|
||||
|
||||
def workspace_has_agnes_hooks(workspace: Path) -> bool:
|
||||
"""True iff ``workspace/.claude/settings.json`` already shows signs of a
|
||||
prior ``agnes init``: at least one Agnes-managed hook entry, or our
|
||||
statusLine command.
|
||||
|
||||
Used as a guard by :func:`maybe_refresh_claude_hooks` so that
|
||||
``agnes self-upgrade`` (which fires from a SessionStart hook in every
|
||||
Agnes workspace) does not accidentally install hooks into a directory
|
||||
that is not an Agnes workspace — e.g. the user's home dir, if they
|
||||
invoke ``agnes self-upgrade`` manually from there.
|
||||
|
||||
Returns False on missing / malformed settings.json — the caller treats
|
||||
that as "not an Agnes workspace", so the refresh skips.
|
||||
"""
|
||||
settings_path = workspace / ".claude" / "settings.json"
|
||||
if not settings_path.exists():
|
||||
return False
|
||||
try:
|
||||
cfg = json.loads(settings_path.read_text(encoding="utf-8"))
|
||||
except (json.JSONDecodeError, OSError):
|
||||
return False
|
||||
if not isinstance(cfg, dict):
|
||||
return False
|
||||
sl = cfg.get("statusLine")
|
||||
if isinstance(sl, dict) and _STATUSLINE_MARKER in str(sl.get("command", "")):
|
||||
return True
|
||||
hooks = cfg.get("hooks", {})
|
||||
if not isinstance(hooks, dict):
|
||||
return False
|
||||
for entries in hooks.values():
|
||||
if not isinstance(entries, list):
|
||||
continue
|
||||
for entry in entries:
|
||||
if not isinstance(entry, dict):
|
||||
continue
|
||||
for h in entry.get("hooks", []) or []:
|
||||
cmd = h.get("command", "") if isinstance(h, dict) else ""
|
||||
if any(marker in cmd for marker in _OUR_COMMAND_MARKERS):
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def maybe_refresh_claude_hooks(workspace: Path) -> bool:
|
||||
"""Idempotently re-install Agnes hooks if ``workspace`` already looks like
|
||||
an Agnes workspace, otherwise no-op.
|
||||
|
||||
Called from ``agnes self-upgrade`` so that operators on workspaces
|
||||
initialized with an older CLI version pick up new hook layout (e.g.
|
||||
the v0.49 SessionStart ``agnes capture-session`` entry) on the next
|
||||
session-start, without needing to re-run ``agnes init`` manually.
|
||||
Without this, an existing v0.48 workspace would auto-upgrade the CLI
|
||||
via its own SessionStart self-upgrade entry, but the new
|
||||
capture-session hook would never get installed — the queue would stay
|
||||
empty and ``agnes push`` would silently stop uploading sessions.
|
||||
|
||||
The guard (``workspace_has_agnes_hooks``) makes this safe to call from
|
||||
any working directory: ``agnes self-upgrade`` invoked from ``~/`` will
|
||||
not create ``~/.claude/`` or write hooks there.
|
||||
|
||||
Returns True if hooks were refreshed; False if the workspace looked
|
||||
non-Agnes and we skipped.
|
||||
"""
|
||||
if not workspace_has_agnes_hooks(workspace):
|
||||
return False
|
||||
install_claude_hooks(workspace)
|
||||
return True
|
||||
|
|
|
|||
160
cli/lib/private_list.py
Normal file
160
cli/lib/private_list.py
Normal file
|
|
@ -0,0 +1,160 @@
|
|||
"""Private session list — authoritative state for "do not upload".
|
||||
|
||||
A session is "private" iff its session_id appears (as a line) in
|
||||
``<workspace>/.claude/agnes-sessions-private.txt``. The list is the single
|
||||
source of truth — both ``agnes capture-session`` (queue writer) and
|
||||
``agnes push`` (queue reader) consult it and skip private session IDs.
|
||||
|
||||
By making the list authoritative rather than treating queue removal as
|
||||
authoritative, the slash-command / SessionStart-hook race condition is
|
||||
impossible by construction:
|
||||
|
||||
- If ``mark-private`` writes the ID before ``capture-session`` runs,
|
||||
``capture-session`` sees it on the list and skips the queue write.
|
||||
- If ``capture-session`` runs first, ``mark-private`` adds the ID to the
|
||||
list. ``push`` re-checks the list before each upload — the list itself
|
||||
is the source of truth, queue removal is incidental.
|
||||
|
||||
File format: one session_id per line, UTF-8, LF terminators. The file
|
||||
is never rewritten — only appended to (with dedup on add). The growth
|
||||
rate is one line per privately-marked session, so manual cleanup (if
|
||||
ever needed) is fine.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
from threading import Lock
|
||||
|
||||
|
||||
_PRIVATE_LIST_FILENAME = "agnes-sessions-private.txt"
|
||||
|
||||
|
||||
def _claude_dir_writable(workspace: Path) -> Path:
|
||||
"""Return the workspace ``.claude/`` dir, creating it if needed.
|
||||
|
||||
Use ONLY from write paths (``add_private``). Read paths must use
|
||||
``_claude_dir_readonly`` so ``agnes statusline`` doesn't materialize
|
||||
``.claude/`` directories in arbitrary working dirs (it gets called on
|
||||
every prompt redraw, and the user may be in `~/` or another non-Agnes
|
||||
directory at the time).
|
||||
"""
|
||||
d = workspace / ".claude"
|
||||
d.mkdir(parents=True, exist_ok=True)
|
||||
return d
|
||||
|
||||
|
||||
def _claude_dir_readonly(workspace: Path) -> Path:
|
||||
"""Return the workspace ``.claude/`` path WITHOUT mkdir.
|
||||
|
||||
Read paths can compose this with ``Path.exists()`` to short-circuit
|
||||
cleanly when the directory doesn't exist — e.g. the user opened
|
||||
Claude Code in a non-Agnes directory; the statusline still calls
|
||||
``is_private`` once per redraw and creating a ``.claude/`` shadow
|
||||
in every directory the user opens is hostile.
|
||||
"""
|
||||
return workspace / ".claude"
|
||||
|
||||
|
||||
def private_list_path(workspace: Path, *, writable: bool = False) -> Path:
|
||||
"""Path to the private-session list.
|
||||
|
||||
``writable=True`` ensures the parent ``.claude/`` exists (use from
|
||||
``add_private``); the read default keeps the function side-effect-
|
||||
free so ``statusline`` and other hot-path readers don't churn the
|
||||
filesystem.
|
||||
"""
|
||||
parent = _claude_dir_writable(workspace) if writable else _claude_dir_readonly(workspace)
|
||||
return parent / _PRIVATE_LIST_FILENAME
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# mtime-keyed read cache. ``agnes statusline`` is documented to run once per
|
||||
# Claude Code prompt redraw (sub-second on an active session). The private
|
||||
# list is append-only, so mtime is sufficient cache-keying: if the file's
|
||||
# mtime hasn't moved, the cached set is still accurate. Cache keyed per
|
||||
# absolute file path so multiple workspaces in the same process don't
|
||||
# share state.
|
||||
#
|
||||
# The cache is process-local. statusline is invoked as a fresh subprocess
|
||||
# per redraw (Claude Code's ``statusLine`` setting spawns the configured
|
||||
# command), so the cache doesn't help the per-redraw path on its own —
|
||||
# subprocess spawn cost dominates. What it DOES help is in-process callers
|
||||
# (push doing one mtime stat per upload candidate, ``agnes diagnose``
|
||||
# scanning workspaces) and prevents the previous behaviour of calling
|
||||
# ``mkdir`` on every read (S2.7 from the PR review), which polluted
|
||||
# arbitrary directories with stray ``.claude/`` shadows.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_CACHE: dict[str, tuple[float, set[str]]] = {}
|
||||
_CACHE_LOCK = Lock()
|
||||
|
||||
|
||||
def _read_cached(path: Path) -> set[str]:
|
||||
key = os.fspath(path)
|
||||
try:
|
||||
mtime = path.stat().st_mtime
|
||||
except OSError:
|
||||
# File doesn't exist (or unreadable) — drop any cached entry and
|
||||
# return empty. Don't raise; statusline must never paint errors.
|
||||
with _CACHE_LOCK:
|
||||
_CACHE.pop(key, None)
|
||||
return set()
|
||||
with _CACHE_LOCK:
|
||||
cached = _CACHE.get(key)
|
||||
if cached and cached[0] == mtime:
|
||||
return cached[1]
|
||||
# Cache miss / stale — read outside the lock so concurrent readers
|
||||
# don't serialize on disk I/O. Last-writer-wins on the cache write
|
||||
# is fine: both readers will store the same (mtime, set) pair.
|
||||
out: set[str] = set()
|
||||
try:
|
||||
for raw in path.read_text(encoding="utf-8").splitlines():
|
||||
s = raw.strip()
|
||||
if s:
|
||||
out.add(s)
|
||||
except OSError:
|
||||
return set()
|
||||
with _CACHE_LOCK:
|
||||
_CACHE[key] = (mtime, out)
|
||||
return out
|
||||
|
||||
|
||||
def read_all_private(workspace: Path) -> set[str]:
|
||||
"""Return all private session IDs as a set for O(1) membership checks.
|
||||
|
||||
Returns an empty set if the file doesn't exist. Whitespace-only lines
|
||||
are skipped. Cached by file mtime — repeated calls within the same
|
||||
process return the cached set when the file hasn't changed.
|
||||
"""
|
||||
path = private_list_path(workspace)
|
||||
return _read_cached(path)
|
||||
|
||||
|
||||
def is_private(workspace: Path, session_id: str) -> bool:
|
||||
"""True iff ``session_id`` is on the private list. Empty session_id → False."""
|
||||
if not session_id:
|
||||
return False
|
||||
return session_id in read_all_private(workspace)
|
||||
|
||||
|
||||
def add_private(workspace: Path, session_id: str) -> bool:
|
||||
"""Append ``session_id`` to the private list if not already present.
|
||||
|
||||
Returns True if the entry was added, False if it was already present
|
||||
(idempotent re-mark). Empty session_id → no-op, returns False.
|
||||
"""
|
||||
if not session_id:
|
||||
return False
|
||||
if is_private(workspace, session_id):
|
||||
return False
|
||||
line = session_id.rstrip("\n") + "\n"
|
||||
with open(private_list_path(workspace, writable=True), "a", encoding="utf-8") as f:
|
||||
f.write(line)
|
||||
# Explicit cache eviction prevents a sub-second window where add+
|
||||
# is_private from the same process disagrees (mtime granularity on
|
||||
# some filesystems is 1s).
|
||||
with _CACHE_LOCK:
|
||||
_CACHE.pop(os.fspath(private_list_path(workspace)), None)
|
||||
return True
|
||||
64
cli/lib/push_lock.py
Normal file
64
cli/lib/push_lock.py
Normal file
|
|
@ -0,0 +1,64 @@
|
|||
"""Cross-platform single-instance lock for `agnes push`.
|
||||
|
||||
Wraps :class:`filelock.FileLock` (which delegates to ``fcntl.flock`` on POSIX
|
||||
and ``msvcrt.locking`` on Windows) into a context manager that returns
|
||||
``None`` when another push is already running. Callers use it via:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
with acquire_or_skip(workspace) as lock:
|
||||
if lock is None:
|
||||
return # silent exit — another push holds the lock
|
||||
do_push()
|
||||
|
||||
The OS releases the lock automatically when the holding process exits
|
||||
(including crashes), so we do NOT track PIDs or stale-lock ages. The lock
|
||||
file persists between runs but stays empty — :class:`filelock` only uses
|
||||
it for the kernel-level lock handle.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from contextlib import contextmanager
|
||||
from pathlib import Path
|
||||
from typing import Iterator
|
||||
|
||||
from filelock import FileLock, Timeout
|
||||
|
||||
|
||||
_LOCK_FILENAME = "agnes-push.lock"
|
||||
|
||||
|
||||
def lock_path(workspace: Path) -> Path:
|
||||
"""Resolve ``<workspace>/.claude/agnes-push.lock``."""
|
||||
claude_dir = workspace / ".claude"
|
||||
claude_dir.mkdir(parents=True, exist_ok=True)
|
||||
return claude_dir / _LOCK_FILENAME
|
||||
|
||||
|
||||
@contextmanager
|
||||
def acquire_or_skip(workspace: Path) -> Iterator[FileLock | None]:
|
||||
"""Yield the held lock, or ``None`` if the lock can't be acquired.
|
||||
|
||||
Non-blocking (``timeout=0``). Two ways acquisition can fail, both
|
||||
treated the same way (yield ``None`` so the caller can ``return`` /
|
||||
``sys.exit(0)`` quietly):
|
||||
|
||||
- ``filelock.Timeout`` — another push is already running. Expected
|
||||
when multiple SessionEnd hooks fire simultaneously after the user
|
||||
closes several Claude Code sessions at once: exactly one acquires
|
||||
the lock and runs, the rest no-op.
|
||||
- ``OSError`` — the lock file can't be created or opened (read-only
|
||||
filesystem, ``.claude/`` not writable, disk full, hardware I/O
|
||||
error). Rare; when it happens the operator's environment has
|
||||
bigger problems than missing session uploads. We swallow it so
|
||||
``agnes push`` exits cleanly instead of dumping an opaque
|
||||
traceback to stderr or, in the SessionEnd hook context, crashing
|
||||
silently under ``|| true``.
|
||||
"""
|
||||
lock = FileLock(str(lock_path(workspace)))
|
||||
try:
|
||||
with lock.acquire(timeout=0):
|
||||
yield lock
|
||||
except (Timeout, OSError):
|
||||
yield None
|
||||
289
cli/lib/session_queue.py
Normal file
289
cli/lib/session_queue.py
Normal file
|
|
@ -0,0 +1,289 @@
|
|||
"""Session queue and uploaded-log management for `agnes push`.
|
||||
|
||||
The push command operates on a queue file
|
||||
(``<workspace>/.claude/agnes-sessions.txt``) populated by the
|
||||
``agnes capture-session`` SessionStart hook. Each line is a TSV pair:
|
||||
``<session_id>\\t<transcript_path>``. session_id is needed so the
|
||||
push and slash-command machinery can filter against the private
|
||||
list (``cli/lib/private_list.py``).
|
||||
|
||||
Backward compatibility: legacy lines without a tab (just an absolute
|
||||
path) are accepted and treated as having an empty session_id. They
|
||||
still upload via push but cannot be marked private retroactively —
|
||||
which is fine, since by definition they pre-date the feature.
|
||||
|
||||
Race protection: push atomically renames the queue to a snapshot file
|
||||
before processing. New SessionStart hooks write to a freshly-created
|
||||
queue without their entries being clobbered by the eventual rewrite.
|
||||
A short-lived ``agnes-queue.lock`` (filelock) serializes the rename
|
||||
against in-flight appends so the queue file is never written to and
|
||||
renamed concurrently — required on Windows, where ``os.rename`` fails
|
||||
if another handle has the file open, and where ``open(path, "a")`` is
|
||||
not atomic across writers.
|
||||
|
||||
Recovery: if push crashes mid-snapshot, the snapshot file persists. The
|
||||
next push picks it up via :func:`find_recovery_snapshots` and processes
|
||||
it before touching the live queue.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import uuid
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
from filelock import FileLock
|
||||
|
||||
|
||||
_QUEUE_FILENAME = "agnes-sessions.txt"
|
||||
_UPLOADED_FILENAME = "agnes-sessions-uploaded.txt"
|
||||
_PRIVATE_SKIPPED_FILENAME = "agnes-sessions-private-skipped.txt"
|
||||
_FAILED_FILENAME = "agnes-sessions-failed.txt"
|
||||
_SNAPSHOT_PREFIX = "agnes-sessions.snapshot."
|
||||
_SNAPSHOT_SUFFIX = ".txt"
|
||||
_QUEUE_LOCK_FILENAME = "agnes-queue.lock"
|
||||
|
||||
|
||||
def _claude_dir(workspace: Path) -> Path:
|
||||
"""Return ``<workspace>/.claude``, creating it if missing."""
|
||||
d = workspace / ".claude"
|
||||
d.mkdir(parents=True, exist_ok=True)
|
||||
return d
|
||||
|
||||
|
||||
def queue_path(workspace: Path) -> Path:
|
||||
return _claude_dir(workspace) / _QUEUE_FILENAME
|
||||
|
||||
|
||||
def uploaded_log_path(workspace: Path) -> Path:
|
||||
return _claude_dir(workspace) / _UPLOADED_FILENAME
|
||||
|
||||
|
||||
def private_skipped_log_path(workspace: Path) -> Path:
|
||||
return _claude_dir(workspace) / _PRIVATE_SKIPPED_FILENAME
|
||||
|
||||
|
||||
def failed_log_path(workspace: Path) -> Path:
|
||||
return _claude_dir(workspace) / _FAILED_FILENAME
|
||||
|
||||
|
||||
def _queue_lock_path(workspace: Path) -> Path:
|
||||
"""Lock file serializing concurrent writers to the queue file.
|
||||
|
||||
Separate from ``agnes-push.lock`` — that one serializes the push
|
||||
command end-to-end; this one is short-lived (held only for the
|
||||
duration of a single append or rename).
|
||||
"""
|
||||
return _claude_dir(workspace) / _QUEUE_LOCK_FILENAME
|
||||
|
||||
|
||||
def append_to_queue(workspace: Path, session_id: str, transcript_path: str) -> None:
|
||||
"""Append a ``<session_id>\\t<transcript_path>`` line to the queue.
|
||||
|
||||
Held under ``agnes-queue.lock`` to serialize concurrent SessionStart
|
||||
hooks. Python's ``open(path, "a")`` is NOT atomic on Windows — the
|
||||
CRT does not pass ``FILE_APPEND_DATA`` to ``CreateFile``, so it's a
|
||||
plain seek-to-end + write that can interleave bytes mid-line under
|
||||
concurrent writers (e.g. user opens several Claude Code windows
|
||||
simultaneously). The lock makes the append safe on every platform.
|
||||
|
||||
No deduplication here: duplicates may legitimately appear (resume
|
||||
scenario re-writes the same path). Dedup happens at read time.
|
||||
"""
|
||||
sid = (session_id or "").rstrip("\n").rstrip("\t")
|
||||
tp = transcript_path.rstrip("\n")
|
||||
line = f"{sid}\t{tp}\n"
|
||||
with FileLock(str(_queue_lock_path(workspace))):
|
||||
with open(queue_path(workspace), "a", encoding="utf-8") as f:
|
||||
f.write(line)
|
||||
|
||||
|
||||
def snapshot_queue(workspace: Path) -> Path | None:
|
||||
"""Atomically rename the live queue to a snapshot for processing.
|
||||
|
||||
Returns the snapshot path, or None if the queue doesn't exist (no work
|
||||
to do). The snapshot filename embeds the current PID *and* a random
|
||||
uuid8 hex tail: PID alone is not unique after the OS recycles it
|
||||
(Linux wraps at ~32768 by default), so a crashed push leaving a
|
||||
snapshot on disk could be silently overwritten by a future push with
|
||||
the same PID — ``os.rename`` atomically replaces the destination on
|
||||
POSIX and Windows alike, so data loss would be silent. The uuid tail
|
||||
makes every snapshot filename unique regardless of PID reuse.
|
||||
|
||||
Held under ``agnes-queue.lock`` to serialize against in-flight
|
||||
``append_to_queue`` calls: on Windows, ``os.rename`` would fail with
|
||||
``PermissionError`` if another handle has the queue open for write,
|
||||
so the lock prevents that race. The lock is short-lived (single
|
||||
rename), so it doesn't meaningfully delay concurrent capture-session
|
||||
hooks.
|
||||
"""
|
||||
queue = queue_path(workspace)
|
||||
if not queue.exists():
|
||||
return None
|
||||
unique = uuid.uuid4().hex[:8]
|
||||
snapshot = (
|
||||
_claude_dir(workspace)
|
||||
/ f"{_SNAPSHOT_PREFIX}{os.getpid()}.{unique}{_SNAPSHOT_SUFFIX}"
|
||||
)
|
||||
with FileLock(str(_queue_lock_path(workspace))):
|
||||
try:
|
||||
os.rename(queue, snapshot)
|
||||
except FileNotFoundError:
|
||||
return None # race: queue removed between exists() and rename()
|
||||
return snapshot
|
||||
|
||||
|
||||
def _parse_queue_line(raw: str) -> tuple[str, Path] | None:
|
||||
"""Parse one queue line into (session_id, path), or None if blank/invalid."""
|
||||
s = raw.strip()
|
||||
if not s:
|
||||
return None
|
||||
if "\t" in s:
|
||||
sid, _, p = s.partition("\t")
|
||||
sid = sid.strip()
|
||||
p = p.strip()
|
||||
else:
|
||||
# Legacy format: bare path, no session_id known.
|
||||
sid = ""
|
||||
p = s
|
||||
if not p:
|
||||
return None
|
||||
return sid, Path(p)
|
||||
|
||||
|
||||
def read_entries_from_snapshot(snapshot: Path) -> list[tuple[str, Path]]:
|
||||
"""Read (session_id, path) entries from a snapshot, deduplicated.
|
||||
|
||||
Deduplication is by the (session_id, path) pair — preserves first-seen
|
||||
order. Blank lines and lines without a path are skipped. Mixed legacy
|
||||
(1-column) and new (2-column) lines coexist.
|
||||
|
||||
Repeats from the resume scenario collapse into a single entry: the
|
||||
server-side overwrite makes a second upload of the same path redundant
|
||||
within one push run.
|
||||
"""
|
||||
if not snapshot.exists():
|
||||
return []
|
||||
seen: set[tuple[str, str]] = set()
|
||||
out: list[tuple[str, Path]] = []
|
||||
for raw in snapshot.read_text(encoding="utf-8").splitlines():
|
||||
parsed = _parse_queue_line(raw)
|
||||
if parsed is None:
|
||||
continue
|
||||
sid, path = parsed
|
||||
key = (sid, str(path))
|
||||
if key in seen:
|
||||
continue
|
||||
seen.add(key)
|
||||
out.append(parsed)
|
||||
return out
|
||||
|
||||
|
||||
# Backward-compatible alias for code that only needs paths. Returns just
|
||||
# the paths (preserving the old ``list[Path]`` shape) for callers that
|
||||
# don't care about session_id. Internally used by the dry-run preview
|
||||
# path which only displays files.
|
||||
def read_paths_from_snapshot(snapshot: Path) -> list[Path]:
|
||||
return [path for _sid, path in read_entries_from_snapshot(snapshot)]
|
||||
|
||||
|
||||
def find_recovery_snapshots(workspace: Path) -> list[Path]:
|
||||
"""Return any pre-existing snapshot files left behind by a crashed push."""
|
||||
return sorted(_claude_dir(workspace).glob(f"{_SNAPSHOT_PREFIX}*{_SNAPSHOT_SUFFIX}"))
|
||||
|
||||
|
||||
def discard_snapshot(snapshot: Path) -> None:
|
||||
"""Delete a fully-processed snapshot file. Idempotent."""
|
||||
try:
|
||||
snapshot.unlink()
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
|
||||
|
||||
def mark_uploaded(
|
||||
workspace: Path,
|
||||
transcript_path: Path,
|
||||
when: datetime | None = None,
|
||||
) -> None:
|
||||
"""Append `<iso_timestamp>\\t<absolute_path>\\n` to the uploaded log."""
|
||||
if when is None:
|
||||
when = datetime.now(timezone.utc)
|
||||
ts = when.strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
line = f"{ts}\t{transcript_path}\n"
|
||||
with open(uploaded_log_path(workspace), "a", encoding="utf-8") as f:
|
||||
f.write(line)
|
||||
|
||||
|
||||
def mark_private_skipped(
|
||||
workspace: Path,
|
||||
session_id: str,
|
||||
transcript_path: Path,
|
||||
when: datetime | None = None,
|
||||
) -> None:
|
||||
"""Append `<iso_timestamp>\\t<session_id>\\t<path>` to the private-skipped audit log.
|
||||
|
||||
Called by push when it filters out an entry whose session_id is on
|
||||
the private list. The audit log is append-only — its purpose is to
|
||||
surface (during incident review or user support) which sessions were
|
||||
intentionally NOT uploaded.
|
||||
"""
|
||||
if when is None:
|
||||
when = datetime.now(timezone.utc)
|
||||
ts = when.strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
line = f"{ts}\t{session_id}\t{transcript_path}\n"
|
||||
with open(private_skipped_log_path(workspace), "a", encoding="utf-8") as f:
|
||||
f.write(line)
|
||||
|
||||
|
||||
def mark_failed_permanent(
|
||||
workspace: Path,
|
||||
session_id: str,
|
||||
transcript_path: Path,
|
||||
status_code: int,
|
||||
when: datetime | None = None,
|
||||
) -> None:
|
||||
"""Append `<iso_timestamp>\\t<session_id>\\t<status>\\t<path>` to the
|
||||
permanent-failure audit log.
|
||||
|
||||
Called by push when the server returns a 4xx (other than 408 / 429)
|
||||
— deterministic failures where retrying never succeeds (401 token
|
||||
expired, 403 RBAC denial, 413 payload too large, 400 server
|
||||
validation, etc.). The transcript path is logged here instead of
|
||||
silently dropped so operators have a forensic trail; the entry is
|
||||
NOT re-queued, breaking the prior infinite-loop bug where every
|
||||
push run would re-bombard the server with the same failing upload.
|
||||
|
||||
No separate lock: piggybacks on `agnes-push.lock` (the
|
||||
single-instance push lock), same as `mark_uploaded` and
|
||||
`mark_private_skipped`. Push is the only writer to this file.
|
||||
"""
|
||||
if when is None:
|
||||
when = datetime.now(timezone.utc)
|
||||
ts = when.strftime("%Y-%m-%dT%H:%M:%SZ")
|
||||
line = f"{ts}\t{session_id}\t{status_code}\t{transcript_path}\n"
|
||||
with open(failed_log_path(workspace), "a", encoding="utf-8") as f:
|
||||
f.write(line)
|
||||
|
||||
|
||||
def requeue_failed(
|
||||
workspace: Path,
|
||||
entries: list[tuple[str, Path]],
|
||||
) -> None:
|
||||
"""Append failed (session_id, path) entries back to the live queue.
|
||||
|
||||
Failed entries land at the end of the queue alongside any fresh
|
||||
appends that hooks wrote during this push run. Relative ordering
|
||||
vs. those fresh entries is best-effort — order doesn't affect
|
||||
correctness.
|
||||
|
||||
Held under ``agnes-queue.lock`` because concurrent ``capture-session``
|
||||
hooks (which don't hold the push lock) may be appending at the same
|
||||
time — same Windows non-atomicity concern as ``append_to_queue``.
|
||||
"""
|
||||
if not entries:
|
||||
return
|
||||
with FileLock(str(_queue_lock_path(workspace))):
|
||||
with open(queue_path(workspace), "a", encoding="utf-8") as f:
|
||||
for sid, p in entries:
|
||||
f.write(f"{sid}\t{p}\n")
|
||||
|
|
@ -25,11 +25,14 @@ if sys.platform == "win32":
|
|||
pass
|
||||
|
||||
from cli.commands.auth import auth_app
|
||||
from cli.commands.capture_session import capture_session_app
|
||||
from cli.commands.init import init_app
|
||||
from cli.commands.mark_private import mark_private_app
|
||||
from cli.commands.onboarded import onboarded_app
|
||||
from cli.commands.pull import pull_app
|
||||
from cli.commands.push import push_app
|
||||
from cli.commands.refresh_marketplace import refresh_marketplace_app
|
||||
from cli.commands.statusline import statusline_app
|
||||
from cli.commands.query import query_command
|
||||
from cli.commands.status import status_app
|
||||
from cli.commands.admin import admin_app
|
||||
|
|
@ -114,6 +117,9 @@ app.add_typer(init_app, name="init")
|
|||
app.add_typer(onboarded_app, name="onboarded")
|
||||
app.add_typer(pull_app, name="pull")
|
||||
app.add_typer(push_app, name="push")
|
||||
app.add_typer(capture_session_app, name="capture-session")
|
||||
app.add_typer(mark_private_app, name="mark-private")
|
||||
app.add_typer(statusline_app, name="statusline")
|
||||
app.add_typer(refresh_marketplace_app, name="refresh-marketplace")
|
||||
app.command("query")(query_command)
|
||||
app.add_typer(status_app, name="status")
|
||||
|
|
|
|||
8
cli/templates/commands/agnes-private.md
Normal file
8
cli/templates/commands/agnes-private.md
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
---
|
||||
description: Mark the current Claude Code session as private — exclude it from Agnes upload
|
||||
---
|
||||
|
||||
!`agnes mark-private`
|
||||
|
||||
The session above has been marked as private. Its transcript will be skipped
|
||||
the next time `agnes push` runs.
|
||||
|
|
@ -37,6 +37,24 @@ This workspace is connected to {{ server.url }}.
|
|||
- `agnes push` — upload sessions and local notes to server
|
||||
- Data on the server refreshes every {{ sync_interval }}
|
||||
|
||||
## Private sessions
|
||||
|
||||
If a Claude Code session covers something you do **not** want uploaded
|
||||
to the Agnes server (sensitive PII walk-through, exploratory work that
|
||||
shouldn't shape org-wide context, etc.), mark it private:
|
||||
|
||||
- Run `/agnes-private` from inside the session — the deterministic
|
||||
`!`-prefix bash adds the current `CLAUDE_CODE_SESSION_ID` to
|
||||
`<workspace>/.claude/agnes-sessions-private.txt`. Subsequent
|
||||
`agnes push` runs skip its transcript and audit-log the skip to
|
||||
`<workspace>/.claude/agnes-sessions-private-skipped.txt`.
|
||||
- The Claude Code statusbar shows `🔒 agnes-private` while you are
|
||||
in a private session, so the marking is visible at a glance.
|
||||
- Marking before `capture-session` fires (the SessionStart hook that
|
||||
queues the transcript) means the session never enters the upload
|
||||
queue at all; marking after it fires is also safe — the push-side
|
||||
re-check on the private list is the authoritative filter.
|
||||
|
||||
## Discovering tables — never enumerate from memory
|
||||
|
||||
Tables, columns, sizes, descriptions, and `query_mode` change as admins
|
||||
|
|
@ -67,15 +85,28 @@ and use `agnes snapshot create --estimate` to size-check before fetching.
|
|||
{% if marketplaces -%}
|
||||
## Agnes Marketplace — plugins available to you
|
||||
|
||||
These plugins reach Claude Code through the per-user **`agnes`** marketplace
|
||||
served by this server (an aggregated, RBAC-filtered view of the upstream
|
||||
marketplaces below). When you install or invoke one of these plugins inside
|
||||
Claude Code, address it as `<plugin>@agnes` regardless of which upstream it
|
||||
came from — e.g. `claude plugin install <plugin>@agnes`. The
|
||||
`agnes refresh-marketplace` command (run by the SessionStart hook every
|
||||
session) keeps the local clone in sync.
|
||||
The Agnes server publishes a per-user **`agnes`** Claude Code marketplace,
|
||||
aggregated from upstream marketplaces and RBAC-filtered. The list below is
|
||||
your **eligibility set** — plugins your account is allowed to add to its
|
||||
stack on the `/marketplace` page. Eligibility ≠ installed.
|
||||
|
||||
Upstream marketplaces folded into your `agnes` view:
|
||||
What actually reaches Claude Code is your **served stack**, which is:
|
||||
- plugins you opted in to on `/marketplace` (from the eligibility set above),
|
||||
- plus any plugin your admin pinned as system-mandatory (auto-applied to every user),
|
||||
- plus skills / agents / plugins you installed from the **Flea market** tab on `/marketplace`.
|
||||
|
||||
When you install or invoke a stack plugin inside Claude Code, address it
|
||||
as `<plugin>@agnes` regardless of which upstream marketplace it came from
|
||||
— e.g. `claude plugin install <plugin>@agnes`.
|
||||
|
||||
Updates: the SessionStart hook runs `agnes refresh-marketplace --check`
|
||||
on every Claude Code session — it only **detects** server-side changes
|
||||
(new admin grants, new system plugins, your own `/marketplace` picks) and
|
||||
prompts you to run `/update-agnes-plugins` inside Claude Code to install
|
||||
the diff. The slash command does the full reconcile with output visible
|
||||
in the transcript; no silent auto-install at session start.
|
||||
|
||||
Upstream marketplaces folded into your `agnes` view (eligibility):
|
||||
{% for mp in marketplaces -%}
|
||||
- **{{ mp.name }}** ({{ mp.slug }}): {{ mp.plugins | map(attribute="name") | join(", ") }}
|
||||
{% endfor %}
|
||||
|
|
|
|||
|
|
@ -88,6 +88,11 @@ dependencies = [
|
|||
# src/welcome_template.py was vulnerable to. Pre-built wheels published
|
||||
# for all supported (mac/linux/windows × arm64/x86_64) targets.
|
||||
"nh3>=0.2",
|
||||
# Cross-platform advisory file locking for the `agnes push` single-instance
|
||||
# guard. Wraps fcntl.flock on POSIX and msvcrt.locking on Windows behind
|
||||
# a uniform API; OS releases the lock automatically on process exit (no
|
||||
# stale-lock detection required). Used by cli/lib/push_lock.py.
|
||||
"filelock>=3.13,<4",
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
|
|
|
|||
197
tests/test_capture_session.py
Normal file
197
tests/test_capture_session.py
Normal file
|
|
@ -0,0 +1,197 @@
|
|||
"""Tests for `agnes capture-session` — SessionStart hook helper."""
|
||||
|
||||
import json
|
||||
|
||||
from typer.testing import CliRunner
|
||||
|
||||
from cli.commands.capture_session import capture_session_app
|
||||
from cli.lib.private_list import add_private
|
||||
from cli.lib.session_queue import queue_path
|
||||
|
||||
runner = CliRunner()
|
||||
|
||||
|
||||
def test_capture_appends_session_id_and_transcript_path(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
payload = json.dumps({
|
||||
"session_id": "abc-123",
|
||||
"transcript_path": "/Users/foo/.claude/projects/.../abc.jsonl",
|
||||
"cwd": "/Users/foo/work",
|
||||
"hook_event_name": "SessionStart",
|
||||
})
|
||||
result = runner.invoke(capture_session_app, [], input=payload)
|
||||
assert result.exit_code == 0
|
||||
assert queue_path(tmp_path).read_text(encoding="utf-8") == \
|
||||
"abc-123\t/Users/foo/.claude/projects/.../abc.jsonl\n"
|
||||
|
||||
|
||||
def test_capture_writes_empty_session_id_when_field_missing(tmp_path, monkeypatch):
|
||||
"""Payload missing session_id: still write to queue (legacy behavior).
|
||||
The empty session_id means the entry cannot be marked private later
|
||||
via /agnes-private (which keys on session_id), but it'll still upload."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
payload = json.dumps({"transcript_path": "/abc.jsonl"})
|
||||
result = runner.invoke(capture_session_app, [], input=payload)
|
||||
assert result.exit_code == 0
|
||||
assert queue_path(tmp_path).read_text(encoding="utf-8") == "\t/abc.jsonl\n"
|
||||
|
||||
|
||||
def test_capture_silently_skips_when_field_missing(tmp_path, monkeypatch):
|
||||
"""transcript_path is required; absence is a no-op."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
result = runner.invoke(capture_session_app, [], input='{"session_id": "abc"}')
|
||||
assert result.exit_code == 0
|
||||
assert not queue_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_capture_silently_skips_invalid_json(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
result = runner.invoke(capture_session_app, [], input="not json {")
|
||||
assert result.exit_code == 0
|
||||
assert not queue_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_capture_silently_skips_empty_stdin(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
result = runner.invoke(capture_session_app, [], input="")
|
||||
assert result.exit_code == 0
|
||||
assert not queue_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_capture_silently_skips_non_string_transcript_path(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
payload = json.dumps({"transcript_path": 123})
|
||||
result = runner.invoke(capture_session_app, [], input=payload)
|
||||
assert result.exit_code == 0
|
||||
assert not queue_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_capture_silently_skips_when_payload_not_object(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
result = runner.invoke(capture_session_app, [], input='["array", "not object"]')
|
||||
assert result.exit_code == 0
|
||||
assert not queue_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_capture_appends_multiple_calls(tmp_path, monkeypatch):
|
||||
"""Resume scenario: SessionStart fires again — same path appended twice."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
p1 = json.dumps({"session_id": "abc", "transcript_path": "/abc.jsonl"})
|
||||
p2 = json.dumps({"session_id": "abc", "transcript_path": "/abc.jsonl"})
|
||||
runner.invoke(capture_session_app, [], input=p1)
|
||||
runner.invoke(capture_session_app, [], input=p2)
|
||||
assert queue_path(tmp_path).read_text(encoding="utf-8") == (
|
||||
"abc\t/abc.jsonl\n"
|
||||
"abc\t/abc.jsonl\n"
|
||||
)
|
||||
|
||||
|
||||
def test_capture_verbose_emits_diagnostic(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
payload = json.dumps({"session_id": "abc", "transcript_path": "/x.jsonl"})
|
||||
result = runner.invoke(capture_session_app, ["--verbose"], input=payload)
|
||||
assert result.exit_code == 0
|
||||
assert "/x.jsonl" in result.output
|
||||
|
||||
|
||||
def test_capture_skips_queue_when_session_is_already_private(tmp_path, monkeypatch):
|
||||
"""Race scenario: user ran /agnes-private BEFORE the SessionStart hook
|
||||
chain reached capture-session. The private list write happened first;
|
||||
capture-session sees it and skips the queue append."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
add_private(tmp_path, "abc-123")
|
||||
payload = json.dumps({
|
||||
"session_id": "abc-123",
|
||||
"transcript_path": "/abc.jsonl",
|
||||
})
|
||||
result = runner.invoke(capture_session_app, [], input=payload)
|
||||
assert result.exit_code == 0
|
||||
# Queue was NOT written.
|
||||
assert not queue_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_capture_writes_when_unrelated_session_is_private(tmp_path, monkeypatch):
|
||||
"""Marking session X private must not block capture of unrelated session Y."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
add_private(tmp_path, "other-session")
|
||||
payload = json.dumps({
|
||||
"session_id": "abc-123",
|
||||
"transcript_path": "/abc.jsonl",
|
||||
})
|
||||
result = runner.invoke(capture_session_app, [], input=payload)
|
||||
assert result.exit_code == 0
|
||||
assert queue_path(tmp_path).read_text(encoding="utf-8") == "abc-123\t/abc.jsonl\n"
|
||||
|
||||
|
||||
# ---------- Breadcrumb tests (David #11 from PR review) ---------------------
|
||||
#
|
||||
# capture-session is a SessionStart hook that always exits 0 (it must NOT
|
||||
# fail loudly inside Claude Code's startup chain). Without an external
|
||||
# observability signal, an upstream contract change (Claude Code's stdin
|
||||
# JSON shape shifts; the queue mysteriously stays empty) is invisible to
|
||||
# operators. The breadcrumb log gives `agnes diagnose` something to
|
||||
# inspect.
|
||||
|
||||
from cli.commands.capture_session import _BREADCRUMB_FILENAME
|
||||
|
||||
|
||||
def _breadcrumb_lines(workspace) -> list[str]:
|
||||
"""Read all breadcrumb lines, dropping trailing newline."""
|
||||
path = workspace / ".claude" / _BREADCRUMB_FILENAME
|
||||
if not path.exists():
|
||||
return []
|
||||
return [ln for ln in path.read_text(encoding="utf-8").splitlines() if ln]
|
||||
|
||||
|
||||
def test_capture_writes_ok_breadcrumb_on_success(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
payload = json.dumps({"session_id": "abc-123", "transcript_path": "/abc.jsonl"})
|
||||
runner.invoke(capture_session_app, [], input=payload)
|
||||
lines = _breadcrumb_lines(tmp_path)
|
||||
assert len(lines) == 1
|
||||
parts = lines[0].split("\t")
|
||||
assert parts[1] == "ok"
|
||||
assert parts[2] == "abc-123"
|
||||
|
||||
|
||||
def test_capture_writes_bad_json_breadcrumb_on_invalid_input(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
# Pre-create .claude/ so breadcrumb has somewhere to land (the breadcrumb
|
||||
# is read-only re: dir creation — see _record_breadcrumb).
|
||||
(tmp_path / ".claude").mkdir()
|
||||
runner.invoke(capture_session_app, [], input="not json at all")
|
||||
lines = _breadcrumb_lines(tmp_path)
|
||||
assert len(lines) == 1
|
||||
assert lines[0].split("\t")[1] == "bad_json"
|
||||
|
||||
|
||||
def test_capture_writes_no_transcript_breadcrumb_when_field_missing(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
(tmp_path / ".claude").mkdir()
|
||||
runner.invoke(capture_session_app, [], input=json.dumps({"session_id": "x"}))
|
||||
lines = _breadcrumb_lines(tmp_path)
|
||||
assert len(lines) == 1
|
||||
assert lines[0].split("\t")[1] == "no_transcript_path"
|
||||
|
||||
|
||||
def test_capture_writes_private_skip_breadcrumb_on_marked_session(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
add_private(tmp_path, "private-sid")
|
||||
runner.invoke(
|
||||
capture_session_app, [],
|
||||
input=json.dumps({"session_id": "private-sid", "transcript_path": "/x"}),
|
||||
)
|
||||
lines = _breadcrumb_lines(tmp_path)
|
||||
assert lines[-1].split("\t")[1] == "private_skip"
|
||||
assert lines[-1].split("\t")[2] == "private-sid"
|
||||
|
||||
|
||||
def test_breadcrumb_does_not_create_claude_dir_in_arbitrary_workspaces(tmp_path, monkeypatch):
|
||||
"""If the workspace has no .claude/ directory, capture-session
|
||||
never materializes one — same rationale as the read-only path in
|
||||
private_list.py. Hooks fire in directories the user just opened
|
||||
Claude Code in; the breadcrumb log shouldn't pollute those."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
runner.invoke(capture_session_app, [], input="malformed")
|
||||
# No .claude/ created, no breadcrumb written.
|
||||
assert not (tmp_path / ".claude").exists()
|
||||
|
|
@ -1,30 +1,70 @@
|
|||
"""Tests for agnes push command (SessionEnd uploader)."""
|
||||
|
||||
import json
|
||||
import re
|
||||
from contextlib import contextmanager
|
||||
|
||||
from typer.testing import CliRunner
|
||||
|
||||
from cli.commands.push import push_app
|
||||
from cli.lib.private_list import add_private
|
||||
from cli.lib.session_queue import (
|
||||
append_to_queue,
|
||||
failed_log_path,
|
||||
private_skipped_log_path,
|
||||
queue_path,
|
||||
uploaded_log_path,
|
||||
)
|
||||
|
||||
# CI-safety: Typer/rich emits ANSI escapes in --help output. Strip before asserts.
|
||||
_ANSI_RE = __import__("re").compile(r"\x1b\[[0-9;]*m")
|
||||
_ANSI_RE = re.compile(r"\x1b\[[0-9;]*m")
|
||||
|
||||
|
||||
def _clean(s: str) -> str:
|
||||
return _ANSI_RE.sub("", s)
|
||||
|
||||
from cli.commands.push import push_app
|
||||
|
||||
runner = CliRunner()
|
||||
|
||||
|
||||
class _FakeResp:
|
||||
def __init__(self, status_code: int = 200) -> None:
|
||||
self.status_code = status_code
|
||||
|
||||
|
||||
def _stub_config(monkeypatch) -> None:
|
||||
monkeypatch.setattr("cli.commands.push.get_server_url", lambda: "http://x")
|
||||
monkeypatch.setattr("cli.commands.push.get_token", lambda: "test-pat")
|
||||
|
||||
|
||||
def _record_uploads(monkeypatch) -> list[tuple[str, dict]]:
|
||||
"""Patch api_post to record calls and return success. Returns the recorder list."""
|
||||
calls: list[tuple[str, dict]] = []
|
||||
|
||||
def _fake(endpoint, **kwargs):
|
||||
calls.append((endpoint, kwargs))
|
||||
return _FakeResp(200)
|
||||
|
||||
monkeypatch.setattr("cli.commands.push.api_post", _fake)
|
||||
return calls
|
||||
|
||||
|
||||
# ---------- Smoke + dry-run --------------------------------------------------
|
||||
|
||||
|
||||
def test_push_help():
|
||||
result = runner.invoke(push_app, ["--help"])
|
||||
assert result.exit_code == 0
|
||||
assert "--quiet" in _clean(result.output)
|
||||
assert "--json" in _clean(result.output)
|
||||
assert "--dry-run" in _clean(result.output)
|
||||
assert "--legacy-scan" in _clean(result.output)
|
||||
|
||||
|
||||
def test_push_no_sessions_no_mkdir(tmp_path, monkeypatch):
|
||||
"""Empty workspace -> push exits 0, doesn't create user/sessions/."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
monkeypatch.setattr("cli.commands.push.get_server_url", lambda: "http://x")
|
||||
monkeypatch.setattr("cli.commands.push.get_token", lambda: "test-pat")
|
||||
_stub_config(monkeypatch)
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0
|
||||
assert not (tmp_path / "user" / "sessions").exists(), \
|
||||
|
|
@ -34,17 +74,526 @@ def test_push_no_sessions_no_mkdir(tmp_path, monkeypatch):
|
|||
def test_push_dry_run_no_writes(tmp_path, monkeypatch):
|
||||
"""--dry-run lists what would upload but sends nothing."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
monkeypatch.setattr("cli.commands.push.get_server_url", lambda: "http://x")
|
||||
monkeypatch.setattr("cli.commands.push.get_token", lambda: "test-pat")
|
||||
# Create a session file
|
||||
sessions_dir = tmp_path / "user" / "sessions"
|
||||
sessions_dir.mkdir(parents=True)
|
||||
(sessions_dir / "abc.jsonl").write_text('{"event":"test"}\n')
|
||||
_stub_config(monkeypatch)
|
||||
transcript = tmp_path / "abc.jsonl"
|
||||
transcript.write_text('{"event":"test"}\n')
|
||||
append_to_queue(tmp_path, "sid-1", str(transcript))
|
||||
|
||||
# No api_post should be called - patch it to fail loudly if invoked
|
||||
def _raise(*a, **kw):
|
||||
raise AssertionError("api_post was called during --dry-run")
|
||||
|
||||
monkeypatch.setattr("cli.commands.push.api_post", _raise)
|
||||
|
||||
result = runner.invoke(push_app, ["--dry-run"])
|
||||
assert result.exit_code == 0
|
||||
assert queue_path(tmp_path).exists() # not consumed
|
||||
|
||||
|
||||
# ---------- Queue happy path + dedup + lock + recovery ----------------------
|
||||
|
||||
|
||||
def test_push_uploads_queued_session_and_clears_queue(tmp_path, monkeypatch):
|
||||
"""Happy path: queue has one session, push uploads it, queue cleared, log written."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
calls = _record_uploads(monkeypatch)
|
||||
|
||||
transcript = tmp_path / "abc.jsonl"
|
||||
transcript.write_text('{"event":"test"}\n')
|
||||
append_to_queue(tmp_path, "sid-1", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0
|
||||
|
||||
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
|
||||
assert len(sessions_calls) == 1
|
||||
assert not queue_path(tmp_path).exists()
|
||||
log = uploaded_log_path(tmp_path).read_text(encoding="utf-8")
|
||||
assert str(transcript) in log
|
||||
assert "\t" in log
|
||||
|
||||
|
||||
def test_push_dedups_duplicate_paths_in_queue(tmp_path, monkeypatch):
|
||||
"""Resume scenario: same (session_id, path) queued twice — push uploads once."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
calls = _record_uploads(monkeypatch)
|
||||
|
||||
transcript = tmp_path / "abc.jsonl"
|
||||
transcript.write_text('{"event":"test"}\n')
|
||||
append_to_queue(tmp_path, "sid-1", str(transcript))
|
||||
append_to_queue(tmp_path, "sid-1", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0
|
||||
|
||||
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
|
||||
assert len(sessions_calls) == 1
|
||||
|
||||
|
||||
def test_push_silent_exit_when_lock_held(tmp_path, monkeypatch):
|
||||
"""Concurrent SessionEnd hooks: only one push runs, others silent-exit."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
|
||||
@contextmanager
|
||||
def _yield_none(workspace):
|
||||
yield None
|
||||
|
||||
monkeypatch.setattr("cli.commands.push.acquire_or_skip", _yield_none)
|
||||
|
||||
def _raise(*a, **kw):
|
||||
raise AssertionError("api_post called when lock unavailable")
|
||||
|
||||
monkeypatch.setattr("cli.commands.push.api_post", _raise)
|
||||
|
||||
transcript = tmp_path / "x.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(tmp_path, "sid-1", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0
|
||||
assert result.output == ""
|
||||
assert queue_path(tmp_path).read_text(encoding="utf-8") == f"sid-1\t{transcript}\n"
|
||||
|
||||
|
||||
def test_push_silent_exit_when_filelock_raises_oserror(tmp_path, monkeypatch):
|
||||
"""OSError from filelock (read-only FS, permission denied, disk full)
|
||||
must not crash push with an unhandled traceback. Exercises the real
|
||||
acquire_or_skip by replacing it with a context manager that raises
|
||||
OSError on entry — simulates what filelock.FileLock.acquire raises
|
||||
on a read-only mount."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
|
||||
# Queue setup must happen BEFORE we install the failing lock — the
|
||||
# `append_to_queue` path holds its own `agnes-queue.lock` and a
|
||||
# blanket `FileLock.acquire` patch would break that one too.
|
||||
transcript = tmp_path / "x.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(tmp_path, "sid-1", str(transcript))
|
||||
|
||||
# Wrap the real acquire_or_skip with one that raises OSError before
|
||||
# yielding. We can't just patch `cli.commands.push.acquire_or_skip`
|
||||
# because the new behaviour lives inside `acquire_or_skip` itself —
|
||||
# we have to exercise its except handler. So we patch `FileLock`
|
||||
# used inside push_lock: subclass with overridden `acquire` that
|
||||
# raises OSError.
|
||||
from cli.lib import push_lock as pl
|
||||
|
||||
class _BrokenLock:
|
||||
def __init__(self, path: str) -> None:
|
||||
self._path = path
|
||||
|
||||
def acquire(self, timeout: float = -1):
|
||||
raise OSError("read-only filesystem")
|
||||
|
||||
monkeypatch.setattr(pl, "FileLock", _BrokenLock)
|
||||
|
||||
def _api_should_not_be_called(*a, **kw):
|
||||
raise AssertionError("api_post called when lock acquisition raised OSError")
|
||||
|
||||
monkeypatch.setattr("cli.commands.push.api_post", _api_should_not_be_called)
|
||||
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0, f"push must exit 0 on OSError, got: {result.output}"
|
||||
# No traceback in output
|
||||
assert "Traceback" not in result.output
|
||||
# Queue preserved for next push attempt
|
||||
assert queue_path(tmp_path).read_text(encoding="utf-8") == f"sid-1\t{transcript}\n"
|
||||
|
||||
|
||||
def test_push_processes_recovery_snapshot_first(tmp_path, monkeypatch):
|
||||
"""Pre-existing snapshot from a crashed push gets picked up."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
calls = _record_uploads(monkeypatch)
|
||||
|
||||
claude = tmp_path / ".claude"
|
||||
claude.mkdir(parents=True, exist_ok=True)
|
||||
recovery = claude / "agnes-sessions.snapshot.99999.txt"
|
||||
crashed_jsonl = tmp_path / "crashed.jsonl"
|
||||
crashed_jsonl.write_text("{}\n")
|
||||
recovery.write_text(f"sid-old\t{crashed_jsonl}\n", encoding="utf-8")
|
||||
|
||||
fresh_jsonl = tmp_path / "fresh.jsonl"
|
||||
fresh_jsonl.write_text("{}\n")
|
||||
append_to_queue(tmp_path, "sid-new", str(fresh_jsonl))
|
||||
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0
|
||||
|
||||
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
|
||||
assert len(sessions_calls) == 2
|
||||
assert not recovery.exists()
|
||||
assert not queue_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_push_skips_stale_queue_entry(tmp_path, monkeypatch):
|
||||
"""Queue entry pointing to a deleted file: skipped, not retried forever."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
|
||||
def _raise(*a, **kw):
|
||||
raise AssertionError("api_post should not be called for missing file")
|
||||
|
||||
monkeypatch.setattr("cli.commands.push.api_post", _raise)
|
||||
|
||||
append_to_queue(tmp_path, "sid-1", str(tmp_path / "ghost.jsonl"))
|
||||
|
||||
result = runner.invoke(push_app, [])
|
||||
assert result.exit_code == 0
|
||||
assert not queue_path(tmp_path).exists()
|
||||
assert not uploaded_log_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_push_requeues_failed_uploads(tmp_path, monkeypatch):
|
||||
"""Server returns 500 → path stays in queue for next push."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
|
||||
def _fail(*a, **kw):
|
||||
return _FakeResp(500)
|
||||
|
||||
monkeypatch.setattr("cli.commands.push.api_post", _fail)
|
||||
|
||||
transcript = tmp_path / "x.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(tmp_path, "sid-1", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, [])
|
||||
assert result.exit_code == 0
|
||||
assert queue_path(tmp_path).read_text(encoding="utf-8") == f"sid-1\t{transcript}\n"
|
||||
assert not uploaded_log_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_push_uploads_local_md(tmp_path, monkeypatch):
|
||||
"""CLAUDE.local.md uploaded when present."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
calls = _record_uploads(monkeypatch)
|
||||
|
||||
claude = tmp_path / ".claude"
|
||||
claude.mkdir()
|
||||
(claude / "CLAUDE.local.md").write_text("notes")
|
||||
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0
|
||||
|
||||
md_calls = [c for c in calls if c[0] == "/api/upload/local-md"]
|
||||
assert len(md_calls) == 1
|
||||
|
||||
|
||||
def test_push_json_output(tmp_path, monkeypatch):
|
||||
"""--json emits a single JSON object with results."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
_record_uploads(monkeypatch)
|
||||
|
||||
transcript = tmp_path / "x.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(tmp_path, "sid-1", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, ["--json"])
|
||||
assert result.exit_code == 0
|
||||
data = json.loads(result.output.strip())
|
||||
assert data["sessions"] == 1
|
||||
assert data["errors"] == []
|
||||
assert data["private_skipped"] == 0
|
||||
|
||||
|
||||
# ---------- Private filter tests --------------------------------------------
|
||||
|
||||
|
||||
def test_push_skips_private_session_and_audit_logs(tmp_path, monkeypatch):
|
||||
"""Queue contains a private session_id → no upload, audit log appended."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
calls = _record_uploads(monkeypatch)
|
||||
|
||||
transcript = tmp_path / "secret.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
add_private(tmp_path, "sid-private")
|
||||
append_to_queue(tmp_path, "sid-private", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0
|
||||
|
||||
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
|
||||
assert sessions_calls == [], "private session must NOT be uploaded"
|
||||
|
||||
# Audit log entry written
|
||||
audit = private_skipped_log_path(tmp_path).read_text(encoding="utf-8")
|
||||
assert "sid-private" in audit
|
||||
assert str(transcript) in audit
|
||||
|
||||
# Queue consumed (snapshot processed and discarded — private entry not requeued)
|
||||
assert not queue_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_push_mixes_private_and_public_correctly(tmp_path, monkeypatch):
|
||||
"""A push run with one private + one public session uploads only the public one."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
calls = _record_uploads(monkeypatch)
|
||||
|
||||
secret = tmp_path / "secret.jsonl"
|
||||
secret.write_text("{}\n")
|
||||
public = tmp_path / "public.jsonl"
|
||||
public.write_text("{}\n")
|
||||
|
||||
add_private(tmp_path, "sid-secret")
|
||||
append_to_queue(tmp_path, "sid-secret", str(secret))
|
||||
append_to_queue(tmp_path, "sid-public", str(public))
|
||||
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0
|
||||
|
||||
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
|
||||
assert len(sessions_calls) == 1
|
||||
|
||||
audit = private_skipped_log_path(tmp_path).read_text(encoding="utf-8")
|
||||
assert "sid-secret" in audit
|
||||
assert "sid-public" not in audit
|
||||
|
||||
|
||||
def test_push_dry_run_shows_private_skip(tmp_path, monkeypatch):
|
||||
"""--dry-run preview reports private-skipped count separately."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
|
||||
def _raise(*a, **kw):
|
||||
raise AssertionError("api_post was called during --dry-run")
|
||||
|
||||
monkeypatch.setattr("cli.commands.push.api_post", _raise)
|
||||
|
||||
transcript = tmp_path / "secret.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
add_private(tmp_path, "sid-priv")
|
||||
append_to_queue(tmp_path, "sid-priv", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, ["--dry-run"])
|
||||
assert result.exit_code == 0
|
||||
assert "1 private session" in result.output
|
||||
assert "sid-priv" in result.output
|
||||
|
||||
|
||||
# ---------- 4xx permanent-failure handling -----------------------------------
|
||||
|
||||
|
||||
def _stub_api_post_status(monkeypatch, status: int) -> None:
|
||||
"""Patch api_post to always return the given status code."""
|
||||
def _fixed(*a, **kw):
|
||||
return _FakeResp(status)
|
||||
monkeypatch.setattr("cli.commands.push.api_post", _fixed)
|
||||
|
||||
|
||||
def test_push_drops_4xx_to_audit_log_not_requeue(tmp_path, monkeypatch):
|
||||
"""4xx (here: 401 token expired) → drop + audit, no requeue.
|
||||
Closes the prior infinite-loop bug where every non-200 except
|
||||
`file not found on disk` was requeued forever."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
_stub_api_post_status(monkeypatch, 401)
|
||||
|
||||
transcript = tmp_path / "x.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(tmp_path, "sid-1", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, [])
|
||||
assert result.exit_code == 0
|
||||
# Entry must NOT be in the live queue any more.
|
||||
assert not queue_path(tmp_path).exists() or \
|
||||
queue_path(tmp_path).read_text(encoding="utf-8") == ""
|
||||
# Audit log must record the drop with status + session_id + path.
|
||||
log = failed_log_path(tmp_path).read_text(encoding="utf-8")
|
||||
assert "\t401\t" in log
|
||||
assert "sid-1" in log
|
||||
assert str(transcript) in log
|
||||
|
||||
|
||||
def test_push_drops_each_4xx_status(tmp_path, monkeypatch):
|
||||
"""403, 413, 400 → all drop (not just 401)."""
|
||||
for status in (400, 403, 413):
|
||||
ws = tmp_path / f"ws-{status}"
|
||||
ws.mkdir()
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(ws))
|
||||
_stub_config(monkeypatch)
|
||||
_stub_api_post_status(monkeypatch, status)
|
||||
transcript = ws / "x.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(ws, f"sid-{status}", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, [])
|
||||
assert result.exit_code == 0, (status, result.output)
|
||||
log = failed_log_path(ws).read_text(encoding="utf-8")
|
||||
assert f"\t{status}\t" in log, (status, log)
|
||||
|
||||
|
||||
def test_push_requeues_408_and_429(tmp_path, monkeypatch):
|
||||
"""408 Request Timeout + 429 Too Many Requests are transient per
|
||||
HTTP spec — server is asking us to retry, not telling us the
|
||||
request is invalid. Must requeue, not drop."""
|
||||
for status in (408, 429):
|
||||
ws = tmp_path / f"ws-{status}"
|
||||
ws.mkdir()
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(ws))
|
||||
_stub_config(monkeypatch)
|
||||
_stub_api_post_status(monkeypatch, status)
|
||||
transcript = ws / "x.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(ws, f"sid-{status}", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, [])
|
||||
assert result.exit_code == 0
|
||||
# Requeued → entry back in live queue.
|
||||
live = queue_path(ws).read_text(encoding="utf-8")
|
||||
assert f"sid-{status}\t{transcript}\n" == live
|
||||
# NOT in the failed audit log.
|
||||
assert not failed_log_path(ws).exists()
|
||||
|
||||
|
||||
def test_push_requeues_5xx(tmp_path, monkeypatch):
|
||||
"""5xx is genuine server-side failure: request was valid but server
|
||||
couldn't honor it right now. Requeue for the next push."""
|
||||
for status in (500, 502, 503):
|
||||
ws = tmp_path / f"ws-{status}"
|
||||
ws.mkdir()
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(ws))
|
||||
_stub_config(monkeypatch)
|
||||
_stub_api_post_status(monkeypatch, status)
|
||||
transcript = ws / "x.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(ws, f"sid-{status}", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, [])
|
||||
assert result.exit_code == 0
|
||||
live = queue_path(ws).read_text(encoding="utf-8")
|
||||
assert f"sid-{status}\t{transcript}\n" == live
|
||||
assert not failed_log_path(ws).exists()
|
||||
|
||||
|
||||
def test_push_requeues_network_exception(tmp_path, monkeypatch):
|
||||
"""Connection error / DNS / timeout — no status code from server.
|
||||
Treat as transient: requeue rather than drop."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
|
||||
def _raise(*a, **kw):
|
||||
raise ConnectionError("server unreachable")
|
||||
monkeypatch.setattr("cli.commands.push.api_post", _raise)
|
||||
|
||||
transcript = tmp_path / "x.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(tmp_path, "sid-net", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, [])
|
||||
assert result.exit_code == 0
|
||||
live = queue_path(tmp_path).read_text(encoding="utf-8")
|
||||
assert f"sid-net\t{transcript}\n" == live
|
||||
assert not failed_log_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_push_4xx_drop_count_in_json_output(tmp_path, monkeypatch):
|
||||
"""--json surfaces the new `dropped_permanent` counter so operators
|
||||
can pipe it into monitoring / scripts."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
_stub_api_post_status(monkeypatch, 401)
|
||||
|
||||
transcript = tmp_path / "x.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(tmp_path, "sid-1", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, ["--json"])
|
||||
assert result.exit_code == 0
|
||||
payload = json.loads(result.output)
|
||||
assert payload["dropped_permanent"] == 1
|
||||
assert payload["sessions"] == 0
|
||||
|
||||
|
||||
def test_push_4xx_drop_visible_in_quiet_stdout(tmp_path, monkeypatch):
|
||||
"""Non-quiet stdout mentions the audit-log path so operators tailing
|
||||
`agnes push` output get a pointer to the forensic trail."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
_stub_api_post_status(monkeypatch, 413)
|
||||
|
||||
transcript = tmp_path / "huge.jsonl"
|
||||
transcript.write_text("{}\n")
|
||||
append_to_queue(tmp_path, "sid-big", str(transcript))
|
||||
|
||||
result = runner.invoke(push_app, [])
|
||||
assert result.exit_code == 0
|
||||
assert "agnes-sessions-failed.txt" in result.output
|
||||
assert "permanent failure" in result.output
|
||||
|
||||
|
||||
# ---------- David #8: legacy-scan honors the private list -------------------
|
||||
#
|
||||
# `--legacy-scan` walks ~/.claude/projects/<encoded-cwd>/*.jsonl. Claude Code
|
||||
# names jsonls `<session-id>.jsonl`, so the file stem IS the session id —
|
||||
# the same private filter that protects queue uploads must apply. Without
|
||||
# this, an operator running `agnes push --legacy-scan` to backfill old
|
||||
# sessions would silently upload everything on disk.
|
||||
|
||||
|
||||
def test_push_legacy_scan_skips_private_session(tmp_path, monkeypatch):
|
||||
"""Legacy-scan picks up `<sid>.jsonl` from the projects dir; if the
|
||||
sid is on the private list, it must be skipped + audit-logged."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
calls = _record_uploads(monkeypatch)
|
||||
|
||||
projects_dir = tmp_path / "projects-fake"
|
||||
projects_dir.mkdir()
|
||||
pub = projects_dir / "sid-public.jsonl"
|
||||
pub.write_text("{}\n")
|
||||
priv = projects_dir / "sid-private.jsonl"
|
||||
priv.write_text("{}\n")
|
||||
|
||||
monkeypatch.setattr(
|
||||
"cli.lib.claude_sessions.list_session_files",
|
||||
lambda _w: [pub, priv],
|
||||
)
|
||||
add_private(tmp_path, "sid-private")
|
||||
|
||||
result = runner.invoke(push_app, ["--legacy-scan", "--quiet"])
|
||||
assert result.exit_code == 0
|
||||
|
||||
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
|
||||
assert len(sessions_calls) == 1
|
||||
uploaded_path = sessions_calls[0][1]["files"]["file"][0]
|
||||
assert uploaded_path == "sid-public.jsonl"
|
||||
|
||||
audit = private_skipped_log_path(tmp_path).read_text(encoding="utf-8")
|
||||
assert "sid-private" in audit
|
||||
assert str(priv) in audit
|
||||
|
||||
|
||||
def test_push_legacy_scan_dry_run_segregates_private(tmp_path, monkeypatch):
|
||||
"""Dry-run JSON shape: legacy-scan candidates surface in
|
||||
would_upload OR would_skip_private depending on private membership."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
_stub_config(monkeypatch)
|
||||
|
||||
projects_dir = tmp_path / "projects-fake"
|
||||
projects_dir.mkdir()
|
||||
public_jsonl = projects_dir / "sid-pub.jsonl"
|
||||
public_jsonl.write_text("{}\n")
|
||||
private_jsonl = projects_dir / "sid-priv.jsonl"
|
||||
private_jsonl.write_text("{}\n")
|
||||
|
||||
monkeypatch.setattr(
|
||||
"cli.lib.claude_sessions.list_session_files",
|
||||
lambda _w: [public_jsonl, private_jsonl],
|
||||
)
|
||||
add_private(tmp_path, "sid-priv")
|
||||
|
||||
result = runner.invoke(push_app, ["--legacy-scan", "--dry-run", "--json"])
|
||||
assert result.exit_code == 0
|
||||
payload = json.loads(result.output)
|
||||
assert str(public_jsonl) in payload["would_upload"]["sessions"]
|
||||
assert str(private_jsonl) not in payload["would_upload"]["sessions"]
|
||||
skipped_paths = [e["path"] for e in payload["would_skip_private"]]
|
||||
assert str(private_jsonl) in skipped_paths
|
||||
|
|
|
|||
208
tests/test_e2e_privacy.py
Normal file
208
tests/test_e2e_privacy.py
Normal file
|
|
@ -0,0 +1,208 @@
|
|||
"""End-to-end test pinning the user-visible privacy promise:
|
||||
**a session marked /agnes-private never reaches the server.**
|
||||
|
||||
Each unit test (capture_session, mark_private, push, private_list) covers
|
||||
its own component edge cases. This file wires them together against a
|
||||
recording fake `api_post` to verify the cross-component contract — closes
|
||||
ZdenekSrotyr S2.9 in PR #242.
|
||||
|
||||
The test exercises both race orderings:
|
||||
- mark BEFORE capture (capture-session sees the private flag and skips
|
||||
the queue write — entry never enters the upload pipeline)
|
||||
- capture BEFORE mark (entry sits in the queue; push's per-entry
|
||||
re-check against the private list filters it out and audit-logs)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
|
||||
from typer.testing import CliRunner
|
||||
|
||||
from cli.commands.capture_session import capture_session_app
|
||||
from cli.commands.mark_private import mark_private_app
|
||||
from cli.commands.push import push_app
|
||||
from cli.lib.session_queue import (
|
||||
failed_log_path,
|
||||
private_skipped_log_path,
|
||||
queue_path,
|
||||
uploaded_log_path,
|
||||
)
|
||||
|
||||
|
||||
runner = CliRunner()
|
||||
|
||||
|
||||
class _FakeResp:
|
||||
def __init__(self, status_code: int = 200) -> None:
|
||||
self.status_code = status_code
|
||||
|
||||
|
||||
def _record_uploads(monkeypatch) -> list[tuple[str, dict]]:
|
||||
"""Patch api_post to record every call. Sessions endpoint succeeds with
|
||||
200; local-md endpoint also succeeds. Returns the recorder list."""
|
||||
calls: list[tuple[str, dict]] = []
|
||||
|
||||
def _fake(endpoint, **kwargs):
|
||||
calls.append((endpoint, kwargs))
|
||||
return _FakeResp(200)
|
||||
|
||||
monkeypatch.setattr("cli.commands.push.api_post", _fake)
|
||||
return calls
|
||||
|
||||
|
||||
def _stub_push_config(monkeypatch) -> None:
|
||||
monkeypatch.setattr("cli.commands.push.get_server_url", lambda: "http://x")
|
||||
monkeypatch.setattr("cli.commands.push.get_token", lambda: "test-pat")
|
||||
|
||||
|
||||
def _capture(workspace, session_id, transcript_path, monkeypatch) -> None:
|
||||
"""Simulate a SessionStart hook firing `agnes capture-session` with the
|
||||
real CC stdin payload shape."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(workspace))
|
||||
payload = json.dumps({
|
||||
"session_id": session_id,
|
||||
"transcript_path": str(transcript_path),
|
||||
"cwd": str(workspace),
|
||||
"hook_event_name": "SessionStart",
|
||||
})
|
||||
result = runner.invoke(capture_session_app, [], input=payload)
|
||||
assert result.exit_code == 0, result.output
|
||||
|
||||
|
||||
def _mark_private(workspace, session_id, monkeypatch) -> None:
|
||||
"""Simulate the user typing `/agnes-private` inside Claude Code —
|
||||
`!`-prefix bash spawns `agnes mark-private` with CLAUDE_CODE_SESSION_ID
|
||||
set to the active session."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(workspace))
|
||||
monkeypatch.setenv("CLAUDE_CODE_SESSION_ID", session_id)
|
||||
result = runner.invoke(mark_private_app, [])
|
||||
assert result.exit_code == 0, result.output
|
||||
|
||||
|
||||
def _uploaded_session_names(calls: list[tuple[str, dict]]) -> list[str]:
|
||||
"""Extract the `name` field from every recorded /api/upload/sessions
|
||||
multipart upload. Returns the list of jsonl filenames that reached
|
||||
the (fake) server."""
|
||||
out: list[str] = []
|
||||
for endpoint, kwargs in calls:
|
||||
if endpoint != "/api/upload/sessions":
|
||||
continue
|
||||
files = kwargs.get("files", {})
|
||||
file_field = files.get("file")
|
||||
if file_field is None:
|
||||
continue
|
||||
# `files={"file": (name, fh)}` — `_upload_one` in push.py
|
||||
name = file_field[0] if isinstance(file_field, tuple) else None
|
||||
if name is not None:
|
||||
out.append(name)
|
||||
return out
|
||||
|
||||
|
||||
def test_e2e_mark_before_capture_session_never_uploads(tmp_path, monkeypatch):
|
||||
"""User marks /agnes-private *before* the SessionStart hook runs
|
||||
capture-session. The capture-session call sees the session_id is on
|
||||
the private list and skips the queue write entirely — the upload
|
||||
pipeline never even sees it."""
|
||||
_stub_push_config(monkeypatch)
|
||||
calls = _record_uploads(monkeypatch)
|
||||
|
||||
private_id = "private-session"
|
||||
private_transcript = tmp_path / "private.jsonl"
|
||||
private_transcript.write_text("{}\n")
|
||||
|
||||
# Step 1: user runs /agnes-private (e.g. typed it before any
|
||||
# SessionStart hook reached capture-session).
|
||||
_mark_private(tmp_path, private_id, monkeypatch)
|
||||
|
||||
# Step 2: SessionStart hook fires capture-session. It must see the
|
||||
# private flag and skip the queue write.
|
||||
_capture(tmp_path, private_id, private_transcript, monkeypatch)
|
||||
|
||||
# Queue must be empty / absent — capture-session bailed before write.
|
||||
assert not queue_path(tmp_path).exists() or \
|
||||
queue_path(tmp_path).read_text(encoding="utf-8") == ""
|
||||
|
||||
# Step 3: push runs (e.g. SessionEnd fires it).
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0, result.output
|
||||
|
||||
# Assert: nothing reached the (fake) server.
|
||||
assert _uploaded_session_names(calls) == [], (
|
||||
f"Private session must not upload; got: {_uploaded_session_names(calls)}"
|
||||
)
|
||||
# And no uploaded-log entry was written.
|
||||
assert not uploaded_log_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_e2e_capture_before_mark_filters_at_push(tmp_path, monkeypatch):
|
||||
"""User marks /agnes-private *after* capture-session has already
|
||||
queued the transcript (typical scenario — user types the slash
|
||||
command mid-session). The entry sits in the queue; push's per-entry
|
||||
re-check against the private list filters it out at the last moment
|
||||
and audit-logs the skip."""
|
||||
_stub_push_config(monkeypatch)
|
||||
calls = _record_uploads(monkeypatch)
|
||||
|
||||
private_id = "private-session"
|
||||
private_transcript = tmp_path / "private.jsonl"
|
||||
private_transcript.write_text("{}\n")
|
||||
public_id = "public-session"
|
||||
public_transcript = tmp_path / "public.jsonl"
|
||||
public_transcript.write_text("{}\n")
|
||||
|
||||
# Step 1: SessionStart hooks fire capture-session for BOTH sessions.
|
||||
# Neither is private yet, so both land in the queue.
|
||||
_capture(tmp_path, private_id, private_transcript, monkeypatch)
|
||||
_capture(tmp_path, public_id, public_transcript, monkeypatch)
|
||||
# Sanity: queue has both entries.
|
||||
queue_lines = queue_path(tmp_path).read_text(encoding="utf-8").splitlines()
|
||||
assert len(queue_lines) == 2
|
||||
|
||||
# Step 2: user marks one of them /agnes-private mid-session.
|
||||
_mark_private(tmp_path, private_id, monkeypatch)
|
||||
|
||||
# Step 3: push runs.
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0, result.output
|
||||
|
||||
# Assert: ONLY the public session reached the (fake) server.
|
||||
uploaded = _uploaded_session_names(calls)
|
||||
assert uploaded == ["public.jsonl"], (
|
||||
f"Only the non-private session should upload; got: {uploaded}"
|
||||
)
|
||||
|
||||
# Audit log records the skipped private entry — the forensic trail.
|
||||
skipped_log = private_skipped_log_path(tmp_path).read_text(encoding="utf-8")
|
||||
assert private_id in skipped_log
|
||||
assert "private.jsonl" in skipped_log
|
||||
|
||||
# Failed log MUST be empty — private skip is not a failure, just
|
||||
# an intentional exclusion.
|
||||
assert not failed_log_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_e2e_unmarked_sessions_upload_normally(tmp_path, monkeypatch):
|
||||
"""Control case: with no /agnes-private call, both sessions upload.
|
||||
Proves the e2e tests above are exercising the privacy filter, not
|
||||
some unrelated reason the upload didn't fire."""
|
||||
_stub_push_config(monkeypatch)
|
||||
calls = _record_uploads(monkeypatch)
|
||||
|
||||
t1 = tmp_path / "a.jsonl"
|
||||
t1.write_text("{}\n")
|
||||
t2 = tmp_path / "b.jsonl"
|
||||
t2.write_text("{}\n")
|
||||
|
||||
_capture(tmp_path, "sid-a", t1, monkeypatch)
|
||||
_capture(tmp_path, "sid-b", t2, monkeypatch)
|
||||
|
||||
result = runner.invoke(push_app, ["--quiet"])
|
||||
assert result.exit_code == 0, result.output
|
||||
|
||||
uploaded = sorted(_uploaded_session_names(calls))
|
||||
assert uploaded == ["a.jsonl", "b.jsonl"], (
|
||||
f"Both unmarked sessions should upload; got: {uploaded}"
|
||||
)
|
||||
# No private-skipped audit entries either — nothing was marked private.
|
||||
assert not private_skipped_log_path(tmp_path).exists()
|
||||
|
|
@ -78,6 +78,20 @@ def test_install_idempotent(tmp_path):
|
|||
install_claude_commands(tmp_path)
|
||||
second = _read_managed_command(tmp_path)
|
||||
assert first == second
|
||||
# And no extra files appeared.
|
||||
# Exactly the managed files, no strays.
|
||||
cmd_dir_files = sorted(p.name for p in (tmp_path / ".claude" / "commands").iterdir())
|
||||
assert cmd_dir_files == ["update-agnes-plugins.md"]
|
||||
assert cmd_dir_files == ["agnes-private.md", "update-agnes-plugins.md"]
|
||||
|
||||
|
||||
def test_install_writes_agnes_private_slash_command(tmp_path):
|
||||
"""The /agnes-private slash command is shipped alongside update-agnes-plugins
|
||||
and triggers `agnes mark-private` deterministically (no AI in the loop).
|
||||
Pin the marker so a template rewrite that drops the `!`-prefix line is caught."""
|
||||
install_claude_commands(tmp_path)
|
||||
body = (tmp_path / ".claude" / "commands" / "agnes-private.md").read_text(
|
||||
encoding="utf-8"
|
||||
)
|
||||
assert body.startswith("---"), body[:120]
|
||||
assert "description:" in body.split("---", 2)[1]
|
||||
# `!`-prefix runs as bash directly — deterministic, no AI tokens.
|
||||
assert "agnes mark-private" in body
|
||||
|
|
|
|||
|
|
@ -4,7 +4,11 @@ import json
|
|||
from pathlib import Path
|
||||
|
||||
|
||||
from cli.lib.hooks import install_claude_hooks
|
||||
from cli.lib.hooks import (
|
||||
install_claude_hooks,
|
||||
maybe_refresh_claude_hooks,
|
||||
workspace_has_agnes_hooks,
|
||||
)
|
||||
|
||||
|
||||
def _read_settings(workspace: Path) -> dict:
|
||||
|
|
@ -27,14 +31,26 @@ def test_install_creates_settings_file(tmp_path):
|
|||
install_claude_hooks(tmp_path)
|
||||
cfg = _read_settings(tmp_path)
|
||||
starts = _commands_for(cfg, "SessionStart")
|
||||
# SessionStart has three entries: (1) chained self-upgrade ; pull —
|
||||
# self-upgrade runs first so a wire-protocol bump lands before pull
|
||||
# tries to use the new CLI; (2) refresh-marketplace as a separate
|
||||
# entry so a failure (e.g. fresh workspace with no clone) doesn't
|
||||
# suppress the data pull above; (3) push as a self-heal for orphan
|
||||
# session JSONLs from `claude -p` headless mode (where Claude Code
|
||||
# does NOT fire SessionEnd) or abnormal exits.
|
||||
# SessionStart has three entries: (1) capture-session as the very first
|
||||
# so the hook stdin (transcript_path) is appended to the queue before
|
||||
# any other hook runs; (2) chained self-upgrade ; pull — self-upgrade
|
||||
# runs first so a wire-protocol bump lands before pull tries to use
|
||||
# the new CLI; (3) refresh-marketplace as a separate entry so a
|
||||
# failure (e.g. fresh workspace with no clone) doesn't suppress the
|
||||
# data pull above.
|
||||
#
|
||||
# `agnes push` is NOT in SessionStart — the queue mechanism handles
|
||||
# orphans on the next SessionEnd, so the old self-heal entry was
|
||||
# redundant + would re-upload the just-starting (empty) session.
|
||||
assert len(starts) == 3
|
||||
capture = next((c for c in starts if "agnes capture-session" in c), None)
|
||||
assert capture is not None, "Expected SessionStart capture-session entry"
|
||||
assert capture.startswith("bash -c "), (
|
||||
f"capture-session hook must be wrapped in bash -c for Windows; got: {capture!r}"
|
||||
)
|
||||
assert not any("agnes push" in c for c in starts), (
|
||||
f"agnes push must NOT be in SessionStart; got: {starts!r}"
|
||||
)
|
||||
chain = next(
|
||||
(c for c in starts if "agnes self-upgrade" in c and "agnes pull" in c),
|
||||
None,
|
||||
|
|
@ -62,25 +78,36 @@ def test_install_creates_settings_file(tmp_path):
|
|||
assert "--quiet" not in refresh, (
|
||||
f"refresh-marketplace hook must NOT use --quiet (removed flag); got: {refresh!r}"
|
||||
)
|
||||
# The push self-heal entry is also bash-c-wrapped for Windows parity.
|
||||
push_start = next((c for c in starts if "agnes push" in c), None)
|
||||
assert push_start is not None, (
|
||||
"Expected SessionStart self-heal `agnes push` entry for orphan JSONLs"
|
||||
)
|
||||
assert push_start.startswith("bash -c "), (
|
||||
f"push self-heal hook must be wrapped in bash -c for Windows; got: {push_start!r}"
|
||||
)
|
||||
ends = _commands_for(cfg, "SessionEnd")
|
||||
assert len(ends) == 1
|
||||
assert "agnes push --quiet" in ends[0]
|
||||
|
||||
|
||||
def test_all_installed_hooks_are_bash_wrapped(tmp_path):
|
||||
"""Pin the invariant: every Agnes-managed SessionStart / SessionEnd
|
||||
entry must be wrapped in `bash -c "..."`. Claude Code on Windows
|
||||
runs hook commands directly (no shell), so any unwrapped entry that
|
||||
relies on shell syntax (`;` chains, `2>/dev/null` redirection,
|
||||
`|| true` short-circuit) fails silently on Windows. The previous
|
||||
self-upgrade+pull chain shipped unwrapped — this test catches a
|
||||
regression to that state."""
|
||||
install_claude_hooks(tmp_path)
|
||||
cfg = _read_settings(tmp_path)
|
||||
starts = _commands_for(cfg, "SessionStart")
|
||||
ends = _commands_for(cfg, "SessionEnd")
|
||||
for cmd in starts + ends:
|
||||
assert cmd.startswith("bash -c "), (
|
||||
f"Hook must be wrapped in bash -c for Windows; got: {cmd!r}"
|
||||
)
|
||||
|
||||
|
||||
def test_install_idempotent(tmp_path):
|
||||
install_claude_hooks(tmp_path)
|
||||
install_claude_hooks(tmp_path)
|
||||
cfg = _read_settings(tmp_path)
|
||||
# Three SessionStart entries (pull + refresh-marketplace + push self-heal),
|
||||
# one SessionEnd entry (push). Re-install must NOT duplicate them.
|
||||
# Three SessionStart entries (capture-session + chained self-upgrade/pull
|
||||
# + refresh-marketplace), one SessionEnd entry (push). Re-install must
|
||||
# NOT duplicate them.
|
||||
assert len(cfg["hooks"]["SessionStart"]) == 3
|
||||
assert len(cfg["hooks"]["SessionEnd"]) == 1
|
||||
|
||||
|
|
@ -100,9 +127,11 @@ def test_install_replaces_old_da_sync_entries(tmp_path):
|
|||
cfg = _read_settings(tmp_path)
|
||||
starts = _commands_for(cfg, "SessionStart")
|
||||
assert len(starts) == 3
|
||||
assert any("agnes capture-session" in c for c in starts)
|
||||
assert any("agnes pull" in c for c in starts)
|
||||
assert any("agnes refresh-marketplace" in c for c in starts)
|
||||
assert any("agnes push" in c for c in starts)
|
||||
# `agnes push` lives only in SessionEnd now.
|
||||
assert not any("agnes push" in c for c in starts)
|
||||
# Legacy command must be gone from BOTH starts.
|
||||
assert not any("da sync" in c for c in starts)
|
||||
|
||||
|
|
@ -111,7 +140,7 @@ def test_install_replaces_prior_single_pull_entry(tmp_path):
|
|||
"""Workspaces bootstrapped by a CLI version that only installed a
|
||||
single SessionStart entry (`agnes pull`, no refresh-marketplace) must
|
||||
upgrade to the three-entry layout on the next install — not end up
|
||||
with four entries (one old + three new)."""
|
||||
stacking the new entries on top of the old one."""
|
||||
settings_path = tmp_path / ".claude" / "settings.json"
|
||||
settings_path.parent.mkdir(parents=True)
|
||||
settings_path.write_text(json.dumps({
|
||||
|
|
@ -125,9 +154,10 @@ def test_install_replaces_prior_single_pull_entry(tmp_path):
|
|||
cfg = _read_settings(tmp_path)
|
||||
starts = _commands_for(cfg, "SessionStart")
|
||||
assert len(starts) == 3
|
||||
assert any("agnes capture-session" in c for c in starts)
|
||||
assert any("agnes pull" in c for c in starts)
|
||||
assert any("agnes refresh-marketplace" in c for c in starts)
|
||||
assert any("agnes push" in c for c in starts)
|
||||
assert not any("agnes push" in c for c in starts)
|
||||
|
||||
|
||||
def test_install_replaces_v0_43_chained_self_upgrade_pull_entry(tmp_path):
|
||||
|
|
@ -163,8 +193,9 @@ def test_install_replaces_v0_43_chained_self_upgrade_pull_entry(tmp_path):
|
|||
None,
|
||||
)
|
||||
assert chain is not None
|
||||
assert any("agnes capture-session" in c for c in starts)
|
||||
assert any("agnes refresh-marketplace" in c for c in starts)
|
||||
assert any("agnes push" in c for c in starts)
|
||||
assert not any("agnes push" in c for c in starts)
|
||||
# SessionEnd untouched (single push entry).
|
||||
ends = _commands_for(cfg, "SessionEnd")
|
||||
assert len(ends) == 1
|
||||
|
|
@ -233,9 +264,10 @@ def test_install_preserves_third_party_hooks(tmp_path):
|
|||
# Third-party entry stays + all three agnes entries get added.
|
||||
assert len(starts) == 4
|
||||
assert any("echo hi from another tool" in c for c in starts)
|
||||
assert any("agnes capture-session" in c for c in starts)
|
||||
assert any("agnes pull" in c for c in starts)
|
||||
assert any("agnes refresh-marketplace" in c for c in starts)
|
||||
assert any("agnes push" in c for c in starts)
|
||||
assert not any("agnes push" in c for c in starts)
|
||||
# Other event types untouched.
|
||||
assert cfg["hooks"]["PreToolUse"][0]["hooks"][0]["command"] == "echo pre"
|
||||
|
||||
|
|
@ -279,9 +311,8 @@ def test_install_idempotent_chained_entry(tmp_path):
|
|||
install_claude_hooks(tmp_path)
|
||||
install_claude_hooks(tmp_path)
|
||||
cfg = _read_settings(tmp_path)
|
||||
# Three SessionStart entries (chained self-upgrade+pull, refresh-
|
||||
# marketplace, and the push self-heal) — re-install must not
|
||||
# duplicate any of them.
|
||||
# Three SessionStart entries (capture-session, chained self-upgrade+pull,
|
||||
# refresh-marketplace) — re-install must not duplicate any of them.
|
||||
assert len(cfg["hooks"]["SessionStart"]) == 3
|
||||
assert len(cfg["hooks"]["SessionEnd"]) == 1
|
||||
|
||||
|
|
@ -328,6 +359,80 @@ def test_session_end_push_is_detached(tmp_path):
|
|||
)
|
||||
|
||||
|
||||
def test_install_writes_statusline_when_absent(tmp_path):
|
||||
"""Greenfield install: no prior statusLine → we write ours."""
|
||||
install_claude_hooks(tmp_path)
|
||||
cfg = _read_settings(tmp_path)
|
||||
assert "statusLine" in cfg
|
||||
assert cfg["statusLine"]["type"] == "command"
|
||||
assert "agnes statusline" in cfg["statusLine"]["command"]
|
||||
|
||||
|
||||
def test_install_preserves_existing_user_statusline(tmp_path, capsys):
|
||||
"""User has their own statusLine — we leave it alone and warn on stderr.
|
||||
Customizing the status bar is a personal preference; agnes shouldn't
|
||||
clobber it. Operators who want the private indicator alongside their
|
||||
own content can compose `agnes statusline` into their command."""
|
||||
settings_path = tmp_path / ".claude" / "settings.json"
|
||||
settings_path.parent.mkdir(parents=True)
|
||||
user_statusline = {"type": "command", "command": "my-custom-status"}
|
||||
settings_path.write_text(json.dumps({"statusLine": user_statusline}))
|
||||
|
||||
install_claude_hooks(tmp_path)
|
||||
cfg = _read_settings(tmp_path)
|
||||
# User's statusLine intact.
|
||||
assert cfg["statusLine"] == user_statusline
|
||||
# Warning surfaced.
|
||||
captured = capsys.readouterr()
|
||||
assert "statusLine" in captured.err
|
||||
|
||||
|
||||
def test_install_idempotent_when_statusline_already_ours(tmp_path):
|
||||
"""Re-running install when our statusLine is already in place is a no-op,
|
||||
NOT a warning (idempotent re-init shouldn't spam the user)."""
|
||||
install_claude_hooks(tmp_path)
|
||||
install_claude_hooks(tmp_path)
|
||||
cfg = _read_settings(tmp_path)
|
||||
assert "agnes statusline" in cfg["statusLine"]["command"]
|
||||
|
||||
|
||||
def test_install_treats_explicit_null_statusline_as_unconfigured(tmp_path, capsys):
|
||||
"""`"statusLine": null` (legal JSON, unusual) is treated as
|
||||
"no statusLine configured" rather than "user explicitly opted out".
|
||||
The previous truthy check (`if existing:`) silently took this path
|
||||
without distinguishing it from absent-key; the new check makes
|
||||
the behavior explicit and tested. No warning — the user didn't
|
||||
configure anything actionable."""
|
||||
settings_path = tmp_path / ".claude" / "settings.json"
|
||||
settings_path.parent.mkdir(parents=True)
|
||||
settings_path.write_text(json.dumps({"statusLine": None}))
|
||||
|
||||
install_claude_hooks(tmp_path)
|
||||
|
||||
cfg = _read_settings(tmp_path)
|
||||
assert isinstance(cfg["statusLine"], dict)
|
||||
assert "agnes statusline" in cfg["statusLine"]["command"]
|
||||
captured = capsys.readouterr()
|
||||
assert "preserved" not in captured.err # no spurious warning
|
||||
|
||||
|
||||
def test_install_treats_empty_statusline_as_unconfigured(tmp_path, capsys):
|
||||
"""Empty string is the falsy sibling of None — same treatment.
|
||||
Documents the boundary so future changes can't accidentally treat
|
||||
`""` as a non-empty truthy command."""
|
||||
settings_path = tmp_path / ".claude" / "settings.json"
|
||||
settings_path.parent.mkdir(parents=True)
|
||||
settings_path.write_text(json.dumps({"statusLine": ""}))
|
||||
|
||||
install_claude_hooks(tmp_path)
|
||||
|
||||
cfg = _read_settings(tmp_path)
|
||||
assert isinstance(cfg["statusLine"], dict)
|
||||
assert "agnes statusline" in cfg["statusLine"]["command"]
|
||||
captured = capsys.readouterr()
|
||||
assert "preserved" not in captured.err
|
||||
|
||||
|
||||
def test_install_replaces_old_synchronous_session_end_push(tmp_path):
|
||||
"""A workspace bootstrapped before the detachment fix has the old
|
||||
synchronous `agnes push --quiet 2>/dev/null || true` SessionEnd entry.
|
||||
|
|
@ -350,3 +455,135 @@ def test_install_replaces_old_synchronous_session_end_push(tmp_path):
|
|||
assert "nohup" in ends[0], (
|
||||
f"Old synchronous push entry must have been replaced with the detached form; got: {ends!r}"
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# workspace_has_agnes_hooks / maybe_refresh_claude_hooks
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_workspace_has_agnes_hooks_false_for_missing_settings(tmp_path):
|
||||
"""Fresh dir without `.claude/settings.json` is not an Agnes workspace."""
|
||||
assert workspace_has_agnes_hooks(tmp_path) is False
|
||||
|
||||
|
||||
def test_workspace_has_agnes_hooks_false_for_empty_settings(tmp_path):
|
||||
"""`.claude/settings.json` exists but is empty `{}` — not an Agnes workspace."""
|
||||
sp = tmp_path / ".claude" / "settings.json"
|
||||
sp.parent.mkdir(parents=True)
|
||||
sp.write_text(json.dumps({}), encoding="utf-8")
|
||||
assert workspace_has_agnes_hooks(tmp_path) is False
|
||||
|
||||
|
||||
def test_workspace_has_agnes_hooks_false_for_invalid_json(tmp_path):
|
||||
sp = tmp_path / ".claude" / "settings.json"
|
||||
sp.parent.mkdir(parents=True)
|
||||
sp.write_text("not json", encoding="utf-8")
|
||||
assert workspace_has_agnes_hooks(tmp_path) is False
|
||||
|
||||
|
||||
def test_workspace_has_agnes_hooks_false_for_third_party_only_hook(tmp_path):
|
||||
"""A settings.json with only third-party hooks (no agnes marker) is
|
||||
NOT an Agnes workspace — important so `agnes self-upgrade` from such
|
||||
a dir doesn't auto-install Agnes hooks behind the user's back."""
|
||||
sp = tmp_path / ".claude" / "settings.json"
|
||||
sp.parent.mkdir(parents=True)
|
||||
sp.write_text(json.dumps({
|
||||
"hooks": {
|
||||
"SessionStart": [
|
||||
{"hooks": [{"type": "command", "command": "echo hello"}]},
|
||||
],
|
||||
}
|
||||
}), encoding="utf-8")
|
||||
assert workspace_has_agnes_hooks(tmp_path) is False
|
||||
|
||||
|
||||
def test_workspace_has_agnes_hooks_true_for_agnes_hook(tmp_path):
|
||||
install_claude_hooks(tmp_path)
|
||||
assert workspace_has_agnes_hooks(tmp_path) is True
|
||||
|
||||
|
||||
def test_workspace_has_agnes_hooks_true_for_just_statusline(tmp_path):
|
||||
"""statusLine alone (no hook entries) still signals an Agnes workspace."""
|
||||
sp = tmp_path / ".claude" / "settings.json"
|
||||
sp.parent.mkdir(parents=True)
|
||||
sp.write_text(json.dumps({
|
||||
"statusLine": {"type": "command", "command": "agnes statusline"},
|
||||
}), encoding="utf-8")
|
||||
assert workspace_has_agnes_hooks(tmp_path) is True
|
||||
|
||||
|
||||
def test_maybe_refresh_noop_in_non_agnes_directory(tmp_path):
|
||||
"""Critical safety: `agnes self-upgrade` invoked from a non-Agnes dir
|
||||
(e.g. ~/) must NOT create `.claude/settings.json` there."""
|
||||
refreshed = maybe_refresh_claude_hooks(tmp_path)
|
||||
assert refreshed is False
|
||||
assert not (tmp_path / ".claude").exists()
|
||||
|
||||
|
||||
def test_maybe_refresh_writes_new_layout_into_v048_workspace(tmp_path):
|
||||
"""Simulate a v0.48 workspace (pre-PR-242 hook layout — chained
|
||||
self-upgrade+pull and old refresh-marketplace --quiet, no
|
||||
capture-session, no statusLine) and assert that
|
||||
maybe_refresh_claude_hooks brings it up to the current layout."""
|
||||
sp = tmp_path / ".claude" / "settings.json"
|
||||
sp.parent.mkdir(parents=True)
|
||||
sp.write_text(json.dumps({
|
||||
"hooks": {
|
||||
"SessionStart": [
|
||||
{"hooks": [{"type": "command",
|
||||
"command": "agnes self-upgrade --quiet 2>/dev/null || true; "
|
||||
"agnes pull --quiet 2>/dev/null || true"}]},
|
||||
{"hooks": [{"type": "command",
|
||||
"command": 'bash -c "agnes refresh-marketplace --quiet 2>/dev/null || true"'}]},
|
||||
],
|
||||
"SessionEnd": [
|
||||
{"hooks": [{"type": "command",
|
||||
"command": "agnes push --quiet 2>/dev/null || true"}]},
|
||||
],
|
||||
}
|
||||
}), encoding="utf-8")
|
||||
|
||||
refreshed = maybe_refresh_claude_hooks(tmp_path)
|
||||
assert refreshed is True
|
||||
|
||||
cfg = _read_settings(tmp_path)
|
||||
starts = _commands_for(cfg, "SessionStart")
|
||||
# New layout: capture-session is now the very first SessionStart entry.
|
||||
assert any("agnes capture-session" in c for c in starts), (
|
||||
f"v0.48→v0.49 migration must install capture-session; got: {starts!r}"
|
||||
)
|
||||
# And refresh-marketplace must be on --check, not --quiet.
|
||||
refresh = next((c for c in starts if "agnes refresh-marketplace" in c), None)
|
||||
assert refresh is not None and "--check" in refresh, refresh
|
||||
|
||||
ends = _commands_for(cfg, "SessionEnd")
|
||||
# SessionEnd push is now detached via nohup (replaces the old
|
||||
# synchronous form).
|
||||
assert any("nohup" in c for c in ends), (
|
||||
f"v0.48→v0.49 migration must detach SessionEnd push; got: {ends!r}"
|
||||
)
|
||||
# statusLine for the agnes-private indicator must also be installed.
|
||||
assert cfg.get("statusLine", {}).get("command", "").startswith("agnes statusline")
|
||||
|
||||
|
||||
def test_maybe_refresh_preserves_third_party_hooks(tmp_path):
|
||||
"""Refresh must keep third-party hooks intact (same guarantee as
|
||||
install_claude_hooks — the migration path uses the same machinery)."""
|
||||
sp = tmp_path / ".claude" / "settings.json"
|
||||
sp.parent.mkdir(parents=True)
|
||||
sp.write_text(json.dumps({
|
||||
"hooks": {
|
||||
"SessionStart": [
|
||||
{"hooks": [{"type": "command", "command": "agnes self-upgrade --quiet || true"}]},
|
||||
{"hooks": [{"type": "command", "command": "echo hi from another tool"}]},
|
||||
],
|
||||
}
|
||||
}), encoding="utf-8")
|
||||
refreshed = maybe_refresh_claude_hooks(tmp_path)
|
||||
assert refreshed is True
|
||||
cfg = _read_settings(tmp_path)
|
||||
starts = _commands_for(cfg, "SessionStart")
|
||||
assert "echo hi from another tool" in starts, (
|
||||
f"Third-party hook must survive refresh; got: {starts!r}"
|
||||
)
|
||||
|
|
|
|||
47
tests/test_mark_private.py
Normal file
47
tests/test_mark_private.py
Normal file
|
|
@ -0,0 +1,47 @@
|
|||
"""Tests for `agnes mark-private` — slash-command-driven private flag."""
|
||||
|
||||
from typer.testing import CliRunner
|
||||
|
||||
from cli.commands.mark_private import mark_private_app
|
||||
from cli.lib.private_list import is_private
|
||||
|
||||
runner = CliRunner()
|
||||
|
||||
|
||||
def test_mark_private_requires_session_id_env(tmp_path, monkeypatch):
|
||||
"""Without CLAUDE_CODE_SESSION_ID set (= ran outside a Claude session),
|
||||
the command must error out with exit 1, NOT silently no-op."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
monkeypatch.delenv("CLAUDE_CODE_SESSION_ID", raising=False)
|
||||
|
||||
result = runner.invoke(mark_private_app, [])
|
||||
assert result.exit_code == 1
|
||||
assert "CLAUDE_CODE_SESSION_ID" in result.output
|
||||
|
||||
|
||||
def test_mark_private_writes_to_private_list(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
monkeypatch.setenv("CLAUDE_CODE_SESSION_ID", "abc-123")
|
||||
|
||||
result = runner.invoke(mark_private_app, [])
|
||||
assert result.exit_code == 0
|
||||
assert is_private(tmp_path, "abc-123")
|
||||
assert "abc-123" in result.output
|
||||
|
||||
|
||||
def test_mark_private_is_idempotent(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
monkeypatch.setenv("CLAUDE_CODE_SESSION_ID", "abc-123")
|
||||
|
||||
runner.invoke(mark_private_app, [])
|
||||
result = runner.invoke(mark_private_app, [])
|
||||
assert result.exit_code == 0
|
||||
assert "already marked" in result.output.lower()
|
||||
|
||||
|
||||
def test_mark_private_blank_session_id_treated_as_missing(tmp_path, monkeypatch):
|
||||
"""Whitespace-only env var is the same as unset."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
monkeypatch.setenv("CLAUDE_CODE_SESSION_ID", " ")
|
||||
result = runner.invoke(mark_private_app, [])
|
||||
assert result.exit_code == 1
|
||||
86
tests/test_private_list.py
Normal file
86
tests/test_private_list.py
Normal file
|
|
@ -0,0 +1,86 @@
|
|||
"""Tests for cli.lib.private_list — authoritative "do not upload" state."""
|
||||
|
||||
from cli.lib.private_list import (
|
||||
add_private,
|
||||
is_private,
|
||||
private_list_path,
|
||||
read_all_private,
|
||||
)
|
||||
|
||||
|
||||
def test_is_private_false_when_file_missing(tmp_path):
|
||||
assert is_private(tmp_path, "abc") is False
|
||||
|
||||
|
||||
def test_is_private_false_for_empty_session_id(tmp_path):
|
||||
add_private(tmp_path, "abc")
|
||||
assert is_private(tmp_path, "") is False
|
||||
|
||||
|
||||
def test_add_private_creates_claude_dir_and_writes(tmp_path):
|
||||
assert add_private(tmp_path, "abc") is True
|
||||
assert private_list_path(tmp_path).exists()
|
||||
assert is_private(tmp_path, "abc") is True
|
||||
|
||||
|
||||
def test_add_private_is_idempotent(tmp_path):
|
||||
assert add_private(tmp_path, "abc") is True
|
||||
assert add_private(tmp_path, "abc") is False # second call: already present
|
||||
# File has exactly one line
|
||||
contents = private_list_path(tmp_path).read_text(encoding="utf-8")
|
||||
assert contents == "abc\n"
|
||||
|
||||
|
||||
def test_add_private_empty_id_noop(tmp_path):
|
||||
assert add_private(tmp_path, "") is False
|
||||
assert not private_list_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_read_all_private_returns_set(tmp_path):
|
||||
add_private(tmp_path, "abc")
|
||||
add_private(tmp_path, "def")
|
||||
assert read_all_private(tmp_path) == {"abc", "def"}
|
||||
|
||||
|
||||
def test_read_all_private_skips_blank_lines(tmp_path):
|
||||
# writable=True so the parent .claude/ exists for the manual write below.
|
||||
path = private_list_path(tmp_path, writable=True)
|
||||
path.write_text("abc\n\n \ndef\n", encoding="utf-8")
|
||||
assert read_all_private(tmp_path) == {"abc", "def"}
|
||||
|
||||
|
||||
def test_read_all_private_empty_when_file_missing(tmp_path):
|
||||
assert read_all_private(tmp_path) == set()
|
||||
|
||||
|
||||
def test_read_only_path_does_not_create_claude_dir(tmp_path):
|
||||
"""Hot-path reader (statusline → is_private → private_list_path)
|
||||
must NOT materialize ``.claude/`` in arbitrary directories. S2.7 from
|
||||
the PR review."""
|
||||
_ = private_list_path(tmp_path) # default writable=False
|
||||
assert not (tmp_path / ".claude").exists()
|
||||
# Read-side helpers also stay side-effect-free.
|
||||
_ = read_all_private(tmp_path)
|
||||
assert is_private(tmp_path, "missing") is False
|
||||
assert not (tmp_path / ".claude").exists()
|
||||
|
||||
|
||||
def test_read_cache_returns_same_set_across_calls(tmp_path):
|
||||
"""mtime-keyed cache: repeated reads of an unchanged file return the
|
||||
same set (same identity is fine; same value is the contract)."""
|
||||
add_private(tmp_path, "abc")
|
||||
a = read_all_private(tmp_path)
|
||||
b = read_all_private(tmp_path)
|
||||
assert a == b == {"abc"}
|
||||
|
||||
|
||||
def test_add_private_evicts_cache_so_subsequent_read_sees_new_id(tmp_path):
|
||||
"""Within a sub-second window after add, is_private must reflect
|
||||
the new entry — even when filesystem mtime granularity is 1s and
|
||||
a naive mtime check would still treat the cache as fresh."""
|
||||
add_private(tmp_path, "first")
|
||||
assert read_all_private(tmp_path) == {"first"}
|
||||
# Second add within the same second; cache eviction makes the next
|
||||
# read see both entries even though mtime may be identical.
|
||||
add_private(tmp_path, "second")
|
||||
assert read_all_private(tmp_path) == {"first", "second"}
|
||||
|
|
@ -369,3 +369,83 @@ def test_smoke_test_passes_with_pep440_local_version():
|
|||
ok, detail = su._smoke_test_new_binary("uv", expected_version="0.40.0")
|
||||
assert ok is False
|
||||
assert "version mismatch" in detail
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Workspace hook auto-refresh (PR #242 — ZdenekSrotyr #2 silent-stop fix)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def test_hook_refresh_fires_when_cli_already_current(monkeypatch):
|
||||
"""The info-is-None fast path must still refresh hooks. Covers the
|
||||
v0.48→v0.49 migration moment when the operator already self-upgraded
|
||||
the CLI (so the second self-upgrade call from a SessionStart hook
|
||||
finds nothing to install), but their workspace settings.json was
|
||||
written by the older CLI version and lacks the new capture-session
|
||||
hook entry."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", "/fake/workspace")
|
||||
with patch("cli.commands.self_upgrade.check", return_value=_current_info()), \
|
||||
patch("cli.commands.self_upgrade.maybe_refresh_claude_hooks") as mock_refresh:
|
||||
result = runner.invoke(app, ["self-upgrade"])
|
||||
assert result.exit_code == 0
|
||||
mock_refresh.assert_called_once()
|
||||
|
||||
|
||||
def test_hook_refresh_fires_after_successful_install(monkeypatch):
|
||||
"""The install-success path must refresh hooks AFTER the new wheel is
|
||||
in place — so any wire-format change in the new release lands on the
|
||||
next session-start without re-running `agnes init`."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", "/fake/workspace")
|
||||
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
||||
patch("cli.commands.self_upgrade.shutil.which", return_value="/usr/local/bin/uv"), \
|
||||
patch("cli.commands.self_upgrade.subprocess.run") as mock_run, \
|
||||
patch("cli.commands.self_upgrade._smoke_test_new_binary", return_value=_smoke_pass()), \
|
||||
patch("cli.commands.self_upgrade._read_last_known_good", return_value=None), \
|
||||
patch("cli.commands.self_upgrade._record_last_known_good"), \
|
||||
patch("cli.commands.self_upgrade._invalidate_update_cache"), \
|
||||
patch("cli.commands.self_upgrade.maybe_refresh_claude_hooks") as mock_refresh:
|
||||
mock_run.return_value = MagicMock(returncode=0)
|
||||
result = runner.invoke(app, ["self-upgrade"])
|
||||
assert result.exit_code == 0
|
||||
mock_refresh.assert_called_once()
|
||||
|
||||
|
||||
def test_hook_refresh_skipped_on_install_failure(monkeypatch):
|
||||
"""Failed install: do NOT refresh hooks — the rollback has already
|
||||
run and the workspace is in a known-prior state; rewriting hooks now
|
||||
could pin a layout that doesn't match the rolled-back binary."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", "/fake/workspace")
|
||||
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
||||
patch("cli.commands.self_upgrade.shutil.which", return_value="/usr/local/bin/uv"), \
|
||||
patch("cli.commands.self_upgrade.subprocess.run") as mock_run, \
|
||||
patch("cli.commands.self_upgrade._smoke_test_new_binary", return_value=_smoke_fail()), \
|
||||
patch("cli.commands.self_upgrade._read_last_known_good", return_value=_PRIOR_URL), \
|
||||
patch("cli.commands.self_upgrade._record_last_known_good"), \
|
||||
patch("cli.commands.self_upgrade._invalidate_update_cache"), \
|
||||
patch("cli.commands.self_upgrade.maybe_refresh_claude_hooks") as mock_refresh:
|
||||
mock_run.return_value = MagicMock(returncode=0) # install rc=0 but smoke failed
|
||||
result = runner.invoke(app, ["self-upgrade"])
|
||||
assert result.exit_code == 1
|
||||
mock_refresh.assert_not_called()
|
||||
|
||||
|
||||
def test_hook_refresh_skipped_when_check_only(monkeypatch):
|
||||
"""--check-only is read-only intent; never touch the workspace."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", "/fake/workspace")
|
||||
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
||||
patch("cli.commands.self_upgrade.maybe_refresh_claude_hooks") as mock_refresh:
|
||||
result = runner.invoke(app, ["self-upgrade", "--check-only"])
|
||||
# exit 1 because outdated — see test_check_only_when_outdated_exits_1
|
||||
assert result.exit_code == 1
|
||||
mock_refresh.assert_not_called()
|
||||
|
||||
|
||||
def test_hook_refresh_failure_does_not_flip_exit_code(monkeypatch):
|
||||
"""An exception inside maybe_refresh_claude_hooks must NOT turn a
|
||||
successful upgrade into rc=1. The refresh is best-effort."""
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", "/fake/workspace")
|
||||
with patch("cli.commands.self_upgrade.check", return_value=_current_info()), \
|
||||
patch("cli.commands.self_upgrade.maybe_refresh_claude_hooks",
|
||||
side_effect=PermissionError("settings.json read-only")):
|
||||
result = runner.invoke(app, ["self-upgrade"])
|
||||
assert result.exit_code == 0
|
||||
|
|
|
|||
258
tests/test_session_queue.py
Normal file
258
tests/test_session_queue.py
Normal file
|
|
@ -0,0 +1,258 @@
|
|||
"""Tests for cli.lib.session_queue — queue and uploaded-log helpers."""
|
||||
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
from cli.lib.session_queue import (
|
||||
append_to_queue,
|
||||
discard_snapshot,
|
||||
failed_log_path,
|
||||
find_recovery_snapshots,
|
||||
mark_failed_permanent,
|
||||
mark_private_skipped,
|
||||
mark_uploaded,
|
||||
private_skipped_log_path,
|
||||
queue_path,
|
||||
read_entries_from_snapshot,
|
||||
read_paths_from_snapshot,
|
||||
requeue_failed,
|
||||
snapshot_queue,
|
||||
uploaded_log_path,
|
||||
)
|
||||
|
||||
|
||||
def test_append_creates_claude_dir_and_appends(tmp_path):
|
||||
append_to_queue(tmp_path, "sid-1", "/abc/123.jsonl")
|
||||
queue = queue_path(tmp_path)
|
||||
assert queue.exists()
|
||||
assert queue.read_text(encoding="utf-8") == "sid-1\t/abc/123.jsonl\n"
|
||||
|
||||
append_to_queue(tmp_path, "sid-2", "/def/456.jsonl")
|
||||
assert queue.read_text(encoding="utf-8") == (
|
||||
"sid-1\t/abc/123.jsonl\n"
|
||||
"sid-2\t/def/456.jsonl\n"
|
||||
)
|
||||
|
||||
|
||||
def test_append_strips_trailing_newline_from_input(tmp_path):
|
||||
"""Defensive: appender adds exactly one '\\n' even if caller passes one."""
|
||||
append_to_queue(tmp_path, "sid-1", "/abc.jsonl\n")
|
||||
assert queue_path(tmp_path).read_text(encoding="utf-8") == "sid-1\t/abc.jsonl\n"
|
||||
|
||||
|
||||
def test_append_empty_session_id_writes_leading_tab(tmp_path):
|
||||
"""Empty session_id is legal — line starts with the tab separator.
|
||||
Used by code paths where session_id isn't yet known (none today, but
|
||||
keeps the door open for backfill tooling)."""
|
||||
append_to_queue(tmp_path, "", "/abc.jsonl")
|
||||
assert queue_path(tmp_path).read_text(encoding="utf-8") == "\t/abc.jsonl\n"
|
||||
|
||||
|
||||
def test_snapshot_returns_none_when_no_queue(tmp_path):
|
||||
assert snapshot_queue(tmp_path) is None
|
||||
|
||||
|
||||
def test_snapshot_renames_queue_atomically(tmp_path):
|
||||
append_to_queue(tmp_path, "sid-1", "/abc.jsonl")
|
||||
snap = snapshot_queue(tmp_path)
|
||||
assert snap is not None
|
||||
assert snap.exists()
|
||||
assert not queue_path(tmp_path).exists()
|
||||
assert snap.read_text(encoding="utf-8") == "sid-1\t/abc.jsonl\n"
|
||||
|
||||
|
||||
def test_snapshot_filename_carries_pid(tmp_path):
|
||||
import os
|
||||
append_to_queue(tmp_path, "sid", "/x.jsonl")
|
||||
snap = snapshot_queue(tmp_path)
|
||||
assert snap is not None
|
||||
assert str(os.getpid()) in snap.name
|
||||
|
||||
|
||||
def test_snapshot_filename_is_unique_per_call(tmp_path):
|
||||
"""Two consecutive snapshots under the same PID must not collide.
|
||||
Guards against the data-loss scenario where a crashed push left a
|
||||
snapshot on disk and the OS later reused its PID for a new push:
|
||||
without the uuid suffix, os.rename would atomically overwrite the
|
||||
recovery snapshot, silently losing its entries."""
|
||||
append_to_queue(tmp_path, "sid-1", "/a.jsonl")
|
||||
snap1 = snapshot_queue(tmp_path)
|
||||
append_to_queue(tmp_path, "sid-2", "/b.jsonl")
|
||||
snap2 = snapshot_queue(tmp_path)
|
||||
assert snap1 is not None and snap2 is not None
|
||||
assert snap1 != snap2
|
||||
assert snap1.exists() and snap2.exists()
|
||||
|
||||
|
||||
def test_read_entries_dedups_and_preserves_order(tmp_path):
|
||||
append_to_queue(tmp_path, "sid-1", "/abc.jsonl")
|
||||
append_to_queue(tmp_path, "sid-2", "/def.jsonl")
|
||||
append_to_queue(tmp_path, "sid-1", "/abc.jsonl") # exact duplicate
|
||||
snap = snapshot_queue(tmp_path)
|
||||
assert snap is not None
|
||||
entries = read_entries_from_snapshot(snap)
|
||||
assert entries == [("sid-1", Path("/abc.jsonl")), ("sid-2", Path("/def.jsonl"))]
|
||||
|
||||
|
||||
def test_read_entries_different_session_id_same_path_kept(tmp_path):
|
||||
"""Same path with different session IDs are distinct entries — they
|
||||
represent two sessions that wrote to the same transcript file (rare
|
||||
but possible if Claude Code renames sessions)."""
|
||||
append_to_queue(tmp_path, "sid-1", "/abc.jsonl")
|
||||
append_to_queue(tmp_path, "sid-2", "/abc.jsonl")
|
||||
snap = snapshot_queue(tmp_path)
|
||||
entries = read_entries_from_snapshot(snap)
|
||||
assert len(entries) == 2
|
||||
|
||||
|
||||
def test_read_entries_accepts_legacy_one_column_lines(tmp_path):
|
||||
"""Forward compat: pre-feature workspaces had bare-path lines. Those
|
||||
still upload, just with empty session_id (can't be marked private
|
||||
retroactively, which is fine — they pre-date the feature)."""
|
||||
queue = queue_path(tmp_path)
|
||||
queue.write_text("/legacy.jsonl\nsid-1\t/new.jsonl\n", encoding="utf-8")
|
||||
entries = read_entries_from_snapshot(queue)
|
||||
assert entries == [("", Path("/legacy.jsonl")), ("sid-1", Path("/new.jsonl"))]
|
||||
|
||||
|
||||
def test_read_entries_skips_blank_lines(tmp_path):
|
||||
queue = queue_path(tmp_path)
|
||||
queue.write_text("sid-1\t/abc.jsonl\n\n \nsid-2\t/def.jsonl\n", encoding="utf-8")
|
||||
entries = read_entries_from_snapshot(queue)
|
||||
assert entries == [("sid-1", Path("/abc.jsonl")), ("sid-2", Path("/def.jsonl"))]
|
||||
|
||||
|
||||
def test_read_entries_returns_empty_for_missing_file(tmp_path):
|
||||
assert read_entries_from_snapshot(tmp_path / "does-not-exist.txt") == []
|
||||
|
||||
|
||||
def test_read_paths_compat_wrapper(tmp_path):
|
||||
"""read_paths_from_snapshot returns list[Path] only — used by code
|
||||
paths (dry-run preview) that don't need session_id."""
|
||||
append_to_queue(tmp_path, "sid-1", "/abc.jsonl")
|
||||
append_to_queue(tmp_path, "sid-2", "/def.jsonl")
|
||||
snap = snapshot_queue(tmp_path)
|
||||
paths = read_paths_from_snapshot(snap)
|
||||
assert paths == [Path("/abc.jsonl"), Path("/def.jsonl")]
|
||||
|
||||
|
||||
def test_mark_uploaded_appends_tsv(tmp_path):
|
||||
when = datetime(2026, 5, 10, 14, 32, 18, tzinfo=timezone.utc)
|
||||
p1 = tmp_path / "abc.jsonl"
|
||||
mark_uploaded(tmp_path, p1, when=when)
|
||||
log = uploaded_log_path(tmp_path)
|
||||
assert log.read_text(encoding="utf-8") == f"2026-05-10T14:32:18Z\t{p1}\n"
|
||||
|
||||
when2 = datetime(2026, 5, 10, 14, 32, 19, tzinfo=timezone.utc)
|
||||
p2 = tmp_path / "def.jsonl"
|
||||
mark_uploaded(tmp_path, p2, when=when2)
|
||||
assert log.read_text(encoding="utf-8") == (
|
||||
f"2026-05-10T14:32:18Z\t{p1}\n"
|
||||
f"2026-05-10T14:32:19Z\t{p2}\n"
|
||||
)
|
||||
|
||||
|
||||
def test_mark_uploaded_default_timestamp_is_utc(tmp_path):
|
||||
p = tmp_path / "x.jsonl"
|
||||
mark_uploaded(tmp_path, p)
|
||||
line = uploaded_log_path(tmp_path).read_text(encoding="utf-8")
|
||||
assert line.endswith(f"\t{p}\n")
|
||||
assert line.startswith("20")
|
||||
assert line.split("\t")[0].endswith("Z")
|
||||
|
||||
|
||||
def test_mark_private_skipped_appends_audit_log(tmp_path):
|
||||
when = datetime(2026, 5, 10, 14, 32, 18, tzinfo=timezone.utc)
|
||||
p = tmp_path / "abc.jsonl"
|
||||
mark_private_skipped(tmp_path, "sid-1", p, when=when)
|
||||
log = private_skipped_log_path(tmp_path)
|
||||
assert log.read_text(encoding="utf-8") == f"2026-05-10T14:32:18Z\tsid-1\t{p}\n"
|
||||
|
||||
|
||||
def test_mark_failed_permanent_appends_tsv(tmp_path):
|
||||
when = datetime(2026, 5, 10, 14, 32, 18, tzinfo=timezone.utc)
|
||||
p = tmp_path / "abc.jsonl"
|
||||
mark_failed_permanent(tmp_path, "sid-1", p, 401, when=when)
|
||||
log = failed_log_path(tmp_path)
|
||||
assert log.read_text(encoding="utf-8") == f"2026-05-10T14:32:18Z\tsid-1\t401\t{p}\n"
|
||||
|
||||
when2 = datetime(2026, 5, 10, 14, 32, 19, tzinfo=timezone.utc)
|
||||
p2 = tmp_path / "def.jsonl"
|
||||
mark_failed_permanent(tmp_path, "sid-2", p2, 413, when=when2)
|
||||
assert log.read_text(encoding="utf-8") == (
|
||||
f"2026-05-10T14:32:18Z\tsid-1\t401\t{p}\n"
|
||||
f"2026-05-10T14:32:19Z\tsid-2\t413\t{p2}\n"
|
||||
)
|
||||
|
||||
|
||||
def test_requeue_failed_appends_to_live_queue(tmp_path):
|
||||
fresh = tmp_path / "fresh.jsonl"
|
||||
failed = tmp_path / "failed.jsonl"
|
||||
append_to_queue(tmp_path, "sid-fresh", str(fresh))
|
||||
requeue_failed(tmp_path, [("sid-failed", failed)])
|
||||
assert queue_path(tmp_path).read_text(encoding="utf-8") == (
|
||||
f"sid-fresh\t{fresh}\n"
|
||||
f"sid-failed\t{failed}\n"
|
||||
)
|
||||
|
||||
|
||||
def test_requeue_empty_is_noop(tmp_path):
|
||||
requeue_failed(tmp_path, [])
|
||||
assert not queue_path(tmp_path).exists()
|
||||
|
||||
|
||||
def test_find_recovery_snapshots_returns_existing(tmp_path):
|
||||
claude = tmp_path / ".claude"
|
||||
claude.mkdir()
|
||||
(claude / "agnes-sessions.snapshot.111.txt").write_text("sid-a\t/a.jsonl\n")
|
||||
(claude / "agnes-sessions.snapshot.222.txt").write_text("sid-b\t/b.jsonl\n")
|
||||
snaps = find_recovery_snapshots(tmp_path)
|
||||
assert len(snaps) == 2
|
||||
assert all(s.name.startswith("agnes-sessions.snapshot.") for s in snaps)
|
||||
|
||||
|
||||
def test_find_recovery_snapshots_empty_when_none(tmp_path):
|
||||
assert find_recovery_snapshots(tmp_path) == []
|
||||
|
||||
|
||||
def test_discard_snapshot_idempotent(tmp_path):
|
||||
append_to_queue(tmp_path, "sid", "/a.jsonl")
|
||||
snap = snapshot_queue(tmp_path)
|
||||
assert snap is not None
|
||||
discard_snapshot(snap)
|
||||
assert not snap.exists()
|
||||
discard_snapshot(snap) # second call must not raise
|
||||
|
||||
|
||||
def test_append_concurrent_threads_no_corruption(tmp_path):
|
||||
"""Concurrent appends from multiple threads must not interleave bytes
|
||||
mid-line. Guards against the Windows non-atomic open(path, 'a')
|
||||
regression that the queue lock was added to prevent."""
|
||||
import threading
|
||||
|
||||
per_worker = 50
|
||||
n_workers = 4
|
||||
|
||||
def worker(start: int) -> None:
|
||||
for i in range(per_worker):
|
||||
append_to_queue(tmp_path, f"sid-{start}-{i}", f"/p/{start}-{i}.jsonl")
|
||||
|
||||
threads = [threading.Thread(target=worker, args=(s,)) for s in range(n_workers)]
|
||||
for t in threads:
|
||||
t.start()
|
||||
for t in threads:
|
||||
t.join()
|
||||
|
||||
text = queue_path(tmp_path).read_text(encoding="utf-8")
|
||||
lines = text.splitlines()
|
||||
assert len(lines) == n_workers * per_worker
|
||||
# Every line must be well-formed: exactly one tab separator, non-empty
|
||||
# sid + non-empty path. Interleaved bytes would produce malformed lines.
|
||||
seen: set[str] = set()
|
||||
for line in lines:
|
||||
assert line.count("\t") == 1, f"corrupted line: {line!r}"
|
||||
sid, _, path = line.partition("\t")
|
||||
assert sid.startswith("sid-")
|
||||
assert path.startswith("/p/")
|
||||
seen.add(line)
|
||||
assert len(seen) == n_workers * per_worker # no dropped writes
|
||||
|
|
@ -48,13 +48,15 @@ def test_render_setup_instructions_wires_all_placeholders():
|
|||
|
||||
|
||||
def test_resolve_lines_no_plugins_unified_layout():
|
||||
"""Unified always-on layout (Fix B + Fix C in 2026-05-10 init-report
|
||||
response): 1 install, 2 init, 3 catalog, 4 preflight, 5 marketplace,
|
||||
6 mcp_servers, 7 diagnose, 8 skills, 9 confirm. Preflight +
|
||||
marketplace + MCP block are emitted even when the operator has zero
|
||||
plugin grants — registering the per-user marketplace clone pre-wires
|
||||
the SessionStart hook, and the Atlassian Remote MCP applies to every
|
||||
analyst whose work touches Jira/Confluence."""
|
||||
"""Unified always-on layout: 1 install, 2 init, 3 catalog, 4 preflight,
|
||||
5 marketplace, 6 mcp_servers, 7 diagnose, 8 confirm. Preflight +
|
||||
marketplace + MCP block are emitted even when the operator's served
|
||||
stack is empty — registering the per-user marketplace clone pre-wires
|
||||
Claude Code for future stack changes (admin grants, system pins,
|
||||
Flea installs), and the Atlassian Remote MCP applies to every analyst
|
||||
whose work touches Jira/Confluence. Skills step deleted — the
|
||||
interactive copy-or-on-demand question was confusing and the
|
||||
on-demand path is the one-size-fits-all default."""
|
||||
from app.web.setup_instructions import resolve_lines
|
||||
|
||||
joined = "\n".join(resolve_lines("agnes.whl"))
|
||||
|
|
@ -66,14 +68,19 @@ def test_resolve_lines_no_plugins_unified_layout():
|
|||
assert "5) Register the Agnes Claude Code marketplace" in joined
|
||||
assert "6) Register the Atlassian MCP server" in joined
|
||||
assert "7) Run diagnostics:" in joined
|
||||
assert "8) Skills" in joined
|
||||
assert "9) Confirm:" in joined
|
||||
assert "8) Confirm:" in joined
|
||||
# No stray Confirms at other positions.
|
||||
assert "9) Confirm:" not in joined
|
||||
assert "10) Confirm:" not in joined
|
||||
assert "6) Confirm:" not in joined
|
||||
# The marketplace step header adapts to "no plugins granted yet" copy
|
||||
# No skills step in any form.
|
||||
assert "Skills (ask the user" not in joined
|
||||
assert "Skills" not in joined or "agnes skills" in joined # comment refs still OK
|
||||
assert "8) Skills" not in joined
|
||||
assert "~/.claude/skills/agnes/" not in joined
|
||||
# The marketplace step header adapts to the empty-stack copy
|
||||
# rather than the plugin-installing variant.
|
||||
assert "no plugins granted yet" in joined
|
||||
assert "your stack is empty for now" in joined
|
||||
assert "agnes refresh-marketplace --bootstrap" in joined
|
||||
# MCP step uses SSE transport for Atlassian's hosted Remote MCP.
|
||||
assert "claude mcp add --transport sse atlassian https://mcp.atlassian.com/v1/sse" in joined
|
||||
|
|
@ -250,9 +257,10 @@ def test_trust_block_step_0c_does_not_reference_stale_step_number():
|
|||
|
||||
def test_resolve_lines_with_plugins_uses_install_first_diagnose_last_layout():
|
||||
"""Marketplace layout puts install/init/catalog/preflight/marketplace
|
||||
BEFORE diagnose and skills, so the human-loop skills question is the
|
||||
final blocking step before Confirm. Step numbers: 4 preflight,
|
||||
5 marketplace, 6 diagnose, 7 skills, 8 confirm."""
|
||||
BEFORE diagnose, so diagnose is the final smoke test before Confirm.
|
||||
Step numbers: 4 preflight, 5 marketplace, 6 mcp, 7 diagnose,
|
||||
8 confirm. No skills step — interactive copy-or-on-demand question
|
||||
was confusing; on-demand `agnes skills show` is the default."""
|
||||
from app.web.setup_instructions import resolve_lines
|
||||
|
||||
lines = resolve_lines(
|
||||
|
|
@ -268,13 +276,13 @@ def test_resolve_lines_with_plugins_uses_install_first_diagnose_last_layout():
|
|||
assert "brew install git" in joined
|
||||
assert "winget install --id Git.Git -e --source winget --silent" in joined
|
||||
assert "sudo apt-get install git" in joined or "sudo dnf install git" in joined
|
||||
# Step 5 — marketplace + plugins. Collapsed to a single CLI call:
|
||||
# Step 5 — marketplace + stack install. Collapsed to a single CLI call:
|
||||
# `agnes refresh-marketplace --bootstrap` does clone + PAT-strip +
|
||||
# chmod + register-with-Claude + auto-install-from-manifest internally.
|
||||
# Pulling that out of the inline shell script avoided Claude Code's
|
||||
# agent-driven `rm -rf` permission gate that the old multi-line
|
||||
# sequence tripped on.
|
||||
assert "5) Register the Agnes Claude Code marketplace and install plugins" in joined
|
||||
assert "5) Register the Agnes Claude Code marketplace and install your current stack" in joined
|
||||
assert "agnes refresh-marketplace --bootstrap" in joined
|
||||
# The destructive prep + per-plugin install commands are now inside
|
||||
# the CLI; the prompt must not emit the inline shell forms in
|
||||
|
|
@ -287,15 +295,17 @@ def test_resolve_lines_with_plugins_uses_install_first_diagnose_last_layout():
|
|||
assert "claude plugin marketplace add" not in executable
|
||||
assert "claude plugin install foo@agnes" not in executable
|
||||
assert "claude plugin install bar@agnes" not in executable
|
||||
# Step 6 — Atlassian MCP registration (Fix C in 2026-05-10 init-report response).
|
||||
# Step 6 — Atlassian MCP registration.
|
||||
assert "6) Register the Atlassian MCP server" in joined
|
||||
# Step 7 — diagnose now AFTER marketplace + MCP wiring.
|
||||
assert "7) Run diagnostics:" in joined
|
||||
# Step 8 — skills, the last interactive step before Confirm.
|
||||
assert "8) Skills" in joined
|
||||
# Step 9 — Confirm renumbered (no stray Confirms at other positions).
|
||||
assert "9) Confirm:" in joined
|
||||
for stray in ("4) Confirm:", "5) Confirm:", "6) Confirm:", "7) Confirm:", "8) Confirm:"):
|
||||
# Step 8 — Confirm.
|
||||
assert "8) Confirm:" in joined
|
||||
# No skills step in any form.
|
||||
assert "Skills (ask the user" not in joined
|
||||
assert "8) Skills" not in joined
|
||||
assert "~/.claude/skills/agnes/" not in joined
|
||||
for stray in ("4) Confirm:", "5) Confirm:", "6) Confirm:", "7) Confirm:", "9) Confirm:"):
|
||||
assert stray not in joined
|
||||
# Crucial ordering invariants for the new layout.
|
||||
install_idx = joined.index("1) Install the CLI")
|
||||
|
|
@ -305,9 +315,8 @@ def test_resolve_lines_with_plugins_uses_install_first_diagnose_last_layout():
|
|||
market_idx = joined.index("5) Register the Agnes Claude Code marketplace")
|
||||
mcp_idx = joined.index("6) Register the Atlassian MCP server")
|
||||
diag_idx = joined.index("7) Run diagnostics:")
|
||||
skills_idx = joined.index("8) Skills")
|
||||
confirm_idx = joined.index("9) Confirm:")
|
||||
assert install_idx < init_idx < catalog_idx < git_idx < market_idx < mcp_idx < diag_idx < skills_idx < confirm_idx
|
||||
confirm_idx = joined.index("8) Confirm:")
|
||||
assert install_idx < init_idx < catalog_idx < git_idx < market_idx < mcp_idx < diag_idx < confirm_idx
|
||||
# Legacy `git config sslVerify=false` downgrade is gone — see CHANGELOG.
|
||||
assert "git config --global" not in joined
|
||||
# server_host is server-side substituted; the placeholder must be gone.
|
||||
|
|
@ -631,14 +640,13 @@ def test_resolve_lines_ca_pem_empty_string_is_treated_as_absent():
|
|||
|
||||
def test_resolve_lines_ca_pem_works_without_plugins():
|
||||
"""Trust block is independent of the marketplace + MCP blocks — emit
|
||||
step 0 even when plugin list is empty. Confirm step is at 9 in the
|
||||
always-on layout (Fix B + Fix C in 2026-05-10 init-report response).
|
||||
Step 0 is preamble, not numbered."""
|
||||
step 0 even when plugin list is empty. Confirm step is at 8 in the
|
||||
post-skills-removal layout. Step 0 is preamble, not numbered."""
|
||||
from app.web.setup_instructions import resolve_lines
|
||||
|
||||
joined = "\n".join(resolve_lines("agnes.whl", ca_pem=_FAKE_CA_PEM))
|
||||
assert "0) Trust the Agnes TLS certificate" in joined
|
||||
assert "9) Confirm:" in joined
|
||||
assert "8) Confirm:" in joined
|
||||
# Marketplace block is now emitted unconditionally; the bootstrap
|
||||
# one-liner does the `claude plugin marketplace add` internally so
|
||||
# the literal string isn't in the prompt text — the user-facing
|
||||
|
|
@ -683,51 +691,43 @@ def test_diagnose_step_documents_normal_states():
|
|||
assert "NORMAL" in joined or "normal" in joined
|
||||
|
||||
|
||||
def test_skills_step_is_last_blocking_step_before_confirm():
|
||||
"""In the new layout, skills is the LAST interactive step before Confirm
|
||||
(it used to come right after diagnose and before git/marketplace, which
|
||||
invited the assistant to "do the rest in parallel"). We've moved the
|
||||
install work earlier, so the skills question is now a single clear gate
|
||||
— there's nothing left to do in parallel and the assistant must wait
|
||||
for the user's answer.
|
||||
def test_no_skills_step_emitted():
|
||||
"""Skills step was removed: the interactive copy-or-on-demand question
|
||||
was confusing for new users (named opinion call with no obvious right
|
||||
answer after a wall of technical steps). On-demand lookup via
|
||||
`agnes skills show <name>` is the one-size-fits-all default; CLAUDE.md
|
||||
references specific skills (e.g. agnes-data-querying) when relevant.
|
||||
|
||||
Assert two things:
|
||||
(a) The prompt explicitly tells the assistant to wait for the answer.
|
||||
(b) The skills step appears AFTER the marketplace step in the rendered
|
||||
line order — i.e., the legacy "skills before marketplace" flow
|
||||
isn't accidentally back."""
|
||||
Regression guard: the rendered prompt must not contain a numbered
|
||||
Skills step or the bulk-copy shell loop into ~/.claude/skills/agnes/.
|
||||
"""
|
||||
from app.web.setup_instructions import resolve_lines
|
||||
|
||||
joined = "\n".join(resolve_lines("agnes.whl", plugin_install_names=["foo"], server_host="h"))
|
||||
flattened = " ".join(joined.split())
|
||||
|
||||
# (a) The prompt must instruct the assistant to wait — and must NOT
|
||||
# contain the obsolete "you can continue in parallel" hint.
|
||||
assert "Wait for the user's answer" in joined
|
||||
assert "don't depend on the answer" not in flattened
|
||||
assert "do not depend on the answer" not in flattened
|
||||
|
||||
# (b) Skills comes after marketplace in the rendered line order.
|
||||
market_idx = joined.index("Register the Agnes Claude Code marketplace")
|
||||
skills_idx = joined.index("Skills (ask the user")
|
||||
assert market_idx < skills_idx
|
||||
for kwargs in (
|
||||
{},
|
||||
{"plugin_install_names": ["foo"], "server_host": "h"},
|
||||
):
|
||||
joined = "\n".join(resolve_lines("agnes.whl", **kwargs))
|
||||
assert "Skills (ask the user" not in joined
|
||||
assert "8) Skills" not in joined
|
||||
assert "9) Skills" not in joined
|
||||
assert "~/.claude/skills/agnes/" not in joined
|
||||
assert "for s in $(agnes skills list" not in joined
|
||||
assert "Wait for the user's answer" not in joined
|
||||
|
||||
|
||||
def test_no_plugins_layout_keeps_diagnose_before_skills():
|
||||
"""Always-on layout (Fix B + Fix C in 2026-05-10 init-report response):
|
||||
def test_no_plugins_layout_diagnose_before_confirm():
|
||||
"""Always-on layout (post-skills-removal):
|
||||
install → init → catalog → preflight → marketplace → mcp_servers →
|
||||
diagnose → skills → confirm, regardless of plugin grants. Step
|
||||
numbers: 7 diagnose, 8 skills, 9 confirm."""
|
||||
diagnose → confirm. Step numbers: 7 diagnose, 8 confirm."""
|
||||
from app.web.setup_instructions import resolve_lines
|
||||
|
||||
joined = "\n".join(resolve_lines("agnes.whl"))
|
||||
assert "7) Run diagnostics:" in joined
|
||||
assert "8) Skills" in joined
|
||||
assert "9) Confirm:" in joined
|
||||
assert "8) Confirm:" in joined
|
||||
diag_idx = joined.index("7) Run diagnostics:")
|
||||
skills_idx = joined.index("8) Skills")
|
||||
confirm_idx = joined.index("9) Confirm:")
|
||||
assert diag_idx < skills_idx < confirm_idx
|
||||
confirm_idx = joined.index("8) Confirm:")
|
||||
assert diag_idx < confirm_idx
|
||||
|
||||
|
||||
def test_unified_flow_uses_only_agnes_verbs():
|
||||
|
|
|
|||
|
|
@ -37,10 +37,10 @@ def test_setup_page_renders_unified_layout(client):
|
|||
|
||||
- `agnes init` is mandatory (subsumes the old admin-only
|
||||
`agnes auth import-token` + `agnes auth whoami` pair).
|
||||
- Marketplace block is always emitted (Fix B in 2026-05-10
|
||||
init-report response): anonymous visitors with no plugin grants
|
||||
still get the marketplace registration step so the SessionStart
|
||||
hook is pre-wired. Confirm = step 8.
|
||||
- Marketplace block is always emitted: anonymous visitors with no
|
||||
plugin grants still get the marketplace registration step so
|
||||
future stack changes land cleanly. Confirm = step 8 in the
|
||||
post-skills-removal layout.
|
||||
"""
|
||||
resp = client.get("/setup", follow_redirects=True)
|
||||
assert resp.status_code == 200
|
||||
|
|
@ -49,9 +49,9 @@ def test_setup_page_renders_unified_layout(client):
|
|||
assert "agnes init" in text
|
||||
# Legacy admin-only login verbs are gone from the rendered prompt.
|
||||
assert "agnes auth import-token" not in text
|
||||
# Always-on layout (preflight + marketplace + MCP block all unconditional):
|
||||
# Confirm = step 9.
|
||||
assert "9) Confirm:" in text
|
||||
# Always-on layout (preflight + marketplace + MCP all unconditional,
|
||||
# skills removed): Confirm = step 8.
|
||||
assert "8) Confirm:" in text
|
||||
|
||||
|
||||
def test_setup_page_ignores_role_query_param(client):
|
||||
|
|
@ -72,14 +72,15 @@ def test_setup_page_ignores_role_query_param(client):
|
|||
|
||||
|
||||
def test_setup_page_renders_marketplace_for_user_with_grants(client, monkeypatch):
|
||||
"""When the caller has plugin grants in `resource_grants`, the
|
||||
unified flow inserts the marketplace + plugins block (step 5) and
|
||||
Confirm shifts to step 8.
|
||||
"""When the caller has a non-empty served stack, the marketplace block
|
||||
renders the "install your current stack" copy variant. Confirm stays
|
||||
at step 8 in the post-skills-removal layout (preflight + marketplace
|
||||
+ MCP all always-on regardless of stack contents).
|
||||
|
||||
Stub `marketplace_filter.resolve_user_marketplace` to return a
|
||||
plugin so we don't have to seed the full marketplace plumbing in
|
||||
this test — we're verifying the layout switch, not the RBAC
|
||||
resolver itself (covered by `test_marketplace_filter`).
|
||||
this test — we're verifying the layout, not the RBAC resolver
|
||||
itself (covered by `test_marketplace_filter`).
|
||||
|
||||
Post-Model B (v28+): the setup page reads from
|
||||
`resolve_user_marketplace` (which gates on explicit subscriptions)
|
||||
|
|
@ -109,11 +110,11 @@ def test_setup_page_renders_marketplace_for_user_with_grants(client, monkeypatch
|
|||
# Marketplace block marker. The per-plugin install lines moved inside
|
||||
# `agnes refresh-marketplace --bootstrap`, so we check the section
|
||||
# header + the one-liner instead of `claude plugin install <name>@agnes`.
|
||||
assert "Register the Agnes Claude Code marketplace" in text
|
||||
# Non-empty stack → "install your current stack" header variant.
|
||||
assert "Register the Agnes Claude Code marketplace and install your current stack" in text
|
||||
assert "agnes refresh-marketplace --bootstrap" in text
|
||||
# Layout shift: Confirm is now step 9 (preflight + marketplace + MCP all
|
||||
# always-on per Fix B + Fix C in 2026-05-10 init-report response).
|
||||
assert "9) Confirm:" in text
|
||||
# Post-skills-removal layout: Confirm is step 8.
|
||||
assert "8) Confirm:" in text
|
||||
# Pre-flight is in the rendered prompt at step 4.
|
||||
assert "Make sure git and claude are installed" in text
|
||||
# Atlassian MCP registration is at step 6.
|
||||
|
|
|
|||
55
tests/test_statusline.py
Normal file
55
tests/test_statusline.py
Normal file
|
|
@ -0,0 +1,55 @@
|
|||
"""Tests for `agnes statusline` — Claude Code statusLine helper."""
|
||||
|
||||
import json
|
||||
|
||||
from typer.testing import CliRunner
|
||||
|
||||
from cli.commands.statusline import statusline_app
|
||||
from cli.lib.private_list import add_private
|
||||
|
||||
runner = CliRunner()
|
||||
|
||||
|
||||
def test_statusline_emits_private_indicator(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
add_private(tmp_path, "abc-123")
|
||||
payload = json.dumps({"session_id": "abc-123"})
|
||||
result = runner.invoke(statusline_app, [], input=payload)
|
||||
assert result.exit_code == 0
|
||||
assert "agnes-private" in result.output
|
||||
|
||||
|
||||
def test_statusline_silent_for_non_private(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
payload = json.dumps({"session_id": "abc-123"})
|
||||
result = runner.invoke(statusline_app, [], input=payload)
|
||||
assert result.exit_code == 0
|
||||
assert "agnes-private" not in result.output
|
||||
|
||||
|
||||
def test_statusline_silent_on_malformed_stdin(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
result = runner.invoke(statusline_app, [], input="not json {")
|
||||
assert result.exit_code == 0
|
||||
assert result.output.strip() == ""
|
||||
|
||||
|
||||
def test_statusline_silent_on_missing_session_id(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
result = runner.invoke(statusline_app, [], input='{"foo": "bar"}')
|
||||
assert result.exit_code == 0
|
||||
assert result.output.strip() == ""
|
||||
|
||||
|
||||
def test_statusline_silent_on_empty_stdin(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
result = runner.invoke(statusline_app, [], input="")
|
||||
assert result.exit_code == 0
|
||||
assert result.output.strip() == ""
|
||||
|
||||
|
||||
def test_statusline_silent_when_payload_not_object(tmp_path, monkeypatch):
|
||||
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
|
||||
result = runner.invoke(statusline_app, [], input='["array"]')
|
||||
assert result.exit_code == 0
|
||||
assert result.output.strip() == ""
|
||||
13
uv.lock
13
uv.lock
|
|
@ -21,7 +21,7 @@ wheels = [
|
|||
|
||||
[[package]]
|
||||
name = "agnes-the-ai-analyst"
|
||||
version = "0.47.4"
|
||||
version = "0.48.0"
|
||||
source = { editable = "." }
|
||||
dependencies = [
|
||||
{ name = "a2wsgi" },
|
||||
|
|
@ -32,6 +32,7 @@ dependencies = [
|
|||
{ name = "duckdb" },
|
||||
{ name = "dulwich" },
|
||||
{ name = "fastapi" },
|
||||
{ name = "filelock" },
|
||||
{ name = "google-api-python-client" },
|
||||
{ name = "google-cloud-bigquery" },
|
||||
{ name = "google-cloud-bigquery-storage" },
|
||||
|
|
@ -97,6 +98,7 @@ requires-dist = [
|
|||
{ name = "faker", marker = "extra == 'dev'", specifier = ">=24.0.0" },
|
||||
{ name = "fastapi", specifier = ">=0.115.0" },
|
||||
{ name = "fastapi-debug-toolbar", marker = "extra == 'dev'", specifier = ">=0.6.3" },
|
||||
{ name = "filelock", specifier = ">=3.13,<4" },
|
||||
{ name = "google-api-python-client", specifier = ">=2.0.0" },
|
||||
{ name = "google-cloud-bigquery", specifier = ">=3.0.0" },
|
||||
{ name = "google-cloud-bigquery-storage", specifier = ">=2.0.0" },
|
||||
|
|
@ -712,6 +714,15 @@ wheels = [
|
|||
{ url = "https://files.pythonhosted.org/packages/a7/09/b01c8c60d608e93faa257a9a3fc9a9739c70c9ac926edfbbe92baebbf33b/fastapi_debug_toolbar-0.6.3-py3-none-any.whl", hash = "sha256:077b5ffeb10426c49387ef090cf307a316d5c71c69bbffe270d9898d2554429e", size = 42904, upload-time = "2024-05-04T17:03:11.585Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "filelock"
|
||||
version = "3.29.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/b5/fe/997687a931ab51049acce6fa1f23e8f01216374ea81374ddee763c493db5/filelock-3.29.0.tar.gz", hash = "sha256:69974355e960702e789734cb4871f884ea6fe50bd8404051a3530bc07809cf90", size = 57571, upload-time = "2026-04-19T15:39:10.068Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/81/47/dd9a212ef6e343a6857485ffe25bba537304f1913bdbed446a23f7f592e1/filelock-3.29.0-py3-none-any.whl", hash = "sha256:96f5f6344709aa1572bbf631c640e4ebeeb519e08da902c39a001882f30ac258", size = 39812, upload-time = "2026-04-19T15:39:08.752Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "fonttools"
|
||||
version = "4.62.1"
|
||||
|
|
|
|||
Loading…
Reference in a new issue