agnes-the-ai-analyst/tests/test_cli_push.py
minasarustamyan 19c5a7592a
Session capture queue, private session, and setup-prompt fixes (#242)
* Capture session paths via SessionStart hook + lock parallel pushes

Replace the encoding-based scan of ~/.claude/projects/<encoded-cwd>/ with
a queue file populated by a new `agnes capture-session` SessionStart hook.
The hook reads the documented `transcript_path` field from Claude Code's
hook stdin JSON, sidestepping the cwd-to-folder encoding (which is an
internal implementation detail and varies by Claude Code version).

- New `agnes capture-session` subcommand appends transcript_path to
  <workspace>/.claude/agnes-sessions.txt. Silent on all malformed input
  so a hook chain failure doesn't clutter Claude Code startup.
- `agnes push` now consumes the queue: atomic snapshot rename guards
  against hooks writing during the push window, successful uploads land
  in agnes-sessions-uploaded.txt (TSV: timestamp + path), failed paths
  are requeued.
- Cross-platform single-instance lock via the filelock package (fcntl
  on POSIX, msvcrt on Windows). Concurrent SessionEnd hooks — common
  when the user closes several sessions at once — silent-exit on the
  losing side instead of all racing the upload.
- Recovery: pre-existing snapshot files from a crashed push are picked
  up and processed before the live queue.
- The SessionStart `agnes push` self-heal entry is dropped — it became
  redundant once the queue persists across runs (orphans from headless /
  crashed sessions ship out on the next interactive SessionEnd push).
  Existing workspaces auto-migrate via the marker-based replace logic.
- Legacy encoding scan stays available behind `--legacy-scan` for one-
  off backfills of sessions predating the queue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add /agnes-private + statusLine indicator for private sessions

Users handling sensitive data inside Claude Code can now opt a session
out of the Agnes upload pipeline, either proactively (right after session
start) or reactively (mid-session). The `/agnes-private` slash command
runs `agnes mark-private` deterministically via `!`-prefix direct bash —
no AI in the loop. A workspace-installed statusLine surfaces a
`🔒 agnes-private` indicator in Claude Code's status bar so the user
sees the state at a glance.

Authoritative source of "do not upload" is a separate file
`<workspace>/.claude/agnes-sessions-private.txt` (one session_id per
line). Both `capture-session` (queue writer) and `push` (queue reader)
consult the list. This makes the slash-command / SessionStart-hook race
impossible by construction: whichever runs first, the session is correctly
filtered out.

- `agnes mark-private` reads `CLAUDE_CODE_SESSION_ID` from env (set by
  Claude Code in every bash subprocess it spawns — stable documented API)
  and appends to the private list.
- `agnes statusline` reads the session JSON Claude Code pipes on stdin,
  checks the private list, and emits the indicator or nothing. Optimized
  for the high call frequency of statusLine renders.
- `capture-session` extracts session_id from hook stdin and skips queue
  write when the ID is already on the private list (race protection).
- `push` filters snapshot entries by the private list and appends to a
  per-workspace audit log `agnes-sessions-private-skipped.txt`.
- Queue format migrated from `<path>` to `<session_id>\t<path>`; legacy
  one-column lines still parse (empty session_id, still upload, can't be
  marked private retroactively — fine, they pre-date the feature).
- `install_claude_hooks` writes a workspace statusLine unless the user
  already has a custom one (warn + preserve). Idempotent re-init.
- `install_claude_commands` ships `agnes-private.md` alongside
  `update-agnes-plugins.md`. Per-template fallback so a missing template
  doesn't get clobbered with the wrong content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix setup-prompt + CLAUDE.md marketplace copy + drop skills step

Three issues against the post-PR-#240 / post-PR-#237 state:

1. Setup prompt's marketplace block trailer (both has-stack and
   empty-stack variants) claimed the SessionStart hook keeps the
   marketplace clone in sync via `agnes refresh-marketplace --quiet`
   on every session and that admin grants land automatically — both
   false since PR #237 (0.47.x) moved the install/update path out of
   the hook into the `/update-agnes-plugins` slash command. The hook
   is `--check`-only: detects server-side changes, prompts the user
   to run the slash command, which does the full reconcile
   interactively with output visible in the transcript.

2. The empty-stack variant framed composition as "admin grants only",
   missing the actual three-source served stack:
     (admin RBAC ∩ /marketplace subscriptions)
       ∪ system-mandatory plugins (admin-pinned, auto-applied)
       ∪ Flea market installs (skills/agents bundled, plugins standalone)
   Updated copy spells out all three sources so analysts know where
   their stack picks live, and what the SessionStart hook actually
   does on change detection.

3. CLAUDE.md template's "Agnes Marketplace" section conflated
   eligibility (`resolve_allowed_plugins` — what's listed) with served
   stack (`resolve_user_marketplace` — what actually reaches Claude
   Code). The two are different: a user can be RBAC-eligible for a
   plugin without having subscribed to it on /marketplace. Rewrote
   the section to distinguish the eligibility set from the served
   stack and to describe the `--check`-only hook accurately.

Plus: deleted the setup prompt's interactive Skills step (final step
before Confirm). The named-opinion question — "do you want me to
bulk-copy every skill into ~/.claude/skills/agnes/ or pull on-demand
via `agnes skills show <name>`?" — had no obvious right answer for
new users at the tail end of a wall of technical steps. On-demand
lookup is the one-size-fits-all default; `agnes skills list/show`
remain discoverable and the CLAUDE.md template references specific
skills inline (e.g. agnes-data-querying in the BigQuery section)
where they're relevant. Layout: Confirm shifts from step 9 to step 8.

Tests updated, full setup/marketplace/welcome surface green (115
passed). Remaining full-suite failures are pre-existing (BQ/Keboola
fixtures, Windows charmap collection error in test_v26_keboola_e2e)
— verified against a clean stash, unrelated to this diff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix session-queue race + snapshot PID-reuse data loss

Two blocker fixes from the PR #242 review:

1. Concurrent SessionStart hooks could corrupt the queue file on
   Windows. Python's `open(path, "a")` is not atomic there — the CRT
   does not pass FILE_APPEND_DATA to CreateFile, so concurrent
   appenders (user opening several Claude Code windows simultaneously)
   could interleave bytes mid-line. The malformed lines then silently
   fail the parser and the entries are dropped.

   Fix: wrap append_to_queue, requeue_failed, and snapshot_queue in a
   short-lived FileLock on a dedicated `agnes-queue.lock`. Separate
   from `agnes-push.lock` so capture-session hooks don't block on the
   push command. New test_append_concurrent_threads_no_corruption
   reproduces the race with 4 threads x 50 appends.

2. Snapshot filenames embedded only the PID (`agnes-sessions.snapshot.
   <PID>.txt`). After a crashed push left a snapshot on disk and the
   OS recycled the PID for a new push, `os.rename` would atomically
   overwrite the recovery snapshot — every entry in it lost, silently.

   Fix: append a uuid8 hex tail (`agnes-sessions.snapshot.<PID>.
   <uuid8>.txt`). find_recovery_snapshots already globs the prefix
   so it picks up both old and new format. New
   test_snapshot_filename_is_unique_per_call asserts two consecutive
   snapshots under the same PID don't collide.

Targeted tests green (47/47 in session_queue/capture_session/cli_push).
Full suite failures unchanged from baseline (pre-existing BQ/Keboola
fixture issues per CLAUDE.md).

* Auto-refresh workspace hooks + bash-wrap all hook entries (Windows)

Fixes from PR #242 second review (ZdenekSrotyr):

1. `uv.lock` regenerated to include `filelock 3.29.0` (declared in
   pyproject.toml but missing from the lock file — CI's
   lockfile-consistency check would fail; `uv pip install` on a clean
   cache would silently miss the dep).

2. `agnes self-upgrade` now auto-refreshes the workspace Claude Code
   hooks via the new `cli.lib.hooks.maybe_refresh_claude_hooks`. Closes
   the silent-stop migration gap: a v0.48 workspace would auto-upgrade
   the CLI from its existing SessionStart self-upgrade entry but never
   pick up the new `agnes capture-session` SessionStart hook, leaving
   the queue empty and `agnes push` uploading nothing.

   The refresh fires on both the "info is None" fast path (CLI already
   current — catches the second SessionStart after a prior upgrade)
   and the install-success path. Guarded by `workspace_has_agnes_hooks`
   so it never writes `.claude/settings.json` into directories that
   aren't Agnes workspaces (e.g. `agnes self-upgrade` invoked from
   `~/`). Errors are surfaced on stderr but never flip the upgrade exit
   code.

3. All Agnes-managed hooks are now wrapped in `bash -c "..."`. The
   self-upgrade+pull chained SessionStart entry was the only one still
   shipping unwrapped — Claude Code on Windows runs hook commands
   directly without a shell, so the `;` chain + `2>/dev/null` +
   `|| true` shell syntax silently no-op'd on native Windows installs
   without Git Bash on PATH. Workspaces still on the old form
   auto-upgrade via the refresh path above.

Tests: +12 in test_lib_hooks.py (guard semantics, v0.48→v0.49
migration end-to-end, third-party-hook preservation, bash-wrap
invariant). +5 in test_self_upgrade.py (refresh fires on info=None,
fires on install success, skipped on failure, skipped on --check-only,
refresh failure never flips exit code).

130 targeted tests green. The 2 pre-existing Windows path-separator
failures in `test_smoke_test_detects_version_mismatch[uv|pip]` are
unrelated (path mismatch `\fake\uv\bin\agnes` vs `/fake/uv/bin/agnes`
in test asserts, pre-PR baseline).

* CHANGELOG: document PR-242 main features

Closes ZdenekSrotyr #4: the [Unreleased] block was missing entries for
the PR's primary surface — only the post-merge fix bullets and the
unrelated setup-prompt copy change were captured. Adds:

- ### Added: 6 bullets covering the session capture queue + new
  `agnes capture-session` subcommand, `/agnes-private` slash + `agnes
  mark-private`, `agnes statusline` + statusLine wiring, `--legacy-scan`
  opt-in fallback, single-instance push lock, and the new `filelock`
  runtime dep.

- ### Changed: BREAKING bullet on the SessionStart / SessionEnd hook
  wire format change (capture-session as first SessionStart entry,
  push self-heal removed, SessionEnd push detached via nohup, all
  entries bash-wrapped). Folds the prior standalone bash-wrap bullet
  into this consolidated entry — Z's review flagged the layout shift
  as BREAKING, and grouping the related sub-changes makes the
  migration story readable in one place.

- Operator migration is auto-handled by `maybe_refresh_claude_hooks`
  invoked from `agnes self-upgrade` (separate Changed entry below).
  No `agnes init` re-run required. Pre-queue session jsonls on
  upgrading workspaces still need a one-off `agnes push --legacy-scan`
  — flagged in the BREAKING bullet.

No code change; doc only.

* Drop permanent 4xx uploads instead of requeueing forever

Closes ZdenekSrotyr #5. Previously the push retry path requeued any
non-200 response except the literal "file not found on disk", so 401
(token expired), 403 (RBAC denial), 413 (payload too large), 400
(server-side validation) cycled through every push run forever — the
queue grew without bound and each run re-bombarded the server with the
same deterministically-failing upload.

Now 4xx (except 408 Request Timeout + 429 Too Many Requests, which the
HTTP spec marks as transient) is dropped and audit-logged to
`<workspace>/.claude/agnes-sessions-failed.txt`:

    <iso_ts>\t<session_id>\t<status>\t<transcript_path>

5xx and network errors continue to requeue — those reflect server /
transport state that can change between runs, so retry is the right
behavior.

The audit log piggybacks on the push single-instance lock
(agnes-push.lock) — push is the only writer to this file, same as the
existing `mark_uploaded` and `mark_private_skipped` paths, so no
separate filelock is needed.

`agnes push --json` surfaces a new `dropped_permanent` counter; non-
quiet stdout mentions the audit-log path so operators tailing the
output have a pointer to the forensic trail.

Tests: +7 in test_cli_push.py (401/400/403/413 → drop; 408/429 →
requeue; 500/502/503 → requeue; network exception → requeue;
--json `dropped_permanent` counter; stdout audit-log pointer). +1 in
test_session_queue.py (mark_failed_permanent TSV format).

127/129 targeted tests green. The 2 pre-existing Windows
path-separator failures in `test_smoke_test_detects_version_mismatch
[uv|pip]` are unrelated (path mismatch `\fake\uv\bin\agnes` vs
`/fake/uv/bin/agnes` in test asserts, pre-PR baseline).

* Catch OSError in push lock acquisition

Closes ZdenekSrotyr #8. `acquire_or_skip` in `cli/lib/push_lock.py`
previously caught only `filelock.Timeout`. Any `OSError` from
`FileLock.acquire` — read-only filesystem, permission denied on
`.claude/`, disk full, hardware I/O failure — propagated as an
unhandled traceback.

Two visible failure modes:
- SessionEnd hook: `|| true` in the wrapper swallowed the error, so
  daily pushes silently never ran. Operator had no signal.
- Manual `agnes push`: ugly Python traceback dumped to the terminal
  instead of a clean exit.

Now `OSError` is treated the same as `Timeout` — yield `None`, caller
returns cleanly with rc=0. The operator's environment in these
scenarios has bigger problems than missing session uploads, so we
swallow rather than retry-loop or surface a noisy warning.

Test: `test_push_silent_exit_when_filelock_raises_oserror` patches
the `FileLock` used inside `push_lock` to raise OSError on acquire,
verifies push exits 0 with no traceback and the queue is preserved
for the next attempt.

* Address remaining S2 items from PR-242 review

Four items from ZdenekSrotyr's S2 list:

S2.10 — `_install_statusline` truthy check (cli/lib/hooks.py): replace
`if existing:` with explicit `if existing is None or existing == "":`.
Documents and tests the behavior for both edge cases (explicit-null
and empty-string `statusLine`) — both treated as "not configured"
rather than "explicit user opt-out", so we install ours. Two new
tests in test_lib_hooks.py pin the contract.

S2.6 — onboarding docs for /agnes-private. New "Private sessions"
subsection in `config/claude_md_template.txt` (next to Data Sync)
covering the slash command, statusbar indicator, and audit-log
location. One-line tip in `app/web/setup_instructions.py` so the
feature is discoverable at onboarding.

S2.9 — e2e privacy test (tests/test_e2e_privacy.py). Wires
capture_session → mark_private → push against a recording fake
api_post and asserts zero session uploads for the marked one.
Three cases: mark-before-capture (queue write skipped),
mark-after-capture (push-side filter catches it + audit-logs),
control (unmarked sessions upload normally).

David #8 — `--legacy-scan` help text now documents the
private-list gap (legacy entries carry empty session_id, so
the filter is not consulted). The practical impact is bounded —
pre-queue sessions cannot have been marked private since the
private list is a queue-era feature — but the disclaimer in the
help text means an operator running a backfill is not surprised.

68 targeted tests green (3 new e2e + 2 new truthy edge tests +
existing). 2 pre-existing Windows path-separator failures in
test_smoke_test_detects_version_mismatch[uv|pip] unchanged.

Remaining S2 items (statusline mkdir push-back, capture-session
silent-fail follow-up) handled in PR comment + follow-up issue
respectively.

* Address remaining S2 follow-ups (David #8, S2.7, David #11)

Three items left over from Mina's bbf63472 batch — that commit
addressed S2.6/S2.9/S2.10 + documented David #8 in help text but
deferred the actual implementations of S2.7, David #11, and the real
David #8 fix to follow-ups. This commit closes them.

David #8 — `agnes push --legacy-scan` now consults the private list.
Claude Code names jsonls `<session-id>.jsonl`, so the file stem IS
the session id; the legacy-scan path can apply the same private filter
the queue path uses. Both the dry-run and live-upload code paths fixed.
Help text updated (no longer warns the filter is bypassed). Two new
tests in test_cli_push.py cover the upload-skip path + the dry-run
`would_skip_private` segregation.

S2.7 — `statusline`/`is_private` no longer mkdir-pollutes arbitrary
workdirs. Split `_claude_dir` into `_claude_dir_writable` (used only
from `add_private`) and `_claude_dir_readonly` (no mkdir). The
read-only public helpers (`private_list_path`, `read_all_private`,
`is_private`) compose the no-mkdir variant by default; `add_private`
opts in via `writable=True`. Added a process-local mtime-keyed cache
around `read_all_private` so in-process callers (push doing one stat
per upload candidate, future `agnes diagnose`) don't re-parse the
file on every check. Cache eviction on `add_private` so a sub-second
write+read sequence doesn't see stale data even on coarse-mtime
filesystems. Two new tests pin the no-mkdir contract + the
in-same-second add+read consistency.

David #11 — `agnes capture-session` writes a breadcrumb log on every
invocation. New `<workspace>/.claude/agnes-capture-session.log` TSV:
`<iso_ts>\t<outcome>\t<detail>` where outcome covers every silent-
exit path (`ok`, `private_skip`, `empty_stdin`, `bad_json`,
`not_object`, `no_transcript_path`, `stdin_read_error`,
`write_error`). Gives operators a signal to detect "hook fires but
queue stays empty" — without it, an upstream Claude Code stdin-
contract change is invisible because the hook always exits 0. Log
rolls at 256 KiB so it doesn't grow unbounded on long-lived
workspaces. Best-effort: a breadcrumb-write failure is itself
swallowed so the hook contract stays "exit 0 always". Skipped in
non-Agnes workdirs (no `.claude/` exists) so opening Claude Code
in `~/` doesn't pollute it. Five new tests in test_capture_session.py
cover the success / bad_json / no_transcript_path / private_skip /
no-pollute paths.

115 targeted tests green (test_cli_push, test_capture_session,
test_private_list, test_session_queue, test_e2e_privacy,
test_lib_hooks, test_statusline, test_mark_private).

---------

Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
2026-05-11 13:31:16 +00:00

599 lines
22 KiB
Python

"""Tests for agnes push command (SessionEnd uploader)."""
import json
import re
from contextlib import contextmanager
from typer.testing import CliRunner
from cli.commands.push import push_app
from cli.lib.private_list import add_private
from cli.lib.session_queue import (
append_to_queue,
failed_log_path,
private_skipped_log_path,
queue_path,
uploaded_log_path,
)
# CI-safety: Typer/rich emits ANSI escapes in --help output. Strip before asserts.
_ANSI_RE = re.compile(r"\x1b\[[0-9;]*m")
def _clean(s: str) -> str:
return _ANSI_RE.sub("", s)
runner = CliRunner()
class _FakeResp:
def __init__(self, status_code: int = 200) -> None:
self.status_code = status_code
def _stub_config(monkeypatch) -> None:
monkeypatch.setattr("cli.commands.push.get_server_url", lambda: "http://x")
monkeypatch.setattr("cli.commands.push.get_token", lambda: "test-pat")
def _record_uploads(monkeypatch) -> list[tuple[str, dict]]:
"""Patch api_post to record calls and return success. Returns the recorder list."""
calls: list[tuple[str, dict]] = []
def _fake(endpoint, **kwargs):
calls.append((endpoint, kwargs))
return _FakeResp(200)
monkeypatch.setattr("cli.commands.push.api_post", _fake)
return calls
# ---------- Smoke + dry-run --------------------------------------------------
def test_push_help():
result = runner.invoke(push_app, ["--help"])
assert result.exit_code == 0
assert "--quiet" in _clean(result.output)
assert "--json" in _clean(result.output)
assert "--dry-run" in _clean(result.output)
assert "--legacy-scan" in _clean(result.output)
def test_push_no_sessions_no_mkdir(tmp_path, monkeypatch):
"""Empty workspace -> push exits 0, doesn't create user/sessions/."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
result = runner.invoke(push_app, ["--quiet"])
assert result.exit_code == 0
assert not (tmp_path / "user" / "sessions").exists(), \
"lazy mkdir: nothing to upload must not create user/sessions/"
def test_push_dry_run_no_writes(tmp_path, monkeypatch):
"""--dry-run lists what would upload but sends nothing."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
transcript = tmp_path / "abc.jsonl"
transcript.write_text('{"event":"test"}\n')
append_to_queue(tmp_path, "sid-1", str(transcript))
def _raise(*a, **kw):
raise AssertionError("api_post was called during --dry-run")
monkeypatch.setattr("cli.commands.push.api_post", _raise)
result = runner.invoke(push_app, ["--dry-run"])
assert result.exit_code == 0
assert queue_path(tmp_path).exists() # not consumed
# ---------- Queue happy path + dedup + lock + recovery ----------------------
def test_push_uploads_queued_session_and_clears_queue(tmp_path, monkeypatch):
"""Happy path: queue has one session, push uploads it, queue cleared, log written."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
calls = _record_uploads(monkeypatch)
transcript = tmp_path / "abc.jsonl"
transcript.write_text('{"event":"test"}\n')
append_to_queue(tmp_path, "sid-1", str(transcript))
result = runner.invoke(push_app, ["--quiet"])
assert result.exit_code == 0
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
assert len(sessions_calls) == 1
assert not queue_path(tmp_path).exists()
log = uploaded_log_path(tmp_path).read_text(encoding="utf-8")
assert str(transcript) in log
assert "\t" in log
def test_push_dedups_duplicate_paths_in_queue(tmp_path, monkeypatch):
"""Resume scenario: same (session_id, path) queued twice — push uploads once."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
calls = _record_uploads(monkeypatch)
transcript = tmp_path / "abc.jsonl"
transcript.write_text('{"event":"test"}\n')
append_to_queue(tmp_path, "sid-1", str(transcript))
append_to_queue(tmp_path, "sid-1", str(transcript))
result = runner.invoke(push_app, ["--quiet"])
assert result.exit_code == 0
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
assert len(sessions_calls) == 1
def test_push_silent_exit_when_lock_held(tmp_path, monkeypatch):
"""Concurrent SessionEnd hooks: only one push runs, others silent-exit."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
@contextmanager
def _yield_none(workspace):
yield None
monkeypatch.setattr("cli.commands.push.acquire_or_skip", _yield_none)
def _raise(*a, **kw):
raise AssertionError("api_post called when lock unavailable")
monkeypatch.setattr("cli.commands.push.api_post", _raise)
transcript = tmp_path / "x.jsonl"
transcript.write_text("{}\n")
append_to_queue(tmp_path, "sid-1", str(transcript))
result = runner.invoke(push_app, ["--quiet"])
assert result.exit_code == 0
assert result.output == ""
assert queue_path(tmp_path).read_text(encoding="utf-8") == f"sid-1\t{transcript}\n"
def test_push_silent_exit_when_filelock_raises_oserror(tmp_path, monkeypatch):
"""OSError from filelock (read-only FS, permission denied, disk full)
must not crash push with an unhandled traceback. Exercises the real
acquire_or_skip by replacing it with a context manager that raises
OSError on entry — simulates what filelock.FileLock.acquire raises
on a read-only mount."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
# Queue setup must happen BEFORE we install the failing lock — the
# `append_to_queue` path holds its own `agnes-queue.lock` and a
# blanket `FileLock.acquire` patch would break that one too.
transcript = tmp_path / "x.jsonl"
transcript.write_text("{}\n")
append_to_queue(tmp_path, "sid-1", str(transcript))
# Wrap the real acquire_or_skip with one that raises OSError before
# yielding. We can't just patch `cli.commands.push.acquire_or_skip`
# because the new behaviour lives inside `acquire_or_skip` itself —
# we have to exercise its except handler. So we patch `FileLock`
# used inside push_lock: subclass with overridden `acquire` that
# raises OSError.
from cli.lib import push_lock as pl
class _BrokenLock:
def __init__(self, path: str) -> None:
self._path = path
def acquire(self, timeout: float = -1):
raise OSError("read-only filesystem")
monkeypatch.setattr(pl, "FileLock", _BrokenLock)
def _api_should_not_be_called(*a, **kw):
raise AssertionError("api_post called when lock acquisition raised OSError")
monkeypatch.setattr("cli.commands.push.api_post", _api_should_not_be_called)
result = runner.invoke(push_app, ["--quiet"])
assert result.exit_code == 0, f"push must exit 0 on OSError, got: {result.output}"
# No traceback in output
assert "Traceback" not in result.output
# Queue preserved for next push attempt
assert queue_path(tmp_path).read_text(encoding="utf-8") == f"sid-1\t{transcript}\n"
def test_push_processes_recovery_snapshot_first(tmp_path, monkeypatch):
"""Pre-existing snapshot from a crashed push gets picked up."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
calls = _record_uploads(monkeypatch)
claude = tmp_path / ".claude"
claude.mkdir(parents=True, exist_ok=True)
recovery = claude / "agnes-sessions.snapshot.99999.txt"
crashed_jsonl = tmp_path / "crashed.jsonl"
crashed_jsonl.write_text("{}\n")
recovery.write_text(f"sid-old\t{crashed_jsonl}\n", encoding="utf-8")
fresh_jsonl = tmp_path / "fresh.jsonl"
fresh_jsonl.write_text("{}\n")
append_to_queue(tmp_path, "sid-new", str(fresh_jsonl))
result = runner.invoke(push_app, ["--quiet"])
assert result.exit_code == 0
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
assert len(sessions_calls) == 2
assert not recovery.exists()
assert not queue_path(tmp_path).exists()
def test_push_skips_stale_queue_entry(tmp_path, monkeypatch):
"""Queue entry pointing to a deleted file: skipped, not retried forever."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
def _raise(*a, **kw):
raise AssertionError("api_post should not be called for missing file")
monkeypatch.setattr("cli.commands.push.api_post", _raise)
append_to_queue(tmp_path, "sid-1", str(tmp_path / "ghost.jsonl"))
result = runner.invoke(push_app, [])
assert result.exit_code == 0
assert not queue_path(tmp_path).exists()
assert not uploaded_log_path(tmp_path).exists()
def test_push_requeues_failed_uploads(tmp_path, monkeypatch):
"""Server returns 500 → path stays in queue for next push."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
def _fail(*a, **kw):
return _FakeResp(500)
monkeypatch.setattr("cli.commands.push.api_post", _fail)
transcript = tmp_path / "x.jsonl"
transcript.write_text("{}\n")
append_to_queue(tmp_path, "sid-1", str(transcript))
result = runner.invoke(push_app, [])
assert result.exit_code == 0
assert queue_path(tmp_path).read_text(encoding="utf-8") == f"sid-1\t{transcript}\n"
assert not uploaded_log_path(tmp_path).exists()
def test_push_uploads_local_md(tmp_path, monkeypatch):
"""CLAUDE.local.md uploaded when present."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
calls = _record_uploads(monkeypatch)
claude = tmp_path / ".claude"
claude.mkdir()
(claude / "CLAUDE.local.md").write_text("notes")
result = runner.invoke(push_app, ["--quiet"])
assert result.exit_code == 0
md_calls = [c for c in calls if c[0] == "/api/upload/local-md"]
assert len(md_calls) == 1
def test_push_json_output(tmp_path, monkeypatch):
"""--json emits a single JSON object with results."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
_record_uploads(monkeypatch)
transcript = tmp_path / "x.jsonl"
transcript.write_text("{}\n")
append_to_queue(tmp_path, "sid-1", str(transcript))
result = runner.invoke(push_app, ["--json"])
assert result.exit_code == 0
data = json.loads(result.output.strip())
assert data["sessions"] == 1
assert data["errors"] == []
assert data["private_skipped"] == 0
# ---------- Private filter tests --------------------------------------------
def test_push_skips_private_session_and_audit_logs(tmp_path, monkeypatch):
"""Queue contains a private session_id → no upload, audit log appended."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
calls = _record_uploads(monkeypatch)
transcript = tmp_path / "secret.jsonl"
transcript.write_text("{}\n")
add_private(tmp_path, "sid-private")
append_to_queue(tmp_path, "sid-private", str(transcript))
result = runner.invoke(push_app, ["--quiet"])
assert result.exit_code == 0
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
assert sessions_calls == [], "private session must NOT be uploaded"
# Audit log entry written
audit = private_skipped_log_path(tmp_path).read_text(encoding="utf-8")
assert "sid-private" in audit
assert str(transcript) in audit
# Queue consumed (snapshot processed and discarded — private entry not requeued)
assert not queue_path(tmp_path).exists()
def test_push_mixes_private_and_public_correctly(tmp_path, monkeypatch):
"""A push run with one private + one public session uploads only the public one."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
calls = _record_uploads(monkeypatch)
secret = tmp_path / "secret.jsonl"
secret.write_text("{}\n")
public = tmp_path / "public.jsonl"
public.write_text("{}\n")
add_private(tmp_path, "sid-secret")
append_to_queue(tmp_path, "sid-secret", str(secret))
append_to_queue(tmp_path, "sid-public", str(public))
result = runner.invoke(push_app, ["--quiet"])
assert result.exit_code == 0
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
assert len(sessions_calls) == 1
audit = private_skipped_log_path(tmp_path).read_text(encoding="utf-8")
assert "sid-secret" in audit
assert "sid-public" not in audit
def test_push_dry_run_shows_private_skip(tmp_path, monkeypatch):
"""--dry-run preview reports private-skipped count separately."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
def _raise(*a, **kw):
raise AssertionError("api_post was called during --dry-run")
monkeypatch.setattr("cli.commands.push.api_post", _raise)
transcript = tmp_path / "secret.jsonl"
transcript.write_text("{}\n")
add_private(tmp_path, "sid-priv")
append_to_queue(tmp_path, "sid-priv", str(transcript))
result = runner.invoke(push_app, ["--dry-run"])
assert result.exit_code == 0
assert "1 private session" in result.output
assert "sid-priv" in result.output
# ---------- 4xx permanent-failure handling -----------------------------------
def _stub_api_post_status(monkeypatch, status: int) -> None:
"""Patch api_post to always return the given status code."""
def _fixed(*a, **kw):
return _FakeResp(status)
monkeypatch.setattr("cli.commands.push.api_post", _fixed)
def test_push_drops_4xx_to_audit_log_not_requeue(tmp_path, monkeypatch):
"""4xx (here: 401 token expired) → drop + audit, no requeue.
Closes the prior infinite-loop bug where every non-200 except
`file not found on disk` was requeued forever."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
_stub_api_post_status(monkeypatch, 401)
transcript = tmp_path / "x.jsonl"
transcript.write_text("{}\n")
append_to_queue(tmp_path, "sid-1", str(transcript))
result = runner.invoke(push_app, [])
assert result.exit_code == 0
# Entry must NOT be in the live queue any more.
assert not queue_path(tmp_path).exists() or \
queue_path(tmp_path).read_text(encoding="utf-8") == ""
# Audit log must record the drop with status + session_id + path.
log = failed_log_path(tmp_path).read_text(encoding="utf-8")
assert "\t401\t" in log
assert "sid-1" in log
assert str(transcript) in log
def test_push_drops_each_4xx_status(tmp_path, monkeypatch):
"""403, 413, 400 → all drop (not just 401)."""
for status in (400, 403, 413):
ws = tmp_path / f"ws-{status}"
ws.mkdir()
monkeypatch.setenv("AGNES_LOCAL_DIR", str(ws))
_stub_config(monkeypatch)
_stub_api_post_status(monkeypatch, status)
transcript = ws / "x.jsonl"
transcript.write_text("{}\n")
append_to_queue(ws, f"sid-{status}", str(transcript))
result = runner.invoke(push_app, [])
assert result.exit_code == 0, (status, result.output)
log = failed_log_path(ws).read_text(encoding="utf-8")
assert f"\t{status}\t" in log, (status, log)
def test_push_requeues_408_and_429(tmp_path, monkeypatch):
"""408 Request Timeout + 429 Too Many Requests are transient per
HTTP spec — server is asking us to retry, not telling us the
request is invalid. Must requeue, not drop."""
for status in (408, 429):
ws = tmp_path / f"ws-{status}"
ws.mkdir()
monkeypatch.setenv("AGNES_LOCAL_DIR", str(ws))
_stub_config(monkeypatch)
_stub_api_post_status(monkeypatch, status)
transcript = ws / "x.jsonl"
transcript.write_text("{}\n")
append_to_queue(ws, f"sid-{status}", str(transcript))
result = runner.invoke(push_app, [])
assert result.exit_code == 0
# Requeued → entry back in live queue.
live = queue_path(ws).read_text(encoding="utf-8")
assert f"sid-{status}\t{transcript}\n" == live
# NOT in the failed audit log.
assert not failed_log_path(ws).exists()
def test_push_requeues_5xx(tmp_path, monkeypatch):
"""5xx is genuine server-side failure: request was valid but server
couldn't honor it right now. Requeue for the next push."""
for status in (500, 502, 503):
ws = tmp_path / f"ws-{status}"
ws.mkdir()
monkeypatch.setenv("AGNES_LOCAL_DIR", str(ws))
_stub_config(monkeypatch)
_stub_api_post_status(monkeypatch, status)
transcript = ws / "x.jsonl"
transcript.write_text("{}\n")
append_to_queue(ws, f"sid-{status}", str(transcript))
result = runner.invoke(push_app, [])
assert result.exit_code == 0
live = queue_path(ws).read_text(encoding="utf-8")
assert f"sid-{status}\t{transcript}\n" == live
assert not failed_log_path(ws).exists()
def test_push_requeues_network_exception(tmp_path, monkeypatch):
"""Connection error / DNS / timeout — no status code from server.
Treat as transient: requeue rather than drop."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
def _raise(*a, **kw):
raise ConnectionError("server unreachable")
monkeypatch.setattr("cli.commands.push.api_post", _raise)
transcript = tmp_path / "x.jsonl"
transcript.write_text("{}\n")
append_to_queue(tmp_path, "sid-net", str(transcript))
result = runner.invoke(push_app, [])
assert result.exit_code == 0
live = queue_path(tmp_path).read_text(encoding="utf-8")
assert f"sid-net\t{transcript}\n" == live
assert not failed_log_path(tmp_path).exists()
def test_push_4xx_drop_count_in_json_output(tmp_path, monkeypatch):
"""--json surfaces the new `dropped_permanent` counter so operators
can pipe it into monitoring / scripts."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
_stub_api_post_status(monkeypatch, 401)
transcript = tmp_path / "x.jsonl"
transcript.write_text("{}\n")
append_to_queue(tmp_path, "sid-1", str(transcript))
result = runner.invoke(push_app, ["--json"])
assert result.exit_code == 0
payload = json.loads(result.output)
assert payload["dropped_permanent"] == 1
assert payload["sessions"] == 0
def test_push_4xx_drop_visible_in_quiet_stdout(tmp_path, monkeypatch):
"""Non-quiet stdout mentions the audit-log path so operators tailing
`agnes push` output get a pointer to the forensic trail."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
_stub_api_post_status(monkeypatch, 413)
transcript = tmp_path / "huge.jsonl"
transcript.write_text("{}\n")
append_to_queue(tmp_path, "sid-big", str(transcript))
result = runner.invoke(push_app, [])
assert result.exit_code == 0
assert "agnes-sessions-failed.txt" in result.output
assert "permanent failure" in result.output
# ---------- David #8: legacy-scan honors the private list -------------------
#
# `--legacy-scan` walks ~/.claude/projects/<encoded-cwd>/*.jsonl. Claude Code
# names jsonls `<session-id>.jsonl`, so the file stem IS the session id —
# the same private filter that protects queue uploads must apply. Without
# this, an operator running `agnes push --legacy-scan` to backfill old
# sessions would silently upload everything on disk.
def test_push_legacy_scan_skips_private_session(tmp_path, monkeypatch):
"""Legacy-scan picks up `<sid>.jsonl` from the projects dir; if the
sid is on the private list, it must be skipped + audit-logged."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
calls = _record_uploads(monkeypatch)
projects_dir = tmp_path / "projects-fake"
projects_dir.mkdir()
pub = projects_dir / "sid-public.jsonl"
pub.write_text("{}\n")
priv = projects_dir / "sid-private.jsonl"
priv.write_text("{}\n")
monkeypatch.setattr(
"cli.lib.claude_sessions.list_session_files",
lambda _w: [pub, priv],
)
add_private(tmp_path, "sid-private")
result = runner.invoke(push_app, ["--legacy-scan", "--quiet"])
assert result.exit_code == 0
sessions_calls = [c for c in calls if c[0] == "/api/upload/sessions"]
assert len(sessions_calls) == 1
uploaded_path = sessions_calls[0][1]["files"]["file"][0]
assert uploaded_path == "sid-public.jsonl"
audit = private_skipped_log_path(tmp_path).read_text(encoding="utf-8")
assert "sid-private" in audit
assert str(priv) in audit
def test_push_legacy_scan_dry_run_segregates_private(tmp_path, monkeypatch):
"""Dry-run JSON shape: legacy-scan candidates surface in
would_upload OR would_skip_private depending on private membership."""
monkeypatch.setenv("AGNES_LOCAL_DIR", str(tmp_path))
_stub_config(monkeypatch)
projects_dir = tmp_path / "projects-fake"
projects_dir.mkdir()
public_jsonl = projects_dir / "sid-pub.jsonl"
public_jsonl.write_text("{}\n")
private_jsonl = projects_dir / "sid-priv.jsonl"
private_jsonl.write_text("{}\n")
monkeypatch.setattr(
"cli.lib.claude_sessions.list_session_files",
lambda _w: [public_jsonl, private_jsonl],
)
add_private(tmp_path, "sid-priv")
result = runner.invoke(push_app, ["--legacy-scan", "--dry-run", "--json"])
assert result.exit_code == 0
payload = json.loads(result.output)
assert str(public_jsonl) in payload["would_upload"]["sessions"]
assert str(private_jsonl) not in payload["would_upload"]["sessions"]
skipped_paths = [e["path"] for e in payload["would_skip_private"]]
assert str(private_jsonl) in skipped_paths