Single hook entry: 'agnes self-upgrade --quiet ... || true; agnes pull
--quiet ... || true'. Shell semicolon guarantees ordering across every
Claude Code version (no reliance on undocumented multi-hook execution
semantics); each segment's || true preserves the original property
that an upgrade failure does not abort the pull.
Reuses cli.update_check.check() for the version probe — extended with
bypass_disabled=True so explicit user-typed self-upgrade is not silenced
by AGNES_NO_UPDATE_CHECK (which is for the implicit warning loop).
Install path: uv tool install --force when uv is on PATH; otherwise
curl + pip via sys.executable (NOT system python3, NOT --user — both
would land outside the agnes venv and silently no-op the upgrade).
Smoke test execs the binary at the install-resolved path (uv tool dir
joined with agnes-the-ai-analyst/bin/agnes, or sys.executable's sibling
agnes for pip) — never via shutil.which, which can resolve a stale shadow
on PATH and produce a false-positive smoke pass on the OLD version. Smoke
also asserts --version output contains info.latest via PEP 440 Version()
equality (so 0.40.0 does not falsely match 0.40.10).
On smoke fail: rollback to last_known_good.json (written only after a
previous run's smoke passed). Rollback rc is captured and surfaced on
stderr if it also fails. First-ever upgrade or unrecoverable rollback
prints the canonical bootstrap recovery: curl -fsSL <server>/cli/install.sh | bash.
AGNES_SELF_UPGRADE_IN_PROGRESS=1 is set for the duration of the run
and propagated to the smoke-test subprocess. Layer B's _check_version_headers
honors the sentinel and skips the < min hard-stop, so an in-flight
upgrade can never sys.exit(2) itself.
--force invalidates the update_check cache BEFORE probing. --force +
offline = exit 1 with explicit stderr (without --force, offline is silent).
--quiet suppresses progress output but never gags failure stderr.
Every API response is inspected via httpx event_hooks. When the server
reports X-Agnes-Min-Version > local, CLI prints a remediation message
and exits 2. Latest-version drift continues to be handled by the
update_check warning loop — no double-warning on every API call.
R3 final pass surfaced one issue, addressed:
R2#2 introduced PID-suffixed <target>.{pid}.tmp / .{pid}.partN to
prevent concurrent agnes pull invocations from yanking each other's
in-progress writes. The pre-clean inside _download_chunked /
_download_single_stream only deletes leftovers from the CURRENT
process's PID — files from a SIGKILL'd or crashed prior pull (any
other PID) are never touched and accumulate on disk forever.
Add _reap_dead_pid_leftovers(target_path) called at the start of both
download paths. Globs <target>.*.tmp / <target>.*.partN, extracts the
embedded PID, calls os.kill(pid, 0) to test liveness (POSIX standard
no-op probe), and unlinks files whose process no longer exists.
Permission-denied = process is alive but owned by another user → keep
the file (conservative). Windows users get the conservative 'keep'
default.
Two new tests pin the behavior — live-PID file preserved, dead-PID
.tmp + .partN reaped, bare-name (legacy) untouched, garbage filenames
skipped without raise.
R2 adversarial review surfaced 3 issues, all addressed:
#1 cli/client.py:572-577 outer try/except wrapped both _get_shared_client()
AND the actual download. A 401/403/404/5xx from the server triggered a
full second download attempt with a fresh client — wasted bandwidth on
hard failures, no fail-fast on revoked PAT. Narrowed the try to only
the shared-client construction; the download itself is no longer
retried under the fallback except.
#2 concurrent agnes pull invocations (e.g. SessionStart hook + manual
run) collided on bare <target>.tmp / <target>.partN paths — one process's
in-progress write got yanked by the other's cleanup, manifest hash
check then failed spuriously. Per-process suffix (<target>.{pid}.tmp,
<target>.{pid}.partN) makes intermediate files disjoint; the final
os.replace to the bare target is atomic so last-writer-wins.
#3 _looks_like_bq_rewrite_parse_error patterns 'Syntax error' could
false-positive on a query like WHERE log_msg = 'Syntax error in foo'
that fails for an unrelated reason (quota, network) and has the
literal substring echoed in the error text. Anchored to 'Syntax error: '
(with trailing colon) — BQ always emits the colon in this error
format, user SQL string literals normally don't.
R1 adversarial review surfaced 5 issues, all addressed:
#1 chunked download silently disabled in non-Caddy deployments (HEAD on
GET-only FastAPI route returns 405). _probe_range_support now falls back
to GET with Range: bytes=0-0 when HEAD fails — works against both
Caddy file_server (HEAD-friendly) and dev FastAPI direct (GET-only).
#2 parse-error fallback heuristic too broad — matched on Unrecognized
name / Function not found / No matching signature / Invalid cast,
which BQ surfaces for ordinary user-column typos. That triggered slow
ATTACH-catalog retry on every typo (2× latency tax). Narrowed to just
'Syntax error' / 'syntax error' which are the genuine DuckDB-vs-BQ
dialect mismatch markers.
#3 apply_bq_session_settings was only run on fresh-built pool entries,
not on reuse. An operator's /admin/server-config change to bq_query
_timeout_ms wouldn't propagate to long-lived pooled sessions until
restart. Fixed: re-apply on every pool acquire (idempotent + fail-soft).
#4 content-length sanity bound — a misconfigured proxy returning a
wildly inflated Content-Length would cause overlapping chunked Range
requests against the actual file → corrupt assembled output (caught
by manifest hash check, but only after wasted bandwidth). Cap at 100
GiB; above that, drop to single-stream.
#5 rewriter assumed every BQ row resolves under the single
bq.projects.data project. Bucket containing '.' suggests a project-
qualified bucket (multi-project deployment); rewriter would silently
target the wrong project. Conservative skip with regression test.
Two improvements to `agnes pull` progress reporting:
1. **Aggregated per-file progress across chunked downloads**: the
existing Rich progress bar already used one task per file, but the
chunked-download contract (one file = N parallel chunk callbacks
summing to file size) meant we needed to verify that all chunk
threads advance the same task. They do — the per-file callback is
constructed once per tid and routes every chunk's byte delta to the
same task / textual entry, so the bar shows one aggregated bytes-
downloaded total rather than N separate sub-bars.
2. **Textual fallback for non-TTY stderr**: when stderr is not a
terminal (SessionStart hook, CI runner, Docker log capture), Rich
either suppresses output (silent multi-minute pull on a 5 GB
parquet) or emits raw control sequences. The new `_TextualProgress`
helper instead emits one plain-text line per file at most every
10%-of-total-bytes or 30 s, plus a final `100% done` line per file.
Format: `[N/T files] <tid>: 25% (16 MB / 66 MB) at 1.5 MB/s`.
The TTY path is unchanged. Detection uses `sys.stderr.isatty()` —
`show_progress=True` flips into the textual fallback when that returns
False. `show_progress=False` (the SessionStart hook) still emits no
progress text in either mode.
Pool the httpx.Client used by `stream_download` so N parquet downloads
share a single TLS handshake instead of one handshake each. With the
optional `h2` package installed, HTTP/2 multiplexing further lets all
chunk Range requests share a single TCP connection — synergizes with
the range-chunked download path added in the previous commit.
The shared client is created lazily on first stream-download call, kept
alive for the duration of the process via a module-level slot, and
closed at exit via `atexit.register`. Construction wraps in a
try/except: when `h2` is unavailable (slim install), httpx raises
ImportError on `http2=True` and we transparently fall back to an
HTTP/1.1 client — pooling alone still amortizes TLS handshakes.
`agnes pull` must never crash on a missing optional package, so the
fallback path is non-negotiable. `h2>=4.1.0` is added to the core
dependency set; downstream slim installs that drop it lose the HTTP/2
benefit but keep correctness.
When the server advertises `accept-ranges: bytes` and a parquet exceeds
`AGNES_PULL_CHUNK_THRESHOLD_BYTES` (default 50 MB), `stream_download`
now splits the file into N parallel HTTP Range requests
(`AGNES_PULL_CHUNK_PARALLELISM`, default 4, capped 1..16) and
assembles the parts into the destination atomically.
Targets the per-flow-shaped network (corp VPN with per-TCP-connection
rate-limiting) where single-stream throughput is throttled but N parallel
streams over the same connection scale roughly linearly. Manifests with
1 large materialized parquet + N remote tables previously left the
existing across-files `AGNES_PULL_PARALLELISM=4` pool with 1 active
worker = single-stream throughput; this fixes that.
Falls back to single-stream when:
- HEAD doesn't advertise `accept-ranges: bytes`
- Server returns 200 instead of 206 to a Range probe
- File size below the threshold
Cleanup discipline: every part file removed before return (success or
failure); destination written via `<target>.tmp` and renamed atomically.
Per-chunk retry on transient network blips (bounded by AGNES_STREAM_RETRIES).
Renames the [Unreleased] section to [0.36.0] in CHANGELOG, adds the
top-level summary, drops a fresh empty [Unreleased] above, and bumps
pyproject from 0.35.1.
Also fixes the third Devin Review finding on this PR: the CLI
ReadTimeout message hardcoded QUERY_TIMEOUT_S (300s) so a 30s-default
call (agnes catalog, agnes auth, …) reported a wait window that
didn't match reality. _translate_transport_error now takes the actual
httpx timeout from the calling helper; the BQ-job advisory only
appears for calls where the timeout was set ≥ 60s.
🚩 /api/v2/catalog still async def while now calling sync stat()
`/api/v2/catalog` was left as `async def` when the rest of Tier 1 was
converted, on the assumption it was lightweight. The new
`_materialized_size_hint` populator added in this PR calls
`Path.stat()` / `Path.exists()` for every visible row to bucket the
parquet size — on a local FS that's microseconds, but on a
network-mounted DATA_DIR (NFS / CIFS / GCS-FUSE) those syscalls
can block the event loop. Convert to plain `def` so FastAPI
auto-offloads to the thread pool, mirroring /api/query etc.
🔴 stream_download translates HTTPStatusError as generic transport error
`response.raise_for_status()` inside the retry loop raises
`httpx.HTTPStatusError` on 4xx/5xx. After retries exhaust, the new
`isinstance(last_exc, httpx.HTTPError)` check at line 219 was eating
the status code: HTTPStatusError is a subclass of HTTPError, so the
generic transport translation produced "Unexpected error: HTTPStatusError"
instead of the informative "Client error '401 Unauthorized' for url …"
that callers expect. Fix: short-circuit HTTPStatusError before the
HTTPError branch — it re-raises verbatim so the caller's status-code
handling + the rich server error body (e.g. 401 expired token, 403
cross_project_forbidden) reach the analyst.
api_get / api_post / api_delete / api_patch don't have the same bug:
httpx Client.get/etc. don't raise HTTPStatusError unless the caller
explicitly calls .raise_for_status(), and our wrappers don't.
Only stream_download does, hence the targeted fix there.
Three first-try-failure-surface fixes from Pavel's #185 trace + the
template guidance question, all under PR #188's umbrella so they land
together with the file_server / parallel pull / Tier 1 work.
1. CLI clean-error wrapper — new AgnesTransportError raised by the
api_*/stream_download helpers when httpx times out / drops /
refuses, plus a top-level Typer wrapper (cli/main.py) that prints
one-line "Error: …" + actionable hint and exits non-zero. Full
traceback goes to ~/.config/agnes/last-error.log for support
forwarding. Unhandled Exceptions are caught at the same boundary
so no Python traceback ever leaks to the analyst's terminal.
Pavel's #185 Phase 3B: a 30-frame httpx traceback from a slow BQ
--remote query made it look like a CLI bug. Now: clean message +
hint pointing at `agnes snapshot create` / partition-column
guidance.
Entry point in pyproject.toml flipped from `cli.main:app` →
`cli.main:_run_with_clean_errors` so the wrapper actually runs
under the installed `agnes` binary.
2. agnes init / agnes pull --skip-materialize + progress bar.
--skip-materialize omits query_mode='materialized' rows from the
download set so a first init doesn't spend 44 minutes silently
pulling a single 6 GB parquet (Pavel's #185 Phase 1). Rich-driven
per-file progress bar with label/bytes/rate/ETA renders to stderr
when not --quiet and not --json. Aggregates across the parallel
ThreadPoolExecutor workers added earlier in this PR.
3. config/claude_md_template.txt: explicit one-line snippet pointing
at `agnes catalog --json | jq '.tables[] | select(.id=="<id>")'`
for per-table descriptions + restated invariant: "the description
field on each catalog row is the authoritative business-rules
text — re-read live, never copy into this file." Resolves the
regression-or-feature debate between Pavel (wants annotations)
and the user feedback that landed in the prior commit (don't
embed table-specific content; tables change). Catalog command
stays the source of truth.
Three concrete changes addressing the "analyst Claude misuses the CLI"
class of bugs (image.png table — issues #3, #5, plus the recurrent
"how big is this table" guesswork):
1. config/claude_md_template.txt — the template agnes init writes to
<workspace>/CLAUDE.md. Surfaces every catalog-row field with a why,
adds a query_mode-based decision tree, explicit --estimate scoping
(snapshot create ONLY — was the #1 first-try error), an agnes fetch
→ agnes snapshot create rename note, and a 6-row failure-mode table
that maps each common error wording to its right next step.
2. app/api/v2_catalog.py — populate rough_size_hint for local +
materialized rows from the on-disk parquet size, bucketed
small/medium/large/very_large. Was hardcoded null with a TODO; AI
couldn't tell "is this 6.8 GB" without a failed --remote round-trip.
3. cli/update_check.py — the [update] banner survived the da→agnes
rename and printed "[update] da X is out of date" on every command,
training analysts to associate the binary with the old name.
Verified by rendering the template against representative contexts
(33/33 tests pass) and running every use case from the original
screenshot through the real CLI against a dev VM.
The download loop in cli/lib/pull.py was strictly serial — N tables took
Σ stream_download(t_i). With the Caddy file_server change in this PR,
the server can now sustain many parallel sendfile transfers without
blocking app workers, so the client-side serialization became the new
bottleneck.
Switch to ThreadPoolExecutor capped by AGNES_PULL_PARALLELISM (default 4,
set 1 to restore pre-PR serial). 4 matches typical home-broadband
saturation without over-subscribing the analyst's NIC. Drops to serial
when len(to_download) <= 1 to avoid executor overhead in the common
single-table case.
Per-table error semantics preserved via (tid, entry, err) tuple — a
failure on one parquet doesn't abort the rest of the batch.
Verified end-to-end against a dev VM with the new Caddy file_server
deployed: 2-table pull through agnes CLI works under the new concurrency.
The httpx client behind 'agnes query --remote' used the default 30s
timeout, killing every BigQuery SELECT that took longer than half a
minute — i.e. most non-trivial remote queries.
cli/client.py now exposes QUERY_TIMEOUT_S (default 300s, override via
AGNES_QUERY_TIMEOUT) and propagates a kw-only 'timeout' through
api_get/post/delete/patch. _query_remote passes QUERY_TIMEOUT_S so only
the long-running /api/query path gets the bump; every other CLI call
keeps the 30s default.
Server-side has no read deadline on /api/query, so the client cap was
the sole bottleneck.
Backup-orchestration commands were split across two namespaces (pull in
agnes store, push in agnes admin store), which broke the operator
mental model — pull/push are a paired operation and should sit
together.
Move pull + info into agnes admin store so all bulk operations share
one help screen. Add agnes store mine as the user-facing equivalent —
calls the same /api/store/bundle.zip endpoint with ?owner=me, which
the server resolves to the caller's user_id. Authors can archive
their own uploads without admin role; whole-Store bulk reads stay
admin-flavored as a discoverability hint.
Server: 3-line addition to export_bundle handles owner='me' as a
magic alias for the caller. No new endpoint.
Tests updated: pull/info expectations move from agnes store to
agnes admin store; new tests cover agnes store mine and the
?owner=me server resolution. 69/69 store tests green locally.
Adds whole-Store backup/restore primitives so an external CI/CD job can
mirror the Store to a git repo (and restore back from one).
REST:
- GET /api/store/bundle.zip — deterministic ZIP of all (filtered) Store
entities. Layout: manifest.json + entities/<id>/{plugin,assets}/.
Manifest carries owner_email for cross-instance restore. Auth: any
authenticated user (Store is community-open).
- POST /api/store/import-bundle — admin-only restore. Modes
merge|replace|skip; owner resolution by email with stub-disabled-user
fallback when the email is unknown on the target instance.
CLI:
- agnes store update <id> [--description X] [--zip PATH] ... — in-place
edit (server PUT permits owner OR admin per F4). Closes the missing
edit affordance for analysts who want to fix a typo or push a new
ZIP without losing install_count.
- agnes store pull [-o store.zip] [--unpack DIR] — download the bundle.
--unpack streams + extracts so an external git-backup workflow can
drop the tree straight into a repo and `git add .`.
- agnes store info [--json] — counts + size summary.
- agnes admin store push <zip-or-dir> [--mode ...] — admin-only restore.
Auto-zips a directory client-side so a working-tree → server
round-trip is one command.
cli/v2_client.py gains api_get_stream helper for binary downloads.
Tests: 5 new server tests (bundle shape + filters + round-trip + stub
user creation + skip mode + admin-only gate) + 11 new CLI tests
(update, pull/unpack, info, admin push). 66/66 store-related tests
green locally.
Three items from operator feedback after running the actual flow:
(1) Help docstring lied: "--bucket / --source-table ignored" for
materialized rows. Reality: --bucket is load-bearing because
`agnes schema <name>` builds the BQ identifier as
`bq.<bucket>.<source_table>`. An empty bucket registered the row but
broke schema/describe with HTTP 400 "unsafe BQ identifier in
registry". Fix: docstring rewritten to reflect reality, plus
client-side validation rejects materialized + empty bucket with a
clear error pointing at the right knob.
(2) Post-register UX cliff: `agnes pull` after register-table reports
"Updated 0 tables (1 total)" because registration adds a registry
row but does NOT trigger a parquet build. Operators routinely
assume something's broken when they need to run
`agnes setup first-sync` to kick off the materialization. Hint
emitted on success now points at first-sync.
(3) RBAC gotcha: `agnes catalog` is RBAC-filtered via
`resource_grants`, so non-admin users don't see freshly-registered
rows until a grant is created. Hint emitted on success now points at
`agnes admin grant create <group> table <name>`.
Tests: 8/8 in test_cli_admin_materialized.py, including two new
regression tests for the validation + the hint output.
Three Devin Review findings on PR #173 addressed in one commit since
they're in adjacent code paths:
1. cli/commands/init.py:99 (\u{1F534}): `agnes init --token NEW` ran
step 2 verify against the OLD on-disk token because `get_token()`
read `~/.config/agnes/token.json` before the env var, and
`_override_server_env` only set the env var. So `agnes init --force`
on a machine with a stale token.json failed 401 with a confusing
'token expired' even though the --token arg was valid.
Fix: ContextVar-based override in `cli.config._token_override`
checked by `get_token()` BEFORE the on-disk read.
`_with_token_override` context manager scopes the override.
`_override_server_env` now also sets the contextvar via
`_with_token_override(token)`, so both env var and contextvar
carry the override (env for back-compat with anything bypassing
get_token; contextvar is the authoritative source).
Async-safe (each task sees its own override) and leak-proof
(resets on context exit).
2 new tests: regression on stale-disk-token + scope leak guard.
2. cli/commands/status.py:43 (\u{1F7E1}): sessions_pending_upload only
checked legacy `<workspace>/user/sessions/` and always reported 0
in workspaces bootstrapped with `agnes init` (Claude Code writes
to `~/.claude/projects/`, not the legacy path). Same bug we fixed
for `agnes push` in 08e49591.
Fix: route through `cli.lib.claude_sessions.list_session_files()`
so status and push agree on what counts as a pending session.
3. connectors/bigquery/extractor.py:111 (\u{1F7E1}): docstring claimed
"a live holder still wins the second flock attempt" — incorrect on
Linux. After `unlink()` + `open()`, the new file is a new inode;
fcntl.flock keys per-inode, so the old holder's lock does NOT block
the new acquisition. In a genuine TTL-overrun scenario two writers
CAN race the parquet.tmp.
Fix: documentation only. Comment now honestly describes the
inode-recreation behavior, names the threading.Lock as the actual
in-process guard, and flags pid-gating as the next-iteration fix
if real corruption surfaces. The 24h default TTL is well above
typical COPY durations so the practical risk is low.
Tests: 17/17 across test_cli_init.py + test_lib_pull.py + the broader
regression set.
Sweep operator runbooks (docs/QUICKSTART, docs/HEADLESS_USAGE,
docs/architecture, docs/sample-data, docs/agent-workspace-prompt,
docs/metrics/metrics.yml, dev_docs/server, dev_docs/disaster-recovery),
the corporate-memory service README, the jira connector README + backfill
scripts, the deploy skill, and test docstrings. Replaces `da sync` →
`agnes pull`, `da analyst setup` → `agnes init`, `da metrics ...` →
`agnes catalog --metrics` / `agnes admin metrics ...`, `da fetch` →
`agnes snapshot create`, plus the matching docker-compose admin
invocations.
Vendor-specific `/opt/data-analyst/` install paths in jira backfill /
consistency scripts and operator docs are replaced with the
placeholder `<install-dir>` and a new `AGNES_ENV_FILE` env-var override
that lets a deployment inject its actual install path without a code
change. Aligns with the OSS vendor-agnostic policy in CLAUDE.md.
CHANGELOG `### Internal` entry summarizes the audit and reaffirms the
intentional stale-marker tuples (`_LEGACY_STRINGS`, `_OUR_COMMAND_MARKERS`)
that must keep referencing `da sync` / `da fetch` / etc. for hook upgrade
and override-detection logic.
Pre-fix `agnes pull` decided what to download from sync_state hash
equality alone:
if server_hash != local_hash or tid not in local_tables or not server_hash:
to_download.append(tid)
If the recorded local hash matched server but the actual parquet had
been deleted from disk, the download was skipped. The next DuckDB
view rebuild then fails on a missing file. Repro: `rm
server/parquet/X.parquet && agnes pull` → 'Updated 0 tables', X
still missing.
Failure modes that produce hash-equal-but-file-missing:
- manual `rm` of a single parquet
- operator-side cleanup of `server/parquet/`
- two workspaces sharing one user's
`~/.config/agnes/sync_state.json` (TODO(workspace-scoped-sync-state)
in pull.py): one workspace writes its parquets, the other reads
sync_state and concludes 'I already have these'
- disk corruption / partial restore from backup
Fix: existence check runs alongside the hash compare. Missing file
forces a re-download regardless of hash equality. `parquet_dir` is
hoisted above the loop so the existence check is in scope when the
download set is built.
Tests: regression test for the hash-equal-but-missing-file case +
counterpart for the fast-path (hash-equal-and-file-present must
still skip).
Bring admin UI, audit-log messages, code comments, and analyst-facing
skill docs in line with the post-bootstrap CLI surface (`agnes pull`,
`agnes push`, `agnes init`, `agnes snapshot create`). The legacy
`_LEGACY_STRINGS` detection tuple in `app/api/claude_md.py` and the hook
upgrade markers in `cli/lib/hooks.py` are intentionally left as-is —
they exist precisely to flag pre-rewrite content for re-authoring.
Strip "(folded from `da metrics list`)" / "(lifted from `da metrics
show`)" / "Replaces the old `da analyst status`" docstring noise — the
rename history is in CHANGELOG.md, not in module docstrings.
Ports Minas's PR #172 (against pre-rename `da` CLI on main) and applies
the principle to the post-rename `agnes` CLI. Two distinct failure modes
on Windows consoles whose default codepage is cp1250 (cs-CZ) / cp1252
(en-US):
1. `agnes pull` and other Rich-progress codepaths
UnicodeEncodeError on Braille spinner glyphs. Fix: `cli/main.py`
reconfigures stdout/stderr to UTF-8 with errors='replace' at import
time on `sys.platform == 'win32'` so Rich's legacy-Windows render
path emits decodable bytes. Wrapped in try/except so pytest's
captured streams (which aren't TextIOWrapper) don't break.
2. `agnes skills list` and `agnes skills show`
UnicodeDecodeError when reading skill markdown containing em-dashes /
accented chars. Default `Path.read_text()` uses
locale.getpreferredencoding(False), which is the broken codepage on
Windows. Fix: every call site passes encoding='utf-8' explicitly.
Broader scope than #172 because:
- The bootstrap rewrite renamed/removed several files Minas's PR
patched (`cli/commands/analyst.py` -> rolled into init.py;
`cli/commands/sync.py` -> split into pull/push). Those targets no
longer exist; the equivalent code lives in init.py.
- Other call sites Minas didn't touch (still bare in his branch) are
patched here too — config.py / update_check.py / snapshot_meta.py /
setup.py / skills.py — so the codebase has zero locale-default text
I/O in cli/.
Side cleanup: stale `Run `da`` reference in snapshot_meta.py:88 fixed
to `agnes` while touching the file.
CI failure: test_readers_in_pre_init_dir asserted no Traceback in stderr
when running `agnes snapshot create x --as y --estimate` in a folder
that never saw `agnes init`. The estimate-guard fix in 3d587681 let
`--estimate` skip the local_db check and reach `api_post_json`, but the
existing `except V2ClientError` doesn't cover transport-layer failures.
With no server configured the URL defaults to http://localhost:8000;
httpx raises ConnectError → ConnectError isn't a V2ClientError → the
exception bubbles up through Typer/rich as a full traceback.
Add `except httpx.HTTPError` next to V2ClientError so connection /
DNS / TLS / timeout failures all render the friendly hint
`Run `agnes init …` first` instead of leaking transport noise.
Real bug: `agnes push` was reading `<workspace>/user/sessions/`, but
Claude Code writes session jsonls to `~/.claude/projects/<encoded-cwd>/`
and nothing on the analyst side ever copies them across. The SessionEnd
hook ran `agnes push` happily and uploaded zero sessions every time.
`cli/lib/claude_sessions.py` probes both Claude Code encoding variants
(older `/`→`-` keeping spaces+tildes; newer all-non-alphanumeric→`-`
with collapsed runs) and unions whichever exist. Users who upgraded
Claude Code mid-project end up with both encoded dirs side-by-side on
disk; the union ensures no session is left behind. Same-named jsonl in
both dirs → newest mtime wins. `<workspace>/user/sessions/` survives as
a fallback for any setup that explicitly mirrors sessions there.
Verified on real disk: helper returns 2 dirs + 8 unioned session files
for the Agnes-test workspace where the previous code returned 0.
13 Devin findings across 10 files:
🔴 Critical:
- app/api/v2_catalog.py:42 — `_fetch_hint` returns `da fetch` in /api/v2/catalog
responses (user-visible in every catalog list)
- cli/skills/agnes-data-querying.md — 11 stale `da fetch`/`da sync` refs in the
bundled skill markdown
- config/claude_md_template.txt:38 — referenced `agnes pull --docs-only` flag
that does NOT exist in agnes pull (removed; spec only ships --quiet/--json/
--dry-run)
🟡 Important:
- app/api/admin.py:252 — `da fetch` in bq_max_scan_bytes hint
- cli/commands/auth.py:119 — `da sync` in import-token docstring (--help text)
- cli/commands/tokens.py:48 — "Export it so `da` can use it" prose
- ARCHITECTURE.md — 4 stale rows in CLI commands table
- README.md — stale paragraphs for analysts (da sync, da analyst setup)
🚩 Substantive observations addressed:
- app/api/query.py:249,302,489 — server-side error/help strings still said
`da sync`/`da fetch` (returned in API responses to clients)
- cli/commands/snapshot.py:235-241 — DuckDB existence guard incorrectly
blocked `--estimate` (server-side dry-run that never opens local DB).
Added test ensuring estimate path skips the guard.
Skipped (intentionally historical):
- app/api/admin.py:2377,2429,2437 — historical comments describing past
manifest-vs-sync_state bug; past tense, accurate to keep as `da sync`.