Commit graph

12 commits

Author SHA1 Message Date
Minas Arustamyan
50e0463501 feat(marketplace): clone-based plugin setup + auto-refresh SessionStart hook
Adds end-to-end flow for installing and keeping the per-user filtered
Claude Code marketplace in sync with the user's Agnes stack
(admin RBAC grants \ MyAIStack opt-outs U /store installs).

Setup (one-liner in install prompt step 5):
  `agnes refresh-marketplace --bootstrap` clones the per-user marketplace
  bare repo to ~/.agnes/marketplace, strips PAT from the cloned origin
  URL, registers the local path with Claude Code, and installs every
  plugin in the served manifest at --scope project. Replaces a 15-line
  inline shell sequence that tripped Claude Code's agent-driven `rm -rf`
  permission gate.

Auto-refresh (SessionStart hook installed by `agnes init`):
  `agnes refresh-marketplace --quiet` runs every Claude Code session,
  fetches+resets the clone (server rebuilds as orphan commits, so
  pull --ff-only is impossible), and version-aware reconciles:
    - missing in workspace -> claude plugin install <name>@agnes --scope project
    - version differs       -> claude plugin update <name>@agnes
    - matches               -> skip
  Don't auto-uninstall plugins that disappeared from the manifest --
  a transient empty manifest from the server would wipe the stack.

Hook output: when --quiet AND something actually changed, emits Claude
Code hook JSON on stdout -- `systemMessage` (transient toast) and
`hookSpecificOutput.additionalContext` (model-side system reminder),
both carrying the change summary plus a "/exit + restart Claude Code"
instruction (Claude only scans plugins at session start).

Windows hook compatibility: the refresh-marketplace hook command is
wrapped in `bash -c "..."` because Claude Code on Windows runs hook
commands directly without invoking a shell, so `2>/dev/null || true`
would otherwise be passed as literal argv tokens.

Cross-cutting:
  - cli/lib/marketplace.py: shared CLONE_DIR + MARKETPLACE_NAME constants.
  - cli/lib/hooks.py: SessionStart now has two independent entries
    (pull + refresh-marketplace) so a failure in one doesn't suppress
    the other; legacy `da sync` and prior single-pull layouts upgrade
    cleanly on re-init.
  - PAT injection on every git fetch via per-invocation credential
    helper (token in \$AGNES_TOKEN env, never in argv or .git/config).
  - Pre-snapshot of installed plugins captured BEFORE
    `claude plugin marketplace update` so silent auto-applied version
    bumps still fire notifications.
  - scripts/dev/agnes-client-reset.sh: cleans ~/.claude/plugins/marketplaces/agnes,
    ~/.claude/plugins/cache/agnes, drops uv build cache, documents
    workspace-scoped residue that can't be enumerated from the script.
  - app/web/setup_instructions.py: legacy AGNES_DEBUG_AUTH path also
    uses clone (direct HTTPS marketplace add is broken end-to-end on
    every Claude Code distribution -- stores response as single file,
    plugin source paths then 404).

28 new tests (test_cli_refresh_marketplace.py) + extended hook + setup
template tests cover bootstrap, fetch+reset ordering, version-aware
reconcile, project-path filtering, hook JSON shape, and the bash-c
Windows wrapper invariant.
2026-05-07 06:59:13 +02:00
ZdenekSrotyr
73d2896fa6 docs(hooks): update install_claude_hooks docstring for chained SessionStart 2026-05-06 23:23:23 +02:00
ZdenekSrotyr
be62ce61b8 feat(cli): install SessionStart hook chaining self-upgrade then pull
Single hook entry: 'agnes self-upgrade --quiet ... || true; agnes pull
--quiet ... || true'. Shell semicolon guarantees ordering across every
Claude Code version (no reliance on undocumented multi-hook execution
semantics); each segment's || true preserves the original property
that an upgrade failure does not abort the pull.
2026-05-06 23:23:23 +02:00
ZdenekSrotyr
e72ff259f9 feat(pull): aggregated progress + non-TTY textual fallback
Two improvements to `agnes pull` progress reporting:

1. **Aggregated per-file progress across chunked downloads**: the
   existing Rich progress bar already used one task per file, but the
   chunked-download contract (one file = N parallel chunk callbacks
   summing to file size) meant we needed to verify that all chunk
   threads advance the same task. They do — the per-file callback is
   constructed once per tid and routes every chunk's byte delta to the
   same task / textual entry, so the bar shows one aggregated bytes-
   downloaded total rather than N separate sub-bars.

2. **Textual fallback for non-TTY stderr**: when stderr is not a
   terminal (SessionStart hook, CI runner, Docker log capture), Rich
   either suppresses output (silent multi-minute pull on a 5 GB
   parquet) or emits raw control sequences. The new `_TextualProgress`
   helper instead emits one plain-text line per file at most every
   10%-of-total-bytes or 30 s, plus a final `100% done` line per file.
   Format: `[N/T files] <tid>: 25% (16 MB / 66 MB) at 1.5 MB/s`.

The TTY path is unchanged. Detection uses `sys.stderr.isatty()` —
`show_progress=True` flips into the textual fallback when that returns
False. `show_progress=False` (the SessionStart hook) still emits no
progress text in either mode.
2026-05-06 13:09:37 +02:00
ZdenekSrotyr
28423907fd feat: clean CLI errors + init progress + skip-materialize + claude.md catalog pointer
Three first-try-failure-surface fixes from Pavel's #185 trace + the
template guidance question, all under PR #188's umbrella so they land
together with the file_server / parallel pull / Tier 1 work.

1. CLI clean-error wrapper — new AgnesTransportError raised by the
   api_*/stream_download helpers when httpx times out / drops /
   refuses, plus a top-level Typer wrapper (cli/main.py) that prints
   one-line "Error: …" + actionable hint and exits non-zero. Full
   traceback goes to ~/.config/agnes/last-error.log for support
   forwarding. Unhandled Exceptions are caught at the same boundary
   so no Python traceback ever leaks to the analyst's terminal.

   Pavel's #185 Phase 3B: a 30-frame httpx traceback from a slow BQ
   --remote query made it look like a CLI bug. Now: clean message +
   hint pointing at `agnes snapshot create` / partition-column
   guidance.

   Entry point in pyproject.toml flipped from `cli.main:app` →
   `cli.main:_run_with_clean_errors` so the wrapper actually runs
   under the installed `agnes` binary.

2. agnes init / agnes pull --skip-materialize + progress bar.
   --skip-materialize omits query_mode='materialized' rows from the
   download set so a first init doesn't spend 44 minutes silently
   pulling a single 6 GB parquet (Pavel's #185 Phase 1). Rich-driven
   per-file progress bar with label/bytes/rate/ETA renders to stderr
   when not --quiet and not --json. Aggregates across the parallel
   ThreadPoolExecutor workers added earlier in this PR.

3. config/claude_md_template.txt: explicit one-line snippet pointing
   at `agnes catalog --json | jq '.tables[] | select(.id=="<id>")'`
   for per-table descriptions + restated invariant: "the description
   field on each catalog row is the authoritative business-rules
   text — re-read live, never copy into this file." Resolves the
   regression-or-feature debate between Pavel (wants annotations)
   and the user feedback that landed in the prior commit (don't
   embed table-specific content; tables change). Catalog command
   stays the source of truth.
2026-05-05 18:11:59 +02:00
ZdenekSrotyr
2ae486bc5d feat(pull): parallel parquet downloads (AGNES_PULL_PARALLELISM=4 default)
The download loop in cli/lib/pull.py was strictly serial — N tables took
Σ stream_download(t_i). With the Caddy file_server change in this PR,
the server can now sustain many parallel sendfile transfers without
blocking app workers, so the client-side serialization became the new
bottleneck.

Switch to ThreadPoolExecutor capped by AGNES_PULL_PARALLELISM (default 4,
set 1 to restore pre-PR serial). 4 matches typical home-broadband
saturation without over-subscribing the analyst's NIC. Drops to serial
when len(to_download) <= 1 to avoid executor overhead in the common
single-table case.

Per-table error semantics preserved via (tid, entry, err) tuple — a
failure on one parquet doesn't abort the rest of the batch.

Verified end-to-end against a dev VM with the new Caddy file_server
deployed: 2-table pull through agnes CLI works under the new concurrency.
2026-05-05 16:42:55 +02:00
ZdenekSrotyr
8784f10a6b fix(devin-review): stale-token override + status sessions counter + lock comment
Three Devin Review findings on PR #173 addressed in one commit since
they're in adjacent code paths:

1. cli/commands/init.py:99 (\u{1F534}): `agnes init --token NEW` ran
   step 2 verify against the OLD on-disk token because `get_token()`
   read `~/.config/agnes/token.json` before the env var, and
   `_override_server_env` only set the env var. So `agnes init --force`
   on a machine with a stale token.json failed 401 with a confusing
   'token expired' even though the --token arg was valid.

   Fix: ContextVar-based override in `cli.config._token_override`
   checked by `get_token()` BEFORE the on-disk read.
   `_with_token_override` context manager scopes the override.
   `_override_server_env` now also sets the contextvar via
   `_with_token_override(token)`, so both env var and contextvar
   carry the override (env for back-compat with anything bypassing
   get_token; contextvar is the authoritative source).
   Async-safe (each task sees its own override) and leak-proof
   (resets on context exit).
   2 new tests: regression on stale-disk-token + scope leak guard.

2. cli/commands/status.py:43 (\u{1F7E1}): sessions_pending_upload only
   checked legacy `<workspace>/user/sessions/` and always reported 0
   in workspaces bootstrapped with `agnes init` (Claude Code writes
   to `~/.claude/projects/`, not the legacy path). Same bug we fixed
   for `agnes push` in 08e49591.

   Fix: route through `cli.lib.claude_sessions.list_session_files()`
   so status and push agree on what counts as a pending session.

3. connectors/bigquery/extractor.py:111 (\u{1F7E1}): docstring claimed
   "a live holder still wins the second flock attempt" — incorrect on
   Linux. After `unlink()` + `open()`, the new file is a new inode;
   fcntl.flock keys per-inode, so the old holder's lock does NOT block
   the new acquisition. In a genuine TTL-overrun scenario two writers
   CAN race the parquet.tmp.

   Fix: documentation only. Comment now honestly describes the
   inode-recreation behavior, names the threading.Lock as the actual
   in-process guard, and flags pid-gating as the next-iteration fix
   if real corruption surfaces. The 24h default TTL is well above
   typical COPY durations so the practical risk is low.

Tests: 17/17 across test_cli_init.py + test_lib_pull.py + the broader
regression set.
2026-05-04 21:26:30 +02:00
ZdenekSrotyr
976d0c7160 fix(pull): re-download parquet when file missing despite matching hash
Pre-fix `agnes pull` decided what to download from sync_state hash
equality alone:

    if server_hash != local_hash or tid not in local_tables or not server_hash:
        to_download.append(tid)

If the recorded local hash matched server but the actual parquet had
been deleted from disk, the download was skipped. The next DuckDB
view rebuild then fails on a missing file. Repro: `rm
server/parquet/X.parquet && agnes pull` → 'Updated 0 tables', X
still missing.

Failure modes that produce hash-equal-but-file-missing:
- manual `rm` of a single parquet
- operator-side cleanup of `server/parquet/`
- two workspaces sharing one user's
  `~/.config/agnes/sync_state.json` (TODO(workspace-scoped-sync-state)
  in pull.py): one workspace writes its parquets, the other reads
  sync_state and concludes 'I already have these'
- disk corruption / partial restore from backup

Fix: existence check runs alongside the hash compare. Missing file
forces a re-download regardless of hash equality. `parquet_dir` is
hoisted above the loop so the existence check is in scope when the
download set is built.

Tests: regression test for the hash-equal-but-missing-file case +
counterpart for the fast-path (hash-equal-and-file-present must
still skip).
2026-05-04 21:12:06 +02:00
ZdenekSrotyr
08e4959185 fix(push): read sessions from ~/.claude/projects/<encoded-cwd>/
Real bug: `agnes push` was reading `<workspace>/user/sessions/`, but
Claude Code writes session jsonls to `~/.claude/projects/<encoded-cwd>/`
and nothing on the analyst side ever copies them across. The SessionEnd
hook ran `agnes push` happily and uploaded zero sessions every time.

`cli/lib/claude_sessions.py` probes both Claude Code encoding variants
(older `/`→`-` keeping spaces+tildes; newer all-non-alphanumeric→`-`
with collapsed runs) and unions whichever exist. Users who upgraded
Claude Code mid-project end up with both encoded dirs side-by-side on
disk; the union ensures no session is left behind. Same-named jsonl in
both dirs → newest mtime wins. `<workspace>/user/sessions/` survives as
a fallback for any setup that explicitly mirrors sessions there.

Verified on real disk: helper returns 2 dirs + 8 unioned session files
for the Agnes-test workspace where the previous code returned 0.
2026-05-04 20:29:59 +02:00
ZdenekSrotyr
15004126de fix(cli-lib): I1+I2+I3 review fixes — token-precedence note, sync-state TODO, dry-run hermeticity test 2026-05-04 18:04:56 +02:00
ZdenekSrotyr
37da602060 feat(cli-lib): cli/lib/pull.py:run_pull primitive with lazy mkdir 2026-05-04 18:00:57 +02:00
ZdenekSrotyr
5aebeabf23 feat(cli-lib): cli/lib/hooks.py:install_claude_hooks 2026-05-04 17:53:20 +02:00