agnes-the-ai-analyst/cli/lib
ZdenekSrotyr 2ae486bc5d feat(pull): parallel parquet downloads (AGNES_PULL_PARALLELISM=4 default)
The download loop in cli/lib/pull.py was strictly serial — N tables took
Σ stream_download(t_i). With the Caddy file_server change in this PR,
the server can now sustain many parallel sendfile transfers without
blocking app workers, so the client-side serialization became the new
bottleneck.

Switch to ThreadPoolExecutor capped by AGNES_PULL_PARALLELISM (default 4,
set 1 to restore pre-PR serial). 4 matches typical home-broadband
saturation without over-subscribing the analyst's NIC. Drops to serial
when len(to_download) <= 1 to avoid executor overhead in the common
single-table case.

Per-table error semantics preserved via (tid, entry, err) tuple — a
failure on one parquet doesn't abort the rest of the batch.

Verified end-to-end against a dev VM with the new Caddy file_server
deployed: 2-table pull through agnes CLI works under the new concurrency.
2026-05-05 16:42:55 +02:00
..
__init__.py feat(cli-lib): cli/lib/hooks.py:install_claude_hooks 2026-05-04 17:53:20 +02:00
claude_sessions.py fix(push): read sessions from ~/.claude/projects/<encoded-cwd>/ 2026-05-04 20:29:59 +02:00
hooks.py feat(cli-lib): cli/lib/hooks.py:install_claude_hooks 2026-05-04 17:53:20 +02:00
pull.py feat(pull): parallel parquet downloads (AGNES_PULL_PARALLELISM=4 default) 2026-05-05 16:42:55 +02:00