The download loop in cli/lib/pull.py was strictly serial — N tables took Σ stream_download(t_i). With the Caddy file_server change in this PR, the server can now sustain many parallel sendfile transfers without blocking app workers, so the client-side serialization became the new bottleneck. Switch to ThreadPoolExecutor capped by AGNES_PULL_PARALLELISM (default 4, set 1 to restore pre-PR serial). 4 matches typical home-broadband saturation without over-subscribing the analyst's NIC. Drops to serial when len(to_download) <= 1 to avoid executor overhead in the common single-table case. Per-table error semantics preserved via (tid, entry, err) tuple — a failure on one parquet doesn't abort the rest of the batch. Verified end-to-end against a dev VM with the new Caddy file_server deployed: 2-table pull through agnes CLI works under the new concurrency. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| claude_sessions.py | ||
| hooks.py | ||
| pull.py | ||