The analyst flow becomes a closed loop with the server-curated table catalog:
- `da analyst setup` writes `<workspace>/.claude/settings.json` with two hooks:
SessionStart → `da sync --quiet || true` — pulls fresh RBAC-filtered parquets at session start
SessionEnd → `da sync --upload-only --quiet || true` — uploads session jsonl + CLAUDE.local.md
- `|| true` keeps Claude Code unblocked when the server is down.
- Workspace-level (not user-home) so the hooks fire only when Claude Code opens this analyst workspace.
- `da sync --quiet` rewrites the CLI output for hook consumption — 0 stdout on success, single-line error on failure.
- Existing settings.json is patched (deep-merged), not overwritten; malformed JSON is reported, not silently overwritten.
Tests cover: workspace bootstrap, hook insertion, malformed-json safety, quiet-mode output shape.
Session testing revealed 3 issues with remote queries:
1. CLAUDE.md template recommended `cat <<HEREDOC | ssh ...` but
claude_settings.json had `cat` in deny list, causing 2-3 failed
attempts per query. Replaced with file-based approach: Write tool
creates JSON file, then `ssh ... < file` avoids the cat deny.
2. ssh/scp commands were not in the allow list, requiring manual
approval for every remote query. Added both to allow list.
3. DuckDB fetch_arrow_table() emitted DeprecationWarning on every
parquet export. Replaced with .arrow().read_all().
Also added instruction for proactive hybrid analysis when remote
tables are available (agent was only using local data until asked).
Open-source AI data analyst platform extracted from internal repo.
Includes data sync engine, Keboola adapter, Flask web portal,
server deployment scripts, and configuration templates.