Agent was failing 3x on SSH commands due to backticks (BQ table names)
and single quotes (SQL string literals) getting mangled by nested shell
interpretation (local -> SSH -> bash -> Python).
New --stdin mode reads query spec as JSON from stdin via heredoc:
cat <<'QUERY' | ssh alias 'bash remote_query.sh --stdin'
{"register_bq": {"alias": "SELECT ... FROM \`table\` ..."}, "sql": "..."}
QUERY
Heredoc with <<'QUERY' (quoted) passes everything literally -- no
escaping needed for backticks, quotes, or parentheses.
Updated claude_md_template.txt to use --stdin as the primary method.
Analyst user (foundry_e_psimecek) couldn't access /opt/data-analyst/.
Added to data-ops group on server.
New scripts/remote_query.sh wrapper handles env setup (PYTHONPATH,
CONFIG_DIR, .env) so agents use simple:
ssh alias 'bash ~/server/scripts/remote_query.sh --sql "..." --format table'
Updated claude_md_template.txt to use wrapper instead of raw commands.
Add src/remote_query.py CLI module enabling the AI agent to run SQL
queries spanning local Parquet tables and remote BigQuery tables in a
single DuckDB session on the server. Two-phase protocol: BQ sub-queries
(--register-bq) fetch filtered/aggregated data, then DuckDB SQL (--sql)
joins everything.
Safety: COUNT(*) pre-check, memory estimation (2GB cap), row limits
(500K per BQ sub-query, 100K final result).
Changes:
- New src/remote_query.py with CLI, BQ registration, output formatting
- Add bq_entity_type field to TableConfig (view vs table routing)
- Extract create_local_views() from duckdb_manager.py for reuse
- Update claude_md_template.txt with remote query agent instructions
- Update example configs with remote_query section and docs
- 52 new tests (42 remote_query + 10 bq_entity_type), all passing
Open-source AI data analyst platform extracted from internal repo.
Includes data sync engine, Keboola adapter, Flask web portal,
server deployment scripts, and configuration templates.