agnes-the-ai-analyst/src
Vojtech Rysanek 32c8ea601a fix(bigquery): apply bq_query_timeout_ms on every BQ-extension attach + surface silent failures
The DuckDB BigQuery extension defaults bq_query_timeout_ms to 90 s,
which is too tight for analyst-scale queries against view-backed BQ
datasets. Agnes already has apply_bq_session_settings() that bumps it
to 600 s (configurable via data_source.bigquery.query_timeout_ms), but
two regressions let the 90 s default leak through to live queries:

1. apply_bq_session_settings() swallowed every Exception silently. If
   the BigQuery extension wasn't loaded on the connection yet, or the
   installed extension version didn't recognise the setting, the SET
   would fail and the function would return without surfacing the
   problem. Operators saw 90 s timeouts on 'agnes query --remote' with
   no log line explaining why.

2. The call sites in src/db.py:_reattach_remote_extensions and
   src/orchestrator.py:_remote_attach only invoked
   apply_bq_session_settings on the metadata-token branch (token_env
   empty, the BqAccess contract). The token-based and no-auth branches
   ran ATTACH against the BigQuery extension without ever applying the
   timeout setting — so any BQ source registered with an explicit
   token_env, or with no auth env at all, fell back to the 90 s default.

Fix:

- apply_bq_session_settings now logs WARNING on each failure path
  (instance_config import error, non-numeric value, SET execution
  failure, readback error). It also verifies the setting actually
  landed via SELECT current_setting('bq_query_timeout_ms') and logs
  WARNING when the readback disagrees with the requested value, which
  catches the silent-ignore case some extension versions exhibit.

- Both _reattach_remote_extensions (src/db.py) and _remote_attach
  (src/orchestrator.py) now call apply_bq_session_settings on every
  branch that ATTACHes a BigQuery alias, not only the metadata-token
  branch. Idempotent: calling it twice on the metadata-token path is a
  no-op SET.

Tests:

- Extended the _RecordingConn fixture to support .fetchone() so the
  readback assertion path works. Updated existing call-shape
  assertions to expect the SELECT current_setting readback alongside
  the SET. Added two new tests covering the WARNING surfaces for SET
  failure and readback mismatch — regression guards for the silent-
  fallback bug this PR addresses.

- Full BQ-touching suite (398 tests) passes.
2026-05-06 11:24:14 +04:00
..
repositories Merge remote-tracking branch 'origin/main' into pr180-review 2026-05-05 12:05:50 +02:00
__init__.py Extract Keboola into connectors/keboola module 2026-03-09 12:22:16 +01:00
catalog_export.py feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
claude_md.py fix(claude_md): load default via importlib.resources — survives /app/config bind-mount 2026-05-04 06:53:47 +02:00
db.py fix(bigquery): apply bq_query_timeout_ms on every BQ-extension attach + surface silent failures 2026-05-06 11:24:14 +04:00
identifier_validation.py fix(security): #81 Group D — extractor-side identifier validation (squashed) (#97) 2026-04-27 21:46:17 +02:00
marketplace.py feat(rbac+marketplace): schema v14 FK + AGNES_ENABLE_TABLE_GRANTS + break-glass CLI 2026-04-28 14:25:13 +02:00
marketplace_filter.py feat(store): /store + /my-ai-stack — community marketplace + per-user composition 2026-05-05 02:53:49 +02:00
orchestrator.py fix(bigquery): apply bq_query_timeout_ms on every BQ-extension attach + surface silent failures 2026-05-06 11:24:14 +04:00
orchestrator_security.py fix(security): #81 Group A — orchestrator attach hardening (squashed) (#95) 2026-04-27 21:34:04 +02:00
profiler.py feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
rbac.py feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150) 2026-04-30 22:02:16 +02:00
remote_query.py fix(v2): #134 BigQuery cross-project errors return structured 502/400 + BqAccess facade (#138) 2026-04-30 10:11:20 +02:00
scheduler.py feat(scheduler): re-wire sync_schedule + script.schedule; tune via env; OpenMetadata TLS (#135) 2026-04-29 22:06:30 +02:00
sql_safe.py feat(v2): claude-driven fetch primitives + 0.14.0 (#102) 2026-04-29 01:07:19 +02:00
store_categories.py feat(store): /store + /my-ai-stack — community marketplace + per-user composition 2026-05-05 02:53:49 +02:00
store_naming.py fix(store): security + correctness blockers found in PR review (F1, F2, F4, F5) 2026-05-05 08:18:02 +02:00
welcome_template.py fix(setup): install list reflects opt-outs + Store bundle 2026-05-05 05:17:05 +02:00