agnes-the-ai-analyst/src
ZdenekSrotyr f0979f997a fix(admin-api): reject backtick BQ-native source_query at register; surface materialize errors per-row
E2E testing showed admin POSTs of materialized BQ rows whose source_query
uses BigQuery-native backtick identifiers (`prj.ds.t`) silently no-op'd at
the next sync tick — the materialize path runs the SQL through the DuckDB
BQ extension's COPY which uses DuckDB's parser; backticks aren't recognized
and the query either parse-errors or matches zero rows. No parquet lands at
the canonical path and no error reaches an operator-visible surface.

Two-part fix:

1. RegisterTableRequest's _check_mode_query_coherence model_validator now
   rejects any source_query containing a backtick with a 422 + actionable
   message pointing at the DuckDB equivalent (bq."dataset"."table"). Same
   check is applied in update_table on the merged record so PATCHes that
   flip a stored source_query to backtick form are also caught. Covers BQ
   AND Keboola materialized rows since both connectors funnel source_query
   through DuckDB's COPY.

2. _run_materialized_pass now persists per-row failures via the new
   SyncStateRepository.set_error / clear_error methods (existing
   sync_state.error / status columns — no schema migration). GET
   /api/admin/registry enriches each row with `last_sync_error` from a
   single batched SELECT against sync_state, so the admin UI / da admin
   status can show "this table failed last sync because: X" instead of
   operators having to trawl scheduler logs. Recovered rows have the
   error cleared automatically — update_sync's success path resets
   status='ok' / error=NULL on the upsert.

The materialized-path test fixture's _materialized_payload helper is
updated to use DuckDB-flavor SQL (the prior backtick example pre-dated the
fix). 6 new tests cover register/update rejection on BQ + Keboola, the
sync_state error persistence, and the registry response surface.
2026-05-01 22:51:02 +02:00
..
repositories fix(admin-api): reject backtick BQ-native source_query at register; surface materialize errors per-row 2026-05-01 22:51:02 +02:00
__init__.py Extract Keboola into connectors/keboola module 2026-03-09 12:22:16 +01:00
catalog_export.py feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
db.py feat(materialized): query_mode='materialized' for BigQuery + Keboola — admin SELECT → parquet → analyst 2026-05-01 20:25:56 +02:00
identifier_validation.py fix(security): #81 Group D — extractor-side identifier validation (squashed) (#97) 2026-04-27 21:46:17 +02:00
marketplace.py feat(rbac+marketplace): schema v14 FK + AGNES_ENABLE_TABLE_GRANTS + break-glass CLI 2026-04-28 14:25:13 +02:00
marketplace_filter.py fix(marketplace): use plugin.json name in synth marketplace.json (#133) 2026-04-29 19:25:57 +02:00
orchestrator.py feat(v2): claude-driven fetch primitives + 0.14.0 (#102) 2026-04-29 01:07:19 +02:00
orchestrator_security.py fix(security): #81 Group A — orchestrator attach hardening (squashed) (#95) 2026-04-27 21:34:04 +02:00
profiler.py feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
rbac.py feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150) 2026-04-30 22:02:16 +02:00
remote_query.py fix(v2): #134 BigQuery cross-project errors return structured 502/400 + BqAccess facade (#138) 2026-04-30 10:11:20 +02:00
scheduler.py feat(scheduler): re-wire sync_schedule + script.schedule; tune via env; OpenMetadata TLS (#135) 2026-04-29 22:06:30 +02:00
sql_safe.py feat(v2): claude-driven fetch primitives + 0.14.0 (#102) 2026-04-29 01:07:19 +02:00