Commit graph

3 commits

Author SHA1 Message Date
ZdenekSrotyr
0612c1e1a1 fix(schema-v24): raise on deferred migration so retry path actually runs (Devin Review on db.py:1757)
Pre-fix: when v24 migration found rows to migrate but
data_source.bigquery.project was empty, it logged a warning per row
and returned normally. Schema_version then bumped to 24 unconditionally
→ next start's 'if current < 24:' gate skipped _v23_to_v24_finalize
forever, leaving rows in DuckDB-flavor SQL that the new
_wrap_admin_sql_for_jobs_api wrapping path rejects.

Devin escalated this from advisory ("idempotent retry") to critical
on rescan after my reply. The reply was wrong — the LIKE filter inside
the function gives idempotency IF the function is called again, but
the schema-version gate prevents that call from happening.

Fix (Devin's recommended Approach 1): raise RuntimeError BEFORE the
schema-version bump when rows need migration but project_id is empty.
The schema_version stays at 23, so on next start the 'if current < 24:'
gate fires and the migration runs again — this time with project_id
configured.

Side effect: a BQ-using deployment that hasn't set the project blocks
startup until they do. That's the right call for a config error that
would otherwise silently break all materialized tables. The error
message points at the right knob (data_source.bigquery.project +
restart).

No-rows-no-block invariant preserved: the early 'if not rows: return'
at the top of _v23_to_v24_finalize means non-BQ deployments are
unaffected.

Tests:
- test_v24_raises_when_project_not_configured_and_rows_need_migration:
  asserts raise + schema_version stays at 23 (the load-bearing
  invariant for retry-on-next-start to work)
- test_v24_skips_clean_when_no_rows_match_even_without_project:
  asserts non-BQ deployments don't block startup
- Existing 3 tests still pass
2026-05-04 23:11:34 +02:00
ZdenekSrotyr
ce108d4c6d fix(schema): code-review follow-ups for fac10b29
- _v23_to_v24_finalize: wrap row-update loop in BEGIN/COMMIT/ROLLBACK
  to match the project's transactional-finalizer pattern (compare
  _v12_to_v13_finalize, _v17_to_v18_finalize, _v18_to_v19_finalize).
  Pre-fix a process crash mid-loop left the schema_version unchanged
  but partially-converted rows persisted across restart — idempotent
  overall but inconsistent with project convention.
- _v23_to_v24_finalize: re.sub replacement now uses a function-form
  (lambda) instead of an f-string, so any future project_id with a
  backslash sequence isn't misinterpreted as a group reference.
- tests: add a Keboola-source materialized row case asserting the
  SELECT's source_type filter prevents non-BQ rewrites.
2026-05-04 19:32:24 +02:00
ZdenekSrotyr
fac10b29e4 feat(schema): v24 — rewrite materialized BQ source_query to BQ-native
Materialize now wraps admin SQL into bigquery_query('<billing>', '<inner>')
which requires the inner SQL to be BigQuery-flavor (backticked
identifiers, native function syntax). v24 migrates existing rows from
DuckDB-flavor (bq."ds"."tbl") to (`<project>.ds.tbl`) using the
configured BQ project. Idempotent on already-converted rows; logs a
warning and skips when the project isn't configured (operator can
configure + restart for retry).
2026-05-04 19:15:54 +02:00