merge: pull #174 (BQ materialize view fix + concurrency, 0.33.0) into bootstrap branch

Brings in zs/materialize-sync-fix (PR #174): - BigQuery view materialize works (wrap admin SQL in bigquery_query()) - Per-table mutex + fcntl.flock for concurrent COPY corruption - Cost guardrail dry-run engages on materialized rows - Schema v23 -> v24 migration: rewrite source_query to BQ-native - Server-generated trivial source_query from bucket+source_table - Validator backtick relaxation for materialized rows - 0.33.0 release cut Conflict resolution: - CHANGELOG.md: keep our [Unreleased] (bootstrap rewrite content) ABOVE the new [0.33.0] section from #174. The bootstrap rewrite remains unreleased; it'll cut 0.34.0 (or later) when this PR merges to main. - tests/conftest.py: union — keep our analyst-bootstrap fixture re-export AND #174's bq_instance / stub_bq_extractor fixtures. - pyproject.toml auto-merged to 0.33.0 (matches the cut), correct. - src/db.py auto-merged: SCHEMA_VERSION = 24, _v23_to_v24_finalize added — no overlap with our work which left schema at v23. - CLAUDE.md auto-merged: schema-history paragraph extended with v24. Verified: 79/79 across CLI bootstrap suite + materialize suite + schema v24 migration tests pass locally on Python 3.13/macOS.
2026-05-04 20:53:00 +02:00 · 2026-05-04 20:53:00 +02:00 · e438170ade
commit e438170ade
parent ee83cebbda e6a2c4c51d
23 changed files with 1607 additions and 259 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -54,6 +54,89 @@ End-to-end clean-analyst-bootstrap rewrite. The web `/setup?role=analyst` page n
 - `tests/test_clean_install_integration.py` — end-to-end happy-path tests (minimal grants, zero grants, force preserves CLAUDE.local.md, readers in pre-init dir).
 - `docs/RELEASE_CHECKLIST.md` — manual clean-install protocol mandated for any PR touching the bootstrap path.

+## [0.33.0] — 2026-05-04
+
+Closes #162. Headline fix: `query_mode='materialized'` BigQuery rows now
+materialize correctly for views and materialized views, with per-table
+concurrency control preventing parquet corruption on overlapping scheduler
+ticks. Plus a source_query server-generation convenience, a
+`materialize.lock_ttl_seconds` config knob, and a schema v24 migration that
+converts existing DuckDB-flavor source_query values to BQ-native SQL.
+
+### Fixed
+
+- BigQuery materialize now works for views and materialized views. Pre-fix,
+  `materialize_query` ran admin's `source_query` as `COPY (sql) TO parquet`
+  through the DuckDB BigQuery extension session, which routed through the BQ
+  Storage Read API for `bq."<ds>"."<tbl>"` references. Storage Read API
+  rejects non-base entities (`Binder Error: Error while creating read session:
+  ... non-table entities cannot be read with the storage API`). Fixed by
+  always wrapping admin SQL into `bigquery_query('<billing-project>',
+  '<inner-sql>')` so COPY uses the BQ jobs API uniformly for tables, views,
+  and materialized views.
+- `materialize_query` no longer corrupts its parquet under concurrent
+  invocations for the same `table_id`. Pre-fix, two overlapping
+  `_run_materialized_pass` calls (e.g. a long-running COPY + the next
+  scheduler tick) both hit the unconditional `if tmp_path.exists():
+  tmp_path.unlink()` at function entry and started parallel COPYs against the
+  same path, interleaving bytes and producing a parquet file with no valid
+  footer. Now each call acquires a per-table_id `threading.Lock` plus an
+  advisory `fcntl.flock` on `<id>.parquet.lock`; the second caller raises
+  `MaterializeInFlightError` and the scheduler treats it as
+  `skipped, in_flight` — never as an error.
+- Cost guardrail dry-run now engages for materialized rows. Pre-fix, the
+  BigQuery Python client returned 400 (`Table-valued function not found:
+  bigquery_query`) on the wrapped SQL and the dry-run silently fail-opened.
+  The dry-run now operates on the inner BQ-native SQL (admin's `source_query`
+  directly), which the client parses cleanly.
+
+### Changed
+
+- **BREAKING** `query_mode='materialized'` rows MUST register `source_query`
+  as BigQuery-native SQL (backticks for dashed identifiers, native
+  joins/CTEs). DuckDB-flavor (`bq."<ds>"."<tbl>"`) is no longer accepted on
+  register/PUT. The schema v24 migration converts existing rows automatically;
+  operators with custom-written `source_query` should review the migrated form
+  on first deploy. The validator's prior backtick-rejection rule is now scoped
+  to `query_mode IN ('remote', 'local')` only.
+- `_run_materialized_pass` summary `skipped` field changes from `list[str]`
+  to `list[dict]` with shape
+  `{"table": str, "reason": Literal["due_check", "in_flight"]}`. Downstream
+  consumers that asserted the old string form must update.
+
+### Added
+
+- `POST /api/admin/register-table` for `query_mode='materialized'` rows with
+  `bucket`+`source_table` but no `source_query` now server-generates
+  `` SELECT * FROM `<project>.<bucket>.<source_table>` `` from the configured
+  BigQuery project. The same fallback fires on `PUT /api/admin/registry/{id}`
+  when flipping to materialized. Operators only need to know
+  `bigquery_query()` semantics for non-trivial queries.
+- New top-level `materialize` config section in `instance.yaml`. Single field
+  — `materialize.lock_ttl_seconds` (default `86400`, 24 h) — controls how
+  long a stale `<id>.parquet.lock` file lives before a sibling materialize
+  attempt reclaims it. Editable via `/admin/server-config` API and UI.
+
+### Internal
+
+- Schema v24 migration: rewrites `table_registry.source_query` for
+  materialized BigQuery rows from DuckDB-flavor (`bq."<ds>"."<tbl>"`) to
+  BQ-native (`` `<project>.<ds>.<tbl>` ``) using the configured BQ project.
+  Idempotent on already-converted rows; logs a warning and skips when the
+  project isn't configured (operator can configure + restart for retry).
+  Wrapped in `BEGIN TRANSACTION` / `COMMIT` to match the project's
+  transactional-finalizer pattern.
+- `connectors/bigquery/extractor.py` exports `MaterializeInFlightError` and
+  the `_get_table_lock` / `_get_lock_ttl_seconds` /
+  `_wrap_admin_sql_for_jobs_api` / `_escape_sql_string_literal` helpers as
+  test seams. Underscore-prefixed; not part of the public API.
+- `tests/conftest.py` lifts `bq_instance` and `stub_bq_extractor` fixtures
+  from `tests/test_api_admin_materialized.py` so subsequent test modules in
+  this PR can resolve them via pytest's auto-discovery.
+- `app/api/sync.py:is_table_due` hoisted to module-level import (was deferred
+  inside `_run_materialized_pass`) so monkeypatching `app.api.sync.is_table_due`
+  actually intercepts the call — the deferred form made test patches a no-op.
+
 ## [0.32.0] — 2026-05-04

 Closes #160. Headline fix: `da query --remote` now resolves
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -443,7 +443,7 @@ Module sets `lifecycle { ignore_changes = [metadata_startup_script] }` on `googl
 ## Key Implementation Details

 ### DuckDB Schema (src/db.py)
- Schema v23 with auto-migration v1→…→v23 (v5 adds `users.active`, v6 adds `personal_access_tokens`, v7 adds `personal_access_tokens.last_used_ip`, v8/v9 added the legacy internal_roles/role-grants tables, v10 added `view_ownership` for cross-connector view-name collision detection (issue #81 Group C), v11 added marketplace_registry + marketplace_plugins + user_groups + plugin_access, v12 added users.groups JSON + user_groups.is_system, **v13 replaces internal_roles/group_mappings/user_role_grants/plugin_access with user_group_members + resource_grants and drops users.groups JSON**, v14 adds FK constraints on user_group_members + resource_grants after orphan cleanup, v15 adds knowledge_items context-engineering columns + contradictions + session_extraction_state, v16 adds verification_evidence, v17 adds knowledge_item_relations, v18 drops stranded non-google memberships from google-managed groups, **v19 drops legacy `dataset_permissions`, `access_requests` tables and `users.role`, `table_registry.is_public` columns — table access is now exclusively per-group via `resource_grants(resource_type='table')`**, **v20 adds `source_query` TEXT to `table_registry` to back `query_mode='materialized'` (BigQuery scheduled-query parquet path)**, **v21 adds `welcome_template` singleton table backing the Agent Setup Prompt admin override (`/admin/agent-prompt`)**, **v22 reserves the `setup_banner` table — feature dropped mid-development; table retained for forward compatibility with already-migrated instances**, **v23 adds `claude_md_template` singleton table backing the Agent Workspace Prompt admin override (`/admin/workspace-prompt`)** — see CHANGELOG and docs/RBAC.md)
+- Schema v24 with auto-migration v1→…→v24 (v5 adds `users.active`, v6 adds `personal_access_tokens`, v7 adds `personal_access_tokens.last_used_ip`, v8/v9 added the legacy internal_roles/role-grants tables, v10 added `view_ownership` for cross-connector view-name collision detection (issue #81 Group C), v11 added marketplace_registry + marketplace_plugins + user_groups + plugin_access, v12 added users.groups JSON + user_groups.is_system, **v13 replaces internal_roles/group_mappings/user_role_grants/plugin_access with user_group_members + resource_grants and drops users.groups JSON**, v14 adds FK constraints on user_group_members + resource_grants after orphan cleanup, v15 adds knowledge_items context-engineering columns + contradictions + session_extraction_state, v16 adds verification_evidence, v17 adds knowledge_item_relations, v18 drops stranded non-google memberships from google-managed groups, **v19 drops legacy `dataset_permissions`, `access_requests` tables and `users.role`, `table_registry.is_public` columns — table access is now exclusively per-group via `resource_grants(resource_type='table')`**, **v20 adds `source_query` TEXT to `table_registry` to back `query_mode='materialized'` (BigQuery scheduled-query parquet path)**, **v21 adds `welcome_template` singleton table backing the Agent Setup Prompt admin override (`/admin/agent-prompt`)**, **v22 reserves the `setup_banner` table — feature dropped mid-development; table retained for forward compatibility with already-migrated instances**, **v23 adds `claude_md_template` singleton table backing the Agent Workspace Prompt admin override (`/admin/workspace-prompt`)**, **v24 rewrites materialized BQ `source_query` from DuckDB-flavor `bq."ds"."t"` to BQ-native `` `<project>.ds.t` `` so the new wrapping path accepts them; idempotent + warns when project unconfigured** — see CHANGELOG and docs/RBAC.md)
 - `table_registry`: id, name, source_type, bucket, source_table, query_mode, sync_schedule, etc.
 - `sync_state`, `sync_history`: track extraction progress
 - `users`, `audit_log`: account state + audit trail. RBAC lives in `user_groups` + `user_group_members` + `resource_grants`.
--- a/app/api/admin.py
+++ b/app/api/admin.py
@ -146,6 +146,38 @@ def _validate_urls_in_patch(sections: Dict[str, Dict[str, Any]]) -> None:
                _validate_url_not_private(value, field_name=".".join(path))


+_LOCK_TTL_MIN = 60
+_LOCK_TTL_MAX = 7 * 24 * 3600  # 604800 — one week
+
+
+def _validate_materialize_section(sections: Dict[str, Dict[str, Any]]) -> None:
+    """Validate the materialize section patch when present.
+
+    Checks field-level constraints that the Pydantic envelope can't enforce
+    (it only validates the outer shape, not nested leaf values).
+    """
+    mat = sections.get("materialize")
+    if not isinstance(mat, dict):
+        return
+    ttl = mat.get("lock_ttl_seconds")
+    if ttl is None:
+        return
+    if not isinstance(ttl, int) or isinstance(ttl, bool):
+        raise HTTPException(
+            status_code=422,
+            detail="materialize.lock_ttl_seconds must be an integer",
+        )
+    if ttl < _LOCK_TTL_MIN or ttl > _LOCK_TTL_MAX:
+        raise HTTPException(
+            status_code=422,
+            detail=(
+                f"materialize.lock_ttl_seconds must be between "
+                f"{_LOCK_TTL_MIN} and {_LOCK_TTL_MAX} "
+                f"(got {ttl})"
+            ),
+        )
+
+
 # --- Server-config (instance.yaml) editor -----------------------------------
 #
 # The /admin/server-config UI POSTs a partial dict here keyed by section
@ -175,6 +207,7 @@ _EDITABLE_SECTIONS: tuple[str, ...] = (
    "openmetadata",
    "desktop",
    "corporate_memory",
+    "materialize",
 )

 # "Danger-zone" sections — flipping these can lock operators out (auth.*) or
@ -585,6 +618,23 @@ _KNOWN_FIELDS: dict[str, dict[str, dict]] = {
            ),
        },
    },
+    # materialize — file-lock TTL for the concurrent-materialize safety net.
+    # A single field; more knobs may follow as the feature matures.
+    "materialize": {
+        "lock_ttl_seconds": {
+            "kind": "int",
+            "default": 86400,
+            "hint": (
+                "How long (seconds) before a stale materialize lock file is "
+                "reclaimed. The lock is a .parquet.lock sibling file; if the "
+                "holder process is hard-killed, the next attempt reclaims the "
+                "lock once the file's mtime is older than this TTL. "
+                "Default 86400 (24 h). Min 60, max 604800 (7 days). "
+                "Lower only if you know materializes never exceed the new value "
+                "and your host regularly hard-kills processes."
+            ),
+        },
+    },
 }

 # Keys whose values must be redacted from the audit diff. We match
@ -913,6 +963,9 @@ async def update_server_config(
    # the per-section patch (e.g. data_source.keboola.stack_url).
    _validate_urls_in_patch(request.sections)

+    # Field-level constraints for sections whose values have documented ranges.
+    _validate_materialize_section(request.sections)
+
    # Defense-in-depth: scrub redaction sentinels (`***` / `<empty>`) out of
    # secret-keyed leaves in the patch before they reach the deep-merge.
    # The client form does the same scrub, but an API caller round-tripping
@ -1169,27 +1222,28 @@ class RegisterTableRequest(BaseModel):
    @model_validator(mode="after")
    def _check_mode_query_coherence(self):
        """Enforce query_mode ↔ source_query invariants up front so an admin
-        can't persist a remote/local row carrying an orphan source_query, and
-        materialized rows can't be registered without a SQL body."""
+        can't persist a remote/local row carrying an orphan source_query.
+
+        For BigQuery materialized rows, an empty source_query is allowed here
+        because _validate_bigquery_register_payload generates it from
+        bucket+source_table after this validator runs. For all other source
+        types (e.g. Keboola), source_query is still required for materialized.
+        """
        sq = (self.source_query or "").strip() or None
-        if self.query_mode == "materialized" and not sq:
-            raise ValueError(
-                "query_mode='materialized' requires a non-empty source_query"
-            )
        if self.query_mode != "materialized" and sq:
            raise ValueError(
                "source_query is only valid when query_mode='materialized'"
            )
-        # The materialize path runs the SQL through DuckDB's parser (BigQuery
-        # extension's COPY pushes it through DuckDB first, and the Keboola
-        # path COPYs the raw SQL through a DuckDB session too). DuckDB does
-        # NOT understand BigQuery-native backtick identifiers — those parse-
-        # error or silently match no rows, leaving no parquet at the
-        # canonical path and no operator-visible failure. Reject at register
-        # time with an actionable message so the bad SQL never lands in
-        # `table_registry.source_query`. See `_run_materialized_pass` for
-        # the runtime path that would otherwise eat the error.
-        if sq and "`" in sq:
+        # Non-BQ materialized rows must supply source_query explicitly — there
+        # is no server-generate fallback for Keboola materialized.
+        if self.query_mode == "materialized" and not sq and self.source_type != "bigquery":
+            raise ValueError(
+                "query_mode='materialized' requires a non-empty source_query"
+            )
+        # Backtick guard stays for non-materialized rows (DuckDB-flavor SQL
+        # contract); materialized SQL is BigQuery-native and MUST allow
+        # backticks for dashed identifiers (e.g. `prj-org.dataset.table`).
+        if self.query_mode != "materialized" and sq and "`" in sq:
            raise ValueError(_BACKTICK_REJECTION_MESSAGE)
        # Normalise: stash the trimmed-or-None form so the persisted column
        # never carries surrounding whitespace or empty-string sentinels.
@ -1232,6 +1286,31 @@ class RegisterTableRequest(BaseModel):
        return v


+def _generate_materialized_source_query(
+    bucket: str, source_table: str, project_id: str,
+) -> str:
+    """Build the canonical full-table-dump source_query for a materialized
+    BQ row when admin only supplies dataset + table. The result is
+    BigQuery-native SQL — wrapped at materialize time into
+    bigquery_query(...) by connectors.bigquery.extractor.materialize_query."""
+    if not _is_safe_quoted_identifier(bucket):
+        raise HTTPException(
+            status_code=400,
+            detail=f"bigquery: dataset {bucket!r} is unsafe",
+        )
+    if not _is_safe_quoted_identifier(source_table):
+        raise HTTPException(
+            status_code=400,
+            detail=f"bigquery: source_table {source_table!r} is unsafe",
+        )
+    if not _is_safe_project_id(project_id):
+        raise HTTPException(
+            status_code=400,
+            detail=f"bigquery: data_source.bigquery.project {project_id!r} is malformed",
+        )
+    return f"SELECT * FROM `{project_id}.{bucket}.{source_table}`"
+
+
 def _validate_bigquery_register_payload(req: "RegisterTableRequest") -> None:
    """Enforce BQ-specific shape on a register/precheck request.

@ -1253,13 +1332,8 @@ def _validate_bigquery_register_payload(req: "RegisterTableRequest") -> None:
    """
    if req.query_mode == "materialized":
        # Materialized BQ rows: the SQL body replaces dataset+table refs.
-        # Pydantic model_validator already verified source_query is non-empty;
-        # all we still need is a valid project_id and a safe view name.
-        if not req.source_query or not req.source_query.strip():
-            raise HTTPException(
-                status_code=422,
-                detail="bigquery materialized: 'source_query' is required",
-            )
+        # source_query may be empty if admin supplied bucket+source_table —
+        # in that case the server generates a full-table-dump SQL below.
        raw_name = req.name or ""
        if raw_name.strip() != raw_name or not _is_safe_identifier(raw_name):
            raise HTTPException(
@ -1271,7 +1345,7 @@ def _validate_bigquery_register_payload(req: "RegisterTableRequest") -> None:
                ),
            )
        from app.instance_config import get_value
-        project_id = get_value("data_source", "bigquery", "project", default="")
+        project_id = get_value("data_source", "bigquery", "project", default="") or ""
        if not project_id:
            raise HTTPException(
                status_code=400,
@ -1290,6 +1364,24 @@ def _validate_bigquery_register_payload(req: "RegisterTableRequest") -> None:
                    "^[a-z][a-z0-9-]{4,28}[a-z0-9]$"
                ),
            )
+
+        if not (req.source_query and req.source_query.strip()):
+            # Server-generate from bucket+source_table. Trivial full-table
+            # dump path; admin only sets dataset+table and the server
+            # builds BQ-native SQL from instance.yaml's configured project.
+            if not (req.bucket and req.source_table):
+                raise HTTPException(
+                    status_code=422,
+                    detail=(
+                        "bigquery materialized requires either source_query "
+                        "(custom SQL) or bucket+source_table (server-generates "
+                        "the full-table-dump SQL)"
+                    ),
+                )
+            req.source_query = _generate_materialized_source_query(
+                req.bucket, req.source_table, project_id,
+            )
+
        # Phase C: profile_after_sync is now inert (Pydantic field marked
        # deprecated; not read by app/api/sync.py:410-438). The runtime
        # profiles every synced table unconditionally, so we no longer
@ -2283,35 +2375,32 @@ async def update_table(

        # Cross-source coherence: query_mode='materialized' requires a
        # non-empty source_query for ALL source types, not just BigQuery.
-        # Pre-fix, only the BQ-specific synthetic-RegisterTableRequest below
-        # caught this — Keboola materialized rows could be PUT without
-        # source_query and persisted with source_query=None, then crash at
-        # the next sync tick when kb_materialize_query received `sql=None`
-        # and DuckDB rejected `COPY (None) TO ...`. Devin finding 2026-05-01:
-        # BUG_pr-review-job-58ae3148_0001.
+        # BQ rows without source_query can be server-generated from
+        # bucket+source_table (handled by _validate_bigquery_register_payload
+        # via the synthetic RegisterTableRequest below). Non-BQ rows (e.g.
+        # Keboola) still require an explicit source_query at PUT time.
        if merged.get("query_mode") == "materialized":
            sq = merged.get("source_query")
            if not sq or not str(sq).strip():
-                raise HTTPException(
-                    status_code=422,
-                    detail=(
-                        "query_mode='materialized' requires a non-empty "
-                        "source_query. To revert to a non-materialized mode, "
-                        "PATCH query_mode='local' (Keboola) or 'remote' "
-                        "(BigQuery) and the stale source_query is cleared "
-                        "automatically."
-                    ),
-                )
-            # Backtick rejection on the merged record — see
-            # `_BACKTICK_REJECTION_MESSAGE` for the rationale. Catches PATCHes
-            # that flip `source_query` to a backtick form on an already-
-            # materialized row, which the synthetic-RegisterTableRequest below
-            # only re-validates for BQ rows. Apply uniformly so Keboola
-            # materialized rows can't carry one either.
-            if "`" in str(sq):
-                raise HTTPException(
-                    status_code=422, detail=_BACKTICK_REJECTION_MESSAGE,
-                )
+                # BQ rows: let _validate_bigquery_register_payload generate
+                # source_query from bucket+source_table (falls through below).
+                # Non-BQ rows: no server-generate fallback; raise 422.
+                if merged.get("source_type") != "bigquery":
+                    raise HTTPException(
+                        status_code=422,
+                        detail=(
+                            "query_mode='materialized' requires a non-empty "
+                            "source_query. To revert to a non-materialized mode, "
+                            "PATCH query_mode='local' (Keboola) or 'remote' "
+                            "(BigQuery) and the stale source_query is cleared "
+                            "automatically."
+                        ),
+                    )
+            # Backtick guard removed for materialized rows: the Task 2 wrapping
+            # path (connectors.bigquery.extractor.materialize_query) now runs
+            # admin SQL through the BQ jobs API using BQ-native syntax, which
+            # requires backticks for dashed project/dataset identifiers.
+            # Non-materialized rows still reject backticks in the model validator.

        if merged.get("source_type") == "bigquery":
            # Reuse the register-time validator. It mutates the request to
--- a/app/api/sync.py
+++ b/app/api/sync.py
@ -20,7 +20,7 @@ from src.repositories.sync_state import SyncStateRepository
 from src.repositories.sync_settings import SyncSettingsRepository
 from src.repositories.table_registry import TableRegistryRepository
 from src.rbac import can_access_table
-from src.scheduler import filter_due_tables
+from src.scheduler import filter_due_tables, is_table_due

 logger = logging.getLogger(__name__)
 router = APIRouter(prefix="/api/sync", tags=["sync"])
@ -74,9 +74,8 @@ def _run_materialized_pass(conn: duckdb.DuckDBPyConnection, bq) -> dict:
    its structured fields so operator alerting can pick out the cap-vs-actual
    bytes from the log line.
    """
-    from src.scheduler import is_table_due
    from app.instance_config import get_value
-    from connectors.bigquery.extractor import MaterializeBudgetError
+    from connectors.bigquery.extractor import MaterializeBudgetError, MaterializeInFlightError

    bq_output_dir = str(Path(_get_data_dir()) / "extracts" / "bigquery")
    kb_output_dir = Path(_get_data_dir()) / "extracts" / "keboola" / "data"
@ -125,7 +124,7 @@ def _run_materialized_pass(conn: duckdb.DuckDBPyConnection, bq) -> dict:
        last_iso = last.isoformat() if last else None
        schedule = row.get("sync_schedule") or "every 1h"
        if not is_table_due(schedule, last_iso):
-            summary["skipped"].append(ref_name)
+            summary["skipped"].append({"table": ref_name, "reason": "due_check"})
            continue

        source_type = row.get("source_type") or "bigquery"  # legacy default
@ -195,6 +194,13 @@ def _run_materialized_pass(conn: duckdb.DuckDBPyConnection, bq) -> dict:
                    ),
                })
                continue
+        except MaterializeInFlightError:
+            # In-flight on a sibling worker / scheduler tick — treat as
+            # 'skipped, in-flight'. Do NOT call state.set_error: that
+            # would flip status='error' on a healthy concurrent run and
+            # the registry UI would surface a false-positive failure.
+            summary["skipped"].append({"table": ref_name, "reason": "in_flight"})
+            continue
        except MaterializeBudgetError as e:
            logger.warning(
                "Materialize cap exceeded for %s: %s bytes > %s bytes",
@ -466,9 +472,13 @@ sys.exit(compute_exit_code(result, len(configs)))
                mat_summary = _run_materialized_pass(mat_conn, bq_access)
            finally:
                mat_conn.close()
+            skipped_count = len(mat_summary["skipped"])
+            in_flight_count = sum(
+                1 for s in mat_summary["skipped"] if s.get("reason") == "in_flight"
+            )
            print(
                f"[SYNC] Materialized SQL: {len(mat_summary['materialized'])} ok, "
-                f"{len(mat_summary['skipped'])} skipped, "
+                f"{skipped_count} skipped (in_flight={in_flight_count}), "
                f"{len(mat_summary['errors'])} errors",
                file=_sys.stderr, flush=True,
            )
--- a/app/web/templates/admin_server_config.html
+++ b/app/web/templates/admin_server_config.html
@ -218,6 +218,10 @@ const SECTION_META = {
    title: "Corporate Memory",
    help: "Optional governance for AI-extracted knowledge. When the section is unset, the system runs in legacy democratic-wiki mode with no admin review.",
  },
+  materialize: {
+    title: "Materialize",
+    help: "Concurrency safety net for the materialize path. Controls the file-lock TTL used to detect and reclaim stale locks from hard-killed processes.",
+  },
 };
 const DANGER_SECTIONS = new Set(["auth", "server"]);

--- a/config/instance.yaml.example
+++ b/config/instance.yaml.example
@ -403,3 +403,19 @@ catalog:
 #   schema_cache_ttl_seconds: 3600    # /api/v2/schema/{table_id} cache lifetime (default: 1 h)
 #   sample_cache_ttl_seconds: 3600    # /api/v2/sample/{table_id} cache lifetime (default: 1 h)
 #                                     # Admins can force-refresh via POST /api/v2/sample/{id}?refresh=true
+
+# --- Materialize concurrency safety (optional) ---
+# Concurrency safety net for the materialize path (BQ + Keboola). When
+# two materialize attempts race for the same table_id, the second one
+# raises MaterializeInFlightError and skips. The lock is held in a
+# .parquet.lock sibling file; if a holder process is hard-killed before
+# kernel-level flock release, the next attempt reclaims the lock once
+# the file's mtime is older than this TTL.
+#
+# Default 86400 (24h) is generous on purpose — anything shorter risks
+# a long-running COPY being interrupted by its own scheduler successor.
+# Lower it only if you know your materialize never exceeds the new
+# value AND your host has a habit of hard-killing processes.
+# Min 60 (1 minute), max 604800 (7 days). Configurable via /admin/server-config UI.
+materialize:
+  lock_ttl_seconds: 86400
--- a/connectors/bigquery/extractor.py
+++ b/connectors/bigquery/extractor.py
@ -3,16 +3,29 @@
 No data is downloaded. All queries go directly to BigQuery via DuckDB extension ATTACH.
 """

+import fcntl
+import hashlib
 import logging
 import os
+import re
 import shutil
 import threading
+import time
 from datetime import datetime, timezone
 from pathlib import Path
 from typing import List, Dict, Any, Optional

 import duckdb

+from connectors.bigquery.auth import get_metadata_token, BQMetadataAuthError
+from src.sql_safe import (
+    validate_identifier as _validate_identifier,
+    validate_project_id as _validate_project_id,
+)
+from src.identifier_validation import validate_identifier, validate_quoted_identifier
+
+logger = logging.getLogger(__name__)
+
 # Serializes the body of `init_extract` across threads so two concurrent
 # materialize calls (e.g. the synchronous timeout-fallback BackgroundTask
 # kicking in while the original daemon thread is still running) can't both
@ -21,15 +34,127 @@ import duckdb
 # not the per-source extract-file write, so we need a dedicated lock here.
 _INIT_EXTRACT_LOCK = threading.Lock()

-from connectors.bigquery.auth import get_metadata_token, BQMetadataAuthError
-from app.instance_config import get_value
-from src.sql_safe import (
-    validate_identifier as _validate_identifier,
-    validate_project_id as _validate_project_id,
-)
-from src.identifier_validation import validate_identifier, validate_quoted_identifier
+_LOCK_TTL_DEFAULT_SECONDS: int = 86400  # 24h — overridable via materialize.lock_ttl_seconds

-logger = logging.getLogger(__name__)
+
+class MaterializeInFlightError(Exception):
+    """Raised when a per-table_id materialize is already running.
+
+    Caller (`_run_materialized_pass`) should treat this as a 'skipped,
+    in-flight' outcome — the in-flight worker will finish and write
+    sync_state on its own. Critically, this is NOT an error condition;
+    `state.set_error` MUST NOT be called for this exception or the
+    registry would surface a false-positive failure to the operator
+    every overlap."""
+
+    def __init__(self, table_id: str, layer: str = "process"):
+        self.table_id = table_id
+        self.layer = layer
+        super().__init__(
+            f"materialize for {table_id!r} already in flight ({layer} lock held)"
+        )
+
+
+# Unbounded by design — each registered table_id gets one Lock for the
+# process lifetime. Per-Lock cost is ~56 bytes; a deployment with even
+# 10k registered tables holds <1 MB. No cleanup logic — clean would
+# need ref-counting and risks freeing a Lock currently held by a worker.
+_table_locks: dict[str, threading.Lock] = {}
+_table_locks_registry: threading.Lock = threading.Lock()
+
+
+def _get_table_lock(table_id: str) -> threading.Lock:
+    """Return the process-wide mutex for a given table_id, creating it
+    on first reference. The registry mutex serializes the dict mutation
+    only — once the per-id Lock is returned, contention between callers
+    happens on that lock alone."""
+    with _table_locks_registry:
+        lock = _table_locks.get(table_id)
+        if lock is None:
+            lock = threading.Lock()
+            _table_locks[table_id] = lock
+        return lock
+
+
+def _get_lock_ttl_seconds() -> int:
+    """Read the configured stale-lock TTL with fallback to the default.
+
+    Operator override lives at instance.yaml `materialize.lock_ttl_seconds`
+    (also editable via /admin/server-config). Default 86400 s = 24 h
+    matches the upper bound of any healthy BQ COPY in practice — anything
+    longer is a stuck process or a hung BQ session, both of which warrant
+    reclaim on next attempt."""
+    try:
+        # Deferred import: keeps the connectors module importable in
+        # contexts where the app layer isn't bootstrapped (e.g. unit tests
+        # that exercise extractor helpers without the FastAPI app).
+        from app.instance_config import get_value
+        v = get_value(
+            "materialize", "lock_ttl_seconds",
+            default=_LOCK_TTL_DEFAULT_SECONDS,
+        )
+        n = int(v) if v is not None else _LOCK_TTL_DEFAULT_SECONDS
+        return n if n > 0 else _LOCK_TTL_DEFAULT_SECONDS
+    except Exception:
+        return _LOCK_TTL_DEFAULT_SECONDS
+
+
+def _try_acquire_file_lock(lock_path: Path):
+    """Try to acquire an advisory exclusive flock on `lock_path`. Returns
+    the open file object on success (caller must close to release); None
+    on conflict.
+
+    Stale-lock reclaim: if the lock_path exists and its mtime is older
+    than the configured TTL, log a warning and unlink before retrying.
+    A live holder still wins the second flock attempt (kernel-level
+    flock isn't tied to mtime), so the reclaim doesn't break correctness
+    — it just unblocks the case where a holder process was hard-killed
+    before the kernel released the lock."""
+    lock_path.parent.mkdir(parents=True, exist_ok=True)
+
+    def _try_open_and_flock():
+        # Open in 'w' mode so the file's mtime updates on every successful
+        # acquisition — the mtime is the TTL signal for the next caller.
+        # Content is intentionally empty; the fd exists only to anchor flock.
+        f = open(lock_path, "w")
+        try:
+            fcntl.flock(f.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
+            return f
+        except BlockingIOError:
+            # Another holder owns the lock — return None so the caller can
+            # decide between TTL-reclaim and propagating MaterializeInFlightError.
+            f.close()
+            return None
+        except OSError:
+            # Anything else (read-only fs, unsupported, fd exhaustion) is a
+            # platform / config error, not a contention signal. Close the fd
+            # and re-raise so the caller (and operator) sees the real failure
+            # instead of a silent leak.
+            f.close()
+            raise
+
+    holder = _try_open_and_flock()
+    if holder is not None:
+        return holder
+
+    # Conflict. If the file is older than TTL, reclaim and retry once.
+    try:
+        age = time.time() - lock_path.stat().st_mtime
+    except FileNotFoundError:
+        return _try_open_and_flock()
+
+    if age > _get_lock_ttl_seconds():
+        logger.warning(
+            "Reclaiming stale materialize lock at %s (age %.1fs > TTL)",
+            lock_path, age,
+        )
+        try:
+            lock_path.unlink()
+        except FileNotFoundError:
+            pass
+        return _try_open_and_flock()
+
+    return None


 def _detect_table_type(
@ -59,6 +184,56 @@ def _detect_table_type(
    return row[0] if row else None


+_BILLING_PROJECT_RE = re.compile(r"^[a-z][a-z0-9-]{4,28}[a-z0-9]$")
+
+
+def _escape_sql_string_literal(s: str) -> str:
+    """Double every single quote so the result is safe to embed inside a
+    single-quoted SQL string literal. DuckDB and BigQuery both honor the
+    SQL standard `''` escape inside `'...'`. Used to wrap admin
+    source_query into bigquery_query()'s second arg without breaking
+    the literal envelope."""
+    return s.replace("'", "''")
+
+
+def _wrap_admin_sql_for_jobs_api(billing_project: str, inner_sql: str) -> str:
+    """Build the COPY-source SQL that runs admin's `inner_sql` through
+    the BigQuery jobs API via the DuckDB BQ extension's
+    ``bigquery_query()`` table function.
+
+    Why: the default `bq."ds"."t"` reference path uses the BQ Storage
+    Read API which rejects non-base entities (views, materialized views).
+    Routing through `bigquery_query()` uses the jobs API which accepts
+    every entity type uniformly.
+
+    Args:
+        billing_project: GCP project ID that bills the BQ job. Must
+            match the GCP project_id grammar — anything else is rejected
+            as a defense-in-depth check (admin is trusted, but a typo
+            should fail closed not silently lose budget to the wrong
+            project).
+        inner_sql: BigQuery-flavor SQL the admin registered as
+            ``source_query``. Should be BigQuery-native; DuckDB-flavor
+            `bq."ds"."t"` references are not enforced here but will fail at
+            COPY time inside the BQ jobs API. Existing rows are converted by
+            the v24 schema migration; new rows are validated upstream at
+            register/PUT.
+
+    Returns:
+        A DuckDB-parseable SQL fragment suitable as the operand of
+        ``COPY (...) TO 'path' (FORMAT PARQUET)``.
+    """
+    if not _BILLING_PROJECT_RE.match(billing_project):
+        raise ValueError(
+            f"billing_project {billing_project!r} is not a valid GCP project_id "
+            "(grammar: ^[a-z][a-z0-9-]{4,28}[a-z0-9]$)"
+        )
+    return (
+        f"SELECT * FROM bigquery_query('{billing_project}', "
+        f"'{_escape_sql_string_literal(inner_sql)}')"
+    )
+
+
 def _create_meta_table(conn: duckdb.DuckDBPyConnection) -> None:
    """Create the _meta table required by the extract.duckdb contract."""
    conn.execute("DROP TABLE IF EXISTS _meta")
@ -321,33 +496,42 @@ def materialize_query(
    to `<output_dir>/data/<table_id>.parquet` atomically.

    Designed for `query_mode='materialized'` table_registry rows. The SQL
-    is admin-registered (validated upstream) and may reference DuckDB
-    three-part identifiers (`bq."dataset"."table"`) resolved by the
-    in-session ATTACH, OR native BQ identifiers via the `bigquery_query()`
-    table function — both work because the session has the bigquery
-    extension loaded with a SECRET token.
+    is admin-registered BQ-native SQL (DuckDB-flavor `bq."ds"."t"` refs are
+    validated upstream). The SQL is wrapped in `bigquery_query('<billing>',
+    '<inner>')` before the COPY so the BQ extension routes through the BQ
+    jobs API — the default Storage Read API path rejects non-base entities
+    (views, materialized views) with "non-table entities cannot be read with
+    the storage API". Routing through `bigquery_query()` works uniformly for
+    base tables and views alike.

    Cost guardrail: when `max_bytes` is a positive int, run a BQ dry-run
    via `bq.client()` first; raise `MaterializeBudgetError` if the
    estimate exceeds the cap. `max_bytes=None` or `max_bytes <= 0`
    disables the guardrail (config sentinel, see
-    `data_source.bigquery.max_bytes_per_materialize`).
+    `data_source.bigquery.max_bytes_per_materialize`). The dry-run operates
+    on the inner `sql` (BQ-native), not the wrapped form.

-    Dry-run is best-effort and fail-open: if the SQL uses DuckDB syntax
-    that the native BQ client can't parse (e.g. `bq."ds"."t"`), the
-    dry-run raises and we log a warning; the COPY still runs. This
-    matches the BqAccess facade's "client is for native BQ SQL only"
-    contract — operators who need the cap to engage write the registered
-    SQL using native BQ identifiers (`\\`project.ds.t\\``).
+    Dry-run is best-effort and fail-open: if the dry-run errors (transient
+    upstream failure, missing google lib), we log a warning and proceed
+    with the wrapped COPY.

    Atomic write: result lands in `<id>.parquet.tmp` first, then
    `os.replace` swaps it in. A failed COPY leaves no partial file behind.

+    Concurrency: per-``table_id`` in-process mutex + advisory file lock
+    on ``<table_id>.parquet.lock``. Overlapping calls for the same id
+    raise ``MaterializeInFlightError`` immediately so the caller can
+    skip cleanly without consuming the COPY budget twice. Stale file
+    locks (mtime > ``materialize.lock_ttl_seconds``, default 24 h) are
+    reclaimed automatically.
+
    Args:
        table_id: Logical id from table_registry; becomes the parquet
            filename. Must pass `validate_identifier()` so it can't
            inject path traversal.
-        sql: SELECT statement, no trailing semicolon.
+        sql: BQ-native SELECT statement, no trailing semicolon. Wrapped
+            in `bigquery_query()` before the COPY — must not itself
+            contain a `bigquery_query()` call.
        bq: A `BqAccess` instance — provides `duckdb_session()` for the
            COPY and `client()` for the dry-run.
        output_dir: Connector root, e.g. `/data/extracts/bigquery`.
@ -358,7 +542,10 @@ def materialize_query(
        {"rows": int, "size_bytes": int, "query_mode": "materialized"}

    Raises:
-        ValueError: if `table_id` is unsafe.
+        ValueError: if `table_id` is unsafe or `bq.projects.billing` fails
+            the GCP project_id grammar check.
+        MaterializeInFlightError: if a concurrent call for the same table_id
+            is already in progress (in-process or cross-process).
        MaterializeBudgetError: if `max_bytes > 0` and dry-run estimate exceeds it.
        BqAccessError: from `bq.duckdb_session()` (auth_failed / bq_lib_missing /
            not_configured) — caller catches and aggregates into the trigger
@ -374,99 +561,114 @@ def materialize_query(

    parquet_path = data_dir / f"{table_id}.parquet"
    tmp_path = data_dir / f"{table_id}.parquet.tmp"
-    if tmp_path.exists():
-        tmp_path.unlink()
+    lock_path = data_dir / f"{table_id}.parquet.lock"

-    # Cost guardrail (best-effort — fail-open if dry-run can't parse the SQL).
-    if max_bytes is not None and max_bytes > 0:
+    proc_lock = _get_table_lock(table_id)
+    if not proc_lock.acquire(blocking=False):
+        raise MaterializeInFlightError(table_id, layer="process")
+    try:
+        file_lock = _try_acquire_file_lock(lock_path)
+        if file_lock is None:
+            raise MaterializeInFlightError(table_id, layer="file")
        try:
-            from app.api.v2_scan import _bq_dry_run_bytes  # reuse main's impl
-            estimated = _bq_dry_run_bytes(bq, sql)
-        except Exception as e:
-            logger.warning(
-                "BQ dry-run failed for materialize cost guardrail (fail-open): %s. "
-                "If the SQL uses DuckDB three-part names like bq.\"ds\".\"t\", "
-                "rewrite to native BQ identifiers (`project.ds.t`) for the "
-                "guardrail to engage. Proceeding with COPY.",
-                e,
-            )
-            estimated = 0
-        if estimated > max_bytes:
-            raise MaterializeBudgetError(
-                f"dry-run estimate {estimated:,} bytes exceeds cap "
-                f"{max_bytes:,} for table {table_id!r}",
-                table_id=table_id,
-                current=estimated,
-                limit=max_bytes,
-            )
-
-    # COPY through a BqAccess-managed session.
-    with bq.duckdb_session() as conn:
-        # ATTACH the data project — but only when no `bq` catalog is
-        # already attached. Production sessions (real BqAccess) come with
-        # only `:memory:` and need the ATTACH; test sessions pre-populate
-        # `bq` as a fixture catalog and would error on a redundant ATTACH
-        # (alias already in use) AND on the bigquery extension load when
-        # the test runner has no cached extension. Detecting via
-        # `duckdb_databases()` keeps the ATTACH path idempotent without
-        # swallowing real errors (auth, cross-project permission,
-        # malformed project_id) — those still propagate from the actual
-        # ATTACH call.
-        attached = {
-            r[0] for r in conn.execute(
-                "SELECT database_name FROM duckdb_databases()"
-            ).fetchall()
-        }
-        if "bq" not in attached:
-            conn.execute(
-                f"ATTACH 'project={bq.projects.data}' AS bq (TYPE bigquery, READ_ONLY)"
-            )
-
-        try:
-            safe_path = str(tmp_path).replace("'", "''")
-            conn.execute(f"COPY ({sql}) TO '{safe_path}' (FORMAT PARQUET)")
-            rows = conn.execute(
-                f"SELECT count(*) FROM read_parquet('{safe_path}')"
-            ).fetchone()[0]
-        except Exception:
            if tmp_path.exists():
                tmp_path.unlink()
-            raise

-    # Compute the parquet hash inline before the atomic swap. The caller used
-    # to re-read the file in `_run_materialized_pass` to hash it via
-    # `_file_hash`, but that's a synchronous full-read on the FastAPI worker
-    # thread — a 10 GiB parquet means 50+ seconds of disk I/O blocking other
-    # requests. Hashing here keeps the open-file handle hot from the COPY
-    # round and removes the second read. Devil's-advocate review item.
-    import hashlib
-    h = hashlib.md5()
-    with open(tmp_path, "rb") as f:
-        for chunk in iter(lambda: f.read(8192), b""):
-            h.update(chunk)
-    parquet_hash = h.hexdigest()
+            # Build the wrapped SQL once — both the cost guardrail dry-run and
+            # the COPY operate on `sql` (the inner BQ SQL); only the COPY needs
+            # the DuckDB-side bigquery_query() envelope.
+            billing_project = bq.projects.billing
+            wrapped_sql = _wrap_admin_sql_for_jobs_api(billing_project, sql)

-    size_bytes = tmp_path.stat().st_size
-    os.replace(tmp_path, parquet_path)
+            if max_bytes is not None and max_bytes > 0:
+                try:
+                    from app.api.v2_scan import _bq_dry_run_bytes  # reuse main's impl
+                    estimated = _bq_dry_run_bytes(bq, sql)  # NB: pass inner SQL (BQ-native)
+                except Exception as e:
+                    logger.warning(
+                        "BQ dry-run failed for materialize cost guardrail (fail-open): %s. "
+                        "Proceeding with COPY against `bigquery_query()` wrapping.",
+                        e,
+                    )
+                    estimated = 0
+                if estimated > max_bytes:
+                    raise MaterializeBudgetError(
+                        f"dry-run estimate {estimated:,} bytes exceeds cap "
+                        f"{max_bytes:,} for table {table_id!r}",
+                        table_id=table_id,
+                        current=estimated,
+                        limit=max_bytes,
+                    )

-    rows = int(rows)
-    if rows == 0:
-        # 0 rows is indistinguishable from "the SQL is wrong and nobody
-        # noticed" — surface it loudly so operators see it in the scheduler
-        # log line and the per-row error aggregation. Caller decides whether
-        # to alert.
-        logger.warning(
-            "Materialized %s produced 0 rows — verify the SQL filter is "
-            "intentional. Parquet written: %s",
-            table_id, parquet_path,
-        )
+            # COPY through a BqAccess-managed session. The session has the BQ
+            # extension loaded with a SECRET token; bigquery_query() reuses that
+            # auth path against the billing_project for the jobs API call.
+            with bq.duckdb_session() as conn:
+                attached = {
+                    r[0] for r in conn.execute(
+                        "SELECT database_name FROM duckdb_databases()"
+                    ).fetchall()
+                }
+                if "bq" not in attached:
+                    conn.execute(
+                        f"ATTACH 'project={bq.projects.data}' AS bq (TYPE bigquery, READ_ONLY)"
+                    )

-    return {
-        "rows": rows,
-        "size_bytes": size_bytes,
-        "query_mode": "materialized",
-        "hash": parquet_hash,
-    }
+                try:
+                    safe_path = _escape_sql_string_literal(str(tmp_path))
+                    conn.execute(
+                        f"COPY ({wrapped_sql}) TO '{safe_path}' (FORMAT PARQUET)"
+                    )
+                    rows = conn.execute(
+                        f"SELECT count(*) FROM read_parquet('{safe_path}')"
+                    ).fetchone()[0]
+                except Exception:
+                    if tmp_path.exists():
+                        tmp_path.unlink()
+                    raise
+
+            # Compute the parquet hash inline before the atomic swap. The caller used
+            # to re-read the file in `_run_materialized_pass` to hash it via
+            # `_file_hash`, but that's a synchronous full-read on the FastAPI worker
+            # thread — a 10 GiB parquet means 50+ seconds of disk I/O blocking other
+            # requests. Hashing here keeps the open-file handle hot from the COPY
+            # round and removes the second read. Devil's-advocate review item.
+            h = hashlib.md5()
+            with open(tmp_path, "rb") as f:
+                for chunk in iter(lambda: f.read(8192), b""):
+                    h.update(chunk)
+            parquet_hash = h.hexdigest()
+
+            size_bytes = tmp_path.stat().st_size
+            os.replace(tmp_path, parquet_path)
+
+            rows = int(rows)
+            if rows == 0:
+                # 0 rows is indistinguishable from "the SQL is wrong and nobody
+                # noticed" — surface it loudly so operators see it in the scheduler
+                # log line and the per-row error aggregation. Caller decides whether
+                # to alert.
+                logger.warning(
+                    "Materialized %s produced 0 rows — verify the SQL filter is "
+                    "intentional. Parquet written: %s",
+                    table_id, parquet_path,
+                )
+
+            return {
+                "rows": rows,
+                "size_bytes": size_bytes,
+                "query_mode": "materialized",
+                "hash": parquet_hash,
+            }
+        finally:
+            try:
+                file_lock.close()  # releases flock
+            except Exception:
+                pass
+            # Don't unlink lock_path — its mtime is the TTL signal for
+            # the next reclaim. Leaving it in place is intentional.
+    finally:
+        proc_lock.release()


 def _resolve_bq_project_id() -> str:
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,6 +1,6 @@
 [project]
 name = "agnes-the-ai-analyst"
-version = "0.32.0"
+version = "0.33.0"
 description = "Agnes — AI Data Analyst platform for AI analytical systems"
 requires-python = ">=3.11,<3.14"
 license = "MIT"
--- a/src/db.py
+++ b/src/db.py
@ -39,7 +39,7 @@ def _maybe_instrument(con, db_tag: str):

 _SAFE_IDENTIFIER = re.compile(r"^[a-zA-Z_][a-zA-Z0-9_]{0,63}$")

-SCHEMA_VERSION = 23
+SCHEMA_VERSION = 24

 _SYSTEM_SCHEMA = """
 CREATE TABLE IF NOT EXISTS schema_version (
@ -1682,6 +1682,81 @@ _V22_TO_V23_MIGRATIONS = [
 ]


+# v24: rewrite materialized BQ source_query from DuckDB-flavor
+# (bq."<dataset>"."<table>") to BigQuery-native (`<project>.<dataset>.<table>`)
+# so the new connectors.bigquery.extractor.materialize_query wrapping
+# path (which routes through bigquery_query() / BQ jobs API) accepts
+# them. Pre-v24, materialize used Storage Read API for the bq.<ds>.<tbl>
+# form, which fails for views — see PR for full motivation.
+#
+# This migration is implemented in Python (not pure SQL) because the
+# rewrite is a regex-and-replace per row: the project_id comes from
+# instance_config (file/env), not the DB. SQL alone can't pull the
+# project_id and substitute it. If the project isn't configured at
+# migration time, log a warning per affected row and leave them — the
+# operator must configure data_source.bigquery.project, restart, and
+# the migration will fire on next start (idempotent).
+def _replace_for_v24(project_id: str):
+    """Build a re.sub replacement function (not a string) so backslash
+    sequences in `project_id` aren't interpreted as group references.
+    GCP project IDs can't actually contain backslashes, but using a
+    function-form replacement is the defensive idiom — it makes the
+    intent explicit and removes the dependency on re.sub's replacement-
+    string escaping rules."""
+    def _repl(m):
+        return f"`{project_id}.{m.group(1)}.{m.group(2)}`"
+    return _repl
+
+
+def _v23_to_v24_finalize(conn: duckdb.DuckDBPyConnection) -> None:
+    import re as _re
+
+    try:
+        from app.instance_config import get_value
+        project_id = get_value("data_source", "bigquery", "project", default="") or ""
+    except Exception:
+        project_id = ""
+
+    pattern = _re.compile(r'bq\."([^"]+)"\."([^"]+)"')
+
+    rows = conn.execute(
+        "SELECT id, source_query FROM table_registry "
+        "WHERE query_mode = 'materialized' "
+        "AND source_query LIKE '%bq.\"%' "
+        "AND source_type = 'bigquery'"
+    ).fetchall()
+
+    if not rows:
+        return  # Nothing to migrate; skip the transaction.
+
+    conn.execute("BEGIN TRANSACTION")
+    try:
+        for row_id, sq in rows:
+            if sq is None:
+                continue
+            if not project_id:
+                logger.warning(
+                    "v24 migration: skipping rewrite of source_query for row %r — "
+                    "data_source.bigquery.project is not configured. Set it via "
+                    "/admin/server-config and restart the app to retry the "
+                    "migration.", row_id,
+                )
+                continue
+            new_sq = pattern.sub(_replace_for_v24(project_id), sq)
+            if new_sq != sq:
+                conn.execute(
+                    "UPDATE table_registry SET source_query = ? WHERE id = ?",
+                    [new_sq, row_id],
+                )
+                logger.info(
+                    "v24 migration: rewrote source_query for row %r", row_id,
+                )
+        conn.execute("COMMIT")
+    except Exception:
+        conn.execute("ROLLBACK")
+        raise
+
+
 def _ensure_schema(conn: duckdb.DuckDBPyConnection) -> None:
    """Create tables if they don't exist. Apply migrations if schema version changed.

@ -1837,6 +1912,8 @@ def _ensure_schema(conn: duckdb.DuckDBPyConnection) -> None:
            if current < 23:
                for sql in _V22_TO_V23_MIGRATIONS:
                    conn.execute(sql)
+            if current < 24:
+                _v23_to_v24_finalize(conn)
            conn.execute(
                "UPDATE schema_version SET version = ?, applied_at = current_timestamp",
                [SCHEMA_VERSION],
--- a/tests/conftest.py
+++ b/tests/conftest.py
@ -2,6 +2,7 @@

 import os
 from pathlib import Path
+from unittest.mock import MagicMock

 import duckdb
 import pytest
@ -319,3 +320,68 @@ from tests.fixtures.analyst_bootstrap import (  # noqa: E402,F401
    web_session,
    zero_grants_workspace,
 )
+
+
+@pytest.fixture
+def bq_instance(monkeypatch):
+    """Force instance.yaml to look like a BigQuery deployment for the
+    duration of one test. Patches the cached load_instance_config so
+    /admin/server-config reads / get_value('data_source.bigquery.project')
+    return what we want, without touching the on-disk instance.yaml.
+
+    Tests that need BigQuery-specific admin API behaviour (project_id
+    validation, materialized source_query checks, etc.) depend on this
+    fixture. Yields the fake config dict so callers can inspect it.
+
+    Note: several test files (test_admin_bq_register.py,
+    test_admin_tables_ui_materialized.py, …) define their own local
+    ``bq_instance`` fixture. Those local definitions shadow this one
+    inside those files — the conftest copy is the canonical provider for
+    any new test file that imports from this module."""
+    fake_cfg = {
+        "data_source": {
+            "type": "bigquery",
+            "bigquery": {"project": "my-test-project", "location": "us"},
+        },
+    }
+    monkeypatch.setattr(
+        "app.instance_config.load_instance_config",
+        lambda: fake_cfg,
+        raising=False,
+    )
+    from app.instance_config import reset_cache
+    reset_cache()
+    yield fake_cfg
+    reset_cache()
+
+
+@pytest.fixture
+def stub_bq_extractor(monkeypatch):
+    """Mirror tests/test_admin_bq_register.py — bypasses real-BQ traffic
+    in the post-register rebuild path so the test stays offline. Required
+    whenever the test seeds a remote-mode BQ row via the HTTP API.
+
+    Patches:
+    - ``connectors.bigquery.extractor.rebuild_from_registry`` — returns a
+      minimal success dict so the admin register endpoint's 200/201 path
+      completes without touching a real BQ project.
+    - ``src.orchestrator.SyncOrchestrator`` — replaced with a no-op mock so
+      the post-register orchestrator.rebuild() call doesn't scan the
+      (empty) extracts directory during tests.
+
+    Returns the ``rebuild_from_registry`` MagicMock directly so callers
+    that only need the side-effect patcher can ignore the return value,
+    and callers that want to assert call args can inspect it."""
+    rebuild_mock = MagicMock(return_value={
+        "project_id": "my-test-project",
+        "tables_registered": 1, "errors": [], "skipped": False,
+    })
+    monkeypatch.setattr(
+        "connectors.bigquery.extractor.rebuild_from_registry",
+        rebuild_mock,
+    )
+    monkeypatch.setattr(
+        "src.orchestrator.SyncOrchestrator",
+        lambda *a, **kw: MagicMock(),
+    )
+    return rebuild_mock
--- a/tests/test_admin_register_materialized_server_generated_sql.py
+++ b/tests/test_admin_register_materialized_server_generated_sql.py
@ -0,0 +1,90 @@
+"""When admin registers a materialized BQ row with bucket+source_table
+but NO source_query, the server generates the source_query from the
+configured BQ project + the supplied bucket/source_table. Admin never
+has to know about bigquery_query() syntax for the trivial full-table
+dump case.
+
+Fixtures `seeded_app`, `bq_instance`, `stub_bq_extractor` are auto-
+discovered from `tests/conftest.py` — DO NOT import. `seeded_app`
+is a dict: `{"client": TestClient, "admin_token": str, ...}`.
+"""
+from __future__ import annotations
+
+import pytest
+
+
+def _auth(token: str) -> dict:
+    """Mirror the project's local _auth helper used in every materialized
+    test file (e.g. test_api_admin_materialized.py)."""
+    return {"Authorization": f"Bearer {token}"}
+
+
+def test_register_materialized_with_bucket_only_generates_source_query(
+    seeded_app, bq_instance, stub_bq_extractor,
+):
+    client = seeded_app["client"]
+    headers = _auth(seeded_app["admin_token"])
+    payload = {
+        "name": "trivial_full_dump",
+        "source_type": "bigquery",
+        "query_mode": "materialized",
+        "bucket": "analytics",
+        "source_table": "orders_v2",
+    }
+    resp = client.post("/api/admin/register-table", json=payload, headers=headers)
+    assert resp.status_code in (200, 201, 202), resp.text
+
+    reg = client.get("/api/admin/registry", headers=headers).json()
+    row = next(t for t in reg["tables"] if t["id"] == "trivial_full_dump")
+    expected_project = bq_instance["data_source"]["bigquery"]["project"]
+    assert row["source_query"] == (
+        f"SELECT * FROM `{expected_project}.analytics.orders_v2`"
+    )
+
+
+def test_register_materialized_with_explicit_source_query_persists_verbatim(
+    seeded_app, bq_instance, stub_bq_extractor,
+):
+    client = seeded_app["client"]
+    headers = _auth(seeded_app["admin_token"])
+    custom = "SELECT col1, col2 FROM `analytics.orders_v2` WHERE col3 = 'x'"
+    payload = {
+        "name": "explicit_sql",
+        "source_type": "bigquery",
+        "query_mode": "materialized",
+        "source_query": custom,
+    }
+    resp = client.post("/api/admin/register-table", json=payload, headers=headers)
+    assert resp.status_code in (200, 201, 202), resp.text
+
+    reg = client.get("/api/admin/registry", headers=headers).json()
+    row = next(t for t in reg["tables"] if t["id"] == "explicit_sql")
+    assert row["source_query"] == custom
+
+
+def test_put_flip_to_materialized_with_bucket_generates_source_query(
+    seeded_app, bq_instance, stub_bq_extractor,
+):
+    client = seeded_app["client"]
+    headers = _auth(seeded_app["admin_token"])
+    # First register as remote.
+    client.post(
+        "/api/admin/register-table",
+        json={"name": "flip_t", "source_type": "bigquery",
+              "bucket": "analytics", "source_table": "orders_v2"},
+        headers=headers,
+    )
+    # PUT to flip to materialized without supplying source_query.
+    resp = client.put(
+        "/api/admin/registry/flip_t",
+        json={"query_mode": "materialized"},
+        headers=headers,
+    )
+    assert resp.status_code == 200, resp.text
+
+    reg = client.get("/api/admin/registry", headers=headers).json()
+    row = next(t for t in reg["tables"] if t["id"] == "flip_t")
+    expected_project = bq_instance["data_source"]["bigquery"]["project"]
+    assert row["source_query"] == (
+        f"SELECT * FROM `{expected_project}.analytics.orders_v2`"
+    )
--- a/tests/test_admin_server_config_materialize_section.py
+++ b/tests/test_admin_server_config_materialize_section.py
@ -0,0 +1,179 @@
+"""/api/admin/server-config exposes materialize.lock_ttl_seconds and
+accepts updates. Default is 86400 (24h).
+
+Fixture `seeded_app` is auto-discovered from `tests/conftest.py` —
+DO NOT import. It returns a dict: `{"client": TestClient,
+"admin_token": str, ...}`. Auth helper `_auth(token)` mirrors the
+project's local pattern (also used in test_api_admin_materialized.py).
+
+Behaviour contract:
+  - GET returns `materialize` section in `sections` (empty dict when no
+    override is set, since the endpoint surfaces every editable section).
+  - GET also exposes the known_fields registry entry for `materialize`
+    with `lock_ttl_seconds` spec (kind=int, default=86400).
+  - POST with a valid value persists it and GET returns the new value.
+  - POST with lock_ttl_seconds < 60 or > 604800 is rejected with 422.
+"""
+from __future__ import annotations
+
+import pytest
+import yaml
+
+
+def _auth(token: str) -> dict:
+    return {"Authorization": f"Bearer {token}"}
+
+
+# ---------------------------------------------------------------------------
+# GET — default state
+# ---------------------------------------------------------------------------
+
+
+def test_get_returns_materialize_in_editable_sections(seeded_app):
+    """materialize must appear in editable_sections."""
+    client = seeded_app["client"]
+    headers = _auth(seeded_app["admin_token"])
+    resp = client.get("/api/admin/server-config", headers=headers)
+    assert resp.status_code == 200
+    body = resp.json()
+    assert "materialize" in body["editable_sections"]
+
+
+def test_get_returns_materialize_section_key(seeded_app):
+    """materialize key appears in sections (empty dict when no override set)."""
+    client = seeded_app["client"]
+    headers = _auth(seeded_app["admin_token"])
+    resp = client.get("/api/admin/server-config", headers=headers)
+    assert resp.status_code == 200
+    body = resp.json()
+    # The endpoint surfaces every editable section so the UI can render it.
+    assert "materialize" in body["sections"]
+
+
+def test_get_returns_materialize_known_fields(seeded_app):
+    """known_fields must have a materialize.lock_ttl_seconds entry."""
+    client = seeded_app["client"]
+    headers = _auth(seeded_app["admin_token"])
+    resp = client.get("/api/admin/server-config", headers=headers)
+    assert resp.status_code == 200
+    body = resp.json()
+    mat_fields = body.get("known_fields", {}).get("materialize", {})
+    assert "lock_ttl_seconds" in mat_fields, body.get("known_fields", {})
+    spec = mat_fields["lock_ttl_seconds"]
+    assert spec["kind"] == "int"
+    assert spec["default"] == 86400
+
+
+# ---------------------------------------------------------------------------
+# POST — update and read back
+# ---------------------------------------------------------------------------
+
+
+def test_put_updates_materialize_lock_ttl(seeded_app, tmp_path, monkeypatch):
+    """POST with a valid value persists; GET reflects the new value."""
+    monkeypatch.setenv("DATA_DIR", str(tmp_path))
+    state = tmp_path / "state"
+    state.mkdir(parents=True, exist_ok=True)
+    import app.instance_config as ic
+    ic._instance_config = None
+    try:
+        client = seeded_app["client"]
+        headers = _auth(seeded_app["admin_token"])
+        resp = client.post(
+            "/api/admin/server-config",
+            json={"sections": {"materialize": {"lock_ttl_seconds": 3600}}},
+            headers=headers,
+        )
+        assert resp.status_code == 200, resp.text
+
+        # Verify on disk.
+        loaded = yaml.safe_load((state / "instance.yaml").read_text())
+        assert loaded["materialize"]["lock_ttl_seconds"] == 3600
+
+        # Verify GET reflects the new value.
+        ic._instance_config = None
+        resp2 = client.get("/api/admin/server-config", headers=headers)
+        assert resp2.json()["sections"]["materialize"]["lock_ttl_seconds"] == 3600
+    finally:
+        ic._instance_config = None
+
+
+# ---------------------------------------------------------------------------
+# POST — validation
+# ---------------------------------------------------------------------------
+
+
+def test_invalid_lock_ttl_below_min_rejected(seeded_app):
+    """lock_ttl_seconds < 60 is rejected with 422."""
+    client = seeded_app["client"]
+    headers = _auth(seeded_app["admin_token"])
+    resp = client.post(
+        "/api/admin/server-config",
+        json={"sections": {"materialize": {"lock_ttl_seconds": -5}}},
+        headers=headers,
+    )
+    assert resp.status_code == 422, resp.text
+
+
+def test_invalid_lock_ttl_zero_rejected(seeded_app):
+    """lock_ttl_seconds=0 is rejected with 422 (below the 60s floor)."""
+    client = seeded_app["client"]
+    headers = _auth(seeded_app["admin_token"])
+    resp = client.post(
+        "/api/admin/server-config",
+        json={"sections": {"materialize": {"lock_ttl_seconds": 0}}},
+        headers=headers,
+    )
+    assert resp.status_code == 422, resp.text
+
+
+def test_invalid_lock_ttl_above_max_rejected(seeded_app):
+    """lock_ttl_seconds > 604800 (1 week) is rejected with 422."""
+    client = seeded_app["client"]
+    headers = _auth(seeded_app["admin_token"])
+    resp = client.post(
+        "/api/admin/server-config",
+        json={"sections": {"materialize": {"lock_ttl_seconds": 604801}}},
+        headers=headers,
+    )
+    assert resp.status_code == 422, resp.text
+
+
+def test_valid_lock_ttl_boundary_min_accepted(seeded_app, tmp_path, monkeypatch):
+    """lock_ttl_seconds=60 (minimum) is accepted."""
+    monkeypatch.setenv("DATA_DIR", str(tmp_path))
+    state = tmp_path / "state"
+    state.mkdir(parents=True, exist_ok=True)
+    import app.instance_config as ic
+    ic._instance_config = None
+    try:
+        client = seeded_app["client"]
+        headers = _auth(seeded_app["admin_token"])
+        resp = client.post(
+            "/api/admin/server-config",
+            json={"sections": {"materialize": {"lock_ttl_seconds": 60}}},
+            headers=headers,
+        )
+        assert resp.status_code == 200, resp.text
+    finally:
+        ic._instance_config = None
+
+
+def test_valid_lock_ttl_boundary_max_accepted(seeded_app, tmp_path, monkeypatch):
+    """lock_ttl_seconds=604800 (maximum) is accepted."""
+    monkeypatch.setenv("DATA_DIR", str(tmp_path))
+    state = tmp_path / "state"
+    state.mkdir(parents=True, exist_ok=True)
+    import app.instance_config as ic
+    ic._instance_config = None
+    try:
+        client = seeded_app["client"]
+        headers = _auth(seeded_app["admin_token"])
+        resp = client.post(
+            "/api/admin/server-config",
+            json={"sections": {"materialize": {"lock_ttl_seconds": 604800}}},
+            headers=headers,
+        )
+        assert resp.status_code == 200, resp.text
+    finally:
+        ic._instance_config = None
--- a/tests/test_admin_validator_backtick_relaxed_for_materialized.py
+++ b/tests/test_admin_validator_backtick_relaxed_for_materialized.py
@ -0,0 +1,39 @@
+"""Backtick-quoted identifiers are required for materialized BQ source_query
+(when the dataset/table/project name contains a dash). The validator must
+allow them on materialized rows but still reject on remote/local."""
+from __future__ import annotations
+import pytest
+from pydantic import ValidationError
+
+from app.api.admin import RegisterTableRequest
+
+
+def test_materialized_accepts_backticks():
+    req = RegisterTableRequest(
+        name="b1",
+        source_type="bigquery",
+        query_mode="materialized",
+        source_query="SELECT * FROM `my-project.ds.tbl`",
+    )
+    assert req.source_query == "SELECT * FROM `my-project.ds.tbl`"
+
+
+def test_remote_rejects_backticks():
+    with pytest.raises(ValidationError):
+        RegisterTableRequest(
+            name="r1",
+            source_type="bigquery",
+            query_mode="remote",
+            bucket="ds", source_table="tbl",
+            source_query="SELECT * FROM `prj.ds.tbl`",
+        )
+
+
+def test_local_rejects_backticks():
+    with pytest.raises(ValidationError):
+        RegisterTableRequest(
+            name="l1",
+            source_type="keboola",
+            query_mode="local",
+            source_query="SELECT * FROM `kbc.ds.tbl`",
+        )
--- a/tests/test_api_admin_materialized.py
+++ b/tests/test_api_admin_materialized.py
@ -18,8 +18,6 @@ Covers PR #145 (re-implementation against 0.24.0 base):
 Shares the seeded_app + bq_instance fixtures from conftest /
 test_admin_bq_register.py for parity with the existing BQ test surface.
 """
-from unittest.mock import MagicMock
-
 import pytest


@ -27,59 +25,15 @@ def _auth(token):
    return {"Authorization": f"Bearer {token}"}


-@pytest.fixture
-def stub_bq_extractor(monkeypatch):
-    """Mirror tests/test_admin_bq_register.py — bypasses real-BQ traffic
-    in the post-register rebuild path so the test stays offline. Required
-    whenever the test seeds a remote-mode BQ row via the HTTP API."""
-    rebuild_mock = MagicMock(return_value={
-        "project_id": "my-test-project",
-        "tables_registered": 1, "errors": [], "skipped": False,
-    })
-    monkeypatch.setattr(
-        "connectors.bigquery.extractor.rebuild_from_registry",
-        rebuild_mock,
-    )
-    monkeypatch.setattr(
-        "src.orchestrator.SyncOrchestrator",
-        lambda *a, **kw: MagicMock(),
-    )
-    return rebuild_mock
-
-
-@pytest.fixture
-def bq_instance(monkeypatch):
-    """Force instance.yaml to look like a BigQuery deployment.
-
-    Mirrors tests/test_admin_bq_register.py::bq_instance so the
-    project_id read inside _validate_bigquery_register_payload succeeds.
-    """
-    fake_cfg = {
-        "data_source": {
-            "type": "bigquery",
-            "bigquery": {"project": "my-test-project", "location": "us"},
-        },
-    }
-    monkeypatch.setattr(
-        "app.instance_config.load_instance_config",
-        lambda: fake_cfg,
-        raising=False,
-    )
-    from app.instance_config import reset_cache
-    reset_cache()
-    yield fake_cfg
-    reset_cache()
-
-
 def _materialized_payload(**overrides):
    p = {
        "name": "orders_90d",
        "source_type": "bigquery",
        "query_mode": "materialized",
-        # DuckDB-flavor SQL (not BQ-native backticks) — the materialize path
-        # runs the SQL through the DuckDB BQ extension's COPY which uses
-        # double-quoted identifiers. Backticks are now rejected at register
-        # time. See `_BACKTICK_REJECTION_MESSAGE` in app/api/admin.py.
+        # BQ-native or DuckDB-flavor SQL — both accepted since Task 2 wraps
+        # materialized SQL in bigquery_query() (BQ jobs API path). Backtick
+        # identifiers are now allowed for materialized rows; remote/local rows
+        # still require DuckDB-flavor (double-quoted) identifiers.
        "source_query": 'SELECT date FROM bq."ds"."orders"',
        "sync_schedule": "every 6h",
    }
@ -326,36 +280,44 @@ def test_register_materialized_persists_source_query_in_registry(seeded_app, bq_
    assert "WHERE x = 1" in row["source_query"]


-# --- Backtick (BigQuery-native) source_query rejection -----------------------
+# --- Backtick (BigQuery-native) source_query handling ------------------------
 #
-# DuckDB BQ extension's COPY path interprets the SQL through DuckDB's parser,
-# which does NOT understand backtick-quoted identifiers (it uses double quotes
-# for quoted identifiers). A registered backtick-style source_query like
-# `SELECT * FROM \`prj.ds.t\`` either parse-errors or returns 0 rows at next
-# materialize tick — silently — and no parquet ends up at the canonical path.
-# Reject at registration time with an actionable message.
+# Task 2 (materialize-sync-fix) changed the BQ materialization path to run
+# admin SQL through the BQ jobs API (bigquery_query() wrapper) rather than
+# through DuckDB's BQ extension COPY path. BQ-native SQL requires backticks
+# for dashed project/dataset/table identifiers. The backtick guard has been
+# relaxed for ALL materialized rows: the validator now only rejects backticks
+# for remote/local rows (DuckDB-flavor SQL contract). Materialized rows must
+# be allowed to carry backticks so operators can reference dashed identifiers.
+# See test_admin_validator_backtick_relaxed_for_materialized.py for the
+# model-layer unit tests.


-def test_register_materialized_rejects_backtick_source_query(seeded_app, bq_instance):
+def test_register_materialized_accepts_backtick_source_query(seeded_app, bq_instance, stub_bq_extractor):
+    """BQ materialized rows now accept BQ-native backtick syntax; the
+    materialize path (Task 2) wraps them in bigquery_query() which uses
+    the BQ jobs API — not DuckDB's COPY — so backticks are valid."""
    c = seeded_app["client"]
    token = seeded_app["admin_token"]
    r = c.post(
        "/api/admin/register-table",
        json=_materialized_payload(
            name="bt_native",
-            source_query="SELECT * FROM `prj-grp.ds.product_inventory`",
+            source_query="SELECT * FROM `my-project.ds.product_inventory`",
        ),
        headers=_auth(token),
    )
-    assert r.status_code == 422, r.json()
-    detail = str(r.json().get("detail", "")).lower()
-    assert "backtick" in detail
-    assert 'bq."' in detail or "duckdb" in detail
+    assert r.status_code in (200, 201, 202), r.json()
+    reg = c.get("/api/admin/registry", headers=_auth(token)).json()
+    row = next(t for t in reg["tables"] if t["id"] == "bt_native")
+    assert row["source_query"] == "SELECT * FROM `my-project.ds.product_inventory`"


-def test_update_materialized_rejects_backtick_source_query(
+def test_update_materialized_accepts_backtick_source_query(
    seeded_app, bq_instance, stub_bq_extractor,
 ):
+    """PUT to a materialized BQ row may switch source_query to BQ-native
+    backtick form — accepted now that Task 2 wraps via jobs API."""
    c = seeded_app["client"]
    token = seeded_app["admin_token"]

@ -370,7 +332,7 @@ def test_update_materialized_rejects_backtick_source_query(
    assert r.status_code == 201, r.json()
    table_id = r.json()["id"]

-    # PATCH the source_query to a backtick form — must be rejected.
+    # PATCH the source_query to a BQ-native backtick form — now accepted.
    r2 = c.put(
        f"/api/admin/registry/{table_id}",
        json={
@ -379,14 +341,17 @@ def test_update_materialized_rejects_backtick_source_query(
        },
        headers=_auth(token),
    )
-    assert r2.status_code == 422, r2.json()
-    detail = str(r2.json().get("detail", "")).lower()
-    assert "backtick" in detail
+    assert r2.status_code == 200, r2.json()
+    reg = c.get("/api/admin/registry", headers=_auth(token)).json()
+    row = next(t for t in reg["tables"] if t["id"] == table_id)
+    assert row["source_query"] == "SELECT * FROM `prj.ds.t`"


-def test_register_materialized_keboola_rejects_backtick_source_query(seeded_app):
-    """The check is generic, not BQ-only — Keboola materialized rows that
-    include backticks would also be silently skipped at materialize time."""
+def test_register_materialized_keboola_accepts_backtick_source_query(seeded_app):
+    """Keboola materialized rows also accept backtick source_query at register
+    time — the backtick guard now only applies to remote/local rows. If the
+    SQL is invalid at runtime (DuckDB parse error), that surfaces as a sync
+    error, not a registration error."""
    c = seeded_app["client"]
    token = seeded_app["admin_token"]
    r = c.post(
@ -399,9 +364,7 @@ def test_register_materialized_keboola_rejects_backtick_source_query(seeded_app)
        },
        headers=_auth(token),
    )
-    assert r.status_code == 422, r.json()
-    detail = str(r.json().get("detail", "")).lower()
-    assert "backtick" in detail
+    assert r.status_code == 201, r.json()


 # --- Surface materialize errors per-row ---------------------------------------
--- a/tests/test_bq_cost_guardrail.py
+++ b/tests/test_bq_cost_guardrail.py
@ -18,7 +18,13 @@ from connectors.bigquery.extractor import materialize_query, MaterializeBudgetEr

 def _bq_with_seed(tables: dict[str, str] | None = None) -> BqAccess:
    """Stub BqAccess seeded with in-memory tables (same recipe as
-    test_bq_materialize)."""
+    test_bq_materialize).
+
+    A `bigquery_query(project, sql_text)` table macro is registered so the
+    wrapping added by `_wrap_admin_sql_for_jobs_api` (Task 2 — routes COPY
+    through the BQ jobs API for views) resolves against the in-memory tables
+    without needing the real BQ extension.
+    """
    tables = tables or {}

    @contextmanager
@ -30,6 +36,12 @@ def _bq_with_seed(tables: dict[str, str] | None = None) -> BqAccess:
                conn.execute(f"CREATE SCHEMA IF NOT EXISTS {s}")
            for ref, body in tables.items():
                conn.execute(f"CREATE OR REPLACE TABLE {ref} AS {body}")
+            # Stub bigquery_query() so materialize_query's wrapped COPY works
+            # against the in-memory bq catalog without the real BQ extension.
+            conn.execute(
+                "CREATE OR REPLACE MACRO bigquery_query(project, sql_text) "
+                "AS TABLE SELECT * FROM query(sql_text)"
+            )
            yield conn
        finally:
            conn.close()
@ -116,22 +128,26 @@ def test_zero_max_bytes_skips_dry_run(tmp_path):
    assert stats["rows"] == 1


-def test_dry_run_failure_is_fail_open(tmp_path):
+def test_dry_run_failure_is_fail_open(tmp_path, caplog):
    """If the dry-run errors (DuckDB syntax, missing google lib, transient
    upstream failure) we don't block — log + proceed with COPY. Operators
    who need hard-fail watch logs for the warning."""
+    import logging
+
    out = tmp_path / "extracts" / "bigquery"
    out.mkdir(parents=True)
    bq = _bq_with_seed({"bq.test.tiny": "SELECT 1 AS n"})

-    with patch(
-        "app.api.v2_scan._bq_dry_run_bytes", side_effect=RuntimeError("boom")
-    ):
-        stats = materialize_query(
-            table_id="t1",
-            sql="SELECT * FROM bq.test.tiny",
-            bq=bq,
-            output_dir=str(out),
-            max_bytes=10 * 2**30,
-        )
+    with caplog.at_level(logging.WARNING, logger="connectors.bigquery.extractor"):
+        with patch(
+            "app.api.v2_scan._bq_dry_run_bytes", side_effect=RuntimeError("boom")
+        ):
+            stats = materialize_query(
+                table_id="t1",
+                sql="SELECT * FROM bq.test.tiny",
+                bq=bq,
+                output_dir=str(out),
+                max_bytes=10 * 2**30,
+            )
    assert stats["rows"] == 1
+    assert "fail-open" in caplog.text
--- a/tests/test_bq_materialize.py
+++ b/tests/test_bq_materialize.py
@ -21,6 +21,11 @@ def _make_stub_bq(tables: dict[str, str] | None = None) -> BqAccess:
    with a pretend `bq` catalog containing test tables. `tables` maps
    DuckDB-three-part references like `'bq.test.orders'` to a SELECT
    expression to seed them with.
+
+    A `bigquery_query(project, sql_text)` table macro is registered so the
+    wrapping added by `_wrap_admin_sql_for_jobs_api` (Task 2 — routes COPY
+    through the BQ jobs API for views) resolves against the in-memory tables
+    without needing the real BQ extension.
    """
    tables = tables or {}

@ -34,6 +39,12 @@ def _make_stub_bq(tables: dict[str, str] | None = None) -> BqAccess:
                conn.execute(f"CREATE SCHEMA IF NOT EXISTS {s}")
            for ref, body in tables.items():
                conn.execute(f"CREATE OR REPLACE TABLE {ref} AS {body}")
+            # Stub bigquery_query() so materialize_query's wrapped COPY works
+            # against the in-memory bq catalog without the real BQ extension.
+            conn.execute(
+                "CREATE OR REPLACE MACRO bigquery_query(project, sql_text) "
+                "AS TABLE SELECT * FROM query(sql_text)"
+            )
            yield conn
        finally:
            conn.close()
--- a/tests/test_bq_materialize_concurrency.py
+++ b/tests/test_bq_materialize_concurrency.py
@ -0,0 +1,204 @@
+"""Per-table_id concurrency: in-process mutex + advisory file lock with
+TTL reclaim. Two overlapping materialize_query calls for the same id
+must NOT corrupt each other's parquet."""
+from __future__ import annotations
+import os
+import threading
+import time
+from pathlib import Path
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from connectors.bigquery.extractor import (
+    materialize_query,
+    MaterializeInFlightError,
+    _get_table_lock,
+    _LOCK_TTL_DEFAULT_SECONDS,
+)
+
+
+@pytest.fixture(autouse=True)
+def reset_locks(monkeypatch):
+    # Tests must not share lock state across runs.
+    import connectors.bigquery.extractor as mod
+    monkeypatch.setattr(mod, "_table_locks", {})
+    yield
+
+
+def _slow_bq(stall_seconds: float = 1.0):
+    """Build a fake BqAccess whose duckdb_session COPY blocks for
+    `stall_seconds` so we can race a second call against it."""
+    bq = MagicMock()
+    bq.projects.billing = "prj-billing"
+    bq.projects.data = "prj-data"
+
+    class _Session:
+        def __enter__(self):
+            return self
+        def __exit__(self, *a):
+            return False
+        def execute(self, sql):
+            if sql.startswith("SELECT database_name"):
+                class _R:
+                    def fetchall(self):
+                        return [("memory",)]
+                return _R()
+            if sql.startswith("ATTACH"):
+                return MagicMock()
+            if sql.startswith("COPY"):
+                # Simulate a long-running COPY by writing a stub parquet
+                # then sleeping so a second call can race us.
+                # Extract the path from the COPY statement.
+                import re
+                m = re.search(r"TO '([^']+)'", sql)
+                assert m
+                Path(m.group(1)).write_bytes(b"PARQUET_STUB_HEADER" + b"\x00" * 200)
+                time.sleep(stall_seconds)
+                return MagicMock()
+            if sql.startswith("SELECT count"):
+                class _R:
+                    def fetchone(self):
+                        return (42,)
+                return _R()
+            return MagicMock()
+
+    bq.duckdb_session.return_value = _Session()
+    return bq
+
+
+def test_concurrent_calls_for_same_id_raise_in_flight(tmp_path):
+    bq = _slow_bq(stall_seconds=2.0)
+
+    out_dir = str(tmp_path)
+    captured: list = []
+
+    def runner(tag):
+        try:
+            r = materialize_query(
+                table_id="t1", sql="SELECT 1",
+                bq=bq, output_dir=out_dir, max_bytes=None,
+            )
+            captured.append(("ok", tag, r))
+        except MaterializeInFlightError as e:
+            captured.append(("in_flight", tag, str(e)))
+        except Exception as e:
+            captured.append(("err", tag, str(e)))
+
+    t1 = threading.Thread(target=runner, args=("first",))
+    t2 = threading.Thread(target=runner, args=("second",))
+    t1.start()
+    time.sleep(0.2)  # let t1 acquire the lock
+    t2.start()
+    t1.join()
+    t2.join()
+
+    outcomes = [c[0] for c in captured]
+    assert outcomes.count("ok") == 1, f"expected exactly one success, got {captured}"
+    assert outcomes.count("in_flight") == 1
+
+
+def test_sequential_calls_for_same_id_both_succeed(tmp_path):
+    bq = _slow_bq(stall_seconds=0.05)
+
+    out_dir = str(tmp_path)
+    r1 = materialize_query(
+        table_id="t1", sql="SELECT 1",
+        bq=bq, output_dir=out_dir, max_bytes=None,
+    )
+    r2 = materialize_query(
+        table_id="t1", sql="SELECT 1",
+        bq=bq, output_dir=out_dir, max_bytes=None,
+    )
+    assert r1["rows"] == 42
+    assert r2["rows"] == 42
+
+
+def test_different_ids_run_in_parallel(tmp_path):
+    bq = _slow_bq(stall_seconds=1.0)
+    out_dir = str(tmp_path)
+    captured: list = []
+
+    def runner(tid):
+        try:
+            r = materialize_query(
+                table_id=tid, sql="SELECT 1",
+                bq=bq, output_dir=out_dir, max_bytes=None,
+            )
+            captured.append((tid, r["rows"]))
+        except Exception as e:
+            captured.append((tid, "ERROR"))
+
+    threads = [threading.Thread(target=runner, args=(f"tab_{i}",)) for i in range(3)]
+    start = time.time()
+    for t in threads: t.start()
+    for t in threads: t.join()
+    elapsed = time.time() - start
+    # If they were serialized, would take >= 3s. Parallel: ~1s.
+    assert elapsed < 2.0, f"expected parallel, elapsed={elapsed:.2f}s"
+    assert len(captured) == 3
+    assert all(c[1] == 42 for c in captured)
+
+
+def test_stale_file_lock_is_reclaimed_after_ttl(tmp_path, monkeypatch):
+    """Verify a stale, unheld .lock file (old mtime, no live flock holder) does NOT
+    cause `MaterializeInFlightError`. The reclaim branch in `_try_acquire_file_lock`
+    is technically not reached here (the first `_try_open_and_flock` succeeds because
+    nobody holds the lock), but exercising the in-flight-by-mtime-only mistake is what
+    this test guards against."""
+    bq = _slow_bq(stall_seconds=0.05)
+    lock_path = Path(tmp_path) / "data" / "t1.parquet.lock"
+    lock_path.parent.mkdir(parents=True, exist_ok=True)
+    lock_path.write_text("")
+
+    # Set mtime to 25h ago (> default 24h TTL).
+    old_ts = time.time() - 25 * 3600
+    os.utime(lock_path, (old_ts, old_ts))
+
+    r = materialize_query(
+        table_id="t1", sql="SELECT 1",
+        bq=bq, output_dir=str(tmp_path), max_bytes=None,
+    )
+    assert r["rows"] == 42
+
+
+def test_fresh_file_lock_blocks_with_in_flight_error(tmp_path, monkeypatch):
+    """Force a fresh .lock file (mtime within TTL) and verify a new
+    call raises rather than reclaims."""
+    bq = _slow_bq(stall_seconds=0.05)
+    lock_path = Path(tmp_path) / "data" / "t1.parquet.lock"
+    lock_path.parent.mkdir(parents=True, exist_ok=True)
+
+    # Open the lock file and HOLD a fcntl exclusive lock so the materialize
+    # call's flock(LOCK_NB) sees a real conflicting lock — relying on
+    # mtime-only would let the test pass even if flock acquisition was
+    # broken.
+    import fcntl
+    holder = open(lock_path, "w")
+    fcntl.flock(holder.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
+    try:
+        with pytest.raises(MaterializeInFlightError):
+            materialize_query(
+                table_id="t1", sql="SELECT 1",
+                bq=bq, output_dir=str(tmp_path), max_bytes=None,
+            )
+    finally:
+        fcntl.flock(holder.fileno(), fcntl.LOCK_UN)
+        holder.close()
+
+
+def test_lock_ttl_reads_from_instance_config(tmp_path, monkeypatch):
+    """When `materialize.lock_ttl_seconds` is set in instance.yaml, that
+    value overrides the default."""
+    # Patches `app.instance_config.get_value` directly. This works because
+    # `_get_lock_ttl_seconds` re-imports `get_value` on every call (see
+    # extractor.py for the deferred-import rationale). If a future change
+    # hoists the import to module-level, this patch must change to target
+    # `connectors.bigquery.extractor.get_value` instead.
+    monkeypatch.setattr(
+        "app.instance_config.get_value",
+        lambda *args, **kw: 60 if args == ("materialize", "lock_ttl_seconds") else kw.get("default"),
+    )
+
+    from connectors.bigquery.extractor import _get_lock_ttl_seconds
+    assert _get_lock_ttl_seconds() == 60
--- a/tests/test_bq_materialize_query_wrapping.py
+++ b/tests/test_bq_materialize_query_wrapping.py
@ -0,0 +1,54 @@
+"""materialize_query must always wrap admin source_query in
+bigquery_query('<billing>', '<admin>') so the COPY uses BQ jobs API,
+which works for base tables AND views — Storage Read API does not."""
+from __future__ import annotations
+from pathlib import Path
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from connectors.bigquery.extractor import (
+    _wrap_admin_sql_for_jobs_api,
+    _escape_sql_string_literal,
+)
+
+
+def test_wrap_simple_select():
+    out = _wrap_admin_sql_for_jobs_api(
+        billing_project="prj-billing",
+        inner_sql="SELECT * FROM `ds.tbl`",
+    )
+    assert out == (
+        "SELECT * FROM bigquery_query('prj-billing', "
+        "'SELECT * FROM `ds.tbl`')"
+    )
+
+
+def test_escape_single_quotes_in_inner_sql():
+    inner = "SELECT name FROM `ds.tbl` WHERE country = 'CZ'"
+    escaped = _escape_sql_string_literal(inner)
+    assert escaped == "SELECT name FROM `ds.tbl` WHERE country = ''CZ''"
+
+
+def test_wrap_with_inner_quotes_round_trips():
+    inner = "SELECT * FROM `ds.tbl` WHERE col = 'foo''bar'"
+    out = _wrap_admin_sql_for_jobs_api("myproject", inner)
+    # Outer string-literal envelope must double the inner single quotes
+    # so DuckDB's parser sees a balanced literal.
+    assert out.count("'") % 2 == 0
+    # Round-trip: stripping the wrapper gives back the original inner exactly.
+    prefix = "SELECT * FROM bigquery_query('myproject', '"
+    assert out.startswith(prefix)
+    assert out.endswith("')")
+    middle = out[len(prefix):-2]
+    # DuckDB string literal escape: '' → '. Reverse it.
+    decoded = middle.replace("''", "'")
+    assert decoded == inner
+
+
+def test_billing_project_validates_format():
+    with pytest.raises(ValueError, match="billing_project"):
+        _wrap_admin_sql_for_jobs_api(
+            billing_project="bad project'; DROP",
+            inner_sql="SELECT 1",
+        )
--- a/tests/test_db_schema_version.py
+++ b/tests/test_db_schema_version.py
@ -13,8 +13,9 @@ import duckdb
 from src.db import SCHEMA_VERSION, _ensure_schema, get_schema_version


-def test_schema_version_is_23():
-    assert SCHEMA_VERSION == 23
+def test_schema_version_is_24():
+    # bumped from 23→24 for the materialized BQ source_query rewrite migration
+    assert SCHEMA_VERSION == 24


 def test_v20_adds_source_query(tmp_path):
@ -29,7 +30,7 @@ def test_v20_adds_source_query(tmp_path):
        ).fetchall()
    }
    assert "source_query" in cols, f"source_query missing from {cols}"
-    assert get_schema_version(conn) == 23
+    assert get_schema_version(conn) == SCHEMA_VERSION
    conn.close()


@ -83,7 +84,7 @@ def test_v19_db_migrates_to_v20(tmp_path):

    _ensure_schema(conn)

-    assert get_schema_version(conn) == 23
+    assert get_schema_version(conn) == SCHEMA_VERSION  # bumped 23→24
    cols = {
        r[0] for r in conn.execute(
            "SELECT column_name FROM information_schema.columns "
--- a/tests/test_materialized_e2e.py
+++ b/tests/test_materialized_e2e.py
@ -75,7 +75,13 @@ def stub_bq_extractor(monkeypatch):
@pytest.fixture
 def stub_bq():
    """Real-shape BqAccess wired to in-memory DuckDB factories so the
-    materialize_query path can run end-to-end without GCP."""
+    materialize_query path can run end-to-end without GCP.
+
+    A `bigquery_query(project, sql_text)` table macro is registered so the
+    wrapping added by `_wrap_admin_sql_for_jobs_api` (Task 2 — routes COPY
+    through the BQ jobs API for views) resolves against the in-memory tables
+    without needing the real BQ extension.
+    """
    @contextmanager
    def _session(_p):
        conn = duckdb.connect(":memory:")
@ -87,6 +93,12 @@ def stub_bq():
                "SELECT 'EU' AS region, 100 AS revenue UNION ALL "
                "SELECT 'US' AS region, 250 AS revenue"
            )
+            # Stub bigquery_query() so materialize_query's wrapped COPY works
+            # against the in-memory bq catalog without the real BQ extension.
+            conn.execute(
+                "CREATE OR REPLACE MACRO bigquery_query(project, sql_text) "
+                "AS TABLE SELECT * FROM query(sql_text)"
+            )
            yield conn
        finally:
            conn.close()
@ -265,12 +277,18 @@ def test_materialized_zero_rows_logs_warning(stub_bq, tmp_path, caplog):
            conn.execute("CREATE SCHEMA bq.test")
            conn.execute("CREATE OR REPLACE TABLE bq.test.empty AS "
                         "SELECT 1 AS n WHERE FALSE")
+            # Stub bigquery_query() so materialize_query's wrapped COPY works
+            # against the in-memory bq catalog without the real BQ extension.
+            conn.execute(
+                "CREATE OR REPLACE MACRO bigquery_query(project, sql_text) "
+                "AS TABLE SELECT * FROM query(sql_text)"
+            )
            yield conn
        finally:
            conn.close()

    bq_empty = BqAccess(
-        BqProjects(billing="t", data="t"),
+        BqProjects(billing="test-project", data="test-project"),
        client_factory=lambda _p: MagicMock(),
        duckdb_session_factory=_session_empty,
    )
@ -323,7 +341,7 @@ def test_attach_real_error_propagates(stub_bq, tmp_path):
            conn.close()

    bq_bad = BqAccess(
-        BqProjects(billing="t", data="t"),
+        BqProjects(billing="test-project", data="test-project"),
        client_factory=lambda _p: MagicMock(),
        duckdb_session_factory=_session_attach_fails,
    )
--- a/tests/test_run_materialized_pass_in_flight_skip.py
+++ b/tests/test_run_materialized_pass_in_flight_skip.py
@ -0,0 +1,66 @@
+"""When materialize_query raises MaterializeInFlightError, _run_materialized_pass
+must record it as a 'skipped, in_flight' outcome and NOT call state.set_error
+(otherwise sync_state surfaces a false-positive 'failure' for a healthy
+in-progress run)."""
+from __future__ import annotations
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+from app.api.sync import _run_materialized_pass
+from connectors.bigquery.extractor import MaterializeInFlightError
+
+
+@pytest.fixture
+def fake_registry_with_one_materialized(monkeypatch, tmp_path):
+    monkeypatch.setenv("DATA_DIR", str(tmp_path))
+    rows = [{
+        "id": "in_flight_t",
+        "name": "in_flight_t",
+        "query_mode": "materialized",
+        "source_type": "bigquery",
+        "source_query": "SELECT * FROM `ds.t`",
+        "sync_schedule": None,
+    }]
+
+    class _Repo:
+        def __init__(self, conn): pass
+        def list_all(self): return rows
+
+    class _State:
+        def __init__(self, conn):
+            self.set_error_calls = []
+            self.update_sync_calls = []
+        def get_last_sync(self, _id): return None
+        def set_error(self, table_id, msg): self.set_error_calls.append((table_id, msg))
+        def update_sync(self, **kw): self.update_sync_calls.append(kw)
+
+    state = _State(None)
+    monkeypatch.setattr("app.api.sync.TableRegistryRepository", _Repo)
+    monkeypatch.setattr("app.api.sync.SyncStateRepository", lambda c: state)
+    return state
+
+
+def test_in_flight_recorded_as_skipped_not_error(fake_registry_with_one_materialized):
+    state = fake_registry_with_one_materialized
+
+    with patch(
+        "app.api.sync._materialize_table",
+        side_effect=MaterializeInFlightError("in_flight_t", layer="process"),
+    ):
+        summary = _run_materialized_pass(MagicMock(), MagicMock())
+
+    assert summary["materialized"] == []
+    assert summary["errors"] == []
+    assert len(summary["skipped"]) == 1
+    skipped = summary["skipped"][0]
+    assert skipped == {"table": "in_flight_t", "reason": "in_flight"}
+    assert state.set_error_calls == []
+    assert state.update_sync_calls == []
+
+
+def test_due_check_skipped_uses_due_check_reason(fake_registry_with_one_materialized, monkeypatch):
+    monkeypatch.setattr("app.api.sync.is_table_due", lambda *a, **k: False)
+
+    summary = _run_materialized_pass(MagicMock(), MagicMock())
+    assert summary["skipped"] == [{"table": "in_flight_t", "reason": "due_check"}]
--- a/tests/test_schema_v24_source_query_rewrite.py
+++ b/tests/test_schema_v24_source_query_rewrite.py
@ -0,0 +1,159 @@
+"""v24: rewrites table_registry.source_query for materialized BQ rows
+from DuckDB-flavor (bq.\"ds\".\"tbl\") to BQ-native (`<project>.ds.tbl`).
+The wrapping path (connectors.bigquery.extractor.materialize_query) only
+accepts BQ-native; pre-v24 rows would fail at materialize time without
+this conversion."""
+from __future__ import annotations
+import os
+import tempfile
+from pathlib import Path
+
+import duckdb
+import pytest
+
+from src.db import _ensure_schema, get_schema_version, SCHEMA_VERSION
+
+
+def _seed_v23(conn, project_id: str = "prj-data"):
+    conn.execute(
+        "CREATE TABLE schema_version (version INTEGER, applied_at TIMESTAMP DEFAULT current_timestamp)"
+    )
+    conn.execute("INSERT INTO schema_version (version) VALUES (23)")
+    conn.execute(
+        "CREATE TABLE table_registry ("
+        "id VARCHAR PRIMARY KEY, name VARCHAR, source_type VARCHAR, "
+        "query_mode VARCHAR, bucket VARCHAR, source_table VARCHAR, source_query VARCHAR)"
+    )
+
+
+def test_v24_rewrites_duckdb_flavor_to_bq_native(monkeypatch):
+    with tempfile.TemporaryDirectory() as tmp:
+        monkeypatch.setenv("DATA_DIR", tmp)
+        monkeypatch.setattr(
+            "app.instance_config.get_value",
+            lambda *args, **kw: "prj-data" if args == ("data_source", "bigquery", "project") else kw.get("default"),
+        )
+        Path(tmp, "state").mkdir(parents=True, exist_ok=True)
+        db_path = Path(tmp, "state", "system.duckdb")
+        conn = duckdb.connect(str(db_path))
+        try:
+            _seed_v23(conn)
+            conn.execute(
+                'INSERT INTO table_registry VALUES (?, ?, ?, ?, ?, ?, ?)',
+                ["t1", "t1", "bigquery", "materialized", "ds", "tbl",
+                 'SELECT * FROM bq."ds"."tbl"'],
+            )
+            conn.execute(
+                'INSERT INTO table_registry VALUES (?, ?, ?, ?, ?, ?, ?)',
+                ["t2", "t2", "bigquery", "materialized", "analytics", "orders",
+                 'SELECT col1 FROM bq."analytics"."orders" WHERE col2 > 10'],
+            )
+            conn.execute(
+                'INSERT INTO table_registry VALUES (?, ?, ?, ?, ?, ?, ?)',
+                ["r1", "r1", "bigquery", "remote", "ds", "tbl", None],
+            )
+
+            _ensure_schema(conn)
+            assert get_schema_version(conn) == SCHEMA_VERSION
+            assert SCHEMA_VERSION >= 24
+
+            rows = {r[0]: r[1] for r in conn.execute(
+                "SELECT id, source_query FROM table_registry"
+            ).fetchall()}
+            assert rows["t1"] == "SELECT * FROM `prj-data.ds.tbl`"
+            assert rows["t2"] == (
+                "SELECT col1 FROM `prj-data.analytics.orders` WHERE col2 > 10"
+            )
+            assert rows["r1"] is None  # remote row untouched
+        finally:
+            conn.close()
+
+
+def test_v24_idempotent_when_already_bq_native(monkeypatch):
+    with tempfile.TemporaryDirectory() as tmp:
+        monkeypatch.setenv("DATA_DIR", tmp)
+        monkeypatch.setattr(
+            "app.instance_config.get_value",
+            lambda *args, **kw: "prj-data" if args == ("data_source", "bigquery", "project") else kw.get("default"),
+        )
+        Path(tmp, "state").mkdir(parents=True, exist_ok=True)
+        db_path = Path(tmp, "state", "system.duckdb")
+        conn = duckdb.connect(str(db_path))
+        try:
+            _seed_v23(conn)
+            conn.execute(
+                'INSERT INTO table_registry VALUES (?, ?, ?, ?, ?, ?, ?)',
+                ["t1", "t1", "bigquery", "materialized", "ds", "tbl",
+                 "SELECT * FROM `prj-data.ds.tbl`"],
+            )
+            _ensure_schema(conn)
+            row = conn.execute(
+                "SELECT source_query FROM table_registry WHERE id='t1'"
+            ).fetchone()
+            assert row[0] == "SELECT * FROM `prj-data.ds.tbl`"
+        finally:
+            conn.close()
+
+
+def test_v24_logs_warning_when_project_not_configured(monkeypatch, caplog):
+    with tempfile.TemporaryDirectory() as tmp:
+        monkeypatch.setenv("DATA_DIR", tmp)
+        monkeypatch.setattr(
+            "app.instance_config.get_value",
+            lambda *args, **kw: kw.get("default", ""),  # no project configured
+        )
+        Path(tmp, "state").mkdir(parents=True, exist_ok=True)
+        db_path = Path(tmp, "state", "system.duckdb")
+        conn = duckdb.connect(str(db_path))
+        try:
+            _seed_v23(conn)
+            conn.execute(
+                'INSERT INTO table_registry VALUES (?, ?, ?, ?, ?, ?, ?)',
+                ["t1", "t1", "bigquery", "materialized", "ds", "tbl",
+                 'SELECT * FROM bq."ds"."tbl"'],
+            )
+            with caplog.at_level("WARNING"):
+                _ensure_schema(conn)
+            row = conn.execute(
+                "SELECT source_query FROM table_registry WHERE id='t1'"
+            ).fetchone()
+            assert row[0] == 'SELECT * FROM bq."ds"."tbl"'
+            assert any(
+                "v24" in r.message.lower() or "project" in r.message.lower()
+                for r in caplog.records
+            )
+        finally:
+            conn.close()
+
+
+def test_v24_keboola_materialized_row_not_rewritten(monkeypatch):
+    """Materialized rows with source_type != 'bigquery' must not be touched
+    by v24. Keboola materialized has no notion of bq."ds"."tbl" syntax;
+    the SELECT's source_type filter pins this contract.
+    """
+    with tempfile.TemporaryDirectory() as tmp:
+        monkeypatch.setenv("DATA_DIR", tmp)
+        monkeypatch.setattr(
+            "app.instance_config.get_value",
+            lambda *args, **kw: "prj-data" if args == ("data_source", "bigquery", "project") else kw.get("default"),
+        )
+        Path(tmp, "state").mkdir(parents=True, exist_ok=True)
+        db_path = Path(tmp, "state", "system.duckdb")
+        conn = duckdb.connect(str(db_path))
+        try:
+            _seed_v23(conn)
+            # Keboola row that happens to contain `bq."..."` in its SQL
+            # (admin error or copy-paste from a BQ row). Migration must
+            # leave it alone — this is not the v24 contract.
+            conn.execute(
+                'INSERT INTO table_registry VALUES (?, ?, ?, ?, ?, ?, ?)',
+                ["kb1", "kb1", "keboola", "materialized", "ds", "tbl",
+                 'SELECT * FROM bq."ds"."tbl"'],
+            )
+            _ensure_schema(conn)
+            row = conn.execute(
+                "SELECT source_query FROM table_registry WHERE id='kb1'"
+            ).fetchone()
+            assert row[0] == 'SELECT * FROM bq."ds"."tbl"'
+        finally:
+            conn.close()
--- a/tests/test_sync_trigger_materialized.py
+++ b/tests/test_sync_trigger_materialized.py
@ -102,7 +102,8 @@ def test_materialized_pass_skips_undue_rows(system_db, stub_bq):
        summary = sync_mod._run_materialized_pass(system_db, stub_bq)

    mock_mat.assert_not_called()
-    assert "orders_daily" in summary["skipped"]
+    # summary["skipped"] is now list[dict] — see PR zs/materialize-sync-fix
+    assert {"table": "orders_daily", "reason": "due_check"} in summary["skipped"]


 def test_materialized_pass_skips_non_materialized_rows(system_db, stub_bq):