New top-level 'materialize' section, single field (lock_ttl_seconds).
Default 86400 (24h). Backs the file-lock TTL reclaim added in the
per-table-mutex change. Editable via PUT /api/admin/server-config and
the /admin/server-config UI.
When admin registers a materialized BQ row with bucket+source_table but
no source_query, the server generates 'SELECT * FROM `<project>.<ds>.<tbl>`'
from instance.yaml's configured BQ project. Same fallback fires on PUT
when flipping to materialized. The backtick rejection guard, which was
appropriate for DuckDB-flavor source_query, is relaxed for materialized
rows since the new wrapping path (Task 2) runs admin SQL through BQ
jobs API which uses BQ-native syntax (backticks for dashed identifiers).
_run_materialized_pass distinguishes due-check skips from in-flight
skips and never calls state.set_error for either. summary['skipped']
becomes a list of {table, reason} dicts; the end-of-pass log line
breaks out the in_flight subcount.
Hoists is_table_due to module-level import so test monkeypatching of
the symbol intercepts the call (the previous local import made
patches a no-op).
- extractor._try_acquire_file_lock: close fd and re-raise on non-
BlockingIOError from fcntl.flock (read-only fs, unsupported flock,
fd exhaustion). Pre-fix the fd leaked silently and the underlying
OSError still propagated past the caller.
- extractor: reorder module-level layout so logger is bound before
the new lock-related helpers reference it. Deferred import of
app.instance_config inside _get_lock_ttl_seconds documented inline.
- extractor: comment _table_locks unbounded-by-design rationale.
- tests: docstring + monkeypatch-target rationale for the two
concurrency tests where the contract isn't obvious from the body.
- Verb renames (da X -> agnes X for surviving verbs; legacy verbs already
absent from this default template — admin overrides with legacy verbs are
caught by Task 2's _LEGACY_STRINGS scan + Task 5's admin banner).
- Path renames: data/parquet/ -> server/parquet/, data/duckdb/ ->
user/duckdb/, data/metadata/ removed entirely (no longer exists per spec).
- Drop user/artifacts/ from directory structure (spec workspace layout
drops it; surviving paths: server/parquet/, user/duckdb/, user/snapshots/,
user/sessions/).
- Add AGNES_WORKSPACE.md pointer near top-of-template so analysts know
where to find human-readable docs.
Cleans Task 0.5's missed sweep on this file (was not in cli/ tree but is
user-visible via /api/welcome).
81 claude_md/welcome_template tests pass.
Two Task 4 review fixes for app/web/templates/install.html:
1. JSON-escape `ROLE` JS const via `{{ role | tojson }}` (defense in
depth — removes the dependency on Jinja autoescape semantics for JS
contexts; FastAPI's Literal validator already constrains role values).
2. Verify the analyst tile's clipboard payload is the analyst layout.
The pre-existing role-aware plumbing (compute_default_agent_prompt
threading role into setup_instructions_lines, picked up by the JS
SETUP_INSTRUCTIONS_TEMPLATE array) was correct; adding regression tests
that pin to the JS clipboard block specifically so a future inversion
would fail loudly.
Tests: analyst clipboard contains `agnes init` + `agnes catalog` and
NOT `agnes auth import-token` / `agnes skills`; admin clipboard is the
inverse. Plus an explicit assertion that ROLE is rendered via tojson.
Two layers of concurrency control. Layer 1 is a per-table_id
threading.Lock keyed on table_id; Layer 2 is fcntl.flock on a sibling
<id>.parquet.lock file. Overlapping calls for the same id raise
MaterializeInFlightError, which the caller treats as 'skipped,
in_flight' instead of a hard error. Stale file locks (mtime older
than materialize.lock_ttl_seconds, default 86400) are reclaimed on
the next attempt — covers the rare case where a holder was hard-killed
before kernel-level flock release.
Pre-fix, when a materialize ran longer than the scheduler tick interval
(15 min), the next tick called materialize_query for the same id, hit
the unconditional tmp_path.unlink() at function entry, and started a
second COPY against the same path. Both writers interleaved bytes;
the original COPY's read_parquet validation then failed with
'No magic bytes found at end of file'.
- tests/test_bq_cost_guardrail.py: assert fail-open warning is logged
(test previously only proved fail-open doesn't crash; review note:
warning is the only operator-visible signal of the silent failure).
- extractor._wrap_admin_sql_for_jobs_api: docstring no longer claims
DuckDB-flavor SQL is rejected — the function performs no inner-SQL
validation; the v24 migration + register-time validator are the
real enforcement points.
- extractor.materialize_query: safe_path uses _escape_sql_string_literal
instead of inline replace, for one-place-to-update consistency.
- extractor: import hashlib hoisted to module-level imports.
Pre-fix, materialize ran the admin source_query as 'COPY (sql) TO parquet'
through the DuckDB BQ extension session. The extension defaults to the
BQ Storage Read API for bq.<ds>.<tbl> references, which rejects views
('non-table entities cannot be read with the storage API'). The fix
always wraps admin SQL into bigquery_query('<billing>', '<inner>') so
COPY uses the BQ jobs API uniformly for tables and views.
Cost guardrail dry-run now operates on the inner SQL (BQ-native), so
the BQ Python client parses it and the cap engages — pre-fix the dry-run
hit 'Table-valued function not found: bigquery_query' and fail-opened.
Pre-fix the fixtures lived inside tests/test_api_admin_materialized.py.
Upcoming test files in this branch need them too; conftest is the
canonical home so they resolve via pytest's auto-discovery.
CHANGELOG: rename [Unreleased] → [0.32.0] — 2026-05-04, prepend a new
empty [Unreleased] for next-PR landing zone.
pyproject.toml: version 0.31.0 → 0.32.0.
Per repo discipline (memory: feedback_release_cut_with_pr.md), the
release-cut commit lands as the FINAL commit of the PR that contained
the user-visible behavior change — it does not get a separate PR.
After merge: tag v0.32.0 on the merge commit + create a GitHub Release
(memory: feedback_github_release_per_tag.md — the tag alone isn't
enough; the Release prose is the operator-visible artifact).
Headline: closes#160. da query --remote now resolves query_mode='remote'
BQ rows whose entity is VIEW or MATERIALIZED_VIEW (the bug Pavel hit).
Plus 4 reinforcing fixes — server-side cost guardrail (bq_max_scan_bytes,
default 5 GiB), registry-gating of direct bq.* paths, bigquery_query()
function-call backdoor closed, structured CLI render of typed BQ errors —
and one operator-side admin convenience (BQ test-connection endpoint +
billing_project placeholder UI).
14 issues caught and addressed across 6 iterations of Devin Review.
E2E verified on agnes-zsrotyr.groupondev.com (commit 7f743d03):
- VIEW path resolves (count=23 from active_inventory_view)
- VIEW aggregate parity vs filtered BASE TABLE
- cost guardrail rejects with structured 400 detail
- bq_path_not_registered 403 (incl. quoted "bq" variant)
- bigquery_query() blocklist 400
- test-connection endpoint 200 with elapsed_ms
Devin Review iter #6 found 2 issues.
🟡 BUG: cli/error_render.py filtered out empty-string values via
`detail[key] not in (None, "")` and `value not in (None, "")` before
they could reach `_kv_line`. But `_kv_line` was specifically designed
to render empty strings as `(empty)` — the filter shadowed that
branch. The hidden field happens to be the most operator-actionable
one in `cross_project_forbidden`: `billing_project: ""` is the exact
diagnostic confirming WHY USER_PROJECT_DENIED fires.
Change filter to `is not None`. Empty strings now flow through
`_kv_line` and render as `billing_project: (empty)`.
📝 ANALYSIS: CHANGELOG wording for the test-connection endpoint said
"the saved data_source.bigquery config", which Devin flagged as
slightly misleading because `get_bq_access` is `@functools.cache`d —
"Test connection" tests the config in the running process, not the
just-saved YAML overlay. The save flow already returns
`restart_required: True` and the UI shows a banner, so the behavior
is documented; only the CHANGELOG wording was loose. Tightened to
"the **process-cached** BqAccess... Tests the config active in the
running process — after a save the response includes restart_required;
click Test AFTER restart to validate the freshly-saved values."
New test: test_renders_empty_string_as_empty_marker locks in the
empty-string-as-(empty) rendering for the cross_project_forbidden
case so a future filter change won't silently drop the diagnostic
again. 9 affected render tests pass.
Devin Review iter #5 flagged a pre-existing class of name/id mismatch
in app/api/query.py:131-136 — the SAME root cause as the bq.* RBAC
issue I fixed in iter #3 (line 332/362). Devin called it out as
"NOT introduced by this PR" / "might merit follow-up", but it's
exactly the same security-boundary pattern this PR is hardening, so
fixing here keeps the RBAC story consistent across the handler.
The `forbidden = all_views - set(allowed)` comparison mixed types:
- `all_views` carries DuckDB master view names (= registry display
`name` from the orchestrator's CREATE VIEW)
- `set(allowed)` carries registry IDs (resource_grants.resource_id)
When `id != name` (e.g. id="bq.finance.ue", name="ue"), authorized
users got spurious 403s — the view name landed in `forbidden` even
though the caller had a valid grant on the registry id.
Build a name->id map from the registry, then the forbidden check
compares apples to apples:
allowed_view_names = {r["name"] for r in registry_rows
if r.get("name") and r.get("id") in allowed_ids}
forbidden = all_views - allowed_view_names
107 affected tests pass; 487 pass in wider RBAC/query/access/admin
domain — no regressions.
Devin Review iter #4 caught: `_format_dict` in cli/error_render.py
seeded `seen = {"kind", "reason"}` to keep both out of the kv block.
But the label line uses only ONE of them (`kind or reason or "error"`),
so the other was silently dropped.
Quota rejections at app/api/query.py:423 (daily-budget) and 488
(concurrent-slot) emit BOTH keys: `{reason: "daily_byte_cap_exceeded",
kind: "daily_bytes", ...}` and `{reason: "concurrent_slot_exceeded",
kind: "concurrent_scans", ...}`. Operator only saw `kind` in the label
and never the more specific `reason` value.
Fix: track which key actually went into the label and skip only that
one. The other appears in the kv section.
Verified output:
Error: daily_bytes (HTTP 429)
reason: daily_byte_cap_exceeded
current: 99999
...
8 affected render tests pass.