Task 20: reusable pytest fixtures for the clean-bootstrap test suite.
Tasks 21 and 22 (reader smoke matrix + init smoke matrix) consume them.
- fastapi_test_server boots a real uvicorn subprocess against a tmp DATA_DIR,
pre-seeded with admin@example.com (Admin group), analyst@example.com
(Everyone group), and three tables (one per query_mode: local /
materialized / remote).
- web_session: cookie-authenticated httpx.Client for the admin user.
- test_pat: minted JWT for the analyst with table grants on local +
materialized.
- test_pat_no_grants: same shape, zero resource_grants.
- zero_grants_workspace: subprocess invocation of `agnes init` against the
no-grants PAT; returns the bootstrapped workspace path.
- NONEXISTENT_TABLE: module-level sentinel for the upcoming reader matrix.
Subprocess uvicorn (mirrors tests/test_e2e_corporate_memory.py) instead of
in-thread so DATA_DIR + module-level singletons in src.db don't bleed
across tests. agnes CLI invoked via `python -m cli.main` instead of the
.venv/bin/agnes shim, which depends on .pth file visibility that iCloud
Drive intermittently re-hides on macOS.
Two Task 4 review fixes for app/web/templates/install.html:
1. JSON-escape `ROLE` JS const via `{{ role | tojson }}` (defense in
depth — removes the dependency on Jinja autoescape semantics for JS
contexts; FastAPI's Literal validator already constrains role values).
2. Verify the analyst tile's clipboard payload is the analyst layout.
The pre-existing role-aware plumbing (compute_default_agent_prompt
threading role into setup_instructions_lines, picked up by the JS
SETUP_INSTRUCTIONS_TEMPLATE array) was correct; adding regression tests
that pin to the JS clipboard block specifically so a future inversion
would fail loudly.
Tests: analyst clipboard contains `agnes init` + `agnes catalog` and
NOT `agnes auth import-token` / `agnes skills`; admin clipboard is the
inverse. Plus an explicit assertion that ROLE is rendered via tojson.
Devin Review iter #6 found 2 issues.
🟡 BUG: cli/error_render.py filtered out empty-string values via
`detail[key] not in (None, "")` and `value not in (None, "")` before
they could reach `_kv_line`. But `_kv_line` was specifically designed
to render empty strings as `(empty)` — the filter shadowed that
branch. The hidden field happens to be the most operator-actionable
one in `cross_project_forbidden`: `billing_project: ""` is the exact
diagnostic confirming WHY USER_PROJECT_DENIED fires.
Change filter to `is not None`. Empty strings now flow through
`_kv_line` and render as `billing_project: (empty)`.
📝 ANALYSIS: CHANGELOG wording for the test-connection endpoint said
"the saved data_source.bigquery config", which Devin flagged as
slightly misleading because `get_bq_access` is `@functools.cache`d —
"Test connection" tests the config in the running process, not the
just-saved YAML overlay. The save flow already returns
`restart_required: True` and the UI shows a banner, so the behavior
is documented; only the CHANGELOG wording was loose. Tightened to
"the **process-cached** BqAccess... Tests the config active in the
running process — after a save the response includes restart_required;
click Test AFTER restart to validate the freshly-saved values."
New test: test_renders_empty_string_as_empty_marker locks in the
empty-string-as-(empty) rendering for the cross_project_forbidden
case so a future filter change won't silently drop the diagnostic
again. 9 affected render tests pass.
Devin Review iter #3 found 3 new real bugs after iter #2's fixes landed.
🔴 RBAC check at app/api/query.py:362 used `row["name"]` against
`accessible_set`, but `accessible_set` is keyed by registry IDs
(`get_accessible_tables` returns `resource_grants.resource_id` —
table IDs, not display names). Confirmed by `_table_blocks` projection
at `app/resource_types.py:157-158`. When `id != name` (e.g.
`id="bq.finance.ue", name="ue"`), non-admin users with valid grants
got 403 `bq_path_access_denied`. Switch to `row["id"]`.
🚩 Bare-name pass at app/api/query.py:332 had the same name-vs-id
mismatch (different impact): legitimate accessible rows were skipped
from `dry_run_set`, so the cost guardrail under-counted scan bytes
for non-admin users. Could let an over-cap query through and
under-bill quota. Switch to `row_id` comparison.
🟡 `placeholder_from` for billing_project was dead code.
`_BQ_OPTIONAL_FIELD_DEFAULTS["billing_project"] = ""` seeded an empty
string into every GET payload via `_ensure_bq_optional_fields`. JS
`isUnset = (value === undefined)` evaluated False, so the
`(defaults to <project>)` placeholder NEVER rendered. Drop the seed —
field stays in `known_fields` (UI sees it) but routes through the
unset rendering path on GET, where placeholder_from fires.
Tests: test_get_surfaces_bq_fields_even_when_unset assertion flipped
from "billing_project IS present" to "billing_project NOT auto-seeded"
to lock in the new shape. 67 affected tests pass.
Devin Review iter #2 found 2 new issues (after iter #1's 5 fixes
landed). Both real, both addressed.
🔴 Quota user_id key mismatch defeated shared daily budget. /api/query
computed `user.get("id") or user.get("email")` while /api/v2/scan uses
`user.get("email") or "anon"` (app/api/v2_scan.py:327). Same user → two
different keys in the singleton QuotaTracker. BQ bytes consumed via
/api/query were tracked under UUID; via /api/v2/scan under email; the
`check_daily_budget` pre-flight on either endpoint never saw the
other's recorded bytes — per-user cap was effectively doubled. Match
v2/scan's email-first ordering.
🟡 QuotaExceededError(KIND_CONCURRENT) → 400 instead of 429.
`quota.acquire(user_id)` raises this from __enter__ when the per-user
concurrent-scan slot is at cap. The exception propagated through the
@contextlib.contextmanager generator, the caller's `with guard:`
block, and was caught by execute_query's generic `except Exception`
handler → mapped to 400 with a flattened "Query error: concurrent_scans:
N/M" string, dropping the typed retry_after_seconds field. Wrap the
`with quota.acquire(...)` in a try/except QuotaExceededError that maps
to 429 with the same typed-detail shape used for the daily-budget
rejection — consistent with /api/v2/scan:392-402.
Tests: test_api_query_quota.py user_id strings updated to
"admin@test.com" (the seeded_app admin's email) to match the new
email-first ordering. 40 affected tests pass.
CI failures on PR #168 after rebasing onto main + PR #169/#170:
gw2 worker bucket reproducibly fails test_admin_can_list_registry +
test_three_sources_catalog_count with `assert "X" in set()` — the
register-table POST landed but list/catalog endpoints returned empty.
Root cause: pre-existing module-level cache leak across tests on the
same xdist worker process. `app.instance_config._instance_config`,
`connectors.bigquery.access.get_bq_access` (functools.cache), and
`app.api.v2_quota._quota_singleton` all survive across function-scoped
fixtures, so a prior test that read instance.yaml against an old
DATA_DIR poisons the next test's env even after `monkeypatch.setenv`
resets DATA_DIR.
Pre-existing on main — surfaced now because #160's new tests changed
the xdist test bucket distribution and dropped a different mix of
tests onto gw2 that hit the leak. Direct cause is unchanged; my T1a
fix in test_main_exits_when_project_missing addressed one symptom of
the same pollution but didn't generalize.
Add an autouse fixture in conftest.py that resets all three caches
before every test. Generic fix; helps any future test that reads
instance.yaml or BqAccess and would otherwise be order-dependent on
the worker.
Two new test files driving the next commit's admin UI work.
tests/test_admin_bigquery_test_connection.py — POST
/api/admin/bigquery/test-connection (admin-only health probe). 6 cases:
- success → 200 with ok=true + resolved billing_project / data_project
/ elapsed_ms
- not_configured → 400 with the typed BqAccessError detail surface
- cross_project_forbidden (USER_PROJECT_DENIED simulation) → 502
- 10s timeout → 504 with kind="timeout" (best-effort cancel_job)
- non-admin caller → 403
- unauthenticated → 401
The endpoint matters for the operator side of the reporter's loop —
admin saves data_source.bigquery in /admin/server-config, clicks
"Test connection", gets typed structured feedback BEFORE any analyst
hits a query failure.
tests/test_admin_server_config_placeholder.py — `billing_project`
field-spec must carry `placeholder_from: ["data_source", "bigquery",
"project"]` so the JS template can resolve and inject
"(defaults to <project>)" greyed under the input when the operator
hasn't set billing_project explicitly. This makes the existing
"billing falls back to data" rule (connectors/bigquery/access.py:
339-340) visible in the UI.
7 RED on the current branch (endpoint and placeholder_from key both
absent). GREEN landing in the next commit.
The reporter (#160) saw `USER_PROJECT_DENIED` raw in the CLI because
all three CLI error-rendering paths flatten typed BqAccessError /
guardrail / RBAC dicts to a truncated single-line string, hiding the
structured `hint` field that explains how to fix the misconfig.
Fix: shared `cli/error_render.py:render_error(status_code, body)` that
recognizes the canonical typed shapes and pretty-prints them. Falls
back to truncated-and-flattened form for unrecognized bodies, so the
renderer never makes worse-than-status-quo output.
Recognized shapes:
- {detail: {kind: ..., hint?, billing_project?, data_project?}}
— typed BqAccessError responses from /api/v2/scan, /sample, /schema,
/api/query (when /api/query escalates a BQ failure)
- {detail: {reason: 'remote_scan_too_large', scan_bytes, limit_bytes,
tables, suggestion}} — new /api/query cost-guardrail rejection
- {detail: {reason: 'bq_path_not_registered'/'bq_path_access_denied',
path, hint?, registered_as?}} — new /api/query RBAC patch
- {detail: '...'} — string detail (legacy endpoints)
Wired through 3 CLI paths:
- cli/v2_client.py: V2ClientError.__str__ delegates to render_error;
pre-truncation removed from V2ClientError.message (was hiding hints
past 200 chars).
- cli/commands/query.py:_query_remote: parse JSON body, call renderer
on error.
- cli/commands/query.py:_query_hybrid: catch RemoteQueryError, build
synthetic `{detail: {kind: error_type, **details}}` payload, render.
tests/test_cli_query.py:test_remote_query_failure: assertion updated
from `"Query failed"` (no longer printed) to `HTTP 400` + `bad SQL`
(what the renderer surfaces for string detail).
Sample output for cross_project_forbidden:
Error: cross_project_forbidden (HTTP 502)
billing_project: (empty)
data_project: prj-example-data-001
message: USER_PROJECT_DENIED on bigquery.googleapis.com
hint: Set data_source.bigquery.billing_project in
/admin/server-config to a project where the SA has
serviceusage.services.use, or grant the SA that role on the
data project.
19 tests pass — 10 from T4a now GREEN + 3 prior cli_query tests still
green + 6 ancillary.
Phase 3 review identified an RBAC + cost-cap bypass: `SELECT * FROM
"bq"."ds"."tbl"` (catalog token quoted as a DuckDB identifier) was NOT
matched by the BQ_PATH regex, so direct quoted-form references skipped
both the registry check and the cost-cap dry-run. DuckDB resolves
`"bq"` to the same ATTACHed BQ catalog, so the bypass is real.
Widen the catalog-token alternation: `(?:"bq"|bq)` matches both forms.
Negative lookbehind `(?<![\w.])` still rejects look-alike prefixes
(`other_bq`, `my_bq`); the new "my_bq".ds.tbl negative test locks that
in alongside `other_bq.ds.tbl`.
Tests:
- 2 new positive cases in tests/test_query_bq_regex.py for the quoted
form (`"bq"."finance"."ue"` and uppercase `"BQ"."ds"."tbl"`).
- 1 new negative case rejecting `"my_bq".ds.tbl` so the quoted-form
widening doesn't open a different evasion.
- 1 new RBAC test in tests/test_api_query_rbac_bq_path.py: admin
hitting an unregistered quoted path returns the same
bq_path_not_registered 403 as the unquoted form.
All 33 Phase 3 tests pass after the fix.
3 new test files that drive the upcoming cli/error_render.py module
and the V2ClientError refactor.
tests/test_cli_error_render.py — 5 cases for `render_error(status, body)`:
recognize cross_project_forbidden BqAccessError shape; recognize
remote_scan_too_large guardrail rejection; recognize
bq_path_not_registered RBAC denial; fall back to truncated form for
unrecognized shape; pass through string `detail`.
tests/test_cli_query_render.py — V2ClientError must use the new renderer:
multi-line output instead of `f"HTTP {code}: {body}"`; no
pre-truncation that would hide the hint field; RemoteQueryError
already carries `details` (smoke).
tests/test_remote_query_error_details.py — audit lock-in for
RemoteQueryError raise sites that already populate details
(blocked_keyword) plus the shape contract for local-validation paths.
Run: 5 errors (cli.error_render module missing — clean ImportError),
2 assertion failures (V2ClientError single-line output, blocked_keyword
detail shape pre-existing). 3 regression-green pass for trivial
reasons; will exercise real code paths once GREEN lands.
The headline implementation for issue #160. POST /api/query now gates
direct `bq."<dataset>"."<source_table>"` references behind the registry
and bounds the BQ scan cost behind a configurable cap. Wired through
the same singleton QuotaTracker as /api/v2/scan so daily-byte budgets
are shared across both BQ-touching paths.
Changes in app/api/query.py:
- Add module-level `BQ_PATH` regex matching the 16 syntax variants
verified empirically (fully-quoted, unquoted, mixed quoting,
case-insensitive, inside CTE bodies, multi-path, …).
- Add `bigquery_query` to the SQL keyword blocklist. Closes the
pre-existing function-call backdoor where a user could run an
arbitrary BQ jobs API call against any reachable dataset, bypassing
the registry and RBAC. Wrap views internal to the BQ extractor still
use bigquery_query() — but those run via DuckDB view resolution at
query time, not via user-submitted SQL, so the blocklist doesn't
break them.
- Add `_bq_guardrail_inputs` helper: walks user SQL twice — once for
bare-name matches against accessible registered remote-BQ names
(contributes to dry_run_set), once for direct `bq.X.Y` matches
(gated against `find_by_bq_path` lookups, returns 403 with
structured detail on miss or grant violation).
- Add `_enforce_remote_bq_quota_and_cap` helper: pre-flight
`check_daily_budget` (over-cap → 429), then `with quota.acquire(...)`
wraps a per-path BQ dry-run, sums bytes, raises 400
`remote_scan_too_large` when total > cap.
- Cap default 5 GiB; configurable via `api.query.bq_max_scan_bytes`
in /admin/server-config (next phase wires the UI).
- Post-flight `record_bytes` against the user's daily counter.
- Module-level imports of `_bq_dry_run_bytes`, `_build_quota_tracker`,
`get_bq_access` so tests can monkeypatch via `app.api.query.<name>`.
Tests:
- All 23 RED tests from the previous commit now pass (regex matrix,
blocklist with detail-string assertion, RBAC unregistered/admin-bypass,
guardrail dry-run-called/over-cap-rejected, quota pre-flight 429).
- mock_dry_run fixture stubs both `_bq_dry_run_bytes` and `get_bq_access`
so guardrail tests don't require a live BQ project.
- Quota test uses `admin1` (the seeded_app fixture's actual user id, not
`admin`).
Smoke: 887 passed across query/bq/admin/extractor/registry/quota
domains. No regressions.
5 new test files for the upcoming /api/query pre-flight block (next
commit). All failing for the right reason on the current codebase:
tests/test_query_bq_regex.py (8 + 1 + 7 + 1 = 17 cases)
Pure unit test of `BQ_PATH` regex constant (not yet imported from
app.api.query). Verifies the 16-case matrix from spec §4.3.1:
positive matches for fully-quoted / unquoted / mixed quoting / case
variants / inside CTE bodies / multiple paths in one statement;
negative for bare registered names / 2-part bq.col / prefix that
contains bq / middle-component bq / quoted bare names; documented
string-literal false-positive accepted.
tests/test_query_bigquery_query_blocked.py (3 cases)
POST /api/query with bigquery_query() function call must hit the
canonical blocklist rejection ("Only single SELECT queries are
allowed"). Today the blocklist passes all 3 — confirmed RED via
detail-string assertion.
tests/test_api_query_rbac_bq_path.py (4 cases)
Direct bq."<ds>"."<tbl>" references must be registry-gated:
unregistered → 403 bq_path_not_registered; registered + admin →
bypass per-name grant; case-insensitive lookup; string-literal
containing bq.X.Y → 403 (strict-deny).
tests/test_api_query_guardrail.py (3 cases)
Cost guardrail: SQL referencing a registered remote BQ row invokes
_bq_dry_run_bytes (verified via call-counter side effect); over-cap
dry-run returns 400 remote_scan_too_large with bytes/tables/suggestion
in detail; non-BQ queries skip the dry-run entirely.
tests/test_api_query_quota.py (3 cases)
Daily-byte quota check_daily_budget pre-flight (over-cap → 429
before dry-run); record_bytes post-flight on the shared singleton
v2_quota tracker; non-BQ queries leave the counter alone.
RED breakdown: 16 ImportError (BQ_PATH not yet defined) + 7 assertion
failures = 23 fully-RED. 6 tests pass for regression-green reasons
(use `if r.status_code == 403:` patterns where current code returns
400 for unrelated reasons). They serve as anti-regression guards once
the implementation lands and remain green throughout — documented per
spec §6 Phase 1 RED-discipline notes.
The upcoming /api/query RBAC patch (next phase) gates direct
`bq."<dataset>"."<source_table>"` references in user SQL — every such path
must point at a registered query_mode='remote' BigQuery row, otherwise the
caller has stepped around the registry and around RBAC.
Add `TableRegistryRepository.find_by_bq_path(bucket, source_table)` to
support that lookup. Returns None if no row matches, the row dict if
exactly one matches, or the oldest-by-`registered_at` row when 2+ match
(no UNIQUE constraint on `(source_type, bucket, source_table)` — admins
can in principle register a BQ table twice with different ids/names).
Match is case-insensitive on bucket+source_table so user SQL `SELECT FROM
bq.Finance.UE` resolves to a `(finance, ue)` registry row. NULL values in
either column are excluded so a legacy NULL-bucket row never masks a
legitimate non-NULL lookup.
5 RED tests cover: empty registry, non-BQ source rejected, single match,
oldest-of-many tie-breaker, case-insensitive match, NULL-column exclusion.
All initially failed with AttributeError; pass after the ~30 LOC method
addition.
Now that VIEW/MATERIALIZED_VIEW always wrap via bigquery_query() (the
prior `legacy_wrap_views=True` branch behavior, made unconditional in
the previous commit), the toggle has no semantic meaning and is removed
across the codebase.
Production code:
- app/api/admin.py: drop the field from _OPTIONAL_FIELDS["data_source"]
["bigquery"]["fields"] and from _BQ_OPTIONAL_FIELD_DEFAULTS, plus the
comment block above the defaults dict.
- config/instance.yaml.example: drop the example snippet.
- src/orchestrator.py: update the inner-objects skip-branch comment to
reflect the new BQ behavior (the skip itself stays — keboola
use_extension=False still inserts _meta rows without inner views).
- app/web/templates/admin_tables.html: rewrite operator copy in the
register and edit forms to reflect always-wrap.
Tests:
- tests/test_admin_server_config.py (TestServerConfigBigQueryFields):
flip assertions from "field IS present" to "field NOT present" on
legacy_wrap_views. Drop the test_post_persists_legacy_wrap_views test
since the field no longer exists.
- tests/test_admin_server_config_known_fields.py: same flip on the
known-fields registry assertion.
- tests/test_bigquery_extractor.py: drop the obsolete
test_view_entity_does_not_create_master_view_by_default (asserted the
bug we fixed) and test_legacy_wrap_views_toggle_restores_old_behavior
(toggle no longer meaningful). Update remaining test docstrings.
Operators with `legacy_wrap_views: true` set in their overlay get the
new (equivalent) behavior automatically — the unrecognized key is
silently ignored by the YAML loader. Operators with `false` get the
issue-#160 fix as a behavior change, not a regression.
Spec gate updated: production code grep gate
grep -rn 'legacy_wrap_views' connectors app src config cli
must return zero. tests/ excluded — historical "removed in #160"
breadcrumbs and `assert "X" not in fields` regression guards retained
as anti-regression signals.
Issue #160: da query --remote against query_mode='remote' BQ rows whose
underlying entity is a VIEW or MATERIALIZED_VIEW returned a DuckDB catalog
error because the extractor (with legacy_wrap_views=False default since
the v2 fetch primitives release) skipped master-view creation for those
entity types — but kept inserting the _meta row, leaving operators with a
registered name that resolves to nothing.
Always create a master view for entity types we have proven runtime support
for in this codebase:
BASE TABLE → bq."<dataset>"."<source_table>"
(Storage Read API path; predicate pushdown)
VIEW / MAT_VIEW → bigquery_query('<project>', 'SELECT * FROM `proj.ds.tbl`')
(jobs API path; no pushdown — the upcoming /api/query
cost guardrail bounds the scan; was the legacy
legacy_wrap_views=True branch SQL form, just always-on)
For other entity types (EXTERNAL, SNAPSHOT, CLONE, future), log a warning
and SKIP both the master view AND the _meta row. The registry row remains
intact so /api/v2/scan still works for `da fetch`; we just don't expose a
stale _meta entry that the orchestrator would later strand.
The legacy_wrap_views config knob is still readable in this commit (read
returns the value, which is then ignored). Removal across the rest of
the codebase happens in the follow-up REFACTOR commit.
tests/test_bigquery_extractor.py:
- Add 3 RED tests covering the new always-wrap behavior:
test_view_creates_wrap_view_with_default_config,
test_materialized_view_creates_wrap_view_with_default_config,
test_unsupported_entity_type_skips_meta_and_view.
- Fix pre-existing flakiness in test_main_exits_when_project_missing
by resetting app.instance_config cache before the no-project mock —
the prior test populates the cache with a project, and removing the
legacy_wrap_views get_value() call surfaced this latent ordering bug.
- _list_tables now accepts a user param and delegates to
get_accessible_tables: admins see all, non-admins see only tables
covered by their resource_grants. Fixes silent leak of table names
to unauthorised analysts.
- today derived from now.date() (UTC) instead of date.today()
(server-local TZ), so today and now are always consistent.
- Updated test_render_override_tables_list to seed an admin user so
RBAC filtering doesn't hide the table; added three new tests covering
per-user table isolation, admin sees-all, and no-grants-empty.
Finding #1: _build_context now routes through render_agent_prompt_banner when
a DB connection is available, so both /setup and the /dashboard clipboard CTA
always reflect the admin override (or the live default when no override is set).
Previously _build_context unconditionally used resolve_lines(), ignoring the
welcome_template override for the dashboard JS array.
Finding #2: PUT /api/admin/welcome-template now performs a second render pass
with user=None (anonymous stub) after the authenticated-user pass. Templates
that reference user.* fields without an {% if user %} guard are rejected with
a clear 400 error explaining the anon-visitor breakage.
- Fix#1: _detect_existing_project now checks .claude/settings.json for
"da sync" marker instead of deleted CLAUDE.md; update tests accordingly.
- Fix#2: preview endpoint uses autoescape=False to match /setup rendering;
align render_agent_prompt_banner in welcome_template.py to the same.
- Fix#3: apply _sanitize_banner_html to override render path in setup_page
so all render paths sanitize consistently.
- Fix#4: move .setup-link-banner into the existing-user branch where
account_details.last_sync_display is reachable; remove dead copy from
new-user branch.
The /admin/agent-prompt editor now pre-fills with the full bash bootstrap
script from setup_instructions.resolve_lines() instead of being empty.
When an admin saves an override it replaces the default everywhere — the
/setup page display and the dashboard clipboard CTA — rather than adding a
banner above the auto-generated commands.
GET /api/admin/welcome-template now returns a `default` field with the live
computed script so the editor always shows meaningful starting content.
{server_url} and {token} single-brace placeholders survive Jinja2 rendering
and are substituted by JavaScript at clipboard-copy time as before.
Preview pane switches to textContent (not innerHTML) since content is bash.