agnes-the-ai-analyst/tests/test_diagnose_billing.py
ZdenekSrotyr b627de8344 feat(diagnose) + docs: warn on USER_PROJECT_DENIED footgun + document all newly-exposed knobs
Diagnostic + operator-facing documentation that closes the loop on the work in this PR.

`da diagnose` (via /api/health/detailed):
  - New _check_bq_billing_project() helper. When data_source.type='bigquery' and BqProjects.billing == .data, surface a yellow warning: 'BigQuery billing project equals data project'. Hint includes the YAML field path + the /admin/server-config UI shortcut. Diagnose's overall status promotes warning → degraded so the CLI echoes it.
  - Non-BQ instances (Keboola-only, etc.) skip the check.
  - Implementation hooks into the existing /api/health/detailed surface — no new endpoint, no CLI changes.

config/instance.yaml.example documentation:
  - data_source.bigquery.billing_project: USER_PROJECT_DENIED hint, /admin/server-config UI reference
  - data_source.bigquery.legacy_wrap_views: analyst-side discipline note (use `da fetch` / `da query --remote`), issue #101 history, view-heavy deployment guidance
  - data_source.bigquery.max_bytes_per_materialize: cost guardrail block (NEW — wasn't documented in .example before)
  - ai.base_url: provider list + UI hint
  - openmetadata + desktop: 'configurable via /admin/server-config UI' headers
  - corporate_memory: leading note that the schema is editable via UI

Other docs:
  - CHANGELOG.md: comprehensive Unreleased section
  - CLAUDE.md: schema chain → v20 + Materialized SQL connector mode + per-connector tab UI mention
  - README.md: mode-first source table summary
  - docs/architecture.md: per-connector tab UI mention
  - cli/skills/connectors.md: bootstrap rails (parallel to #154)
  - docs/superpowers/plans/2026-05-01-admin-tables-form-cleanup.md: implementation plan archive (2515 lines)
  - scripts/seed_dummy_tables.py: drop is_public after #150 RBAC migration (column gone)

Tests:
  - test_diagnose_billing.py — 3 cases (BQ with billing==data warns, BQ with billing!=data clean, non-BQ skips)
2026-05-01 20:27:24 +02:00

111 lines
3.8 KiB
Python

"""Phase K — `da diagnose` warning when BQ billing_project == project.
Surfaces via /api/health/detailed (which `da diagnose` already consumes):
when data_source.type == 'bigquery' and the resolved BqProjects.billing equals
BqProjects.data, the response includes a `services.bq_config` entry with
status='warning' and a hint about the 403 USER_PROJECT_DENIED footgun.
"""
import pytest
def _auth(token: str) -> dict:
return {"Authorization": f"Bearer {token}"}
def _patch_instance_config(monkeypatch, cfg: dict) -> None:
"""Replace app.instance_config.load_instance_config + reset caches.
Also clears connectors.bigquery.access.get_bq_access's @functools.cache
so each test sees fresh BqProjects.
"""
monkeypatch.setattr(
"app.instance_config.load_instance_config",
lambda: cfg,
raising=False,
)
# DATA_SOURCE env var, if set in the user shell, would override
# get_data_source_type — strip it for deterministic tests.
monkeypatch.delenv("DATA_SOURCE", raising=False)
monkeypatch.delenv("BIGQUERY_PROJECT", raising=False)
from app.instance_config import reset_cache
reset_cache()
@pytest.fixture(autouse=True)
def _reset_after(monkeypatch):
yield
# Always reset the cache after each test so the next test (or an
# unrelated suite running afterwards) sees fresh config.
try:
from app.instance_config import reset_cache
reset_cache()
except Exception:
pass
def test_diagnose_warns_when_billing_equals_project(seeded_app, monkeypatch):
"""BQ instance with billing_project missing (or equal to project) → warning."""
_patch_instance_config(monkeypatch, {
"data_source": {
"type": "bigquery",
"bigquery": {
"project": "shared-data-prod",
"billing_project": "shared-data-prod",
},
},
})
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/health/detailed", headers=_auth(token))
assert r.status_code == 200, r.text
body = r.json()
bq_cfg = body.get("services", {}).get("bq_config")
assert bq_cfg is not None, body
assert bq_cfg.get("status") == "warning", bq_cfg
# Hint mentions the YAML field path so operators know what to fix.
blob = (str(bq_cfg.get("detail", "")) + " " + str(bq_cfg.get("hint", ""))).lower()
assert "billing_project" in blob, bq_cfg
def test_diagnose_clean_when_billing_differs(seeded_app, monkeypatch):
"""Distinct billing_project → no warning surfaced."""
_patch_instance_config(monkeypatch, {
"data_source": {
"type": "bigquery",
"bigquery": {
"project": "data-prod",
"billing_project": "billing-dev",
},
},
})
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/health/detailed", headers=_auth(token))
assert r.status_code == 200, r.text
body = r.json()
bq_cfg = body.get("services", {}).get("bq_config")
# If present, it must be ok; absence is also fine (means no warning).
if bq_cfg is not None:
assert bq_cfg.get("status") == "ok", bq_cfg
def test_diagnose_no_warning_on_keboola_instance(seeded_app, monkeypatch):
"""Non-BQ instance: BQ billing check shouldn't surface at all."""
_patch_instance_config(monkeypatch, {"data_source": {"type": "keboola"}})
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/health/detailed", headers=_auth(token))
assert r.status_code == 200, r.text
body = r.json()
# Either absent or explicitly status='ok' (n/a). Definitely not 'warning'.
bq_cfg = body.get("services", {}).get("bq_config")
if bq_cfg is not None:
assert bq_cfg.get("status") != "warning", bq_cfg