* fix(security+ops): #82 #85 #87 — auth hardening, API validation, deploy posture Security and operational hardening across three issue groups: - M23: docker-compose.override.yml → docker-compose.dev.yml (BREAKING, prod foot-gun) - C13: Container runs as non-root user 'agnes' (USER directive in Dockerfile) - M21: Docker resource limits (mem_limit, cpus) on app + scheduler - M22: Caddyfile security headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, -Server) - M17: /api/health split into minimal (unauth) + /api/health/detailed (auth) (BREAKING) - M26: release.yml restricts build-and-push to main + workflow_dispatch; paths-ignore for docs - C2: table_id traversal validation on /api/data/{table_id}/download - M4: Upload streaming (chunk-read + temp file) instead of full-buffer; /local-md hashed filename - C5: reset_token removed from POST /api/users/{id}/reset-password response - C8: Startup WARNING when no user has password_hash (bootstrap window visible) - M9: Audit log on failed web form login (mirrors /auth/token endpoint) - M10: Atomic magic-link consume via compare-and-swap (CONSUMED: marker + DuckDB conflict catch) Also: SSRF protection on /api/admin/configure (#46), memory stats SQL aggregation (#90) Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF 169.254.x.x + IPv6 multicast; M10 marker cleanup safety Review fixes: - Add 169.254.0.0/16 (link-local, cloud metadata) to SSRF regex — was missing, allowing requests to AWS/GCP/Azure metadata endpoints - Add ff[0-9a-f]{2}: (IPv6 multicast) to SSRF regex - M10: wrap Step 3 (CONSUMED marker cleanup) in try-except with warning log — prevents unhandled exception if DB write fails after successful token consumption - Add test for 169.254.169.254 SSRF rejection Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF IPv6 bypass, CLI health endpoint, upload FD leak Address Devin Review findings on PR #104: 1. SSRF IPv6 bypass: Replace hostname regex with DNS resolution + ipaddress module checks. The old regex patterns like `fe80:` only matched up to the first colon, missing real IPv6 addresses like `fe80::1`, `fc00::1`, `ff02::1`. The new approach resolves the hostname via getaddrinfo and checks each resulting IP against ipaddress.is_private/is_loopback/is_link_local/is_reserved/is_multicast. 2. CLI commands broken: `da setup test-connection`, `da setup verify`, `da diagnose`, `da status` all called /api/health expecting the old format (status=="healthy", services dict). Now they call /api/health/detailed for service-level checks (with graceful fallback to the minimal endpoint when auth is not configured). 3. Temp file handle leak: _stream_to_temp returns an open NamedTemporaryFile; callers now close it before shutil.move() to prevent FD leaks until GC. Also adds IPv6 SSRF test cases (loopback, link-local, unique-local, multicast) with mocked DNS resolution for test environment independence. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): download regex blocks hyphenated IDs; document health split Address Devin Review round-3 findings on PR #104: 1. _SAFE_IDENTIFIER regex blocked hyphenated table IDs: The download endpoint used the strict SQL-identifier regex which does not allow dots or hyphens, but Keboola table IDs like in.c-crm.orders contain both. Switched to _SAFE_QUOTED_IDENTIFIER which allows dots and hyphens while still blocking path-traversal chars (/, .., \) and quote/control characters. Added test for hyphenated/dotted IDs. 2. Documented health endpoint split in DEPLOYMENT.md: Added Health checks & external monitoring section explaining both endpoints (minimal unauth /api/health vs authenticated /api/health/detailed) and how to wire external monitoring tools to the detailed endpoint with a PAT. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * release(0.12.1): cut hotfix for snapshot integrity + #82/#85/#87 hardening * fix(security): apply CAS pattern to password reset confirm (#82/M10 follow-up) Devin review on the rebased PR flagged the asymmetry: magic-link verify got the atomic compare-and-swap pattern in the original M10 fix, but password reset confirm at /auth/password/reset/confirm was still using read-validate-clear. Two concurrent POSTs with the same valid reset token could both succeed in setting different new passwords (last-write- wins). Lower severity than the magic-link race because the attacker would need the reset token AND to race the legitimate user, but the asymmetry was a polish gap. Mirrors app/auth/providers/email.py::_consume_token CAS exactly: write unique CONSUMED:<random> marker via UPDATE...WHERE token=old_token, then SELECT to verify our marker won, then proceed. Only the winner clears the marker and applies the password change. New regression test_concurrent_reset_only_one_wins in tests/test_password_flows.py::TestResetConfirm pins the contract: two ThreadPoolExecutor workers + Barrier hit /reset/confirm with the same token; exactly one gets 302 (password applied), the other gets 200 with 'Invalid or expired'. Sanity-checked against the pre-CAS code — both POSTs got 302 (race confirmed). --------- Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
517 lines
19 KiB
Python
517 lines
19 KiB
Python
"""Tests for FastAPI endpoints."""
|
|
|
|
import os
|
|
import pytest
|
|
from fastapi.testclient import TestClient
|
|
|
|
|
|
def _auth(token):
|
|
return {"Authorization": f"Bearer {token}"}
|
|
|
|
|
|
@pytest.fixture
|
|
def app_client(tmp_path, monkeypatch):
|
|
monkeypatch.setenv("DATA_DIR", str(tmp_path))
|
|
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret")
|
|
from app.main import create_app
|
|
app = create_app()
|
|
return TestClient(app)
|
|
|
|
|
|
@pytest.fixture
|
|
def seeded_client(tmp_path, monkeypatch):
|
|
"""Client with a pre-created admin user and JWT token."""
|
|
monkeypatch.setenv("DATA_DIR", str(tmp_path))
|
|
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret")
|
|
from app.main import create_app
|
|
from src.db import get_system_db
|
|
from src.repositories.users import UserRepository
|
|
from app.auth.jwt import create_access_token
|
|
|
|
from argon2 import PasswordHasher
|
|
ph = PasswordHasher()
|
|
|
|
from tests.helpers.auth import grant_admin
|
|
|
|
conn = get_system_db()
|
|
repo = UserRepository(conn)
|
|
repo.create(id="admin1", email="admin@acme.com", name="Admin", role="admin",
|
|
password_hash=ph.hash("adminpass"))
|
|
repo.create(id="analyst1", email="analyst@acme.com", name="Analyst", role="analyst",
|
|
password_hash=ph.hash("analystpass"))
|
|
grant_admin(conn, "admin1")
|
|
conn.close()
|
|
|
|
app = create_app()
|
|
client = TestClient(app)
|
|
|
|
admin_token = create_access_token("admin1", "admin@acme.com", "admin")
|
|
analyst_token = create_access_token("analyst1", "analyst@acme.com", "analyst")
|
|
|
|
return client, admin_token, analyst_token
|
|
|
|
|
|
# ---- Health ----
|
|
|
|
class TestHealth:
|
|
def test_health_no_auth(self, app_client):
|
|
resp = app_client.get("/api/health")
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert data["status"] == "ok"
|
|
|
|
def test_health_detailed_requires_auth(self, app_client):
|
|
resp = app_client.get("/api/health/detailed")
|
|
assert resp.status_code in (401, 403)
|
|
|
|
def test_health_detailed_has_duckdb_check(self, seeded_app):
|
|
c = seeded_app["client"]
|
|
resp = c.get("/api/health/detailed", headers=_auth(seeded_app["admin_token"]))
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert data["status"] in ("healthy", "degraded", "unhealthy")
|
|
assert "services" in data
|
|
assert "duckdb_state" in data["services"]
|
|
assert data["services"]["duckdb_state"]["status"] == "ok"
|
|
|
|
|
|
# ---- Auth ----
|
|
|
|
class TestAuth:
|
|
def test_token_for_existing_user(self, seeded_client):
|
|
client, _, _ = seeded_client
|
|
resp = client.post("/auth/token", json={"email": "admin@acme.com", "password": "adminpass"})
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert "access_token" in data
|
|
assert data["role"] == "admin"
|
|
|
|
def test_token_for_unknown_user(self, seeded_client):
|
|
client, _, _ = seeded_client
|
|
resp = client.post("/auth/token", json={"email": "nobody@acme.com"})
|
|
assert resp.status_code == 401
|
|
|
|
def test_protected_endpoint_without_token(self, seeded_client):
|
|
client, _, _ = seeded_client
|
|
resp = client.get("/api/users")
|
|
assert resp.status_code == 401
|
|
|
|
def test_protected_endpoint_with_token(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.get("/api/users", headers={"Authorization": f"Bearer {admin_token}"})
|
|
assert resp.status_code == 200
|
|
|
|
|
|
# ---- RBAC ----
|
|
|
|
class TestRBAC:
|
|
def test_admin_can_list_users(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.get("/api/users", headers={"Authorization": f"Bearer {admin_token}"})
|
|
assert resp.status_code == 200
|
|
assert len(resp.json()) == 2
|
|
|
|
def test_analyst_cannot_list_users(self, seeded_client):
|
|
client, _, analyst_token = seeded_client
|
|
resp = client.get("/api/users", headers={"Authorization": f"Bearer {analyst_token}"})
|
|
assert resp.status_code == 403
|
|
|
|
def test_analyst_cannot_trigger_sync(self, seeded_client):
|
|
client, _, analyst_token = seeded_client
|
|
resp = client.post("/api/sync/trigger", headers={"Authorization": f"Bearer {analyst_token}"})
|
|
assert resp.status_code == 403
|
|
|
|
|
|
# ---- Sync Manifest ----
|
|
|
|
class TestSyncManifest:
|
|
def test_manifest_returns_tables(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
# Seed some sync state
|
|
from src.db import get_system_db
|
|
from src.repositories.sync_state import SyncStateRepository
|
|
conn = get_system_db()
|
|
repo = SyncStateRepository(conn)
|
|
repo.update_sync(table_id="orders", rows=1000, file_size_bytes=5000, hash="abc")
|
|
conn.close()
|
|
|
|
resp = client.get("/api/sync/manifest", headers={"Authorization": f"Bearer {admin_token}"})
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert "tables" in data
|
|
assert "orders" in data["tables"]
|
|
assert data["tables"]["orders"]["rows"] == 1000
|
|
|
|
|
|
# ---- Users CRUD ----
|
|
|
|
class TestUsersCRUD:
|
|
def test_create_user(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.post(
|
|
"/api/users",
|
|
json={"email": "new@acme.com", "name": "New User"},
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
assert resp.status_code == 201
|
|
assert resp.json()["email"] == "new@acme.com"
|
|
|
|
def test_create_duplicate_user(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.post(
|
|
"/api/users",
|
|
json={"email": "admin@acme.com", "name": "Duplicate"},
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
assert resp.status_code == 409
|
|
|
|
def test_delete_user(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.delete(
|
|
"/api/users/analyst1",
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
assert resp.status_code == 204
|
|
|
|
|
|
# ---- Knowledge / Memory ----
|
|
|
|
class TestMemory:
|
|
def test_create_and_list(self, seeded_client):
|
|
client, _, analyst_token = seeded_client
|
|
headers = {"Authorization": f"Bearer {analyst_token}"}
|
|
|
|
# Create
|
|
resp = client.post("/api/memory", json={
|
|
"title": "MRR Definition",
|
|
"content": "Monthly recurring revenue",
|
|
"category": "metrics",
|
|
}, headers=headers)
|
|
assert resp.status_code == 201
|
|
item_id = resp.json()["id"]
|
|
|
|
# List
|
|
resp = client.get("/api/memory", headers=headers)
|
|
assert resp.status_code == 200
|
|
assert resp.json()["count"] == 1
|
|
|
|
def test_vote(self, seeded_client):
|
|
client, _, analyst_token = seeded_client
|
|
headers = {"Authorization": f"Bearer {analyst_token}"}
|
|
|
|
resp = client.post("/api/memory", json={
|
|
"title": "Test", "content": "test", "category": "test",
|
|
}, headers=headers)
|
|
item_id = resp.json()["id"]
|
|
|
|
resp = client.post(f"/api/memory/{item_id}/vote", json={"vote": 1}, headers=headers)
|
|
assert resp.status_code == 200
|
|
assert resp.json()["upvotes"] == 1
|
|
|
|
def test_search(self, seeded_client):
|
|
client, _, analyst_token = seeded_client
|
|
headers = {"Authorization": f"Bearer {analyst_token}"}
|
|
|
|
client.post("/api/memory", json={
|
|
"title": "Revenue report", "content": "MRR trends", "category": "finance",
|
|
}, headers=headers)
|
|
client.post("/api/memory", json={
|
|
"title": "Support SLA", "content": "Response times", "category": "support",
|
|
}, headers=headers)
|
|
|
|
resp = client.get("/api/memory?search=revenue", headers=headers)
|
|
assert resp.json()["count"] == 1
|
|
|
|
|
|
# ---- Upload ----
|
|
|
|
class TestUpload:
|
|
def test_upload_session(self, seeded_client):
|
|
client, _, analyst_token = seeded_client
|
|
headers = {"Authorization": f"Bearer {analyst_token}"}
|
|
resp = client.post(
|
|
"/api/upload/sessions",
|
|
files={"file": ("session.jsonl", b'{"role":"user","content":"hello"}', "application/jsonl")},
|
|
headers=headers,
|
|
)
|
|
assert resp.status_code == 200
|
|
assert resp.json()["size"] > 0
|
|
|
|
def test_upload_local_md(self, seeded_client):
|
|
client, _, analyst_token = seeded_client
|
|
headers = {"Authorization": f"Bearer {analyst_token}"}
|
|
resp = client.post(
|
|
"/api/upload/local-md",
|
|
json={"content": "# My knowledge\n\nMRR = Monthly Recurring Revenue"},
|
|
headers=headers,
|
|
)
|
|
assert resp.status_code == 200
|
|
assert resp.json()["user"] == "analyst@acme.com"
|
|
|
|
|
|
# ---- Metrics API ----
|
|
|
|
SAMPLE_METRIC = {
|
|
"id": "finance/mrr",
|
|
"name": "mrr",
|
|
"display_name": "Monthly Recurring Revenue",
|
|
"category": "finance",
|
|
"sql": "SELECT SUM(amount) FROM subscriptions WHERE active = true",
|
|
}
|
|
|
|
|
|
class TestMetricsAPI:
|
|
def test_list_metrics_empty(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
headers = {"Authorization": f"Bearer {admin_token}"}
|
|
resp = client.get("/api/metrics", headers=headers)
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert data["count"] == 0
|
|
assert data["metrics"] == []
|
|
|
|
def test_create_and_list_metric(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
headers = {"Authorization": f"Bearer {admin_token}"}
|
|
|
|
resp = client.post("/api/admin/metrics", json=SAMPLE_METRIC, headers=headers)
|
|
assert resp.status_code == 201
|
|
|
|
resp = client.get("/api/metrics", headers=headers)
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert data["count"] == 1
|
|
assert data["metrics"][0]["id"] == "finance/mrr"
|
|
|
|
def test_get_metric_detail(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
headers = {"Authorization": f"Bearer {admin_token}"}
|
|
|
|
client.post("/api/admin/metrics", json=SAMPLE_METRIC, headers=headers)
|
|
|
|
resp = client.get("/api/metrics/finance/mrr", headers=headers)
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert data["id"] == "finance/mrr"
|
|
assert data["display_name"] == "Monthly Recurring Revenue"
|
|
assert data["category"] == "finance"
|
|
|
|
def test_get_metric_not_found(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
headers = {"Authorization": f"Bearer {admin_token}"}
|
|
|
|
resp = client.get("/api/metrics/nonexistent/metric", headers=headers)
|
|
assert resp.status_code == 404
|
|
|
|
def test_delete_metric(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
headers = {"Authorization": f"Bearer {admin_token}"}
|
|
|
|
client.post("/api/admin/metrics", json=SAMPLE_METRIC, headers=headers)
|
|
|
|
resp = client.delete("/api/admin/metrics/finance/mrr", headers=headers)
|
|
assert resp.status_code == 200
|
|
assert resp.json()["status"] == "deleted"
|
|
|
|
resp = client.get("/api/metrics/finance/mrr", headers=headers)
|
|
assert resp.status_code == 404
|
|
|
|
def test_analyst_cannot_create_metric(self, seeded_client):
|
|
client, _, analyst_token = seeded_client
|
|
headers = {"Authorization": f"Bearer {analyst_token}"}
|
|
|
|
resp = client.post("/api/admin/metrics", json=SAMPLE_METRIC, headers=headers)
|
|
assert resp.status_code == 403
|
|
|
|
def test_list_metrics_filter_by_category(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
headers = {"Authorization": f"Bearer {admin_token}"}
|
|
|
|
finance_metric = {**SAMPLE_METRIC}
|
|
support_metric = {
|
|
"id": "support/tickets",
|
|
"name": "tickets",
|
|
"display_name": "Open Tickets",
|
|
"category": "support",
|
|
"sql": "SELECT COUNT(*) FROM tickets WHERE status = 'open'",
|
|
}
|
|
client.post("/api/admin/metrics", json=finance_metric, headers=headers)
|
|
client.post("/api/admin/metrics", json=support_metric, headers=headers)
|
|
|
|
resp = client.get("/api/metrics?category=finance", headers=headers)
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert data["count"] == 1
|
|
assert data["metrics"][0]["category"] == "finance"
|
|
|
|
def test_import_metrics_yaml(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
yaml_content = b"- name: test_metric\n display_name: Test\n category: test\n sql: SELECT 1\n"
|
|
resp = client.post(
|
|
"/api/admin/metrics/import",
|
|
files={"file": ("test.yml", yaml_content, "application/x-yaml")},
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
assert resp.status_code == 200
|
|
assert resp.json()["count"] == 1
|
|
|
|
|
|
class TestMetadataAPI:
|
|
def test_get_metadata_empty(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.get("/api/admin/metadata/orders", headers={"Authorization": f"Bearer {admin_token}"})
|
|
assert resp.status_code == 200
|
|
assert resp.json()["columns"] == []
|
|
|
|
def test_save_and_get_metadata(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.post(
|
|
"/api/admin/metadata/orders",
|
|
json={"columns": [
|
|
{"column_name": "id", "basetype": "STRING", "description": "Order ID"},
|
|
{"column_name": "total", "basetype": "NUMERIC", "description": "Total"},
|
|
]},
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
assert resp.status_code == 200
|
|
assert resp.json()["status"] == "ok"
|
|
assert resp.json()["count"] == 2
|
|
|
|
resp = client.get("/api/admin/metadata/orders", headers={"Authorization": f"Bearer {admin_token}"})
|
|
assert resp.status_code == 200
|
|
assert len(resp.json()["columns"]) == 2
|
|
|
|
def test_analyst_cannot_save_metadata(self, seeded_client):
|
|
client, _, analyst_token = seeded_client
|
|
resp = client.post(
|
|
"/api/admin/metadata/orders",
|
|
json={"columns": [{"column_name": "id", "basetype": "STRING"}]},
|
|
headers={"Authorization": f"Bearer {analyst_token}"},
|
|
)
|
|
assert resp.status_code == 403
|
|
|
|
def test_push_non_keboola_table_fails(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.post(
|
|
"/api/admin/metadata/orders/push",
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
# 'orders' is not in table_registry — expect 404 or 400
|
|
assert resp.status_code in (400, 404)
|
|
|
|
def test_push_keboola_table(self, seeded_client, monkeypatch): # noqa: F811
|
|
client, admin_token, _ = seeded_client
|
|
|
|
# 1. Register a keboola table
|
|
from src.db import get_system_db
|
|
from src.repositories.table_registry import TableRegistryRepository
|
|
|
|
conn = get_system_db()
|
|
TableRegistryRepository(conn).register(
|
|
id="kbc_orders",
|
|
name="Orders",
|
|
source_type="keboola",
|
|
source_table="in.c-main.orders",
|
|
)
|
|
conn.close()
|
|
|
|
# 2. Save column metadata
|
|
client.post(
|
|
"/api/admin/metadata/kbc_orders",
|
|
json={"columns": [
|
|
{"column_name": "id", "basetype": "STRING", "description": "Order ID"},
|
|
]},
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
|
|
# 3. Set required env vars
|
|
monkeypatch.setenv("KBC_STACK_URL", "https://connection.keboola.com")
|
|
monkeypatch.setenv("KBC_STORAGE_TOKEN", "test-token")
|
|
|
|
# 4. Mock httpx.AsyncClient so no real HTTP call is made
|
|
import httpx as _httpx
|
|
from unittest.mock import AsyncMock, MagicMock, patch
|
|
|
|
mock_response = _httpx.Response(200, json={"ok": True})
|
|
mock_post = AsyncMock(return_value=mock_response)
|
|
|
|
# AsyncClient is used as "async with httpx.AsyncClient() as client: await client.post(...)"
|
|
# We patch the class so the context-manager instance has our mock_post.
|
|
mock_async_client_instance = MagicMock()
|
|
mock_async_client_instance.post = mock_post
|
|
mock_async_client_instance.__aenter__ = AsyncMock(return_value=mock_async_client_instance)
|
|
mock_async_client_instance.__aexit__ = AsyncMock(return_value=False)
|
|
|
|
with patch("httpx.AsyncClient", return_value=mock_async_client_instance):
|
|
resp = client.post(
|
|
"/api/admin/metadata/kbc_orders/push",
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
|
|
# 5. Assertions
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert data.get("pushed") == 1
|
|
|
|
mock_post.assert_called_once()
|
|
call_args = mock_post.call_args
|
|
# Verify URL contains the source table name
|
|
called_url = call_args[0][0] if call_args[0] else call_args.kwargs.get("url", "")
|
|
assert "in.c-main.orders" in called_url
|
|
# Verify auth header
|
|
called_headers = call_args.kwargs.get("headers", {})
|
|
assert called_headers.get("X-StorageApi-Token") == "test-token"
|
|
# Verify payload structure
|
|
called_json = call_args.kwargs.get("json", {})
|
|
assert called_json.get("provider") == "ai-metadata-enrichment"
|
|
assert isinstance(called_json.get("metadata"), list)
|
|
|
|
|
|
# ---- Hybrid Query ----
|
|
|
|
class TestHybridQueryAPI:
|
|
def test_hybrid_query_requires_admin(self, seeded_client):
|
|
client, _, analyst_token = seeded_client
|
|
resp = client.post(
|
|
"/api/query/hybrid",
|
|
json={"sql": "SELECT 1 AS val", "register_bq": {}},
|
|
headers={"Authorization": f"Bearer {analyst_token}"},
|
|
)
|
|
assert resp.status_code == 403
|
|
|
|
def test_hybrid_query_local_only(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.post(
|
|
"/api/query/hybrid",
|
|
json={"sql": "SELECT 1 AS val", "register_bq": {}},
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert "columns" in data
|
|
assert "rows" in data
|
|
assert data["columns"] == ["val"]
|
|
assert data["rows"] == [[1]]
|
|
|
|
def test_hybrid_query_blocked_sql(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.post(
|
|
"/api/query/hybrid",
|
|
json={"sql": "DROP TABLE users", "register_bq": {}},
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
assert resp.status_code == 400
|
|
assert "query_error" in resp.json()["detail"]
|
|
|
|
def test_hybrid_query_blocked_bq_sql(self, seeded_client):
|
|
client, admin_token, _ = seeded_client
|
|
resp = client.post(
|
|
"/api/query/hybrid",
|
|
json={
|
|
"sql": "SELECT 1",
|
|
"register_bq": {"bad_alias": "DROP TABLE sensitive"},
|
|
},
|
|
headers={"Authorization": f"Bearer {admin_token}"},
|
|
)
|
|
assert resp.status_code == 400
|
|
assert "query_error" in resp.json()["detail"]
|