agnes-the-ai-analyst/tests/test_api_complete.py
minasarustamyan d4ac84dd46
feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150)
* feat(rbac): drop dataset_permissions + access_requests + users.role + is_public; v19 migration

BREAKING. Sjednocení datové RBAC vrstvy do per-group resource_grants modelu.
Před PR byla legacy data RBAC vrstva (dataset_permissions + is_public bypass)
de-facto neaktivní — is_public neměl API/UI/CLI surface, default true znamenal
že can_access_table vždycky bypassl. Dnes každý non-admin přístup vyžaduje
explicitní resource_grants(group, "table", id) řádek.

Schema v18 → v19 (src/db.py:_v18_to_v19_finalize):
- DROP TABLE dataset_permissions, access_requests
- DROP COLUMN users.role (NULL artifact since v13)
- DROP COLUMN table_registry.is_public
- Drops přes table-rebuild idiom (rename → create new → INSERT … SELECT
  → drop old) kvůli DuckDB ALTER DROP COLUMN limitacím na tabulkách
  s historic FK constraints. INSERT picks intersection sloupců, takže
  test fixtures s minimal pre-v19 schemou migrate cleanly.

Runtime:
- src/rbac.py:can_access_table → deleguje na app.auth.access.can_access
- DatasetPermissionRepository, AccessRequestRepository smazány
- AGNES_ENABLE_TABLE_GRANTS env-gate v app/resource_types.py odstraněn
  (TABLE je unconditionally enabled)

API drop:
- app/api/permissions.py, app/api/access_requests.py celé soubory
- /admin/permissions web route + admin_permissions.html
- "Request Access" modal v catalog.html + locked-row UI
- ~10 if user.get("role") != "admin" checků nahrazeno (admin shortcut
  je uvnitř can_access_table)
- /api/settings: drop permissions field z GET; PUT /api/settings/dataset
  gate přepnut na can_access(user_id, "table", dataset, conn)

Auth:
- app/auth/jwt.py:create_access_token: drop role parametr (claim zmizí
  z nově vydávaných JWT; staré tokeny zůstávají valid, claim ignored)
- app/api/users.py: drop role z CreateUserRequest / UpdateUserRequest
  (admin promotion = explicit add to Admin group via memberships API)
- src/repositories/users.py: drop role z create() / update()

CLI:
- da admin set-role smazán → hard-fail s replacement command
- da admin add-user --role flag pryč
- da auth import-token --role flag pryč
- da auth whoami: drop "Role:" výpis
- cli/config.py:save_token: role parametr now optional, no longer written
  (back-compat se starými token.json soubory zachována — pole se ignoruje)

Tests:
- DELETE: test_permissions.py, test_permissions_api.py, test_access_requests_api.py
- REWRITE: test_access_control.py (resource_grants flow), test_rbac.py
  (can_access_table over resource_grants), test_journey_rbac.py
  (drop access-request flow), test_resource_types.py (drop env-gate
  tests, drop is_public from helpers), test_v2_*.py (drop role-based
  user dicts in favor of id-based + Admin group membership),
  test_settings_api.py (no permissions field, can_access gate)
- TRIVIAL: ~30 souborů — drop role="admin" arg z UserRepository.create
  a 3rd positional role z create_access_token
- NEW: test_v18_to_v19 migration test (test_db.py),
  test_can_access_table_no_implicit_public (test_rbac.py),
  test_admin_set_role_returns_hardfail (test_cli_admin.py)
- OpenAPI snapshot regenerated

Docs:
- CHANGELOG: BREAKING entry pod [Unreleased]
- CLAUDE.md: schema v18 → v19
- docs/architecture.md: schema table + RBAC sekce přepsána
- docs/auth-google-oauth.md: admin promotion přes da admin break-glass
- cli/skills/security.md: kompletně přepsáno na group-based model
- docs/TODO-rbac-data-enforcement.md: smazáno (TODO splněn)

Test results: 2363 passed, 19 failed. Zbývající failures jsou pre-existing
Windows-specific issues (fcntl, charset) nesouvisející s tímto PR —
ověřeno git stash pop.

Plan: ~/.claude/plans/floofy-coalescing-parnas.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(release): cut 0.27.0

---------

Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
2026-04-30 22:02:16 +02:00

276 lines
11 KiB
Python

"""Tests for all new API endpoints — catalog, telegram, admin, governance, web UI."""
import os
import pytest
from fastapi.testclient import TestClient
@pytest.fixture
def client(tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-32chars-minimum!!!!!")
from app.main import create_app
from src.db import get_system_db
from src.repositories.users import UserRepository
from src.repositories.knowledge import KnowledgeRepository
from app.auth.jwt import create_access_token
from tests.helpers.auth import grant_admin
conn = get_system_db()
ur = UserRepository(conn)
ur.create(id="admin1", email="admin@test.com", name="Admin")
ur.create(id="analyst1", email="analyst@test.com", name="Analyst")
ur.create(id="km1", email="km@test.com", name="KM Admin")
# Memory governance endpoints (/api/memory/admin/...) are gated by
# require_admin. Putting km1 in the Admin group keeps the existing
# TestGovernance fixture pattern working — the tests only exercise
# the admin path of the governance flow.
grant_admin(conn, "admin1")
grant_admin(conn, "km1")
# Seed knowledge for governance tests
kr = KnowledgeRepository(conn)
kr.create(id="k1", title="MRR", content="Monthly revenue", category="metrics", status="pending")
kr.create(id="k2", title="Churn", content="Customer churn", category="metrics", status="approved")
conn.close()
app = create_app()
c = TestClient(app)
return {
"client": c,
"admin": create_access_token("admin1", "admin@test.com"),
"analyst": create_access_token("analyst1", "analyst@test.com"),
"km": create_access_token("km1", "km@test.com"),
}
def _h(token):
return {"Authorization": f"Bearer {token}"}
# ---- Catalog ----
class TestCatalog:
def test_catalog_tables(self, client):
resp = client["client"].get("/api/catalog/tables", headers=_h(client["analyst"]))
assert resp.status_code == 200
def test_catalog_profile_not_found(self, client):
# Admin can see 404 for truly missing tables (bypasses access control)
resp = client["client"].get("/api/catalog/profile/nonexistent", headers=_h(client["admin"]))
assert resp.status_code == 404
def test_catalog_profile_access_denied_for_analyst(self, client):
# Non-registered (non-public) table returns 403 for analyst
resp = client["client"].get("/api/catalog/profile/private_table", headers=_h(client["analyst"]))
assert resp.status_code == 403
def test_catalog_profile_refresh_access_denied_for_analyst(self, client):
# Refresh endpoint also enforces access control
resp = client["client"].post("/api/catalog/profile/private_table/refresh", headers=_h(client["analyst"]))
assert resp.status_code == 403
def test_catalog_profile_granted_table_accessible_to_analyst(self, client):
"""v19+ — no implicit `is_public`. Analyst gets access via an explicit
resource_grants(group, "table", id) row, then sees 404 (no profile yet)."""
client["client"].post("/api/admin/register-table",
json={"name": "granted_table", "source_type": "keboola"},
headers=_h(client["admin"]))
from src.db import get_system_db
from src.repositories.user_groups import UserGroupsRepository
from src.repositories.user_group_members import UserGroupMembersRepository
from src.repositories.resource_grants import ResourceGrantsRepository
conn = get_system_db()
try:
grp = UserGroupsRepository(conn).create(
name="api-complete-grant", description="t", created_by="t",
)
UserGroupMembersRepository(conn).add_member(
"analyst1", grp["id"], source="admin", added_by="t",
)
ResourceGrantsRepository(conn).create(
group_id=grp["id"], resource_type="table", resource_id="granted_table",
assigned_by="t",
)
finally:
conn.close()
resp = client["client"].get("/api/catalog/profile/granted_table", headers=_h(client["analyst"]))
assert resp.status_code == 404 # access granted, but no profile data yet
# ---- Telegram ----
class TestTelegram:
def test_telegram_status_not_linked(self, client):
resp = client["client"].get("/api/telegram/status", headers=_h(client["analyst"]))
assert resp.status_code == 200
assert resp.json()["linked"] is False
def test_telegram_verify_invalid_code(self, client):
resp = client["client"].post("/api/telegram/verify",
json={"code": "INVALID"},
headers=_h(client["analyst"]))
assert resp.status_code == 400
def test_telegram_unlink(self, client):
resp = client["client"].post("/api/telegram/unlink", headers=_h(client["analyst"]))
assert resp.status_code == 200
# ---- Admin Tables ----
class TestAdminTables:
def test_list_registry_empty(self, client):
resp = client["client"].get("/api/admin/registry", headers=_h(client["admin"]))
assert resp.status_code == 200
assert resp.json()["count"] == 0
def test_register_and_list(self, client):
resp = client["client"].post("/api/admin/register-table",
json={"name": "Orders", "folder": "sales", "sync_strategy": "incremental"},
headers=_h(client["admin"]))
assert resp.status_code == 201
resp = client["client"].get("/api/admin/registry", headers=_h(client["admin"]))
assert resp.json()["count"] == 1
def test_register_duplicate(self, client):
client["client"].post("/api/admin/register-table",
json={"name": "Test", "folder": "f"},
headers=_h(client["admin"]))
resp = client["client"].post("/api/admin/register-table",
json={"name": "Test", "folder": "f"},
headers=_h(client["admin"]))
assert resp.status_code == 409
def test_unregister(self, client):
client["client"].post("/api/admin/register-table",
json={"name": "Temp"},
headers=_h(client["admin"]))
resp = client["client"].delete("/api/admin/registry/temp", headers=_h(client["admin"]))
assert resp.status_code == 204
def test_analyst_blocked(self, client):
resp = client["client"].get("/api/admin/registry", headers=_h(client["analyst"]))
assert resp.status_code == 403
# ---- Corporate Memory Governance ----
class TestGovernance:
def test_approve(self, client):
resp = client["client"].post("/api/memory/admin/approve?item_id=k1",
headers=_h(client["km"]))
assert resp.status_code == 200
assert resp.json()["status"] == "approved"
def test_reject(self, client):
resp = client["client"].post("/api/memory/admin/reject?item_id=k1",
json={"reason": "not relevant"},
headers=_h(client["km"]))
assert resp.status_code == 200
assert resp.json()["status"] == "rejected"
def test_mandate(self, client):
resp = client["client"].post("/api/memory/admin/mandate?item_id=k1",
json={"reason": "critical", "audience": "all"},
headers=_h(client["km"]))
assert resp.status_code == 200
assert resp.json()["status"] == "mandatory"
def test_batch_action(self, client):
resp = client["client"].post("/api/memory/admin/batch",
json={"item_ids": ["k1", "k2"], "action": "approve"},
headers=_h(client["km"]))
assert resp.status_code == 200
assert len(resp.json()["success"]) == 2
def test_pending_queue(self, client):
resp = client["client"].get("/api/memory/admin/pending", headers=_h(client["km"]))
assert resp.status_code == 200
def test_audit_log(self, client):
# Do an action first
client["client"].post("/api/memory/admin/approve?item_id=k1", headers=_h(client["km"]))
resp = client["client"].get("/api/memory/admin/audit", headers=_h(client["km"]))
assert resp.status_code == 200
def test_analyst_blocked_from_governance(self, client):
resp = client["client"].post("/api/memory/admin/approve?item_id=k1",
headers=_h(client["analyst"]))
assert resp.status_code == 403
def test_stats(self, client):
resp = client["client"].get("/api/memory/stats", headers=_h(client["analyst"]))
assert resp.status_code == 200
assert resp.json()["total"] == 2
def test_my_votes(self, client):
# Vote first
client["client"].post("/api/memory/k2/vote", json={"vote": 1}, headers=_h(client["analyst"]))
resp = client["client"].get("/api/memory/my-votes", headers=_h(client["analyst"]))
assert resp.status_code == 200
# ---- Sync Settings (new naming) ----
class TestSyncSettings:
def test_get_sync_settings(self, client):
resp = client["client"].get("/api/sync/settings", headers=_h(client["analyst"]))
assert resp.status_code == 200
def test_update_sync_settings(self, client):
resp = client["client"].post("/api/sync/settings",
json={"datasets": {"sales": True}},
headers=_h(client["analyst"]))
assert resp.status_code == 200
assert "sales" in resp.json()["updated"]
def test_table_subscriptions(self, client):
resp = client["client"].get("/api/sync/table-subscriptions", headers=_h(client["analyst"]))
assert resp.status_code == 200
# ---- Web UI ----
class TestWebUI:
def test_login_page(self, client):
resp = client["client"].get("/login")
assert resp.status_code == 200
def test_root_redirects(self, client):
resp = client["client"].get("/", follow_redirects=False)
assert resp.status_code == 302
def test_health_no_auth(self, client):
resp = client["client"].get("/api/health")
assert resp.status_code == 200
# ---- Upload ----
class TestUpload:
def test_upload_rejects_oversized_file(self, client):
import io
large_data = b"x" * (50 * 1024 * 1024 + 1)
resp = client["client"].post(
"/api/upload/artifacts",
files={"file": ("big.csv", io.BytesIO(large_data), "text/csv")},
headers=_h(client["admin"]),
)
assert resp.status_code == 413
def test_upload_does_not_leak_absolute_path(self, client):
"""Upload response should not contain absolute filesystem paths."""
import io
resp = client["client"].post(
"/api/upload/artifacts",
files={"file": ("test.txt", io.BytesIO(b"hello"), "text/plain")},
headers=_h(client["admin"]),
)
assert resp.status_code == 200
data = resp.json()
assert not data.get("path", "").startswith("/"), "Response should not leak absolute path"
assert "filename" in data, "Response should contain filename"