* fix(security+ops): #82 #85 #87 — auth hardening, API validation, deploy posture Security and operational hardening across three issue groups: - M23: docker-compose.override.yml → docker-compose.dev.yml (BREAKING, prod foot-gun) - C13: Container runs as non-root user 'agnes' (USER directive in Dockerfile) - M21: Docker resource limits (mem_limit, cpus) on app + scheduler - M22: Caddyfile security headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, -Server) - M17: /api/health split into minimal (unauth) + /api/health/detailed (auth) (BREAKING) - M26: release.yml restricts build-and-push to main + workflow_dispatch; paths-ignore for docs - C2: table_id traversal validation on /api/data/{table_id}/download - M4: Upload streaming (chunk-read + temp file) instead of full-buffer; /local-md hashed filename - C5: reset_token removed from POST /api/users/{id}/reset-password response - C8: Startup WARNING when no user has password_hash (bootstrap window visible) - M9: Audit log on failed web form login (mirrors /auth/token endpoint) - M10: Atomic magic-link consume via compare-and-swap (CONSUMED: marker + DuckDB conflict catch) Also: SSRF protection on /api/admin/configure (#46), memory stats SQL aggregation (#90) Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF 169.254.x.x + IPv6 multicast; M10 marker cleanup safety Review fixes: - Add 169.254.0.0/16 (link-local, cloud metadata) to SSRF regex — was missing, allowing requests to AWS/GCP/Azure metadata endpoints - Add ff[0-9a-f]{2}: (IPv6 multicast) to SSRF regex - M10: wrap Step 3 (CONSUMED marker cleanup) in try-except with warning log — prevents unhandled exception if DB write fails after successful token consumption - Add test for 169.254.169.254 SSRF rejection Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF IPv6 bypass, CLI health endpoint, upload FD leak Address Devin Review findings on PR #104: 1. SSRF IPv6 bypass: Replace hostname regex with DNS resolution + ipaddress module checks. The old regex patterns like `fe80:` only matched up to the first colon, missing real IPv6 addresses like `fe80::1`, `fc00::1`, `ff02::1`. The new approach resolves the hostname via getaddrinfo and checks each resulting IP against ipaddress.is_private/is_loopback/is_link_local/is_reserved/is_multicast. 2. CLI commands broken: `da setup test-connection`, `da setup verify`, `da diagnose`, `da status` all called /api/health expecting the old format (status=="healthy", services dict). Now they call /api/health/detailed for service-level checks (with graceful fallback to the minimal endpoint when auth is not configured). 3. Temp file handle leak: _stream_to_temp returns an open NamedTemporaryFile; callers now close it before shutil.move() to prevent FD leaks until GC. Also adds IPv6 SSRF test cases (loopback, link-local, unique-local, multicast) with mocked DNS resolution for test environment independence. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): download regex blocks hyphenated IDs; document health split Address Devin Review round-3 findings on PR #104: 1. _SAFE_IDENTIFIER regex blocked hyphenated table IDs: The download endpoint used the strict SQL-identifier regex which does not allow dots or hyphens, but Keboola table IDs like in.c-crm.orders contain both. Switched to _SAFE_QUOTED_IDENTIFIER which allows dots and hyphens while still blocking path-traversal chars (/, .., \) and quote/control characters. Added test for hyphenated/dotted IDs. 2. Documented health endpoint split in DEPLOYMENT.md: Added Health checks & external monitoring section explaining both endpoints (minimal unauth /api/health vs authenticated /api/health/detailed) and how to wire external monitoring tools to the detailed endpoint with a PAT. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * release(0.12.1): cut hotfix for snapshot integrity + #82/#85/#87 hardening * fix(security): apply CAS pattern to password reset confirm (#82/M10 follow-up) Devin review on the rebased PR flagged the asymmetry: magic-link verify got the atomic compare-and-swap pattern in the original M10 fix, but password reset confirm at /auth/password/reset/confirm was still using read-validate-clear. Two concurrent POSTs with the same valid reset token could both succeed in setting different new passwords (last-write- wins). Lower severity than the magic-link race because the attacker would need the reset token AND to race the legitimate user, but the asymmetry was a polish gap. Mirrors app/auth/providers/email.py::_consume_token CAS exactly: write unique CONSUMED:<random> marker via UPDATE...WHERE token=old_token, then SELECT to verify our marker won, then proceed. Only the winner clears the marker and applies the password change. New regression test_concurrent_reset_only_one_wins in tests/test_password_flows.py::TestResetConfirm pins the contract: two ThreadPoolExecutor workers + Barrier hit /reset/confirm with the same token; exactly one gets 302 (password applied), the other gets 200 with 'Invalid or expired'. Sanity-checked against the pre-CAS code — both POSTs got 302 (race confirmed). --------- Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
179 lines
6.6 KiB
Python
179 lines
6.6 KiB
Python
"""Tests for bootstrap endpoint — first admin user creation."""
|
|
|
|
import os
|
|
import pytest
|
|
from fastapi.testclient import TestClient
|
|
|
|
|
|
@pytest.fixture
|
|
def fresh_client(tmp_path, monkeypatch):
|
|
"""Client with EMPTY database — no users."""
|
|
monkeypatch.setenv("DATA_DIR", str(tmp_path))
|
|
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-32chars-minimum!!!!!")
|
|
from app.main import create_app
|
|
app = create_app()
|
|
return TestClient(app)
|
|
|
|
|
|
@pytest.fixture
|
|
def seeded_client(tmp_path, monkeypatch):
|
|
"""Client with one existing seed user (no password_hash — like SEED_ADMIN_EMAIL seeding)."""
|
|
monkeypatch.setenv("DATA_DIR", str(tmp_path))
|
|
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-32chars-minimum!!!!!")
|
|
from app.main import create_app
|
|
from src.db import get_system_db
|
|
from src.repositories.users import UserRepository
|
|
conn = get_system_db()
|
|
UserRepository(conn).create(id="existing", email="existing@test.com", name="E", role="admin")
|
|
conn.close()
|
|
return TestClient(create_app())
|
|
|
|
|
|
@pytest.fixture
|
|
def password_user_client(tmp_path, monkeypatch):
|
|
"""Client with a user who already has a password set — bootstrap must be disabled."""
|
|
from argon2 import PasswordHasher
|
|
monkeypatch.setenv("DATA_DIR", str(tmp_path))
|
|
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-32chars-minimum!!!!!")
|
|
from app.main import create_app
|
|
from src.db import get_system_db
|
|
from src.repositories.users import UserRepository
|
|
conn = get_system_db()
|
|
UserRepository(conn).create(
|
|
id="existing",
|
|
email="existing@test.com",
|
|
name="E",
|
|
role="admin",
|
|
password_hash=PasswordHasher().hash("pre-existing-pass"),
|
|
)
|
|
conn.close()
|
|
return TestClient(create_app())
|
|
|
|
|
|
class TestBootstrap:
|
|
def test_bootstrap_on_empty_db(self, fresh_client):
|
|
"""First call creates admin and returns token."""
|
|
resp = fresh_client.post("/auth/bootstrap", json={
|
|
"email": "admin@test.com",
|
|
"name": "Admin",
|
|
})
|
|
assert resp.status_code == 200
|
|
data = resp.json()
|
|
assert data["email"] == "admin@test.com"
|
|
assert data["role"] == "admin"
|
|
assert "access_token" in data
|
|
|
|
def test_bootstrap_with_password(self, fresh_client):
|
|
"""Bootstrap with password sets password hash."""
|
|
resp = fresh_client.post("/auth/bootstrap", json={
|
|
"email": "admin@test.com",
|
|
"password": "securepass123",
|
|
})
|
|
assert resp.status_code == 200
|
|
|
|
# Token works
|
|
token = resp.json()["access_token"]
|
|
resp2 = fresh_client.get("/api/health")
|
|
assert resp2.status_code == 200
|
|
|
|
def test_bootstrap_activates_seed_user(self, seeded_client):
|
|
"""Bootstrap activates a password-less seed user (SEED_ADMIN_EMAIL scenario)."""
|
|
resp = seeded_client.post("/auth/bootstrap", json={
|
|
"email": "existing@test.com",
|
|
"password": "newpass123",
|
|
})
|
|
assert resp.status_code == 200
|
|
assert resp.json()["role"] == "admin"
|
|
|
|
# Login now works
|
|
login = seeded_client.post("/auth/password/login", json={
|
|
"email": "existing@test.com",
|
|
"password": "newpass123",
|
|
})
|
|
assert login.status_code == 200
|
|
|
|
def test_bootstrap_disabled_when_password_user_exists(self, password_user_client):
|
|
"""Bootstrap fails with 403 when any user already has a password set."""
|
|
resp = password_user_client.post("/auth/bootstrap", json={
|
|
"email": "hacker@evil.com",
|
|
"password": "should-not-work",
|
|
})
|
|
assert resp.status_code == 403
|
|
assert "password already exists" in resp.json()["detail"]
|
|
|
|
def test_bootstrap_then_login(self, fresh_client):
|
|
"""After bootstrap with password, /auth/token login works; without password it requires OAuth."""
|
|
# Bootstrap with a password
|
|
fresh_client.post("/auth/bootstrap", json={
|
|
"email": "admin@test.com",
|
|
"password": "adminpass123",
|
|
})
|
|
|
|
# Normal password login succeeds
|
|
resp = fresh_client.post("/auth/token", json={
|
|
"email": "admin@test.com",
|
|
"password": "adminpass123",
|
|
})
|
|
assert resp.status_code == 200
|
|
assert resp.json()["role"] == "admin"
|
|
|
|
def test_bootstrap_no_password_token_rejected(self, fresh_client):
|
|
"""After passwordless bootstrap, /auth/token must reject the user (OAuth-only flow)."""
|
|
fresh_client.post("/auth/bootstrap", json={
|
|
"email": "admin@test.com",
|
|
})
|
|
|
|
resp = fresh_client.post("/auth/token", json={
|
|
"email": "admin@test.com",
|
|
})
|
|
assert resp.status_code == 401
|
|
|
|
def test_bootstrap_second_call_fails_once_password_set(self, fresh_client):
|
|
"""Endpoint self-deactivates once any user has a password."""
|
|
# First call WITH password — locks bootstrap
|
|
fresh_client.post("/auth/bootstrap", json={
|
|
"email": "admin@test.com",
|
|
"password": "realpass123",
|
|
})
|
|
|
|
# Any subsequent bootstrap attempt fails
|
|
resp = fresh_client.post("/auth/bootstrap", json={
|
|
"email": "second@test.com",
|
|
"password": "other-pass",
|
|
})
|
|
assert resp.status_code == 403
|
|
|
|
def test_full_agent_flow(self, fresh_client):
|
|
"""Simulate full AI agent deployment flow."""
|
|
# 1. Health check (no auth — minimal endpoint)
|
|
resp = fresh_client.get("/api/health")
|
|
assert resp.status_code == 200
|
|
assert resp.json()["status"] == "ok"
|
|
|
|
# 2. Bootstrap admin
|
|
resp = fresh_client.post("/auth/bootstrap", json={
|
|
"email": "agent@company.com", "name": "AI Agent",
|
|
})
|
|
assert resp.status_code == 200
|
|
token = resp.json()["access_token"]
|
|
headers = {"Authorization": f"Bearer {token}"}
|
|
|
|
# 3. Check manifest (empty, no data yet)
|
|
resp = fresh_client.get("/api/sync/manifest", headers=headers)
|
|
assert resp.status_code == 200
|
|
assert len(resp.json()["tables"]) == 0
|
|
|
|
# 4. List users
|
|
resp = fresh_client.get("/api/users", headers=headers)
|
|
assert resp.status_code == 200
|
|
assert len(resp.json()) == 1
|
|
|
|
# 5. Add analyst user
|
|
resp = fresh_client.post("/api/users", json={
|
|
"email": "analyst@company.com", "name": "Analyst",
|
|
}, headers=headers)
|
|
assert resp.status_code == 201
|
|
|
|
# 6. Verify via detailed health (requires auth)
|
|
resp = fresh_client.get("/api/health/detailed", headers=headers)
|
|
assert resp.json()["services"]["users"]["count"] == 2
|