* fix(security+ops): #82 #85 #87 — auth hardening, API validation, deploy posture Security and operational hardening across three issue groups: - M23: docker-compose.override.yml → docker-compose.dev.yml (BREAKING, prod foot-gun) - C13: Container runs as non-root user 'agnes' (USER directive in Dockerfile) - M21: Docker resource limits (mem_limit, cpus) on app + scheduler - M22: Caddyfile security headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, -Server) - M17: /api/health split into minimal (unauth) + /api/health/detailed (auth) (BREAKING) - M26: release.yml restricts build-and-push to main + workflow_dispatch; paths-ignore for docs - C2: table_id traversal validation on /api/data/{table_id}/download - M4: Upload streaming (chunk-read + temp file) instead of full-buffer; /local-md hashed filename - C5: reset_token removed from POST /api/users/{id}/reset-password response - C8: Startup WARNING when no user has password_hash (bootstrap window visible) - M9: Audit log on failed web form login (mirrors /auth/token endpoint) - M10: Atomic magic-link consume via compare-and-swap (CONSUMED: marker + DuckDB conflict catch) Also: SSRF protection on /api/admin/configure (#46), memory stats SQL aggregation (#90) Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF 169.254.x.x + IPv6 multicast; M10 marker cleanup safety Review fixes: - Add 169.254.0.0/16 (link-local, cloud metadata) to SSRF regex — was missing, allowing requests to AWS/GCP/Azure metadata endpoints - Add ff[0-9a-f]{2}: (IPv6 multicast) to SSRF regex - M10: wrap Step 3 (CONSUMED marker cleanup) in try-except with warning log — prevents unhandled exception if DB write fails after successful token consumption - Add test for 169.254.169.254 SSRF rejection Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF IPv6 bypass, CLI health endpoint, upload FD leak Address Devin Review findings on PR #104: 1. SSRF IPv6 bypass: Replace hostname regex with DNS resolution + ipaddress module checks. The old regex patterns like `fe80:` only matched up to the first colon, missing real IPv6 addresses like `fe80::1`, `fc00::1`, `ff02::1`. The new approach resolves the hostname via getaddrinfo and checks each resulting IP against ipaddress.is_private/is_loopback/is_link_local/is_reserved/is_multicast. 2. CLI commands broken: `da setup test-connection`, `da setup verify`, `da diagnose`, `da status` all called /api/health expecting the old format (status=="healthy", services dict). Now they call /api/health/detailed for service-level checks (with graceful fallback to the minimal endpoint when auth is not configured). 3. Temp file handle leak: _stream_to_temp returns an open NamedTemporaryFile; callers now close it before shutil.move() to prevent FD leaks until GC. Also adds IPv6 SSRF test cases (loopback, link-local, unique-local, multicast) with mocked DNS resolution for test environment independence. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): download regex blocks hyphenated IDs; document health split Address Devin Review round-3 findings on PR #104: 1. _SAFE_IDENTIFIER regex blocked hyphenated table IDs: The download endpoint used the strict SQL-identifier regex which does not allow dots or hyphens, but Keboola table IDs like in.c-crm.orders contain both. Switched to _SAFE_QUOTED_IDENTIFIER which allows dots and hyphens while still blocking path-traversal chars (/, .., \) and quote/control characters. Added test for hyphenated/dotted IDs. 2. Documented health endpoint split in DEPLOYMENT.md: Added Health checks & external monitoring section explaining both endpoints (minimal unauth /api/health vs authenticated /api/health/detailed) and how to wire external monitoring tools to the detailed endpoint with a PAT. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * release(0.12.1): cut hotfix for snapshot integrity + #82/#85/#87 hardening * fix(security): apply CAS pattern to password reset confirm (#82/M10 follow-up) Devin review on the rebased PR flagged the asymmetry: magic-link verify got the atomic compare-and-swap pattern in the original M10 fix, but password reset confirm at /auth/password/reset/confirm was still using read-validate-clear. Two concurrent POSTs with the same valid reset token could both succeed in setting different new passwords (last-write- wins). Lower severity than the magic-link race because the attacker would need the reset token AND to race the legitimate user, but the asymmetry was a polish gap. Mirrors app/auth/providers/email.py::_consume_token CAS exactly: write unique CONSUMED:<random> marker via UPDATE...WHERE token=old_token, then SELECT to verify our marker won, then proceed. Only the winner clears the marker and applies the password change. New regression test_concurrent_reset_only_one_wins in tests/test_password_flows.py::TestResetConfirm pins the contract: two ThreadPoolExecutor workers + Barrier hit /reset/confirm with the same token; exactly one gets 302 (password applied), the other gets 200 with 'Invalid or expired'. Sanity-checked against the pre-CAS code — both POSTs got 302 (race confirmed). --------- Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
311 lines
9.6 KiB
Python
311 lines
9.6 KiB
Python
"""Corporate memory endpoints — knowledge items, voting, governance admin."""
|
|
|
|
import uuid
|
|
from typing import Optional, List
|
|
|
|
from fastapi import APIRouter, Depends, HTTPException
|
|
from pydantic import BaseModel
|
|
import duckdb
|
|
|
|
from app.auth.access import require_admin
|
|
from app.auth.dependencies import get_current_user, _get_db
|
|
from src.repositories.knowledge import KnowledgeRepository
|
|
from src.repositories.audit import AuditRepository
|
|
|
|
router = APIRouter(prefix="/api/memory", tags=["memory"])
|
|
|
|
VALID_STATUSES = ["pending", "approved", "mandatory", "rejected", "revoked", "expired"]
|
|
|
|
|
|
class CreateKnowledgeRequest(BaseModel):
|
|
title: str
|
|
content: str
|
|
category: str
|
|
tags: Optional[List[str]] = None
|
|
|
|
|
|
class VoteRequest(BaseModel):
|
|
vote: int
|
|
|
|
|
|
class AdminActionRequest(BaseModel):
|
|
reason: Optional[str] = None
|
|
audience: Optional[str] = None
|
|
|
|
|
|
class EditRequest(BaseModel):
|
|
title: Optional[str] = None
|
|
content: Optional[str] = None
|
|
|
|
|
|
class BatchActionRequest(BaseModel):
|
|
item_ids: List[str]
|
|
action: str # approve, reject, mandate, revoke
|
|
reason: Optional[str] = None
|
|
audience: Optional[str] = None
|
|
|
|
|
|
# ---- User endpoints ----
|
|
|
|
@router.get("")
|
|
async def list_knowledge(
|
|
status_filter: Optional[str] = None,
|
|
category: Optional[str] = None,
|
|
search: Optional[str] = None,
|
|
page: int = 1,
|
|
per_page: int = 50,
|
|
sort: str = "updated_at",
|
|
user: dict = Depends(get_current_user),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
"""List knowledge items with filtering, pagination, search."""
|
|
repo = KnowledgeRepository(conn)
|
|
offset = (page - 1) * per_page
|
|
if search:
|
|
items = repo.search(search)
|
|
else:
|
|
statuses = [status_filter] if status_filter else None
|
|
items = repo.list_items(statuses=statuses, category=category, limit=per_page, offset=offset)
|
|
|
|
# Enrich with votes
|
|
for item in items:
|
|
votes = repo.get_votes(item["id"])
|
|
item["upvotes"] = votes["upvotes"]
|
|
item["downvotes"] = votes["downvotes"]
|
|
item["score"] = votes["upvotes"] - votes["downvotes"]
|
|
|
|
return {"items": items, "count": len(items), "page": page, "per_page": per_page}
|
|
|
|
|
|
@router.get("/stats")
|
|
async def get_stats(
|
|
user: dict = Depends(get_current_user),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
"""Get corporate memory statistics."""
|
|
rows = conn.execute(
|
|
"SELECT status, category, COUNT(*) as n FROM knowledge_items GROUP BY status, category"
|
|
).fetchall()
|
|
|
|
status_counts: dict[str, int] = {}
|
|
categories: set[str] = set()
|
|
total = 0
|
|
for status, category, n in rows:
|
|
status_counts[status] = status_counts.get(status, 0) + n
|
|
if category:
|
|
categories.add(category)
|
|
total += n
|
|
return {"total": total, "by_status": status_counts, "categories": sorted(categories)}
|
|
|
|
|
|
@router.post("", status_code=201)
|
|
async def create_knowledge(
|
|
request: CreateKnowledgeRequest,
|
|
user: dict = Depends(get_current_user),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
repo = KnowledgeRepository(conn)
|
|
item_id = str(uuid.uuid4())
|
|
repo.create(
|
|
id=item_id,
|
|
title=request.title,
|
|
content=request.content,
|
|
category=request.category,
|
|
source_user=user.get("email"),
|
|
tags=request.tags,
|
|
)
|
|
return {"id": item_id, "status": "pending"}
|
|
|
|
|
|
@router.post("/{item_id}/vote")
|
|
async def vote_knowledge(
|
|
item_id: str,
|
|
request: VoteRequest,
|
|
user: dict = Depends(get_current_user),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
if request.vote not in (1, -1):
|
|
raise HTTPException(status_code=400, detail="Vote must be 1 or -1")
|
|
repo = KnowledgeRepository(conn)
|
|
if not repo.get_by_id(item_id):
|
|
raise HTTPException(status_code=404, detail="Knowledge item not found")
|
|
repo.vote(item_id, user["id"], request.vote)
|
|
return repo.get_votes(item_id)
|
|
|
|
|
|
@router.get("/my-votes")
|
|
async def get_my_votes(
|
|
user: dict = Depends(get_current_user),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
"""Get current user's votes on all items."""
|
|
results = conn.execute(
|
|
"SELECT item_id, vote FROM knowledge_votes WHERE user_id = ?", [user["id"]]
|
|
).fetchall()
|
|
return {row[0]: row[1] for row in results}
|
|
|
|
|
|
# ---- Admin governance endpoints ----
|
|
|
|
def _get_item_or_404(repo: KnowledgeRepository, item_id: str) -> dict:
|
|
item = repo.get_by_id(item_id)
|
|
if not item:
|
|
raise HTTPException(status_code=404, detail="Knowledge item not found")
|
|
return item
|
|
|
|
|
|
def _audit_action(conn, admin_email: str, action: str, item_id: str, details: dict = None):
|
|
audit = AuditRepository(conn)
|
|
audit.log(user_id=admin_email, action=f"km_{action}", resource=item_id, params=details)
|
|
|
|
|
|
@router.post("/admin/approve")
|
|
async def admin_approve(
|
|
item_id: str,
|
|
user: dict = Depends(require_admin),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
repo = KnowledgeRepository(conn)
|
|
_get_item_or_404(repo, item_id)
|
|
repo.update_status(item_id, "approved")
|
|
_audit_action(conn, user["email"], "approve", item_id)
|
|
return {"id": item_id, "status": "approved"}
|
|
|
|
|
|
@router.post("/admin/reject")
|
|
async def admin_reject(
|
|
item_id: str,
|
|
request: AdminActionRequest,
|
|
user: dict = Depends(require_admin),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
repo = KnowledgeRepository(conn)
|
|
_get_item_or_404(repo, item_id)
|
|
repo.update_status(item_id, "rejected")
|
|
_audit_action(conn, user["email"], "reject", item_id, {"reason": request.reason})
|
|
return {"id": item_id, "status": "rejected"}
|
|
|
|
|
|
@router.post("/admin/mandate")
|
|
async def admin_mandate(
|
|
item_id: str,
|
|
request: AdminActionRequest,
|
|
user: dict = Depends(require_admin),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
repo = KnowledgeRepository(conn)
|
|
_get_item_or_404(repo, item_id)
|
|
repo.update_status(item_id, "mandatory")
|
|
_audit_action(conn, user["email"], "mandate", item_id, {
|
|
"reason": request.reason, "audience": request.audience,
|
|
})
|
|
return {"id": item_id, "status": "mandatory"}
|
|
|
|
|
|
@router.post("/admin/revoke")
|
|
async def admin_revoke(
|
|
item_id: str,
|
|
request: AdminActionRequest,
|
|
user: dict = Depends(require_admin),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
repo = KnowledgeRepository(conn)
|
|
_get_item_or_404(repo, item_id)
|
|
repo.update_status(item_id, "revoked")
|
|
_audit_action(conn, user["email"], "revoke", item_id, {"reason": request.reason})
|
|
return {"id": item_id, "status": "revoked"}
|
|
|
|
|
|
@router.post("/admin/edit")
|
|
async def admin_edit(
|
|
item_id: str,
|
|
request: EditRequest,
|
|
user: dict = Depends(require_admin),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
repo = KnowledgeRepository(conn)
|
|
_get_item_or_404(repo, item_id)
|
|
updates = {}
|
|
if request.title is not None:
|
|
updates["title"] = request.title
|
|
if request.content is not None:
|
|
updates["content"] = request.content
|
|
if updates:
|
|
repo.update(item_id, **updates)
|
|
_audit_action(conn, user["email"], "edit", item_id, updates)
|
|
return {"id": item_id, "updated": list(updates.keys())}
|
|
|
|
|
|
@router.post("/admin/batch")
|
|
async def admin_batch(
|
|
request: BatchActionRequest,
|
|
user: dict = Depends(require_admin),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
"""Batch governance action on multiple items."""
|
|
repo = KnowledgeRepository(conn)
|
|
action_map = {
|
|
"approve": "approved",
|
|
"reject": "rejected",
|
|
"mandate": "mandatory",
|
|
"revoke": "revoked",
|
|
}
|
|
if request.action not in action_map:
|
|
raise HTTPException(status_code=400, detail=f"Invalid action: {request.action}")
|
|
|
|
new_status = action_map[request.action]
|
|
results = {"success": [], "not_found": []}
|
|
for item_id in request.item_ids:
|
|
item = repo.get_by_id(item_id)
|
|
if not item:
|
|
results["not_found"].append(item_id)
|
|
continue
|
|
repo.update_status(item_id, new_status)
|
|
_audit_action(conn, user["email"], request.action, item_id, {
|
|
"reason": request.reason, "audience": request.audience, "batch": True,
|
|
})
|
|
results["success"].append(item_id)
|
|
|
|
return results
|
|
|
|
|
|
@router.get("/admin/pending")
|
|
async def admin_pending(
|
|
category: Optional[str] = None,
|
|
page: int = 1,
|
|
per_page: int = 50,
|
|
user: dict = Depends(require_admin),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
"""Get pending items queue for admin review."""
|
|
repo = KnowledgeRepository(conn)
|
|
offset = (page - 1) * per_page
|
|
items = repo.list_items(statuses=["pending"], category=category, limit=per_page, offset=offset)
|
|
return {"items": items, "count": len(items)}
|
|
|
|
|
|
@router.get("/admin/audit")
|
|
async def admin_audit(
|
|
page: int = 1,
|
|
per_page: int = 50,
|
|
action: Optional[str] = None,
|
|
user: dict = Depends(require_admin),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
"""Get governance audit log."""
|
|
audit = AuditRepository(conn)
|
|
# Filter km_ prefixed actions
|
|
km_action = f"km_{action}" if action else None
|
|
entries = audit.query(action=km_action, limit=per_page)
|
|
if not km_action:
|
|
# Get all km_ actions
|
|
entries = conn.execute(
|
|
"SELECT * FROM audit_log WHERE action LIKE 'km_%' ORDER BY timestamp DESC LIMIT ?",
|
|
[per_page],
|
|
).fetchall()
|
|
if entries:
|
|
columns = [desc[0] for desc in conn.description]
|
|
entries = [dict(zip(columns, row)) for row in entries]
|
|
else:
|
|
entries = []
|
|
return {"entries": entries, "count": len(entries)}
|