* fix(security+ops): #82 #85 #87 — auth hardening, API validation, deploy posture Security and operational hardening across three issue groups: - M23: docker-compose.override.yml → docker-compose.dev.yml (BREAKING, prod foot-gun) - C13: Container runs as non-root user 'agnes' (USER directive in Dockerfile) - M21: Docker resource limits (mem_limit, cpus) on app + scheduler - M22: Caddyfile security headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, -Server) - M17: /api/health split into minimal (unauth) + /api/health/detailed (auth) (BREAKING) - M26: release.yml restricts build-and-push to main + workflow_dispatch; paths-ignore for docs - C2: table_id traversal validation on /api/data/{table_id}/download - M4: Upload streaming (chunk-read + temp file) instead of full-buffer; /local-md hashed filename - C5: reset_token removed from POST /api/users/{id}/reset-password response - C8: Startup WARNING when no user has password_hash (bootstrap window visible) - M9: Audit log on failed web form login (mirrors /auth/token endpoint) - M10: Atomic magic-link consume via compare-and-swap (CONSUMED: marker + DuckDB conflict catch) Also: SSRF protection on /api/admin/configure (#46), memory stats SQL aggregation (#90) Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF 169.254.x.x + IPv6 multicast; M10 marker cleanup safety Review fixes: - Add 169.254.0.0/16 (link-local, cloud metadata) to SSRF regex — was missing, allowing requests to AWS/GCP/Azure metadata endpoints - Add ff[0-9a-f]{2}: (IPv6 multicast) to SSRF regex - M10: wrap Step 3 (CONSUMED marker cleanup) in try-except with warning log — prevents unhandled exception if DB write fails after successful token consumption - Add test for 169.254.169.254 SSRF rejection Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): SSRF IPv6 bypass, CLI health endpoint, upload FD leak Address Devin Review findings on PR #104: 1. SSRF IPv6 bypass: Replace hostname regex with DNS resolution + ipaddress module checks. The old regex patterns like `fe80:` only matched up to the first colon, missing real IPv6 addresses like `fe80::1`, `fc00::1`, `ff02::1`. The new approach resolves the hostname via getaddrinfo and checks each resulting IP against ipaddress.is_private/is_loopback/is_link_local/is_reserved/is_multicast. 2. CLI commands broken: `da setup test-connection`, `da setup verify`, `da diagnose`, `da status` all called /api/health expecting the old format (status=="healthy", services dict). Now they call /api/health/detailed for service-level checks (with graceful fallback to the minimal endpoint when auth is not configured). 3. Temp file handle leak: _stream_to_temp returns an open NamedTemporaryFile; callers now close it before shutil.move() to prevent FD leaks until GC. Also adds IPv6 SSRF test cases (loopback, link-local, unique-local, multicast) with mocked DNS resolution for test environment independence. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix(review): download regex blocks hyphenated IDs; document health split Address Devin Review round-3 findings on PR #104: 1. _SAFE_IDENTIFIER regex blocked hyphenated table IDs: The download endpoint used the strict SQL-identifier regex which does not allow dots or hyphens, but Keboola table IDs like in.c-crm.orders contain both. Switched to _SAFE_QUOTED_IDENTIFIER which allows dots and hyphens while still blocking path-traversal chars (/, .., \) and quote/control characters. Added test for hyphenated/dotted IDs. 2. Documented health endpoint split in DEPLOYMENT.md: Added Health checks & external monitoring section explaining both endpoints (minimal unauth /api/health vs authenticated /api/health/detailed) and how to wire external monitoring tools to the detailed endpoint with a PAT. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> * release(0.12.1): cut hotfix for snapshot integrity + #82/#85/#87 hardening * fix(security): apply CAS pattern to password reset confirm (#82/M10 follow-up) Devin review on the rebased PR flagged the asymmetry: magic-link verify got the atomic compare-and-swap pattern in the original M10 fix, but password reset confirm at /auth/password/reset/confirm was still using read-validate-clear. Two concurrent POSTs with the same valid reset token could both succeed in setting different new passwords (last-write- wins). Lower severity than the magic-link race because the attacker would need the reset token AND to race the legitimate user, but the asymmetry was a polish gap. Mirrors app/auth/providers/email.py::_consume_token CAS exactly: write unique CONSUMED:<random> marker via UPDATE...WHERE token=old_token, then SELECT to verify our marker won, then proceed. Only the winner clears the marker and applies the password change. New regression test_concurrent_reset_only_one_wins in tests/test_password_flows.py::TestResetConfirm pins the contract: two ThreadPoolExecutor workers + Barrier hit /reset/confirm with the same token; exactly one gets 302 (password applied), the other gets 200 with 'Invalid or expired'. Sanity-checked against the pre-CAS code — both POSTs got 302 (race confirmed). --------- Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
252 lines
9.7 KiB
Python
252 lines
9.7 KiB
Python
"""Email magic link auth provider for FastAPI."""
|
|
|
|
import logging
|
|
import os
|
|
import secrets
|
|
from datetime import datetime, timedelta, timezone
|
|
from urllib.parse import quote
|
|
|
|
from fastapi import APIRouter, Depends, HTTPException
|
|
from fastapi.responses import RedirectResponse
|
|
from pydantic import BaseModel
|
|
import duckdb
|
|
|
|
from app.auth.jwt import create_access_token
|
|
from app.auth.access import is_user_admin
|
|
from app.auth.dependencies import _get_db, is_local_dev_mode
|
|
from src.repositories.users import UserRepository
|
|
|
|
|
|
def _role_label(user: dict, conn: duckdb.DuckDBPyConnection) -> str:
|
|
if is_user_admin(user["id"], conn):
|
|
return "admin"
|
|
return user.get("role") or "user"
|
|
|
|
logger = logging.getLogger(__name__)
|
|
router = APIRouter(prefix="/auth/email", tags=["auth"])
|
|
|
|
MAGIC_LINK_EXPIRY = 3600 # 1 hour
|
|
|
|
|
|
class MagicLinkRequest(BaseModel):
|
|
email: str
|
|
|
|
|
|
class MagicLinkVerify(BaseModel):
|
|
email: str
|
|
token: str
|
|
|
|
|
|
def is_available() -> bool:
|
|
# In dev mode the link is rendered to logs + response, so the provider is "available"
|
|
# even without SMTP/SendGrid. Keeps the login UI showing the magic-link option.
|
|
if is_local_dev_mode():
|
|
return True
|
|
return bool(os.environ.get("SMTP_HOST") or os.environ.get("SENDGRID_API_KEY"))
|
|
|
|
|
|
def _has_email_transport() -> bool:
|
|
return bool(os.environ.get("SMTP_HOST") or os.environ.get("SENDGRID_API_KEY"))
|
|
|
|
|
|
def _build_magic_link(email: str, token: str) -> str:
|
|
# URL-encode email: a literal '+' in a query string decodes to space per
|
|
# application/x-www-form-urlencoded, which would break addresses like
|
|
# "user+tag@gmail.com" on the GET /verify side.
|
|
server_url = os.environ.get("SERVER_URL", "http://localhost:8000")
|
|
return f"{server_url}/auth/email/verify?email={quote(email, safe='')}&token={token}"
|
|
|
|
|
|
@router.post("/send-link")
|
|
async def send_magic_link(
|
|
request: MagicLinkRequest,
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
"""Send a magic link to the user's email.
|
|
|
|
When SMTP/SendGrid is not configured, or LOCAL_DEV_MODE=1, the link is
|
|
logged to stderr and returned in the response body so a developer can
|
|
click it without an email transport.
|
|
"""
|
|
repo = UserRepository(conn)
|
|
user = repo.get_by_email(request.email)
|
|
|
|
# Always return success to prevent email enumeration
|
|
if not user:
|
|
return {"message": "If this email is registered, you will receive a login link."}
|
|
|
|
# Generate token
|
|
token = secrets.token_urlsafe(32)
|
|
repo.update(
|
|
id=user["id"],
|
|
reset_token=token,
|
|
reset_token_created=datetime.now(timezone.utc),
|
|
)
|
|
|
|
link = _build_magic_link(request.email, token)
|
|
send_error: str | None = None
|
|
if _has_email_transport():
|
|
try:
|
|
_send_email(request.email, token)
|
|
except Exception as e:
|
|
send_error = str(e)
|
|
logger.error("Failed to send magic link email to %s: %s", request.email, e)
|
|
|
|
# Dev fallback: expose the link in logs + response so you can click it without SMTP.
|
|
# Scoped strictly to LOCAL_DEV_MODE so test and production behavior are unchanged.
|
|
if is_local_dev_mode():
|
|
logger.warning("=" * 60)
|
|
logger.warning("Magic link for %s (LOCAL_DEV_MODE fallback):", request.email)
|
|
logger.warning(" %s", link)
|
|
logger.warning("=" * 60)
|
|
response: dict = {
|
|
"message": "Magic link generated (LOCAL_DEV_MODE) — click dev_link to log in.",
|
|
"dev_link": link,
|
|
}
|
|
if send_error:
|
|
response["send_error"] = send_error
|
|
return response
|
|
|
|
return {"message": "If this email is registered, you will receive a login link."}
|
|
|
|
|
|
def _consume_token(conn: duckdb.DuckDBPyConnection, email: str, token: str) -> dict:
|
|
"""Validate & consume a magic-link token atomically. Returns the user dict or raises 401.
|
|
|
|
Uses a "compare-and-swap" pattern: instead of setting reset_token to NULL
|
|
directly, we first set it to a unique CONSUMED marker that identifies THIS
|
|
consumption attempt, then verify that OUR marker was written. Two concurrent
|
|
verifies will both try to write their marker, but only one will succeed
|
|
(the WHERE clause checks the original token value); the loser's UPDATE is
|
|
a no-op, and the loser sees the winner's marker and fails.
|
|
|
|
DuckDB doesn't expose affected-row count, so the marker is the only way
|
|
to distinguish "I won the race" from "someone else won."
|
|
"""
|
|
# Compute the TTL cutoff in Python — DuckDB doesn't support
|
|
# parameterized INTERVAL arithmetic (?, INTERVAL) in all builds.
|
|
cutoff = datetime.now(timezone.utc) - timedelta(seconds=MAGIC_LINK_EXPIRY)
|
|
|
|
# Unique marker for this consumption attempt — lets us detect who won
|
|
# the race without relying on DuckDB rowcount (which returns -1).
|
|
consume_id = f"CONSUMED:{secrets.token_hex(16)}"
|
|
|
|
# Step 1: Atomic compare-and-swap. Only succeeds if the token still
|
|
# matches the original value and hasn't expired. On success, writes
|
|
# OUR consume_id instead of NULL so we can verify ownership.
|
|
# DuckDB raises TransactionContext Error on concurrent row conflicts —
|
|
# catch and treat as "someone else won the race."
|
|
try:
|
|
conn.execute(
|
|
"UPDATE users SET reset_token = ?, reset_token_created = NULL "
|
|
"WHERE email = ? AND reset_token = ? AND reset_token_created IS NOT NULL "
|
|
"AND reset_token_created >= ?",
|
|
[consume_id, email, token, cutoff],
|
|
)
|
|
except Exception as exc:
|
|
err = str(exc).lower()
|
|
if "conflict" in err or "transaction" in err:
|
|
raise HTTPException(status_code=401, detail="Invalid or expired link")
|
|
raise
|
|
|
|
# Step 2: Verify that OUR consume_id was written. If a concurrent
|
|
# request won the race, we'll see THEIR consume_id (or NULL if they
|
|
# already cleared it in step 3) — either way, we fail.
|
|
row = conn.execute(
|
|
"SELECT reset_token FROM users WHERE email = ?",
|
|
[email],
|
|
).fetchone()
|
|
if not row or row[0] != consume_id:
|
|
raise HTTPException(status_code=401, detail="Invalid or expired link")
|
|
|
|
# Step 3: Clear the consumed marker. Safe to do unconditionally —
|
|
# only the winner reaches here, and the marker is transient.
|
|
# If this UPDATE fails (DB error), the marker persists but the user
|
|
# can still request a new magic link — not a lockout.
|
|
try:
|
|
conn.execute(
|
|
"UPDATE users SET reset_token = NULL WHERE email = ? AND reset_token = ?",
|
|
[email, consume_id],
|
|
)
|
|
except Exception:
|
|
logger.warning("Failed to clear CONSUMED marker for %s — marker will persist", email)
|
|
|
|
# Fetch the user (token is now cleared, but we need the rest of the fields).
|
|
# CAS already validated token + expiry atomically, so no further checks
|
|
# needed — re-running them now would always fail because reset_token was
|
|
# NULL'd in step 3.
|
|
repo = UserRepository(conn)
|
|
user = repo.get_by_email(email)
|
|
if not user:
|
|
raise HTTPException(status_code=401, detail="Invalid link")
|
|
return user
|
|
|
|
|
|
@router.post("/verify")
|
|
async def verify_magic_link(
|
|
request: MagicLinkVerify,
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
"""Verify a magic link token and issue JWT (JSON API for programmatic clients)."""
|
|
user = _consume_token(conn, request.email, request.token)
|
|
role_label = _role_label(user, conn)
|
|
jwt_token = create_access_token(user["id"], user["email"], role_label)
|
|
return {"access_token": jwt_token, "token_type": "bearer", "email": user["email"], "role": role_label}
|
|
|
|
|
|
@router.get("/verify")
|
|
async def verify_magic_link_get(
|
|
email: str,
|
|
token: str,
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
):
|
|
"""Click-through variant — verifies token, sets cookie, redirects to /dashboard.
|
|
|
|
This is the URL we embed in outgoing emails (and the dev-fallback link), so
|
|
clicking it in a mail client logs the user in without a separate API call.
|
|
"""
|
|
user = _consume_token(conn, email, token)
|
|
jwt_token = create_access_token(user["id"], user["email"], _role_label(user, conn))
|
|
# secure=False when DOMAIN is unset so the cookie is actually sent on plain HTTP (dev).
|
|
use_secure = os.environ.get("DOMAIN", "") != ""
|
|
response = RedirectResponse(url="/dashboard", status_code=302)
|
|
response.set_cookie(
|
|
key="access_token", value=jwt_token,
|
|
httponly=True, max_age=86400, samesite="lax",
|
|
secure=use_secure,
|
|
)
|
|
return response
|
|
|
|
|
|
def _send_email(email: str, token: str):
|
|
"""Send magic link email via SMTP or SendGrid."""
|
|
link = _build_magic_link(email, token)
|
|
sendgrid_key = os.environ.get("SENDGRID_API_KEY")
|
|
if sendgrid_key:
|
|
import sendgrid
|
|
from sendgrid.helpers.mail import Mail
|
|
sg = sendgrid.SendGridAPIClient(api_key=sendgrid_key)
|
|
message = Mail(
|
|
from_email=os.environ.get("EMAIL_FROM_ADDRESS", "noreply@example.com"),
|
|
to_emails=email,
|
|
subject="Login Link",
|
|
html_content=f'<p>Click to login: <a href="{link}">Login</a></p>',
|
|
)
|
|
sg.send(message)
|
|
return
|
|
|
|
smtp_host = os.environ.get("SMTP_HOST")
|
|
if smtp_host:
|
|
import smtplib
|
|
from email.mime.text import MIMEText
|
|
msg = MIMEText(f"Login link: {link}")
|
|
msg["Subject"] = "Login Link"
|
|
msg["From"] = os.environ.get("SMTP_FROM", "noreply@example.com")
|
|
msg["To"] = email
|
|
with smtplib.SMTP(smtp_host, int(os.environ.get("SMTP_PORT", "587"))) as s:
|
|
if os.environ.get("SMTP_USE_TLS", "true").lower() == "true":
|
|
s.starttls()
|
|
smtp_user = os.environ.get("SMTP_USER")
|
|
if smtp_user:
|
|
s.login(smtp_user, os.environ.get("SMTP_PASSWORD", ""))
|
|
s.send_message(msg)
|