* feat(auth): internal roles + external→internal group mapping (foundation)
Two-layer authorization model: external Cloud Identity groups (org-managed)
get mapped onto internal Agnes-defined capabilities (app-managed) via an
admin-curated many-to-many table. Per-request permission checks read off
the session — no DB hit. Refresh requires re-login.
Schema v8 — new tables:
- internal_roles (id, key UNIQUE, display_name, description, owner_module, …)
— app-defined capabilities like 'context_admin'. Modules self-register at
import; the startup hook syncs the registry into this table (idempotent).
- group_mappings (id, external_group_id, internal_role_id FK, …)
— admin-managed bindings, UNIQUE(external_group_id, internal_role_id).
app/auth/role_resolver.py — new module:
- register_internal_role(key, display_name, description, owner_module)
Module-author entry point. lower_snake_case key, immutable, validated.
Same key + same fields = no-op (re-import safe); same key + different
fields = ValueError so two modules can't silently overwrite each other.
- sync_registered_roles_to_db(conn) — startup reconciliation. Inserts new
keys, updates drifted metadata, never deletes (preserves mappings).
- resolve_internal_roles(external_groups, conn) — joins group_mappings.
Sorted, deduplicated role-key list. Plugged into google_callback +
dev-bypass branch in get_current_user.
- require_internal_role('key') — FastAPI dependency factory; reads
session.internal_roles; 403 with explicit message when missing.
Resolution runs at sign-in only (Google callback + LOCAL_DEV_GROUPS change
in dev-bypass) — same semantics as session.google_groups. No admin UI yet;
mappings created via repository directly until follow-up PR ships UI.
21 new tests in tests/test_role_resolver.py: register/list, idempotency,
collision detection, key-format validation; sync insert/update/no-delete;
resolve empty/single/many-to-many/malformed-input; e2e via
LOCAL_DEV_GROUPS — gated endpoint allowed/denied + direct session-cookie
inspection. Full sweep: 178/178 passed across auth + db + repo tests.
(Two pre-existing test_catalog_export.py failures verified unrelated.)
* fix(auth): polish review feedback — first-request dev populate + PAT doc
Two follow-ups from a code-reviewer pass on the foundation commit before
opening the PR:
- Dev-bypass populates session["internal_roles"] on the first request
after sign-in, not just when external groups change. The previous
guard only resolved when groups_changed=True, which left a hole for
the LOCAL_DEV_GROUPS=`""` (explicit empty) flow: target=[],
current=None, neither write branch fires, internal_roles stays
unset, and require_internal_role then 403s with no roles to check
against. The OAuth callback writes session["internal_roles"]
unconditionally on sign-in (even []); dev-bypass now matches that
semantics. Adds a single-pass populate gated on the key being
absent from the session, so subsequent same-state requests still
no-op (cheap session lookup, no resolver call).
- Document that internal roles are session-scoped and PAT/headless
clients will get 403 from any require_internal_role(...) endpoint.
Same constraint already applies to session.google_groups (PAT JWTs
deliberately don't snapshot group memberships — they could change
after issuance with no way to re-sign), but the doc didn't surface
this — an operator pointing a CLI at a role-gated endpoint would
see 403 with no clue why. New "PAT and headless requests" section
spells out the constraint, the rationale, and the three escape
valves (use users.role for the gate; route through OAuth; wait for
the planned `da admin grant-role` CLI helper).
54 auth tests still pass locally (21 role-resolver + 33 existing
auth-provider).
* release(0.11.3): cut release for the internal-roles foundation
Bumps pyproject.toml 0.11.2 → 0.11.3 and renames CHANGELOG's
[Unreleased] section to [0.11.3] — 2026-04-26 (with a fresh
empty [Unreleased] skeleton appended). Adds the matching
[0.11.3] link reference at the bottom of CHANGELOG so the
section heading renders as a hyperlink to the GitHub release
page once the tag lands.
The bullet itself is unchanged content; the rephrasing of
"dev-bypass when external groups change" → "dev-bypass —
populates on first request and whenever external groups
change, mirroring the OAuth callback's always-write
semantics" reflects the polish committed in d590579, plus
the appended PAT/headless caveat pointing at the doc
section that landed in the same polish pass.
* fix(auth): address review feedback from Pavel — PAT-specific 403, audit logs, hardening
Round-2 polish over the internal-roles foundation, addressing Pavel's review
on PR #71. No behavior change for the happy path; tightens the safety rails
and makes the failure modes self-explanatory.
User-visible:
- require_internal_role now distinguishes "no session" (Bearer/PAT caller)
from "signed in but missing role" and surfaces a PAT-specific 403 detail
in the first case ("This endpoint needs an interactive (OAuth) session
— Bearer/PAT tokens do not carry session-resolved roles by design").
- docs/internal-roles.md documents deactivate+reactivate as the supported
"force re-resolve now" lever for users that can't be made to log out.
Internal hardening:
- INFO-level audit log on every successful resolve (OAuth callback +
dev-bypass) so a wrong-role complaint is debuggable from the log alone.
- Startup warning when SESSION_SECRET is shorter than 32 chars, matching
the existing JWT_SECRET_KEY gate — both HMAC surfaces sign trust-laden
state (session.internal_roles, session.google_groups, JWTs).
- _clear_registry_for_tests() now refuses to run unless TESTING=1 so a
stray import path in production can't drop the registered capabilities.
Tests:
- 4 new tests in tests/test_role_resolver.py covering: stale-session
contract after a mid-session mapping revoke (pin the documented
limitation), PAT 403 detail wording, OAuth pipeline data flow from
external groups to internal_roles, and the dev-bypass empty-list
fallback when the resolver raises.
CHANGELOG.md updated under [0.11.3] (### Changed + ### Internal).
CLAUDE.md schema doc bumped from v7 to v8.
---------
Co-authored-by: Claude <noreply@anthropic.com>
329 lines
13 KiB
Python
329 lines
13 KiB
Python
"""FastAPI auth dependencies — current user, role checking."""
|
|
|
|
import json
|
|
import logging
|
|
import os
|
|
from typing import Optional
|
|
|
|
import duckdb
|
|
from fastapi import Depends, HTTPException, Header, Request, status
|
|
|
|
from app.auth.jwt import verify_token
|
|
from src.db import get_system_db
|
|
from src.rbac import Role, ROLE_HIERARCHY
|
|
from src.repositories.users import UserRepository
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
# Default dev user used when LOCAL_DEV_MODE=1. Seeded at startup by app/main.py.
|
|
LOCAL_DEV_DEFAULT_EMAIL = "dev@localhost"
|
|
|
|
# Single-slot cache for the parsed LOCAL_DEV_GROUPS value, keyed by the raw env
|
|
# string. Avoids re-parsing JSON on every authenticated request without the
|
|
# surprise of test isolation issues — when the env changes (typical in tests),
|
|
# the key changes and the cache transparently re-parses.
|
|
_LOCAL_DEV_GROUPS_CACHE: tuple[str, list[dict]] | None = None
|
|
|
|
|
|
def is_local_dev_mode() -> bool:
|
|
"""True when LOCAL_DEV_MODE=1 — unsafe for production, bypasses auth."""
|
|
return os.environ.get("LOCAL_DEV_MODE", "").lower() in ("1", "true", "yes")
|
|
|
|
|
|
def get_local_dev_email() -> str:
|
|
"""Email of the auto-logged-in dev user. Configurable via LOCAL_DEV_USER_EMAIL."""
|
|
return os.environ.get("LOCAL_DEV_USER_EMAIL", LOCAL_DEV_DEFAULT_EMAIL)
|
|
|
|
|
|
def get_local_dev_groups() -> list[dict]:
|
|
"""Mock Google Workspace groups for the dev user when LOCAL_DEV_MODE is on.
|
|
|
|
Reads ``LOCAL_DEV_GROUPS`` as a JSON array of objects matching the shape
|
|
produced by ``_fetch_google_groups`` — ``[{"id": "...", "name": "..."}]``.
|
|
Items must have a non-empty ``id``; ``name`` defaults to ``id`` when
|
|
omitted. Extra fields are preserved verbatim so future group attributes
|
|
(roles, labels, …) can be mocked without touching this parser.
|
|
|
|
Returns ``[]`` on missing/empty/malformed input — dev mock must never
|
|
break the dev flow. Malformed input is logged at WARNING.
|
|
|
|
Cached single-slot: re-parses only when the raw env-var value changes.
|
|
"""
|
|
global _LOCAL_DEV_GROUPS_CACHE
|
|
raw = os.environ.get("LOCAL_DEV_GROUPS", "").strip()
|
|
if _LOCAL_DEV_GROUPS_CACHE is not None and _LOCAL_DEV_GROUPS_CACHE[0] == raw:
|
|
return _LOCAL_DEV_GROUPS_CACHE[1]
|
|
result = _parse_local_dev_groups(raw)
|
|
_LOCAL_DEV_GROUPS_CACHE = (raw, result)
|
|
return result
|
|
|
|
|
|
def _parse_local_dev_groups(raw: str) -> list[dict]:
|
|
if not raw:
|
|
return []
|
|
try:
|
|
parsed = json.loads(raw)
|
|
except json.JSONDecodeError as e:
|
|
logger.warning("LOCAL_DEV_GROUPS is not valid JSON, ignoring: %s", e)
|
|
return []
|
|
if not isinstance(parsed, list):
|
|
logger.warning(
|
|
"LOCAL_DEV_GROUPS must be a JSON array, got %s — ignoring",
|
|
type(parsed).__name__,
|
|
)
|
|
return []
|
|
out: list[dict] = []
|
|
for item in parsed:
|
|
if not isinstance(item, dict) or not item.get("id"):
|
|
logger.warning(
|
|
"LOCAL_DEV_GROUPS item must be an object with 'id', skipping: %r",
|
|
item,
|
|
)
|
|
continue
|
|
# Don't mutate the parsed input — keeps the parser pure so the cache
|
|
# value stays a fresh list on each rebuild.
|
|
out.append({**item, "name": item.get("name") or item["id"]})
|
|
return out
|
|
|
|
|
|
def _get_db():
|
|
conn = get_system_db()
|
|
try:
|
|
yield conn
|
|
finally:
|
|
conn.close()
|
|
|
|
|
|
def _client_ip(request: Optional[Request]) -> Optional[str]:
|
|
"""Return the request's client IP, preferring the first hop of X-Forwarded-For.
|
|
|
|
Trust model: this deployment runs behind Caddy (see repo Caddyfile), which
|
|
strips incoming X-Forwarded-For and sets its own. The leftmost hop is
|
|
therefore trustworthy. If the app is ever exposed directly to the internet
|
|
without a proxy, this value becomes client-settable and should only be
|
|
relied on for audit/diagnostics, never access control. Value is stored in
|
|
personal_access_tokens.last_used_ip and audit_log entries — informational
|
|
only, never authorization.
|
|
"""
|
|
if request is None:
|
|
return None
|
|
xff = request.headers.get("x-forwarded-for")
|
|
if xff:
|
|
return xff.split(",", 1)[0].strip() or None
|
|
client = getattr(request, "client", None)
|
|
return getattr(client, "host", None) if client else None
|
|
|
|
|
|
def _get_local_dev_user(conn: duckdb.DuckDBPyConnection) -> Optional[dict]:
|
|
"""Return the seeded dev user when LOCAL_DEV_MODE is on, else None."""
|
|
repo = UserRepository(conn)
|
|
user = repo.get_by_email(get_local_dev_email())
|
|
if not user:
|
|
logger.error(
|
|
"LOCAL_DEV_MODE is on but dev user %s is not seeded; expected app startup to seed it",
|
|
get_local_dev_email(),
|
|
)
|
|
return user
|
|
|
|
|
|
async def get_current_user(
|
|
request: Request = None,
|
|
authorization: Optional[str] = Header(None),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
) -> dict:
|
|
"""Extract and validate JWT from Authorization header or cookie. Returns user dict."""
|
|
if is_local_dev_mode():
|
|
user = _get_local_dev_user(conn)
|
|
if user:
|
|
# Mirror the Google OAuth callback (app/auth/providers/google.py:189-194)
|
|
# which writes session.google_groups on every login — including [] on
|
|
# failure — so group-aware code paths see authoritative state. We
|
|
# match that semantics here while skipping the write when nothing
|
|
# would change: same-value updates are a no-op, and the write on
|
|
# PAT/CLI requests with no prior session + no target is also skipped
|
|
# (target → [], existing → None/[], no transition to record).
|
|
if request is not None and hasattr(request, "session"):
|
|
target_groups = get_local_dev_groups()
|
|
current = request.session.get("google_groups")
|
|
groups_changed = False
|
|
if target_groups and current != target_groups:
|
|
request.session["google_groups"] = target_groups
|
|
groups_changed = True
|
|
elif not target_groups and current:
|
|
# Clear stale groups if the operator unsets LOCAL_DEV_GROUPS
|
|
# mid-session — matches production's "always-write" semantics.
|
|
request.session["google_groups"] = []
|
|
groups_changed = True
|
|
# Populate internal_roles whenever it would otherwise be missing
|
|
# — first request after sign-in or any time groups changed. This
|
|
# mirrors the OAuth callback's unconditional write so a dev
|
|
# request never reaches require_internal_role with the key
|
|
# absent. Skipping when role list is already cached + groups
|
|
# didn't change keeps the per-request cost at a session lookup.
|
|
if groups_changed or "internal_roles" not in request.session:
|
|
try:
|
|
from app.auth.role_resolver import resolve_internal_roles
|
|
resolved = resolve_internal_roles(target_groups, conn)
|
|
request.session["internal_roles"] = resolved
|
|
logger.info(
|
|
"dev-bypass resolved %d internal role(s) for %s: %s",
|
|
len(resolved),
|
|
user.get("email", "<unknown>"),
|
|
resolved or "<none>",
|
|
)
|
|
except Exception as e:
|
|
logger.warning(
|
|
"dev-bypass: resolve_internal_roles failed: %s", e,
|
|
)
|
|
request.session["internal_roles"] = []
|
|
return user
|
|
# Fall through to normal auth if seed missing — surfaces the bug instead of hiding it.
|
|
|
|
token = None
|
|
|
|
# Try Authorization header first
|
|
if authorization and authorization.startswith("Bearer "):
|
|
token = authorization.removeprefix("Bearer ")
|
|
|
|
# Fallback to cookie (for web UI after OAuth redirect)
|
|
if not token and request:
|
|
token = request.cookies.get("access_token")
|
|
|
|
if not token:
|
|
raise HTTPException(
|
|
status_code=status.HTTP_401_UNAUTHORIZED,
|
|
detail="Missing or invalid Authorization header",
|
|
)
|
|
payload = verify_token(token)
|
|
if not payload:
|
|
raise HTTPException(
|
|
status_code=status.HTTP_401_UNAUTHORIZED,
|
|
detail="Invalid or expired token",
|
|
)
|
|
|
|
repo = UserRepository(conn)
|
|
user = repo.get_by_id(payload.get("sub", ""))
|
|
if not user:
|
|
raise HTTPException(
|
|
status_code=status.HTTP_401_UNAUTHORIZED,
|
|
detail="User not found",
|
|
)
|
|
if not bool(user.get("active", True)):
|
|
raise HTTPException(
|
|
status_code=status.HTTP_401_UNAUTHORIZED,
|
|
detail="Account deactivated",
|
|
)
|
|
|
|
# PAT validation: check it's not revoked / expired / unknown in DB.
|
|
if payload.get("typ") == "pat":
|
|
from datetime import datetime, timezone
|
|
import hashlib
|
|
from src.repositories.access_tokens import AccessTokenRepository
|
|
|
|
def _fail(detail: str) -> None:
|
|
raise HTTPException(
|
|
status_code=status.HTTP_401_UNAUTHORIZED, detail=detail
|
|
)
|
|
|
|
tokens_repo = AccessTokenRepository(conn)
|
|
record = tokens_repo.get_by_id(payload.get("jti", ""))
|
|
if not record:
|
|
_fail("Token unknown")
|
|
if record.get("revoked_at") is not None:
|
|
_fail("Token revoked")
|
|
exp_at = record.get("expires_at")
|
|
if exp_at is not None:
|
|
if isinstance(exp_at, str):
|
|
exp_at = datetime.fromisoformat(exp_at)
|
|
if exp_at.tzinfo is None:
|
|
exp_at = exp_at.replace(tzinfo=timezone.utc)
|
|
if datetime.now(timezone.utc) > exp_at:
|
|
_fail("Token expired")
|
|
# Defense-in-depth: stored token_hash must match sha256(bearer JWT).
|
|
# Protects against a forged-but-unrevoked JWT using a stolen key.
|
|
stored_hash = record.get("token_hash")
|
|
if stored_hash:
|
|
actual = hashlib.sha256(token.encode()).hexdigest()
|
|
if actual != stored_hash:
|
|
_fail("Token mismatch")
|
|
|
|
# First-use-from-new-IP audit entry (#12 acceptance criterion).
|
|
# Only emit when the IP changes on a *subsequent* use — the very
|
|
# first use of a token is not surprising and doesn't need an entry.
|
|
current_ip = _client_ip(request)
|
|
previous_ip = record.get("last_used_ip")
|
|
already_used = record.get("last_used_at") is not None
|
|
if already_used and current_ip and current_ip != previous_ip:
|
|
try:
|
|
from src.repositories.audit import AuditRepository
|
|
AuditRepository(conn).log(
|
|
user_id=user["id"],
|
|
action="token.first_use_new_ip",
|
|
resource=f"token:{payload['jti']}",
|
|
params={"ip": current_ip, "previous_ip": previous_ip},
|
|
)
|
|
except Exception:
|
|
pass # audit failure must not block auth
|
|
|
|
# Record last_used_at / last_used_ip synchronously — acceptable cost; can batch later.
|
|
try:
|
|
tokens_repo.mark_used(payload["jti"], ip=current_ip)
|
|
except Exception:
|
|
pass
|
|
|
|
return user
|
|
|
|
|
|
async def get_optional_user(
|
|
request: Request = None,
|
|
authorization: Optional[str] = Header(None),
|
|
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
|
|
) -> Optional[dict]:
|
|
"""Like get_current_user but returns None instead of 401 if no token."""
|
|
try:
|
|
return await get_current_user(request=request, authorization=authorization, conn=conn)
|
|
except HTTPException:
|
|
return None
|
|
|
|
|
|
def require_role(minimum_role: Role):
|
|
"""Dependency factory: require user has at least the given role."""
|
|
async def _check(user: dict = Depends(get_current_user)):
|
|
user_role = Role(user.get("role", "viewer"))
|
|
if ROLE_HIERARCHY.get(user_role, 0) < ROLE_HIERARCHY.get(minimum_role, 0):
|
|
raise HTTPException(
|
|
status_code=status.HTTP_403_FORBIDDEN,
|
|
detail=f"Requires role {minimum_role.value} or higher",
|
|
)
|
|
return user
|
|
return _check
|
|
|
|
|
|
async def require_admin(user: dict = Depends(get_current_user)) -> dict:
|
|
"""Dependency: require user is an admin. Raises 403 otherwise."""
|
|
if user.get("role") != "admin":
|
|
raise HTTPException(
|
|
status_code=status.HTTP_403_FORBIDDEN,
|
|
detail="Admin access required",
|
|
)
|
|
return user
|
|
|
|
|
|
async def require_session_token(request: Request, user: dict = Depends(get_current_user)) -> dict:
|
|
"""Like get_current_user but rejects PAT — for endpoints that must not
|
|
be callable via a long-lived CI token (e.g. creating new tokens, changing password)."""
|
|
auth = request.headers.get("authorization", "")
|
|
token = None
|
|
if auth.startswith("Bearer "):
|
|
token = auth.removeprefix("Bearer ")
|
|
if not token and request:
|
|
token = request.cookies.get("access_token")
|
|
if token:
|
|
from app.auth.jwt import verify_token
|
|
payload = verify_token(token) or {}
|
|
if payload.get("typ") == "pat":
|
|
raise HTTPException(
|
|
status_code=status.HTTP_403_FORBIDDEN,
|
|
detail="This endpoint requires an interactive session, not a PAT",
|
|
)
|
|
return user
|