agnes-the-ai-analyst/app/main.py
Petr Simecek 1c18cdf15f
release(0.11.2): LOCAL_DEV_GROUPS dev mock + Makefile defaults + docs/local-development.md (#70)
* feat(auth): mock session.google_groups in LOCAL_DEV_MODE via LOCAL_DEV_GROUPS

LOCAL_DEV_MODE auto-logged-in the dev user but left session.google_groups
empty, so group-aware UI/code paths can't be exercised on localhost without
a real Google OAuth round-trip. New LOCAL_DEV_GROUPS env var (JSON array
matching the production {id, name} shape) populates the session on every
dev-bypass request — same structure the OAuth callback writes, so mock and
prod stay in lockstep. Compare-then-write avoids spurious Set-Cookie noise
on PAT/CLI requests; malformed input falls back to [] with a WARNING so
the dev mock never breaks the dev flow.

* refactor(auth): fail-fast LOCAL_DEV_GROUPS at startup + cache + no-mutate

Three small follow-ups on the same dev-mock vector before merge:

- Validate LOCAL_DEV_GROUPS at app startup and report the parsed group IDs
  in the LOCAL_DEV_MODE banner. A malformed value now warns loudly at boot
  instead of silently logging on the first authenticated request, where
  it's easy to miss.
- Cache the parsed result single-slot, keyed by the raw env-string. Avoids
  re-parsing JSON on every authenticated request without test-isolation
  surprises — when the env value changes, the key changes and the cache
  transparently rebuilds.
- Stop mutating the parsed-input dicts (item.setdefault → spread-merge)
  so the cached list stays a fresh value on every rebuild.
- Replace the try/except guard around request.session with hasattr —
  SessionMiddleware is always registered, the silent except was paranoid.

Tests grow by a direct session-cookie inspection (decoupled from the
profile template) and three startup-banner log assertions.

* fix(auth): drop fragile session-decoder test + actually skip empty-target write

Two follow-ups on the LOCAL_DEV_GROUPS feature before merge:

- Drop test_session_holds_mocked_groups_directly. It manually decoded the
  signed session cookie via TimestampSigner + base64, hardcoding both the
  Starlette session-cookie format and the 14-day max_age. Starlette has
  changed its session encoding before (URLSafeTimedSerializer pre-0.20)
  and would do so again silently — the test would fail with a cryptic
  BadSignature, not a clear "mock is broken" signal. The remaining
  test_dev_user_sees_mocked_groups_on_profile already covers the same
  observable signal (mocked groups in /profile body) without coupling to
  Starlette internals.

- Actually skip the session write when target_groups is empty. The previous
  comment claimed compare-then-write avoided spurious Set-Cookie noise on
  PAT/CLI requests, but on those requests session.get("google_groups") is
  None and target is [], so None != [] always evaluates True and the write
  fired anyway, marking the session dirty and re-issuing Set-Cookie on
  every request. Adding `target_groups and ...` to the guard makes the
  comment honest: empty mock now genuinely no-ops, stable browser sessions
  still skip via value-equality, and the only remaining write is the one
  that actually changes state.

33 auth tests still pass locally.

* fix(auth): match production's always-write semantics for stale dev groups

Devin code-review finding on PR #70: my earlier `target_groups and ...`
short-circuit silently diverged from the production OAuth callback. In
app/auth/providers/google.py:189-194 the callback always writes
session.google_groups on each login — including [] on failure or empty
token — so the session always reflects authoritative current state. The
mock should match.

Failure mode the previous guard left open: a developer sets
LOCAL_DEV_GROUPS=[{...}] for a session, the groups land in the signed
cookie, then the developer unsets the env var and reloads. target → [],
session.get → [{...}], `if target_groups and ...` is False, no write,
stale groups stay in the browser session indefinitely. Mock now lies
about state until logout.

Fix splits the guard:
- target_groups truthy + value-changed → write the new mock (existing path)
- target_groups falsy + non-empty stored → write [] to clear stale state
- otherwise no-op (target [] + stored None/[]: no transition to record)

PAT/CLI requests with no prior session still take the no-op path
(target=[], session.get → None which is falsy), so the original goal of
suppressing spurious Set-Cookie noise on token traffic is preserved.

Tests already cover the populated and unset paths; the new clear-stale
branch is correct by construction (production has the same shape) and
the rare manual reset workflow.

* release(0.11.2): default mocked groups in make local-dev + docs/local-development.md

Cuts 0.11.2 around the LOCAL_DEV_GROUPS work plus a small dev-experience
follow-up: every `make local-dev` now boots with two sensible default
mocked groups (Local Dev Engineers + Local Dev Admins on example.com),
so /profile and group-aware code paths render something realistic
without the operator having to discover and set LOCAL_DEV_GROUPS.

Layered so the default lives in the workflow, not the contract:

- scripts/run-local-dev.sh seeds LOCAL_DEV_GROUPS via shell ":="
  syntax — only sets the var when the operator hasn't already.
  Override: LOCAL_DEV_GROUPS='[...]' make local-dev. Disable:
  LOCAL_DEV_GROUPS= make local-dev.
- docker-compose.local-dev.yml swaps the commented JSON example for
  a bare `- LOCAL_DEV_GROUPS` passthrough — the value comes from the
  shell, the compose file just propagates it. Operators running
  `docker compose up` directly without the wrapper script get an
  empty mock (correct: they didn't opt into the make-driven defaults).
- Makefile help line mentions the mocked groups so the behavior is
  visible without grepping.

New docs/local-development.md consolidates dev-onboarding instructions
that were previously scattered across docker-compose.local-dev.yml
inline comments, docs/auth-groups.md "Local-dev mock" section, the
Makefile help text, and CLAUDE.md "First-Time Setup". Single page now
covers TL;DR, what LOCAL_DEV_MODE actually bypasses, group mocking
controls + verification, what is *not* mocked (Cloud Identity, real
OAuth, admin Workspace permissions), and the safety rails that keep
the dev shortcuts off production.

Version bump 0.11.1 → 0.11.2 in pyproject.toml, CHANGELOG cuts
[Unreleased] → [0.11.2] — 2026-04-26 with a fresh empty [Unreleased]
skeleton.

* fix(local-dev): default LOCAL_DEV_GROUPS truncated by shell parameter expansion

Reported by an operator running `make local-dev` against the freshly
released 0.11.2 — the LOCAL_DEV_MODE banner showed:

    LOCAL_DEV_GROUPS is not valid JSON, ignoring:
    Expecting ',' delimiter: line 1 column 70 (char 69)
    LOCAL_DEV_GROUPS is set but produced no valid groups —
    check the WARNING above for the parse error.

Cause: the default value lived inside `${LOCAL_DEV_GROUPS:=…}` parameter
expansion. Bash matches `}` to close the expansion at the *first* `}`
encountered in the body, regardless of context — even one inside a
nested JSON object literal. The two-element JSON array was therefore
truncated to the first group's closing brace, leaving an unparseable
fragment:

    [{"id":"local-dev-engineers@example.com","name":"Local Dev Engineers"

There is no escaping syntax for `}` inside parameter expansion (the
backslash escapes I had only escaped the quotes — `}` reaches bash
literally). Fix: hold the default in a single-quoted variable and
reference it through `${LOCAL_DEV_GROUPS:-$DEFAULT_LOCAL_DEV_GROUPS}`.
The variable's value is opaque to the expansion — no `}` matching
inside it — so the JSON survives intact. Verified with `python -m json`:

    parsed OK: 2 groups: ['local-dev-engineers@example.com',
                          'local-dev-admins@example.com']

Operators on a running 0.11.2 stack: `make local-dev-down && make
local-dev` to pick up the corrected default.

* fix(local-dev): respect LOCAL_DEV_GROUPS= disable path + add 0.11.2 changelog link

Two follow-ups from a Devin code-review pass on PR #70:

- run-local-dev.sh: switch ${LOCAL_DEV_GROUPS:-$DEFAULT} to
  ${LOCAL_DEV_GROUPS-$DEFAULT} (no leading colon). The :- form
  substitutes the default when the variable is unset OR set-but-empty,
  silently overwriting the documented disable knob. Three places
  promise this works — docs/local-development.md, the CHANGELOG entry,
  and the script's own comment — so the bug was an operator-facing
  lie, not just an implementation detail. The bare - form only
  substitutes on unset, so `LOCAL_DEV_GROUPS= make local-dev` now
  reaches the Python parser as "" and short-circuits to []. Verified
  with both empty and unset shells.

- CHANGELOG.md: add the [0.11.2] link reference at the bottom.
  Keep-a-Changelog convention is to mirror every version heading
  with a release-tag link in the footer; the 0.11.2 heading was
  missing its counterpart, breaking the Markdown link rendering on
  GitHub.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-26 16:48:55 +02:00

276 lines
12 KiB
Python

"""FastAPI main application — unified server for web UI + API."""
import logging
from contextlib import asynccontextmanager
from importlib.metadata import PackageNotFoundError
from importlib.metadata import version as _pkg_version
from pathlib import Path
from urllib.parse import quote
import os
def _app_version() -> str:
"""Product version for FastAPI title / OpenAPI schema.
Single source of truth is `pyproject.toml` `[project].version`; we read
it back via `importlib.metadata` at runtime so `/docs`, `/openapi.json`,
`/api/version`, `/cli/latest`, and `da --version` can never drift.
"""
try:
return _pkg_version("agnes-the-ai-analyst")
except PackageNotFoundError:
return "dev"
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import RedirectResponse
from fastapi.staticfiles import StaticFiles
from starlette.exceptions import HTTPException as StarletteHTTPException
from starlette.middleware.gzip import GZipMiddleware
from starlette.middleware.sessions import SessionMiddleware
from starlette.types import ASGIApp, Receive, Scope, Send
class _SelectiveGZipMiddleware:
"""GZipMiddleware wrapper that skips a set of path prefixes.
Parquet-serving endpoints send responses that are already columnar-
compressed (parquet's internal codec) and — for /api/data — can reach
hundreds of MB. Gzipping them on the way out costs CPU and latency with
no meaningful size reduction. Skip those paths; every other endpoint
(JSON manifests, HTML previews, install.sh) still gets compressed.
"""
def __init__(self, app: ASGIApp, minimum_size: int = 1024, skip_prefixes: tuple[str, ...] = ()) -> None:
self._raw = app
self._gzip = GZipMiddleware(app, minimum_size=minimum_size)
self._skip_prefixes = skip_prefixes
async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None:
if scope.get("type") == "http":
path = scope.get("path", "")
if any(path.startswith(p) for p in self._skip_prefixes):
await self._raw(scope, receive, send)
return
await self._gzip(scope, receive, send)
from app.auth.router import router as auth_router
from app.api.health import router as health_router
from app.api.sync import router as sync_router
from app.api.data import router as data_router
from app.api.query import router as query_router
from app.api.users import router as users_router
from app.api.memory import router as memory_router
from app.api.upload import router as upload_router
from app.api.scripts import router as scripts_router
from app.api.settings import router as settings_router
from app.api.catalog import router as catalog_router
from app.api.telegram import router as telegram_router
from app.api.admin import router as admin_router
from app.api.permissions import router as permissions_router
from app.api.access_requests import router as access_requests_router
from app.api.jira_webhooks import router as jira_webhooks_router
from app.api.metrics import router as metrics_router
from app.api.metadata import router as metadata_router
from app.api.query_hybrid import router as query_hybrid_router
from app.api.cli_artifacts import router as cli_artifacts_router
from app.api.tokens import router as tokens_router, admin_router as tokens_admin_router
from app.web.router import router as web_router
logger = logging.getLogger(__name__)
@asynccontextmanager
async def lifespan(app):
yield
from src.db import close_system_db
close_system_db()
def create_app() -> FastAPI:
app = FastAPI(
title="AI Data Analyst",
description="Data distribution platform for AI analytical systems",
version=_app_version(),
lifespan=lifespan,
)
# Compress JSON / HTML responses on the wire. Parquet downloads are
# excluded — they're already columnar-compressed and re-gzipping them
# just burns CPU with no size win. minimum_size=1024 keeps tiny
# responses uncompressed too (cheaper than the header overhead).
app.add_middleware(
_SelectiveGZipMiddleware,
minimum_size=1024,
skip_prefixes=("/api/data/", "/cli/wheel/", "/cli/download"),
)
# Session middleware (required for OAuth state)
from app.secrets import get_session_secret
session_secret = get_session_secret()
app.add_middleware(SessionMiddleware, secret_key=session_secret)
# CORS for CLI and external clients
cors_origins = os.environ.get("CORS_ORIGINS", "http://localhost:3000,http://localhost:8000").split(",")
app.add_middleware(
CORSMiddleware,
allow_origins=[o.strip() for o in cors_origins],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Load .env_overlay (persisted by /api/admin/configure)
_overlay = Path(os.environ.get("DATA_DIR", "./data")) / "state" / ".env_overlay"
if _overlay.exists():
for line in _overlay.read_text().splitlines():
if "=" in line and not line.startswith("#"):
k, v = line.split("=", 1)
os.environ.setdefault(k.strip(), v.strip())
# Load instance config on startup
try:
from app.instance_config import load_instance_config
load_instance_config()
logger.info("Instance config loaded")
except Exception as e:
logger.warning(f"Could not load instance config: {e}")
# Startup banner
from src.db import SCHEMA_VERSION
logger.info(
"Agnes %s | channel: %s | schema v%s",
os.environ.get("AGNES_VERSION", "dev"),
os.environ.get("RELEASE_CHANNEL", "dev"),
SCHEMA_VERSION,
)
# LOCAL_DEV_MODE: bypass authentication for local development. DO NOT enable in prod.
# When on, every protected route auto-logs in as a seeded admin user (default dev@localhost).
from app.auth.dependencies import (
is_local_dev_mode, get_local_dev_email, get_local_dev_groups,
)
if is_local_dev_mode():
logger.warning("=" * 60)
logger.warning("LOCAL_DEV_MODE is ON — authentication is bypassed.")
logger.warning("All requests auto-authenticate as: %s", get_local_dev_email())
# Validate + report LOCAL_DEV_GROUPS at startup so a malformed JSON
# value gets surfaced loudly here instead of silently warning on the
# first authenticated request. Empty when unset is fine — just say so.
raw_groups_env = os.environ.get("LOCAL_DEV_GROUPS", "").strip()
mocked_groups = get_local_dev_groups()
if raw_groups_env and not mocked_groups:
logger.warning(
"LOCAL_DEV_GROUPS is set but produced no valid groups — "
"check the WARNING above for the parse error.",
)
elif mocked_groups:
logger.warning(
"LOCAL_DEV_GROUPS: mocking %d group(s) into session: %s",
len(mocked_groups),
", ".join(g["id"] for g in mocked_groups),
)
else:
logger.warning("LOCAL_DEV_GROUPS is unset — session.google_groups will be empty.")
logger.warning("NEVER enable this in a deployment reachable from the internet.")
logger.warning("=" * 60)
# Seed admin user for testing/CI (when SEED_ADMIN_EMAIL is set) OR for local dev.
# Optional: SEED_ADMIN_PASSWORD sets password_hash on first seed so the user
# can log in immediately without bootstrap. Only applied if the user has no
# password_hash yet — never overwrites an existing password.
seed_email = os.environ.get("SEED_ADMIN_EMAIL") or (get_local_dev_email() if is_local_dev_mode() else None)
if seed_email:
try:
from src.db import get_system_db
from src.repositories.users import UserRepository
conn = get_system_db()
repo = UserRepository(conn)
seed_password = os.environ.get("SEED_ADMIN_PASSWORD") or None
password_hash = None
if seed_password:
from argon2 import PasswordHasher
password_hash = PasswordHasher().hash(seed_password)
existing = repo.get_by_email(seed_email)
if not existing:
import uuid
repo.create(
id=str(uuid.uuid4()),
email=seed_email,
name="Admin",
role="admin",
password_hash=password_hash,
)
logger.info("Seeded admin user: %s (password=%s)", seed_email, "yes" if password_hash else "no")
elif password_hash and not existing.get("password_hash"):
repo.update(id=existing["id"], password_hash=password_hash, role="admin")
logger.info("Set password on existing seed admin: %s", seed_email)
conn.close()
except Exception as e:
logger.warning(f"Could not seed admin: {e}")
# Static files
static_dir = Path(__file__).parent / "web" / "static"
if static_dir.exists():
app.mount("/static", StaticFiles(directory=str(static_dir)), name="static")
# Auth providers (conditional registration)
from app.auth.providers.google import router as google_auth_router, is_available as google_available
from app.auth.providers.password import router as password_auth_router
from app.auth.providers.email import router as email_auth_router, is_available as email_available
# API routers
app.include_router(auth_router)
app.include_router(google_auth_router)
app.include_router(password_auth_router)
app.include_router(email_auth_router) # Always register, check availability per-request
app.include_router(health_router)
app.include_router(sync_router)
app.include_router(data_router)
app.include_router(query_router)
app.include_router(users_router)
app.include_router(memory_router)
app.include_router(upload_router)
app.include_router(scripts_router)
app.include_router(settings_router)
app.include_router(catalog_router)
app.include_router(telegram_router)
app.include_router(admin_router)
app.include_router(permissions_router)
app.include_router(access_requests_router)
app.include_router(jira_webhooks_router)
app.include_router(metrics_router)
app.include_router(metadata_router)
app.include_router(query_hybrid_router)
app.include_router(cli_artifacts_router)
app.include_router(tokens_router)
app.include_router(tokens_admin_router)
# Web UI router (must be last — has catch-all routes)
app.include_router(web_router)
@app.exception_handler(StarletteHTTPException)
async def _html_auth_redirect_handler(request, exc: StarletteHTTPException):
"""Redirect unauthenticated HTML page loads (GET) to /login.
Only GET requests outside `/api/` and `/auth/` are redirected — that
targets browser navigations to HTML pages. POSTs, API prefixes, and
non-401 errors fall through to Starlette's default JSON response so
JSON clients (including `/auth/tokens` for PAT CRUD) keep their
existing contract.
"""
if (
exc.status_code == 401
and request.method == "GET"
and not request.url.path.startswith(("/api/", "/auth/"))
):
next_param = quote(request.url.path, safe="")
return RedirectResponse(url=f"/login?next={next_param}", status_code=302)
from fastapi.exception_handlers import http_exception_handler
return await http_exception_handler(request, exc)
return app
app = create_app()