agnes-the-ai-analyst/docs/local-development.md
ZdenekSrotyr 5f6bb7a4b2
fix(security+ops) + release(0.12.1): #82 #85 #87 hardening + cut 0.12.1 (#104)
* fix(security+ops): #82 #85 #87 — auth hardening, API validation, deploy posture

Security and operational hardening across three issue groups:

- M23: docker-compose.override.yml → docker-compose.dev.yml (BREAKING, prod foot-gun)
- C13: Container runs as non-root user 'agnes' (USER directive in Dockerfile)
- M21: Docker resource limits (mem_limit, cpus) on app + scheduler
- M22: Caddyfile security headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, -Server)
- M17: /api/health split into minimal (unauth) + /api/health/detailed (auth) (BREAKING)
- M26: release.yml restricts build-and-push to main + workflow_dispatch; paths-ignore for docs

- C2: table_id traversal validation on /api/data/{table_id}/download
- M4: Upload streaming (chunk-read + temp file) instead of full-buffer; /local-md hashed filename

- C5: reset_token removed from POST /api/users/{id}/reset-password response
- C8: Startup WARNING when no user has password_hash (bootstrap window visible)
- M9: Audit log on failed web form login (mirrors /auth/token endpoint)
- M10: Atomic magic-link consume via compare-and-swap (CONSUMED: marker + DuckDB conflict catch)

Also: SSRF protection on /api/admin/configure (#46), memory stats SQL aggregation (#90)

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(review): SSRF 169.254.x.x + IPv6 multicast; M10 marker cleanup safety

Review fixes:
- Add 169.254.0.0/16 (link-local, cloud metadata) to SSRF regex — was
  missing, allowing requests to AWS/GCP/Azure metadata endpoints
- Add ff[0-9a-f]{2}: (IPv6 multicast) to SSRF regex
- M10: wrap Step 3 (CONSUMED marker cleanup) in try-except with
  warning log — prevents unhandled exception if DB write fails after
  successful token consumption
- Add test for 169.254.169.254 SSRF rejection

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(review): SSRF IPv6 bypass, CLI health endpoint, upload FD leak

Address Devin Review findings on PR #104:

1. SSRF IPv6 bypass: Replace hostname regex with DNS resolution +
   ipaddress module checks. The old regex patterns like `fe80:` only
   matched up to the first colon, missing real IPv6 addresses like
   `fe80::1`, `fc00::1`, `ff02::1`. The new approach resolves the
   hostname via getaddrinfo and checks each resulting IP against
   ipaddress.is_private/is_loopback/is_link_local/is_reserved/is_multicast.

2. CLI commands broken: `da setup test-connection`, `da setup verify`,
   `da diagnose`, `da status` all called /api/health expecting the old
   format (status=="healthy", services dict). Now they call
   /api/health/detailed for service-level checks (with graceful fallback
   to the minimal endpoint when auth is not configured).

3. Temp file handle leak: _stream_to_temp returns an open
   NamedTemporaryFile; callers now close it before shutil.move() to
   prevent FD leaks until GC.

Also adds IPv6 SSRF test cases (loopback, link-local, unique-local,
multicast) with mocked DNS resolution for test environment independence.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(review): download regex blocks hyphenated IDs; document health split

Address Devin Review round-3 findings on PR #104:

1. _SAFE_IDENTIFIER regex blocked hyphenated table IDs: The download
   endpoint used the strict SQL-identifier regex which does not allow
   dots or hyphens, but Keboola table IDs like in.c-crm.orders
   contain both. Switched to _SAFE_QUOTED_IDENTIFIER which allows dots
   and hyphens while still blocking path-traversal chars (/, .., \)
   and quote/control characters. Added test for hyphenated/dotted IDs.

2. Documented health endpoint split in DEPLOYMENT.md: Added Health
   checks & external monitoring section explaining both endpoints
   (minimal unauth /api/health vs authenticated /api/health/detailed)
   and how to wire external monitoring tools to the detailed endpoint
   with a PAT.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* release(0.12.1): cut hotfix for snapshot integrity + #82/#85/#87 hardening

* fix(security): apply CAS pattern to password reset confirm (#82/M10 follow-up)

Devin review on the rebased PR flagged the asymmetry: magic-link verify
got the atomic compare-and-swap pattern in the original M10 fix, but
password reset confirm at /auth/password/reset/confirm was still using
read-validate-clear. Two concurrent POSTs with the same valid reset
token could both succeed in setting different new passwords (last-write-
wins). Lower severity than the magic-link race because the attacker
would need the reset token AND to race the legitimate user, but the
asymmetry was a polish gap.

Mirrors app/auth/providers/email.py::_consume_token CAS exactly: write
unique CONSUMED:<random> marker via UPDATE...WHERE token=old_token, then
SELECT to verify our marker won, then proceed. Only the winner clears
the marker and applies the password change.

New regression test_concurrent_reset_only_one_wins in
tests/test_password_flows.py::TestResetConfirm pins the contract: two
ThreadPoolExecutor workers + Barrier hit /reset/confirm with the same
token; exactly one gets 302 (password applied), the other gets 200 with
'Invalid or expired'. Sanity-checked against the pre-CAS code — both
POSTs got 302 (race confirmed).

---------

Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-04-28 19:57:30 +02:00

7.1 KiB

Local Development

Single source of truth for working on Agnes against localhost. Covers the dev-mode auth bypass, mocked Google Workspace groups, what isn't mocked, and the safety rails that keep the dev shortcuts off production.

TL;DR

make local-dev

Then open http://localhost:8000. You land on /dashboard already logged in as dev@localhost (role admin) and your /profile shows two mocked Workspace groups. No login screen, no .env file, no SMTP, no GCP project — just code.

What make local-dev actually does:

  • Stacks three Compose files: docker-compose.yml (base) + docker-compose.dev.yml (hot-reload + source bind mount) + docker-compose.local-dev.yml (LOCAL_DEV_MODE overlay).
  • Seeds LOCAL_DEV_GROUPS with a sensible default (engineers + admins on example.com) so /profile is non-empty on first boot.
  • Touches an empty .env if missing — Compose validates env_file: paths even for services that never start, and the local-dev overlay drops the env-file requirement for the services that do.

make local-dev-down stops the stack; make local-dev-logs tails it.

What LOCAL_DEV_MODE=1 actually bypasses

The local-dev overlay sets LOCAL_DEV_MODE=1, which flips four switches:

  1. Auth bypass. app/auth/dependencies.py::get_current_user short-circuits to a seeded admin user (dev@localhost by default; override via LOCAL_DEV_USER_EMAIL) before any token check runs. Every protected route — REST and HTML — auto-authenticates.
  2. Magic-link emails skip SMTP. When the email-link auth provider is exercised in dev, the link is logged to stderr and returned in the response body instead of sent over wire. No mail server, no inbox.
  3. Secrets self-seed. JWT_SECRET_KEY and SESSION_SECRET auto-generate into /data/state/ on first boot if not provided. You don't need to manage them manually.
  4. No .env requirement. The overlay declares env_file: [] on the affected services, so the project-level .env doesn't need to exist. Everything dev-relevant is inline in docker-compose.local-dev.yml.

A loud warning banner is logged at startup when LOCAL_DEV_MODE=1:

============================================================
LOCAL_DEV_MODE is ON — authentication is bypassed.
All requests auto-authenticate as: dev@localhost
LOCAL_DEV_GROUPS: mocking 2 group(s) into session: local-dev-engineers@example.com, local-dev-admins@example.com
NEVER enable this in a deployment reachable from the internet.
============================================================

If you don't see that banner at boot, dev mode isn't on — check LOCAL_DEV_MODE=1 made it into the container's env.

Mocking Google Workspace groups

/profile and any future group-aware code path read session.google_groups. In production that field gets populated by the OAuth callback (app/auth/providers/google.py) from a Cloud Identity searchTransitiveGroups call. In dev there's no OAuth round-trip, so the field stays empty unless we mock it.

LOCAL_DEV_GROUPS is a JSON array of objects matching the production shape:

export LOCAL_DEV_GROUPS='[{"id":"engineers@example.com","name":"Engineering"},{"id":"admins@example.com","name":"Admins"}]'

The values flow into session.google_groups on every dev-bypass request, so group-aware code sees something realistic. Same {id, name} shape the OAuth callback writes.

How make local-dev seeds it

scripts/run-local-dev.sh sets a default if you haven't already (engineers + admins on example.com), so first-boot is non-empty. Three ways to control it:

make local-dev                                          # default mock — engineers + admins
LOCAL_DEV_GROUPS='[{"id":"qa@x.com","name":"QA"}]' make local-dev   # custom mock
LOCAL_DEV_GROUPS= make local-dev                         # empty — exercise the no-groups path

Verifying the mock

Two checks:

  1. Boot banner logs the parsed group IDs (or warns loudly if the JSON is malformed):

    LOCAL_DEV_GROUPS: mocking 2 group(s) into session: local-dev-engineers@example.com, local-dev-admins@example.com
    

    A typo (e.g. unbalanced bracket) shows up here — not silently on the first authenticated request.

  2. /profile renders the mocked groups in a list. If you set LOCAL_DEV_GROUPS= (empty), you'll see "No Google groups available".

Edge case: clearing stale groups mid-session

If you previously had LOCAL_DEV_GROUPS set, then unset it and made a request, the dev-bypass path now writes [] into the session — same semantics as the production OAuth callback, which always rewrites session.google_groups on each login. You won't get stuck looking at stale mocked groups after toggling the env var.

What's NOT mocked

LOCAL_DEV_MODE is intentionally narrow. These still need real configuration if you exercise them:

  • Cloud Identity API. No real call ever fires in dev. LOCAL_DEV_GROUPS populates session.google_groups directly without going through _fetch_google_groups. To debug the actual API call, use scripts/debug/probe_google_groups.py against a real OAuth token.
  • Real OAuth round-trip. Google login button is hidden / no-op in dev mode. To test the full OAuth flow, follow docs/auth-google-oauth.md and unset LOCAL_DEV_MODE.
  • Admin Workspace permissions. The mocked groups are not authoritative — they live only in your browser session. They don't grant any real access to anything outside Agnes; they let you exercise group-aware code paths inside the app.
  • PAT (Personal Access Token) flow. PATs work normally in dev mode; the dev bypass only short-circuits cookie/session auth. Token-bearer requests still hit the JWT validation path.

Security model

LOCAL_DEV_MODE=1 is a footgun by design — every protected route auto-authenticates as admin without any check. The codebase has these rails to keep it from leaking into prod:

  • docker-compose.local-dev.yml is a separate overlay, never stacked into docker-compose.prod.yml. Production deployments never see it.
  • The startup banner is loud and unmissableWARNING level, repeated 60-character separator. Anyone reading container logs at startup will spot it immediately.
  • is_local_dev_mode() reads os.environ fresh on every call — no startup-time cache that could be poisoned.
  • LOCAL_DEV_GROUPS is honored only inside the if is_local_dev_mode(): block in get_current_user. Setting it without LOCAL_DEV_MODE=1 does nothing.

If you ever see the dev banner in a real deployment's logs, treat it as a P0 incident: the auth boundary is gone.

  • docs/auth-groups.md — production Google Workspace groups: GCP setup checklist, the security label gotcha, debugging the real Cloud Identity call.
  • docs/auth-google-oauth.md — full Google OAuth setup for non-dev environments (client ID, scopes, redirect URIs).
  • docs/QUICKSTART.md — first-time setup for a real (non-dev) instance.
  • CLAUDE.md — repo-wide engineering conventions (changelog discipline, vendor-agnostic OSS rules, project structure).