Commit graph

12 commits

Author SHA1 Message Date
Petr Simecek
864a245acf
fix(deploy): pass CADDY_TLS through to caddy container (#55)
* fix(deploy): pass CADDY_TLS through to caddy container

PR #52 added the {$CADDY_TLS:default} substitution to the Caddyfile but
forgot to expose CADDY_TLS to the caddy service in docker-compose.yml.
Result: Caddyfile substitution falls back to the default
(`tls /certs/fullchain.pem /certs/privkey.pem`) regardless of what the
operator wrote into .env, and Caddy crash-loops with "open
/certs/fullchain.pem: no such file or directory" on any LE / internal
deployment.

Compose `- CADDY_TLS` (no `=value`) is the bare-form passthrough — Compose
reads the value from .env (or the host shell) at up time. No-op when
CADDY_TLS is unset (Caddyfile default kicks in), exact behavior preserved
for cert-file deployments.

Caught by Keboola's first agnes-dev recreate (kids-ai-data-analysis project,
agnes-dev.keboola.com) — VM came up with .env containing
CADDY_TLS="tls petr@keboola.com" but Caddy ignored it and tried to load
the corp PKI cert file.

* docs(changelog): document the CADDY_TLS passthrough fix per discipline rule
2026-04-26 01:46:42 +02:00
Vojtech
0bbbf3e40b
feat(tls): corporate-CA HTTPS with URL-driven rotation, on-VM CSR gen, self-signed fallback (#51)
Replaces the implicit Let's Encrypt flow with a general corporate-CA HTTPS path:

- Caddy switches to cert-file mode (`tls /certs/fullchain.pem /certs/privkey.pem`) with HSTS + TLS 1.2/1.3 floor
- New `docker-compose.tls.yml` overlay closes host `:8000` when Caddy fronts (no TLS bypass)
- New `scripts/tls-fetch.sh` — generic URL fetcher for `sm://`, `gs://`, `https://`, `file://` with redirect refusal + PEM validation
- New `scripts/grpn/agnes-tls-rotate.sh` — daily rotation, self-signed fallback against same key (zero key churn), on-VM RSA-2048 + CSR auto-gen, atomic swap, SIGUSR1 reload
- `scripts/grpn/agnes-auto-upgrade.sh` becomes cert-aware (auto-enables tls overlay when certs present)
- Compose profile `production` renamed to `tls` (aligns with DEPLOYMENT.md and infra startup)

Pairs with FoundryAI/agnes-the-ai-analyst-infra#27 (merged) which wires per-VM `local.vm_tls`, writes `TLS_*` env vars into `.env`, auto-creates Secret Manager containers for `sm://` privkey URLs, and installs `agnes-tls-rotate.{service,timer}` for daily polling.

Includes hardening + docs follow-ups from code review:
- `TLS_CSR_SUBJECT` env-var parametrisation applied to both CSR and self-signed cert paths
- curl `--max-redirs 0 --proto '=https'` + post-fetch PEM validation in `tls-fetch.sh`
- `ulimit -c 0` + array-form `COMPOSE_FILES` (POSIX-safe, bash 3.2 compatible)
- TLS section added to `config/.env.template`
- Historical-note headers in `docs/superpowers/{plans,specs}/2026-04-09-*.md` flagging the profile rename
2026-04-25 19:51:25 +00:00
ZdenekSrotyr
1381770057
fix(auth): uvicorn --proxy-headers + Google OAuth doc + vendor-agnostic OSS rule in CLAUDE.md (#39)
* fix(compose): pass --proxy-headers to uvicorn so OAuth callbacks resolve to https

When the app runs behind a reverse proxy (Caddy, nginx, Cloudflare Tunnel),
uvicorn's default policy of trusting X-Forwarded-* only from 127.0.0.1 means
the request the container sees still looks like http://localhost:8000/...,
even when the user is on https://. The OAuth provider then sends Google a
callback URL Google has never seen — Error 400: redirect_uri_mismatch.

--proxy-headers + --forwarded-allow-ips '*' tell uvicorn to honor those
headers from any source. The container only ever sees its own docker network
anyway; trusting it everywhere is safe in this deployment shape.

Adds docs/auth-google-oauth.md with the full operator gotcha list — env
vars that have to be set, instance.yaml fields that silently fall back to
defaults, and the DB workaround for ad-hoc role promotion when
SEED_ADMIN_EMAIL was missed on first boot.

* docs(claude): codify vendor-agnostic OSS rule for AI agents and humans

Adds a "Vendor-agnostic OSS" section to CLAUDE.md spelling out what cannot
land in this repo (specific deployments, internal hostnames/projects, cross-
references to private repos, customer-specific paths) and how to phrase
abstractions instead. Plus a pre-PR grep checklist in the existing "Git
Commits & Pull Requests" section.

This trips up agents and humans alike — the previous version of #39 had
private-deployment references in the body and a customer domain in a doc
example. Surfacing the rule once in the file every Claude/Cursor/Aider
session reads should prevent that on the next PR.

* docs(oauth): cover DOMAIN + SERVER_URL env vars introduced by PR #48

PR #48 (merged) added DOMAIN-gated Secure cookie in google.py and
documented SERVER_URL in .env.template, but this operator doc was
drafted before that merge and didn't reference either variable.
Adding both to the env table and extending the common-failure-modes
table with a sticky-cookie / redirect-URI-mismatch entry that
references SERVER_URL as the host-header-independent fix. Also
aligns the compose command snippet with the `='*'` syntax that
actually ships on main post-PR #48.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Vojtech Rysanek <vrysanek@groupon.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 07:07:33 +00:00
ZdenekSrotyr
9e19fb5219
chore(deploy): trust proxy headers + document HTTPS env vars (#48)
* chore(deploy): trust proxy headers + document HTTPS env vars

- uvicorn: add --proxy-headers --forwarded-allow-ips='*' so the app honors
  X-Forwarded-Proto/Host from a TLS-terminating reverse proxy (Caddy,
  Cloudflare Tunnel, nginx, LB). Without this the app saw every request as
  plain HTTP and built redirect/OAuth URLs from the raw Host, which is
  fragile behind a proxy.
- .env.template: document DOMAIN (enables Secure cookie flag) and new
  SERVER_URL (deterministic base URL for OAuth callbacks and external
  links). Grouped under a dedicated HTTPS / REVERSE PROXY section.

* chore(deploy): add proxy header flags to Dockerfile CMD and Kamal config

Matches the docker-compose changes so non-compose deployments (docker run,
Kubernetes, ECS, Kamal) also trust X-Forwarded-Proto/X-Forwarded-For.

* fix(auth): align Google OAuth cookie Secure flag with password/email providers

Google OAuth set the access_token cookie Secure flag based on the TESTING env
var, while password and email providers use DOMAIN. This meant the DOMAIN
env var (now documented in config/.env.template) did not actually control
Secure for Google cookies. Align all three providers on DOMAIN so the
documented behavior holds consistently.
2026-04-24 08:52:53 +02:00
ZdenekSrotyr
6c53082295 feat: multi-instance deployment — all 14 must-have items from spec
CalVer CI (release.yml) with stable/dev channels, health endpoint
with version/channel/schema_version, JWT secret auto-generation with
file persistence, smoke test script + Docker-in-CI, pre-migration
snapshot, /api/admin/configure for headless setup, /api/admin/
discover-and-register, /setup wizard, OpenAPI snapshot test, custom
connector mount support, CHANGELOG, migration safety tests, startup
banner.

663 tests pass (6 new migration safety + 3 OpenAPI snapshot + 1
updated JWT test).
2026-04-10 11:57:42 +02:00
ZdenekSrotyr
d814eaa503 feat: add Caddy HTTPS reverse proxy and production compose override 2026-04-09 16:39:23 +02:00
ZdenekSrotyr
510e1a8178 fix: add restart policy and config mount to app, scheduler, extract services 2026-04-09 16:38:58 +02:00
ZdenekSrotyr
9e5066cf1d feat: replace Docker healthcheck with curl 2026-04-09 07:03:12 +02:00
ZdenekSrotyr
92fbb88c15 chore: Docker prod config (Python 3.13, no reload), fix utcnow deprecation, update docs 2026-04-08 12:10:47 +02:00
ZdenekSrotyr
4bad893cb8 feat: Docker services (ws-gateway, corporate-memory, session-collector) + scheduler auto-auth 2026-04-08 07:04:26 +02:00
ZdenekSrotyr
9fef90a729 docs: rewrite CLAUDE.md for extract.duckdb architecture
Update project structure, architecture diagram, key implementation
details, development commands, and extensibility docs.
Add extract service to docker-compose.yml for one-shot extraction.
2026-03-31 07:52:44 +02:00
ZdenekSrotyr
3701130a11 feat: add Docker, CLI tool, scheduler, and agent skills
- Dockerfile (uv-based) + docker-compose.yml (3 services)
- CLI tool 'da' with commands: auth, sync, query, status, admin, diagnose, skills
- Scheduler sidecar service (replaces systemd timers)
- pyproject.toml for uv distribution
- Built-in skills (setup, troubleshoot) for AI agents
- 17 CLI tests, 75 total tests passing
2026-03-27 15:30:03 +01:00