agnes-the-ai-analyst/CHANGELOG.md
Petr Simecek 864a245acf
fix(deploy): pass CADDY_TLS through to caddy container (#55)
* fix(deploy): pass CADDY_TLS through to caddy container

PR #52 added the {$CADDY_TLS:default} substitution to the Caddyfile but
forgot to expose CADDY_TLS to the caddy service in docker-compose.yml.
Result: Caddyfile substitution falls back to the default
(`tls /certs/fullchain.pem /certs/privkey.pem`) regardless of what the
operator wrote into .env, and Caddy crash-loops with "open
/certs/fullchain.pem: no such file or directory" on any LE / internal
deployment.

Compose `- CADDY_TLS` (no `=value`) is the bare-form passthrough — Compose
reads the value from .env (or the host shell) at up time. No-op when
CADDY_TLS is unset (Caddyfile default kicks in), exact behavior preserved
for cert-file deployments.

Caught by Keboola's first agnes-dev recreate (kids-ai-data-analysis project,
agnes-dev.keboola.com) — VM came up with .env containing
CADDY_TLS="tls petr@keboola.com" but Caddy ignored it and tried to load
the corp PKI cert file.

* docs(changelog): document the CADDY_TLS passthrough fix per discipline rule
2026-04-26 01:46:42 +02:00

7.5 KiB

Changelog

All notable changes to Agnes AI Data Analyst.

Format: Keep a Changelog. Versions follow Semantic Versioning, pre-1.0 — public surface (CLI flags, REST endpoints, instance.yaml schema, extract.duckdb contract) may shift between minor versions; breaking changes called out under Changed or Removed with the BREAKING marker.

CalVer image tags (stable-YYYY.MM.N, dev-YYYY.MM.N) are produced for every CI build; semver tags (v0.X.Y) are cut at release boundaries and reference the same commit as a stable-* tag from the same day.


[Unreleased]

Fixed

  • docker-compose.yml caddy service now passes CADDY_TLS through to the container (- CADDY_TLS bare-form passthrough). Without it the Caddyfile {$CADDY_TLS:default} substitution always falls back to cert-file mode regardless of what the operator wrote into .env, and Caddy crash-loops on Let's Encrypt / internal-CA deployments. Should have shipped with #52; first attempt was #55, accidentally closed before merging.

Internal

  • CLAUDE.md — non-negotiable changelog discipline: every PR touching user-visible behavior must update CHANGELOG.md under ## [Unreleased] in the same PR.

0.11.0 — 2026-04-26

First tagged semver release. The version = "2.x" strings that appeared in earlier pyproject.toml snapshots were arbitrary placeholders from the initial scaffold and never reflected actual API maturity — resetting to pre-1.0 to signal that things may still shift.

Added — Auth

  • Google Workspace groups on /profile. OAuth callback fetches the signed-in user's group memberships via Cloud Identity (searchTransitiveGroups with the security label — see docs/auth-groups.md for the GCP setup checklist and the security-vs-discussion_forum gotcha). Profile link added to the user dropdown.
  • Password reset + invite flows for web and admin (/auth/password/reset, /admin/users/invite).
  • Personal access tokens (PAT) with separate :typ=pat JWT claim, per-token revoke, last-used IP tracking, "My tokens" + admin "All tokens" UI.
  • Email magic-link provider (itsdangerous-signed token).
  • Optional SEED_ADMIN_PASSWORD to pre-hash the seed admin (dev convenience).

Added — Deploy

  • keboola-deploy.yml workflow. Tag-triggered alternative to release.yml for shared dev VMs that want explicit "deploy when I tag" semantics. Publishes immutable :keboola-deploy-<tag> + floating :keboola-deploy-latest alias.
  • Caddy + Let's Encrypt + corporate-CA TLS. Caddyfile parametrized via $CADDY_TLS env var so a single file serves three regimes: cert-file (corp PKI), Let's Encrypt auto-issue, Caddy-internal-CA. URL-driven cert rotation with self-signed fallback (scripts/grpn/agnes-tls-rotate.sh). docker-compose.tls.yml overlay closes host :8000 when Caddy fronts.
  • dev_instances schema in customer-instance Terraform module gains optional tls_mode + domain (mirrors prod_instance). infra-v1.6.0 tag.
  • Optional Google OAuth credentials from Secret Manager. Module reads google-oauth-client-{id,secret} at boot if present; graceful fallback so non-Google deployments aren't affected.
  • LOCAL_DEV_MODE + make local-dev-up / local-dev-down for one-keystroke local stack with magic-link auth pre-wired.
  • Per-developer dev-<prefix>-latest GHCR alias for branches matching <prefix>/<branch> — push-to-deploy on personal dev VMs.
  • /setup web wizard for first-time instance setup, plus headless POST /api/admin/configure and POST /api/admin/discover-and-register.
  • Smoke-test job in CI (Docker-in-CI after every release) + scripts/smoke-test.sh for post-deploy verification.

Added — CLI

  • Wheel distribution + auto-update check on startup.
  • --version flag, --dry-run + X/N progress on da sync, durable sync (atomic writes + manifest hash + retry on transient errors).
  • gzip on JSON/HTML responses (server-side).

Added — Data

  • Remote query engine. Two-phase BigQuery + DuckDB engine for tables too large to sync locally (--register-bq flag).
  • Business metrics. Standardized metric_definitions table in DuckDB with starter pack importer (da metrics import).
  • /api/health returns version, channel, commit_sha, image_tag, schema_version.
  • Custom connector mount support (connectors/custom/).
  • OpenAPI snapshot test for breaking-change detection.

Added — Docs / tooling

  • docs/auth-groups.md, docs/DEPLOYMENT.md, docs/HACKATHON.md, docs/ONBOARDING.md runbooks.
  • scripts/debug/probe_google_groups.py — stdlib-only probe for diagnosing Cloud Identity API issues without a deploy cycle.
  • Schema migration safety tests (idempotency, data preservation, snapshot).
  • Pre-migration snapshot of system.duckdb before schema upgrades.
  • Auto-generated JWT and session secrets with file persistence (/data/state/.jwt_secret).
  • Startup banner logging version, channel, and schema version.

Changed

  • BREAKING (deployment) — Caddy compose profile renamed productiontls. Existing docker compose --profile production up -d invocations need to switch.
  • BREAKING (deployment) — Default Caddyfile mode is now cert-file (tls /certs/fullchain.pem /certs/privkey.pem); for the previous Let's Encrypt auto-issue behaviour set CADDY_TLS=tls <ops-email> in .env. See docs/auth-groups.md and Caddyfile inline docs.
  • Schema migration v5→v6→v7: adds users.active, personal_access_tokens table, personal_access_tokens.last_used_ip. Auto-applied at boot.
  • Image-level AGNES_VERSION now sourced from pyproject.toml at build time (no more drift between da --version and the package metadata).
  • Vendor-agnostic OSS rule codified in CLAUDE.md — customer-specific names, hostnames, project IDs belong in consumer infra repos, not in this OSS distribution.

Fixed — Security

  • Open-redirect guard for backslash in safe_next_path.
  • SessionMiddleware max_age=3600 + https_only (was browser-session forever, plain-HTTP-OK).
  • Timezone-aware datetimes in Keboola metadata cache.
  • Atomic magic-link token consumption (closes double-use race under concurrent clicks).
  • Bootstrap backdoor closed when passwordless seed admin exists.
  • urllib3 1.26→2.6.3 (resolves 4 Dependabot security alerts).
  • argon2-cffi adopted for password hashing.
  • See docs/padak-security.md for the full audit.

Fixed — Other

  • uvicorn --proxy-headers --forwarded-allow-ips='*' so OAuth callbacks resolve to https when behind a TLS terminator.
  • scripts/grpn/agnes-tls-rotate.sh hardened: --max-redirs 0 + --proto '=https' on cert fetch, post-fetch PEM validation (rejects HTML error pages from corp portals), ulimit -c 0 to suppress coredumps that could leak the unencrypted privkey, POSIX-safe ${arr[@]+"${arr[@]}"} array expansion.
  • scripts/tls-fetch.sh — generic URL fetcher (sm://, gs://, https://, file://) with redirect refusal + PEM validation.
  • kbcstorage moved to optional dep — unblocks urllib3 security updates; primary Keboola path now uses the DuckDB Keboola extension.
  • Dependencies consolidated into pyproject.toml (no more requirements.txt).

Internal

  • Test suite expanded to 1357+ tests (4 layers — unit, integration, web smoke, journey).