agnes-the-ai-analyst/CHANGELOG.md
Petr Simecek 598f186eb1
release(0.11.0): reset to pre-1.0 semver + first changelog (#58)
The version = "2.x" strings in earlier pyproject.toml snapshots were
arbitrary placeholders from the initial scaffold (cookiecutter default),
not a reflection of API maturity. Resetting to 0.11.0 to signal pre-1.0
status: public surface (CLI flags, REST endpoints, instance.yaml schema,
extract.duckdb contract) may still shift between minor versions.

CalVer image tags (stable-YYYY.MM.N, dev-YYYY.MM.N) continue from CI;
semver tags (v0.X.Y) are cut at release boundaries and reference the
same commit as a stable-* tag from the same day.

CHANGELOG.md replaces the old CalVer draft format with Keep a Changelog
+ semver. The 0.11.0 entry curates everything currently in main:
- Auth: Workspace groups, password reset, PAT, magic-link, seed admin pwd
- Deploy: keboola-deploy workflow, Caddy/LE/cert-file TLS, dev_instances
  TLS, optional Google OAuth from SM, LOCAL_DEV_MODE, /setup wizard
- CLI: wheel distribution, auto-update, --version, --dry-run, gzip
- Data: remote query (BQ+DuckDB), business metrics, OpenAPI snapshot test
- Security: padak-security.md audit batch + urllib3 + argon2-cffi
- Two BREAKING items called out (Caddy profile rename, Caddyfile default
  cert mode flipped to cert-file)
2026-04-26 01:05:55 +02:00

6.8 KiB

Changelog

All notable changes to Agnes AI Data Analyst.

Format: Keep a Changelog. Versions follow Semantic Versioning, pre-1.0 — public surface (CLI flags, REST endpoints, instance.yaml schema, extract.duckdb contract) may shift between minor versions; breaking changes called out under Changed or Removed with the BREAKING marker.

CalVer image tags (stable-YYYY.MM.N, dev-YYYY.MM.N) are produced for every CI build; semver tags (v0.X.Y) are cut at release boundaries and reference the same commit as a stable-* tag from the same day.


0.11.0 — 2026-04-26

First tagged semver release. The version = "2.x" strings that appeared in earlier pyproject.toml snapshots were arbitrary placeholders from the initial scaffold and never reflected actual API maturity — resetting to pre-1.0 to signal that things may still shift.

Added — Auth

  • Google Workspace groups on /profile. OAuth callback fetches the signed-in user's group memberships via Cloud Identity (searchTransitiveGroups with the security label — see docs/auth-groups.md for the GCP setup checklist and the security-vs-discussion_forum gotcha). Profile link added to the user dropdown.
  • Password reset + invite flows for web and admin (/auth/password/reset, /admin/users/invite).
  • Personal access tokens (PAT) with separate :typ=pat JWT claim, per-token revoke, last-used IP tracking, "My tokens" + admin "All tokens" UI.
  • Email magic-link provider (itsdangerous-signed token).
  • Optional SEED_ADMIN_PASSWORD to pre-hash the seed admin (dev convenience).

Added — Deploy

  • keboola-deploy.yml workflow. Tag-triggered alternative to release.yml for shared dev VMs that want explicit "deploy when I tag" semantics. Publishes immutable :keboola-deploy-<tag> + floating :keboola-deploy-latest alias.
  • Caddy + Let's Encrypt + corporate-CA TLS. Caddyfile parametrized via $CADDY_TLS env var so a single file serves three regimes: cert-file (corp PKI), Let's Encrypt auto-issue, Caddy-internal-CA. URL-driven cert rotation with self-signed fallback (scripts/grpn/agnes-tls-rotate.sh). docker-compose.tls.yml overlay closes host :8000 when Caddy fronts.
  • dev_instances schema in customer-instance Terraform module gains optional tls_mode + domain (mirrors prod_instance). infra-v1.6.0 tag.
  • Optional Google OAuth credentials from Secret Manager. Module reads google-oauth-client-{id,secret} at boot if present; graceful fallback so non-Google deployments aren't affected.
  • LOCAL_DEV_MODE + make local-dev-up / local-dev-down for one-keystroke local stack with magic-link auth pre-wired.
  • Per-developer dev-<prefix>-latest GHCR alias for branches matching <prefix>/<branch> — push-to-deploy on personal dev VMs.
  • /setup web wizard for first-time instance setup, plus headless POST /api/admin/configure and POST /api/admin/discover-and-register.
  • Smoke-test job in CI (Docker-in-CI after every release) + scripts/smoke-test.sh for post-deploy verification.

Added — CLI

  • Wheel distribution + auto-update check on startup.
  • --version flag, --dry-run + X/N progress on da sync, durable sync (atomic writes + manifest hash + retry on transient errors).
  • gzip on JSON/HTML responses (server-side).

Added — Data

  • Remote query engine. Two-phase BigQuery + DuckDB engine for tables too large to sync locally (--register-bq flag).
  • Business metrics. Standardized metric_definitions table in DuckDB with starter pack importer (da metrics import).
  • /api/health returns version, channel, commit_sha, image_tag, schema_version.
  • Custom connector mount support (connectors/custom/).
  • OpenAPI snapshot test for breaking-change detection.

Added — Docs / tooling

  • docs/auth-groups.md, docs/DEPLOYMENT.md, docs/HACKATHON.md, docs/ONBOARDING.md runbooks.
  • scripts/debug/probe_google_groups.py — stdlib-only probe for diagnosing Cloud Identity API issues without a deploy cycle.
  • Schema migration safety tests (idempotency, data preservation, snapshot).
  • Pre-migration snapshot of system.duckdb before schema upgrades.
  • Auto-generated JWT and session secrets with file persistence (/data/state/.jwt_secret).
  • Startup banner logging version, channel, and schema version.

Changed

  • BREAKING (deployment) — Caddy compose profile renamed productiontls. Existing docker compose --profile production up -d invocations need to switch.
  • BREAKING (deployment) — Default Caddyfile mode is now cert-file (tls /certs/fullchain.pem /certs/privkey.pem); for the previous Let's Encrypt auto-issue behaviour set CADDY_TLS=tls <ops-email> in .env. See docs/auth-groups.md and Caddyfile inline docs.
  • Schema migration v5→v6→v7: adds users.active, personal_access_tokens table, personal_access_tokens.last_used_ip. Auto-applied at boot.
  • Image-level AGNES_VERSION now sourced from pyproject.toml at build time (no more drift between da --version and the package metadata).
  • Vendor-agnostic OSS rule codified in CLAUDE.md — customer-specific names, hostnames, project IDs belong in consumer infra repos, not in this OSS distribution.

Fixed — Security

  • Open-redirect guard for backslash in safe_next_path.
  • SessionMiddleware max_age=3600 + https_only (was browser-session forever, plain-HTTP-OK).
  • Timezone-aware datetimes in Keboola metadata cache.
  • Atomic magic-link token consumption (closes double-use race under concurrent clicks).
  • Bootstrap backdoor closed when passwordless seed admin exists.
  • urllib3 1.26→2.6.3 (resolves 4 Dependabot security alerts).
  • argon2-cffi adopted for password hashing.
  • See docs/padak-security.md for the full audit.

Fixed — Other

  • uvicorn --proxy-headers --forwarded-allow-ips='*' so OAuth callbacks resolve to https when behind a TLS terminator.
  • scripts/grpn/agnes-tls-rotate.sh hardened: --max-redirs 0 + --proto '=https' on cert fetch, post-fetch PEM validation (rejects HTML error pages from corp portals), ulimit -c 0 to suppress coredumps that could leak the unencrypted privkey, POSIX-safe ${arr[@]+"${arr[@]}"} array expansion.
  • scripts/tls-fetch.sh — generic URL fetcher (sm://, gs://, https://, file://) with redirect refusal + PEM validation.
  • kbcstorage moved to optional dep — unblocks urllib3 security updates; primary Keboola path now uses the DuckDB Keboola extension.
  • Dependencies consolidated into pyproject.toml (no more requirements.txt).

Internal

  • Test suite expanded to 1357+ tests (4 layers — unit, integration, web smoke, journey).