From 598f186eb15384195245b39789dd47e3420282e5 Mon Sep 17 00:00:00 2001 From: Petr Simecek Date: Sun, 26 Apr 2026 01:05:55 +0200 Subject: [PATCH] release(0.11.0): reset to pre-1.0 semver + first changelog (#58) The version = "2.x" strings in earlier pyproject.toml snapshots were arbitrary placeholders from the initial scaffold (cookiecutter default), not a reflection of API maturity. Resetting to 0.11.0 to signal pre-1.0 status: public surface (CLI flags, REST endpoints, instance.yaml schema, extract.duckdb contract) may still shift between minor versions. CalVer image tags (stable-YYYY.MM.N, dev-YYYY.MM.N) continue from CI; semver tags (v0.X.Y) are cut at release boundaries and reference the same commit as a stable-* tag from the same day. CHANGELOG.md replaces the old CalVer draft format with Keep a Changelog + semver. The 0.11.0 entry curates everything currently in main: - Auth: Workspace groups, password reset, PAT, magic-link, seed admin pwd - Deploy: keboola-deploy workflow, Caddy/LE/cert-file TLS, dev_instances TLS, optional Google OAuth from SM, LOCAL_DEV_MODE, /setup wizard - CLI: wheel distribution, auto-update, --version, --dry-run, gzip - Data: remote query (BQ+DuckDB), business metrics, OpenAPI snapshot test - Security: padak-security.md audit batch + urllib3 + argon2-cffi - Two BREAKING items called out (Caddy profile rename, Caddyfile default cert mode flipped to cert-file) --- CHANGELOG.md | 101 ++++++++++++++++++++++++++++++++++++++----------- pyproject.toml | 2 +- 2 files changed, 79 insertions(+), 24 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 180ecee..572751d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,33 +1,88 @@ # Changelog -All notable changes to Agnes AI Data Analyst are documented in this file. +All notable changes to Agnes AI Data Analyst. -Format: [CalVer](https://calver.org/) `YYYY.MM.N` with channels `stable` and `dev`. +Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html), pre-1.0 — public surface (CLI flags, REST endpoints, `instance.yaml` schema, `extract.duckdb` contract) may shift between minor versions; breaking changes called out under **Changed** or **Removed** with the **BREAKING** marker. + +CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every CI build; semver tags (`v0.X.Y`) are cut at release boundaries and reference the same commit as a `stable-*` tag from the same day. --- -## stable-2026.04.1 (unreleased) +## [0.11.0] — 2026-04-26 -Multi-instance deployment and self-service setup. +First tagged semver release. The `version = "2.x"` strings that appeared in earlier `pyproject.toml` snapshots were arbitrary placeholders from the initial scaffold and never reflected actual API maturity — resetting to pre-1.0 to signal that things may still shift. -### Added -- CalVer versioning with `stable` and `dev` release channels -- `/api/health` now returns `version`, `channel`, and `schema_version` -- Auto-generated JWT and session secrets with file persistence (`/data/state/.jwt_secret`) -- Pre-migration snapshot of `system.duckdb` before schema upgrades -- `POST /api/admin/configure` for headless data source configuration -- `POST /api/admin/discover-and-register` combined table discovery and registration -- `/setup` web wizard for first-time instance setup -- `scripts/smoke-test.sh` for post-deploy verification -- Smoke test job in CI (Docker-in-CI after every release) -- OpenAPI snapshot test for breaking change detection -- Custom connector mount support (`connectors/custom/`) -- Startup banner logging version, channel, and schema version -- Schema migration safety tests (idempotency, data preservation, snapshot) -- `CHANGELOG.md` and release notes template +### Added — Auth -### Breaking Changes -None. +- **Google Workspace groups on `/profile`.** OAuth callback fetches the signed-in user's group memberships via Cloud Identity (`searchTransitiveGroups` with the `security` label — see `docs/auth-groups.md` for the GCP setup checklist and the `security`-vs-`discussion_forum` gotcha). Profile link added to the user dropdown. +- **Password reset + invite flows** for web and admin (`/auth/password/reset`, `/admin/users/invite`). +- **Personal access tokens (PAT)** with separate `:typ=pat` JWT claim, per-token revoke, last-used IP tracking, "My tokens" + admin "All tokens" UI. +- **Email magic-link provider** (itsdangerous-signed token). +- **Optional `SEED_ADMIN_PASSWORD`** to pre-hash the seed admin (dev convenience). -### Migration Guide -No action required. Existing instances upgrade seamlessly. +### Added — Deploy + +- **`keboola-deploy.yml` workflow.** Tag-triggered alternative to `release.yml` for shared dev VMs that want explicit "deploy when I tag" semantics. Publishes immutable `:keboola-deploy-` + floating `:keboola-deploy-latest` alias. +- **Caddy + Let's Encrypt + corporate-CA TLS.** `Caddyfile` parametrized via `$CADDY_TLS` env var so a single file serves three regimes: cert-file (corp PKI), Let's Encrypt auto-issue, Caddy-internal-CA. URL-driven cert rotation with self-signed fallback (`scripts/grpn/agnes-tls-rotate.sh`). `docker-compose.tls.yml` overlay closes host `:8000` when Caddy fronts. +- **`dev_instances` schema in `customer-instance` Terraform module** gains optional `tls_mode` + `domain` (mirrors `prod_instance`). `infra-v1.6.0` tag. +- **Optional Google OAuth credentials from Secret Manager.** Module reads `google-oauth-client-{id,secret}` at boot if present; graceful fallback so non-Google deployments aren't affected. +- **`LOCAL_DEV_MODE` + `make local-dev-up` / `local-dev-down`** for one-keystroke local stack with magic-link auth pre-wired. +- **Per-developer `dev--latest` GHCR alias** for branches matching `/` — push-to-deploy on personal dev VMs. +- **`/setup` web wizard** for first-time instance setup, plus headless `POST /api/admin/configure` and `POST /api/admin/discover-and-register`. +- **Smoke-test job in CI** (Docker-in-CI after every release) + `scripts/smoke-test.sh` for post-deploy verification. + +### Added — CLI + +- **Wheel distribution** + auto-update check on startup. +- `--version` flag, `--dry-run` + `X/N` progress on `da sync`, durable sync (atomic writes + manifest hash + retry on transient errors). +- gzip on JSON/HTML responses (server-side). + +### Added — Data + +- **Remote query engine.** Two-phase BigQuery + DuckDB engine for tables too large to sync locally (`--register-bq` flag). +- **Business metrics.** Standardized `metric_definitions` table in DuckDB with starter pack importer (`da metrics import`). +- **`/api/health`** returns `version`, `channel`, `commit_sha`, `image_tag`, `schema_version`. +- **Custom connector mount support** (`connectors/custom/`). +- **OpenAPI snapshot test** for breaking-change detection. + +### Added — Docs / tooling + +- `docs/auth-groups.md`, `docs/DEPLOYMENT.md`, `docs/HACKATHON.md`, `docs/ONBOARDING.md` runbooks. +- `scripts/debug/probe_google_groups.py` — stdlib-only probe for diagnosing Cloud Identity API issues without a deploy cycle. +- Schema migration safety tests (idempotency, data preservation, snapshot). +- Pre-migration snapshot of `system.duckdb` before schema upgrades. +- Auto-generated JWT and session secrets with file persistence (`/data/state/.jwt_secret`). +- Startup banner logging version, channel, and schema version. + +### Changed + +- **BREAKING (deployment)** — Caddy compose profile renamed `production` → `tls`. Existing `docker compose --profile production up -d` invocations need to switch. +- **BREAKING (deployment)** — Default `Caddyfile` mode is now cert-file (`tls /certs/fullchain.pem /certs/privkey.pem`); for the previous Let's Encrypt auto-issue behaviour set `CADDY_TLS=tls ` in `.env`. See `docs/auth-groups.md` and `Caddyfile` inline docs. +- Schema migration v5→v6→v7: adds `users.active`, `personal_access_tokens` table, `personal_access_tokens.last_used_ip`. Auto-applied at boot. +- Image-level `AGNES_VERSION` now sourced from `pyproject.toml` at build time (no more drift between `da --version` and the package metadata). +- **Vendor-agnostic OSS rule** codified in `CLAUDE.md` — customer-specific names, hostnames, project IDs belong in consumer infra repos, not in this OSS distribution. + +### Fixed — Security + +- Open-redirect guard for backslash in `safe_next_path`. +- `SessionMiddleware max_age=3600 + https_only` (was browser-session forever, plain-HTTP-OK). +- Timezone-aware datetimes in Keboola metadata cache. +- Atomic magic-link token consumption (closes double-use race under concurrent clicks). +- Bootstrap backdoor closed when passwordless seed admin exists. +- urllib3 1.26→2.6.3 (resolves 4 Dependabot security alerts). +- argon2-cffi adopted for password hashing. +- See [docs/padak-security.md](docs/padak-security.md) for the full audit. + +### Fixed — Other + +- `uvicorn --proxy-headers --forwarded-allow-ips='*'` so OAuth callbacks resolve to https when behind a TLS terminator. +- `scripts/grpn/agnes-tls-rotate.sh` hardened: `--max-redirs 0` + `--proto '=https'` on cert fetch, post-fetch PEM validation (rejects HTML error pages from corp portals), `ulimit -c 0` to suppress coredumps that could leak the unencrypted privkey, POSIX-safe `${arr[@]+"${arr[@]}"}` array expansion. +- `scripts/tls-fetch.sh` — generic URL fetcher (`sm://`, `gs://`, `https://`, `file://`) with redirect refusal + PEM validation. +- `kbcstorage` moved to optional dep — unblocks urllib3 security updates; primary Keboola path now uses the DuckDB Keboola extension. +- Dependencies consolidated into `pyproject.toml` (no more `requirements.txt`). + +### Internal + +- Test suite expanded to 1357+ tests (4 layers — unit, integration, web smoke, journey). + +[0.11.0]: https://github.com/keboola/agnes-the-ai-analyst/releases/tag/v0.11.0 diff --git a/pyproject.toml b/pyproject.toml index e237627..f004e78 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "agnes-the-ai-analyst" -version = "2.1.0" +version = "0.11.0" description = "Agnes — AI Data Analyst platform for AI analytical systems" requires-python = ">=3.11,<3.14" license = "MIT"