release(0.11.0): reset to pre-1.0 semver + first changelog (#58)

The version = "2.x" strings in earlier pyproject.toml snapshots were
arbitrary placeholders from the initial scaffold (cookiecutter default),
not a reflection of API maturity. Resetting to 0.11.0 to signal pre-1.0
status: public surface (CLI flags, REST endpoints, instance.yaml schema,
extract.duckdb contract) may still shift between minor versions.

CalVer image tags (stable-YYYY.MM.N, dev-YYYY.MM.N) continue from CI;
semver tags (v0.X.Y) are cut at release boundaries and reference the
same commit as a stable-* tag from the same day.

CHANGELOG.md replaces the old CalVer draft format with Keep a Changelog
+ semver. The 0.11.0 entry curates everything currently in main:
- Auth: Workspace groups, password reset, PAT, magic-link, seed admin pwd
- Deploy: keboola-deploy workflow, Caddy/LE/cert-file TLS, dev_instances
  TLS, optional Google OAuth from SM, LOCAL_DEV_MODE, /setup wizard
- CLI: wheel distribution, auto-update, --version, --dry-run, gzip
- Data: remote query (BQ+DuckDB), business metrics, OpenAPI snapshot test
- Security: padak-security.md audit batch + urllib3 + argon2-cffi
- Two BREAKING items called out (Caddy profile rename, Caddyfile default
  cert mode flipped to cert-file)
This commit is contained in:
Petr Simecek 2026-04-26 01:05:55 +02:00 committed by GitHub
parent c25fd41bf7
commit 598f186eb1
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 79 additions and 24 deletions

View file

@ -1,33 +1,88 @@
# Changelog # Changelog
All notable changes to Agnes AI Data Analyst are documented in this file. All notable changes to Agnes AI Data Analyst.
Format: [CalVer](https://calver.org/) `YYYY.MM.N` with channels `stable` and `dev`. Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html), pre-1.0 — public surface (CLI flags, REST endpoints, `instance.yaml` schema, `extract.duckdb` contract) may shift between minor versions; breaking changes called out under **Changed** or **Removed** with the **BREAKING** marker.
CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every CI build; semver tags (`v0.X.Y`) are cut at release boundaries and reference the same commit as a `stable-*` tag from the same day.
--- ---
## stable-2026.04.1 (unreleased) ## [0.11.0] — 2026-04-26
Multi-instance deployment and self-service setup. First tagged semver release. The `version = "2.x"` strings that appeared in earlier `pyproject.toml` snapshots were arbitrary placeholders from the initial scaffold and never reflected actual API maturity — resetting to pre-1.0 to signal that things may still shift.
### Added ### Added — Auth
- CalVer versioning with `stable` and `dev` release channels
- `/api/health` now returns `version`, `channel`, and `schema_version`
- Auto-generated JWT and session secrets with file persistence (`/data/state/.jwt_secret`)
- Pre-migration snapshot of `system.duckdb` before schema upgrades
- `POST /api/admin/configure` for headless data source configuration
- `POST /api/admin/discover-and-register` combined table discovery and registration
- `/setup` web wizard for first-time instance setup
- `scripts/smoke-test.sh` for post-deploy verification
- Smoke test job in CI (Docker-in-CI after every release)
- OpenAPI snapshot test for breaking change detection
- Custom connector mount support (`connectors/custom/`)
- Startup banner logging version, channel, and schema version
- Schema migration safety tests (idempotency, data preservation, snapshot)
- `CHANGELOG.md` and release notes template
### Breaking Changes - **Google Workspace groups on `/profile`.** OAuth callback fetches the signed-in user's group memberships via Cloud Identity (`searchTransitiveGroups` with the `security` label — see `docs/auth-groups.md` for the GCP setup checklist and the `security`-vs-`discussion_forum` gotcha). Profile link added to the user dropdown.
None. - **Password reset + invite flows** for web and admin (`/auth/password/reset`, `/admin/users/invite`).
- **Personal access tokens (PAT)** with separate `:typ=pat` JWT claim, per-token revoke, last-used IP tracking, "My tokens" + admin "All tokens" UI.
- **Email magic-link provider** (itsdangerous-signed token).
- **Optional `SEED_ADMIN_PASSWORD`** to pre-hash the seed admin (dev convenience).
### Migration Guide ### Added — Deploy
No action required. Existing instances upgrade seamlessly.
- **`keboola-deploy.yml` workflow.** Tag-triggered alternative to `release.yml` for shared dev VMs that want explicit "deploy when I tag" semantics. Publishes immutable `:keboola-deploy-<tag>` + floating `:keboola-deploy-latest` alias.
- **Caddy + Let's Encrypt + corporate-CA TLS.** `Caddyfile` parametrized via `$CADDY_TLS` env var so a single file serves three regimes: cert-file (corp PKI), Let's Encrypt auto-issue, Caddy-internal-CA. URL-driven cert rotation with self-signed fallback (`scripts/grpn/agnes-tls-rotate.sh`). `docker-compose.tls.yml` overlay closes host `:8000` when Caddy fronts.
- **`dev_instances` schema in `customer-instance` Terraform module** gains optional `tls_mode` + `domain` (mirrors `prod_instance`). `infra-v1.6.0` tag.
- **Optional Google OAuth credentials from Secret Manager.** Module reads `google-oauth-client-{id,secret}` at boot if present; graceful fallback so non-Google deployments aren't affected.
- **`LOCAL_DEV_MODE` + `make local-dev-up` / `local-dev-down`** for one-keystroke local stack with magic-link auth pre-wired.
- **Per-developer `dev-<prefix>-latest` GHCR alias** for branches matching `<prefix>/<branch>` — push-to-deploy on personal dev VMs.
- **`/setup` web wizard** for first-time instance setup, plus headless `POST /api/admin/configure` and `POST /api/admin/discover-and-register`.
- **Smoke-test job in CI** (Docker-in-CI after every release) + `scripts/smoke-test.sh` for post-deploy verification.
### Added — CLI
- **Wheel distribution** + auto-update check on startup.
- `--version` flag, `--dry-run` + `X/N` progress on `da sync`, durable sync (atomic writes + manifest hash + retry on transient errors).
- gzip on JSON/HTML responses (server-side).
### Added — Data
- **Remote query engine.** Two-phase BigQuery + DuckDB engine for tables too large to sync locally (`--register-bq` flag).
- **Business metrics.** Standardized `metric_definitions` table in DuckDB with starter pack importer (`da metrics import`).
- **`/api/health`** returns `version`, `channel`, `commit_sha`, `image_tag`, `schema_version`.
- **Custom connector mount support** (`connectors/custom/`).
- **OpenAPI snapshot test** for breaking-change detection.
### Added — Docs / tooling
- `docs/auth-groups.md`, `docs/DEPLOYMENT.md`, `docs/HACKATHON.md`, `docs/ONBOARDING.md` runbooks.
- `scripts/debug/probe_google_groups.py` — stdlib-only probe for diagnosing Cloud Identity API issues without a deploy cycle.
- Schema migration safety tests (idempotency, data preservation, snapshot).
- Pre-migration snapshot of `system.duckdb` before schema upgrades.
- Auto-generated JWT and session secrets with file persistence (`/data/state/.jwt_secret`).
- Startup banner logging version, channel, and schema version.
### Changed
- **BREAKING (deployment)** — Caddy compose profile renamed `production``tls`. Existing `docker compose --profile production up -d` invocations need to switch.
- **BREAKING (deployment)** — Default `Caddyfile` mode is now cert-file (`tls /certs/fullchain.pem /certs/privkey.pem`); for the previous Let's Encrypt auto-issue behaviour set `CADDY_TLS=tls <ops-email>` in `.env`. See `docs/auth-groups.md` and `Caddyfile` inline docs.
- Schema migration v5→v6→v7: adds `users.active`, `personal_access_tokens` table, `personal_access_tokens.last_used_ip`. Auto-applied at boot.
- Image-level `AGNES_VERSION` now sourced from `pyproject.toml` at build time (no more drift between `da --version` and the package metadata).
- **Vendor-agnostic OSS rule** codified in `CLAUDE.md` — customer-specific names, hostnames, project IDs belong in consumer infra repos, not in this OSS distribution.
### Fixed — Security
- Open-redirect guard for backslash in `safe_next_path`.
- `SessionMiddleware max_age=3600 + https_only` (was browser-session forever, plain-HTTP-OK).
- Timezone-aware datetimes in Keboola metadata cache.
- Atomic magic-link token consumption (closes double-use race under concurrent clicks).
- Bootstrap backdoor closed when passwordless seed admin exists.
- urllib3 1.26→2.6.3 (resolves 4 Dependabot security alerts).
- argon2-cffi adopted for password hashing.
- See [docs/padak-security.md](docs/padak-security.md) for the full audit.
### Fixed — Other
- `uvicorn --proxy-headers --forwarded-allow-ips='*'` so OAuth callbacks resolve to https when behind a TLS terminator.
- `scripts/grpn/agnes-tls-rotate.sh` hardened: `--max-redirs 0` + `--proto '=https'` on cert fetch, post-fetch PEM validation (rejects HTML error pages from corp portals), `ulimit -c 0` to suppress coredumps that could leak the unencrypted privkey, POSIX-safe `${arr[@]+"${arr[@]}"}` array expansion.
- `scripts/tls-fetch.sh` — generic URL fetcher (`sm://`, `gs://`, `https://`, `file://`) with redirect refusal + PEM validation.
- `kbcstorage` moved to optional dep — unblocks urllib3 security updates; primary Keboola path now uses the DuckDB Keboola extension.
- Dependencies consolidated into `pyproject.toml` (no more `requirements.txt`).
### Internal
- Test suite expanded to 1357+ tests (4 layers — unit, integration, web smoke, journey).
[0.11.0]: https://github.com/keboola/agnes-the-ai-analyst/releases/tag/v0.11.0

View file

@ -1,6 +1,6 @@
[project] [project]
name = "agnes-the-ai-analyst" name = "agnes-the-ai-analyst"
version = "2.1.0" version = "0.11.0"
description = "Agnes — AI Data Analyst platform for AI analytical systems" description = "Agnes — AI Data Analyst platform for AI analytical systems"
requires-python = ">=3.11,<3.14" requires-python = ">=3.11,<3.14"
license = "MIT" license = "MIT"