* feat(auth): mock session.google_groups in LOCAL_DEV_MODE via LOCAL_DEV_GROUPS
LOCAL_DEV_MODE auto-logged-in the dev user but left session.google_groups
empty, so group-aware UI/code paths can't be exercised on localhost without
a real Google OAuth round-trip. New LOCAL_DEV_GROUPS env var (JSON array
matching the production {id, name} shape) populates the session on every
dev-bypass request — same structure the OAuth callback writes, so mock and
prod stay in lockstep. Compare-then-write avoids spurious Set-Cookie noise
on PAT/CLI requests; malformed input falls back to [] with a WARNING so
the dev mock never breaks the dev flow.
* refactor(auth): fail-fast LOCAL_DEV_GROUPS at startup + cache + no-mutate
Three small follow-ups on the same dev-mock vector before merge:
- Validate LOCAL_DEV_GROUPS at app startup and report the parsed group IDs
in the LOCAL_DEV_MODE banner. A malformed value now warns loudly at boot
instead of silently logging on the first authenticated request, where
it's easy to miss.
- Cache the parsed result single-slot, keyed by the raw env-string. Avoids
re-parsing JSON on every authenticated request without test-isolation
surprises — when the env value changes, the key changes and the cache
transparently rebuilds.
- Stop mutating the parsed-input dicts (item.setdefault → spread-merge)
so the cached list stays a fresh value on every rebuild.
- Replace the try/except guard around request.session with hasattr —
SessionMiddleware is always registered, the silent except was paranoid.
Tests grow by a direct session-cookie inspection (decoupled from the
profile template) and three startup-banner log assertions.
* fix(auth): drop fragile session-decoder test + actually skip empty-target write
Two follow-ups on the LOCAL_DEV_GROUPS feature before merge:
- Drop test_session_holds_mocked_groups_directly. It manually decoded the
signed session cookie via TimestampSigner + base64, hardcoding both the
Starlette session-cookie format and the 14-day max_age. Starlette has
changed its session encoding before (URLSafeTimedSerializer pre-0.20)
and would do so again silently — the test would fail with a cryptic
BadSignature, not a clear "mock is broken" signal. The remaining
test_dev_user_sees_mocked_groups_on_profile already covers the same
observable signal (mocked groups in /profile body) without coupling to
Starlette internals.
- Actually skip the session write when target_groups is empty. The previous
comment claimed compare-then-write avoided spurious Set-Cookie noise on
PAT/CLI requests, but on those requests session.get("google_groups") is
None and target is [], so None != [] always evaluates True and the write
fired anyway, marking the session dirty and re-issuing Set-Cookie on
every request. Adding `target_groups and ...` to the guard makes the
comment honest: empty mock now genuinely no-ops, stable browser sessions
still skip via value-equality, and the only remaining write is the one
that actually changes state.
33 auth tests still pass locally.
* fix(auth): match production's always-write semantics for stale dev groups
Devin code-review finding on PR #70: my earlier `target_groups and ...`
short-circuit silently diverged from the production OAuth callback. In
app/auth/providers/google.py:189-194 the callback always writes
session.google_groups on each login — including [] on failure or empty
token — so the session always reflects authoritative current state. The
mock should match.
Failure mode the previous guard left open: a developer sets
LOCAL_DEV_GROUPS=[{...}] for a session, the groups land in the signed
cookie, then the developer unsets the env var and reloads. target → [],
session.get → [{...}], `if target_groups and ...` is False, no write,
stale groups stay in the browser session indefinitely. Mock now lies
about state until logout.
Fix splits the guard:
- target_groups truthy + value-changed → write the new mock (existing path)
- target_groups falsy + non-empty stored → write [] to clear stale state
- otherwise no-op (target [] + stored None/[]: no transition to record)
PAT/CLI requests with no prior session still take the no-op path
(target=[], session.get → None which is falsy), so the original goal of
suppressing spurious Set-Cookie noise on token traffic is preserved.
Tests already cover the populated and unset paths; the new clear-stale
branch is correct by construction (production has the same shape) and
the rare manual reset workflow.
* release(0.11.2): default mocked groups in make local-dev + docs/local-development.md
Cuts 0.11.2 around the LOCAL_DEV_GROUPS work plus a small dev-experience
follow-up: every `make local-dev` now boots with two sensible default
mocked groups (Local Dev Engineers + Local Dev Admins on example.com),
so /profile and group-aware code paths render something realistic
without the operator having to discover and set LOCAL_DEV_GROUPS.
Layered so the default lives in the workflow, not the contract:
- scripts/run-local-dev.sh seeds LOCAL_DEV_GROUPS via shell ":="
syntax — only sets the var when the operator hasn't already.
Override: LOCAL_DEV_GROUPS='[...]' make local-dev. Disable:
LOCAL_DEV_GROUPS= make local-dev.
- docker-compose.local-dev.yml swaps the commented JSON example for
a bare `- LOCAL_DEV_GROUPS` passthrough — the value comes from the
shell, the compose file just propagates it. Operators running
`docker compose up` directly without the wrapper script get an
empty mock (correct: they didn't opt into the make-driven defaults).
- Makefile help line mentions the mocked groups so the behavior is
visible without grepping.
New docs/local-development.md consolidates dev-onboarding instructions
that were previously scattered across docker-compose.local-dev.yml
inline comments, docs/auth-groups.md "Local-dev mock" section, the
Makefile help text, and CLAUDE.md "First-Time Setup". Single page now
covers TL;DR, what LOCAL_DEV_MODE actually bypasses, group mocking
controls + verification, what is *not* mocked (Cloud Identity, real
OAuth, admin Workspace permissions), and the safety rails that keep
the dev shortcuts off production.
Version bump 0.11.1 → 0.11.2 in pyproject.toml, CHANGELOG cuts
[Unreleased] → [0.11.2] — 2026-04-26 with a fresh empty [Unreleased]
skeleton.
* fix(local-dev): default LOCAL_DEV_GROUPS truncated by shell parameter expansion
Reported by an operator running `make local-dev` against the freshly
released 0.11.2 — the LOCAL_DEV_MODE banner showed:
LOCAL_DEV_GROUPS is not valid JSON, ignoring:
Expecting ',' delimiter: line 1 column 70 (char 69)
LOCAL_DEV_GROUPS is set but produced no valid groups —
check the WARNING above for the parse error.
Cause: the default value lived inside `${LOCAL_DEV_GROUPS:=…}` parameter
expansion. Bash matches `}` to close the expansion at the *first* `}`
encountered in the body, regardless of context — even one inside a
nested JSON object literal. The two-element JSON array was therefore
truncated to the first group's closing brace, leaving an unparseable
fragment:
[{"id":"local-dev-engineers@example.com","name":"Local Dev Engineers"
There is no escaping syntax for `}` inside parameter expansion (the
backslash escapes I had only escaped the quotes — `}` reaches bash
literally). Fix: hold the default in a single-quoted variable and
reference it through `${LOCAL_DEV_GROUPS:-$DEFAULT_LOCAL_DEV_GROUPS}`.
The variable's value is opaque to the expansion — no `}` matching
inside it — so the JSON survives intact. Verified with `python -m json`:
parsed OK: 2 groups: ['local-dev-engineers@example.com',
'local-dev-admins@example.com']
Operators on a running 0.11.2 stack: `make local-dev-down && make
local-dev` to pick up the corrected default.
* fix(local-dev): respect LOCAL_DEV_GROUPS= disable path + add 0.11.2 changelog link
Two follow-ups from a Devin code-review pass on PR #70:
- run-local-dev.sh: switch ${LOCAL_DEV_GROUPS:-$DEFAULT} to
${LOCAL_DEV_GROUPS-$DEFAULT} (no leading colon). The :- form
substitutes the default when the variable is unset OR set-but-empty,
silently overwriting the documented disable knob. Three places
promise this works — docs/local-development.md, the CHANGELOG entry,
and the script's own comment — so the bug was an operator-facing
lie, not just an implementation detail. The bare - form only
substitutes on unset, so `LOCAL_DEV_GROUPS= make local-dev` now
reaches the Python parser as "" and short-circuits to []. Verified
with both empty and unset shells.
- CHANGELOG.md: add the [0.11.2] link reference at the bottom.
Keep-a-Changelog convention is to mirror every version heading
with a release-tag link in the footer; the 0.11.2 heading was
missing its counterpart, breaking the Markdown link rendering on
GitHub.
---------
Co-authored-by: Claude <noreply@anthropic.com>
121 lines
10 KiB
Markdown
121 lines
10 KiB
Markdown
# Changelog
|
|
|
|
All notable changes to Agnes AI Data Analyst.
|
|
|
|
Format: [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). Versions follow [Semantic Versioning](https://semver.org/spec/v2.0.0.html), pre-1.0 — public surface (CLI flags, REST endpoints, `instance.yaml` schema, `extract.duckdb` contract) may shift between minor versions; breaking changes called out under **Changed** or **Removed** with the **BREAKING** marker.
|
|
|
|
CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every CI build; semver tags (`v0.X.Y`) are cut at release boundaries and reference the same commit as a `stable-*` tag from the same day.
|
|
|
|
---
|
|
|
|
## [Unreleased]
|
|
|
|
<!-- Add bullets here. Group: Added / Changed / Fixed / Removed / Internal.
|
|
Mark breaking changes with **BREAKING** at the start of the bullet. -->
|
|
|
|
## [0.11.2] — 2026-04-26
|
|
|
|
Dev-experience patch release — make `LOCAL_DEV_MODE` realistic enough to actually exercise group-aware code paths on `localhost`, and consolidate scattered dev-onboarding instructions into a single `docs/local-development.md`.
|
|
|
|
### Added
|
|
|
|
- **`LOCAL_DEV_GROUPS` env var** mocks `session.google_groups` for the auto-logged-in dev user when `LOCAL_DEV_MODE=1`. JSON array matching the production shape (`[{"id":"…","name":"…"}]`) so group-aware UI and access-control code paths can be exercised on `localhost` without a Google OAuth round-trip. Honored only under `LOCAL_DEV_MODE=1`. The startup banner reports the parsed group IDs (or warns loudly when the value is set but malformed), so a typo gets surfaced at boot rather than silently on the first authenticated request. Session injection mirrors the production OAuth callback's "always-write" semantics — including clearing stale groups when the operator unsets `LOCAL_DEV_GROUPS` mid-session. See `docs/auth-groups.md` → *Local-dev mock*.
|
|
- **`make local-dev` now seeds two default mocked groups** (`Local Dev Engineers` + `Local Dev Admins` on `example.com`) via `scripts/run-local-dev.sh`, so first-boot `/profile` is non-empty out of the box. Override with `LOCAL_DEV_GROUPS='[…]' make local-dev`; disable with `LOCAL_DEV_GROUPS= make local-dev`.
|
|
- **`docs/local-development.md`** — single onboarding doc for working on Agnes locally: TL;DR, what `LOCAL_DEV_MODE` actually bypasses, group mocking, what isn't mocked, and the security-rails reminder that dev mode must never reach a production deploy.
|
|
|
|
### Internal
|
|
|
|
- Fix nightly `docker-e2e` CI failures: refresh two stale assertions that had drifted from the live API. `tests/test_docker_full.py::test_app_returns_html_on_root` now expects the auth-aware `302 → /login` (root has redirected since the auth middleware landed); `tests/test_e2e_docker.py::TestDockerHealth::test_health_has_duckdb` now reads `services["duckdb_state"]` (current health-payload shape, already validated by `tests/test_api.py`). No application behavior change — these only ran in the scheduled nightly job, so the drift went unnoticed for several PRs.
|
|
|
|
## [0.11.1] — 2026-04-26
|
|
|
|
Patch release — hotfix the missed Caddy env passthrough that should have shipped with 0.11.0, plus codify changelog discipline so this kind of drift gets caught at PR review time next time.
|
|
|
|
### Fixed
|
|
|
|
- `docker-compose.yml` caddy service now passes `CADDY_TLS` through to the container (`- CADDY_TLS` bare-form passthrough). Without it the `Caddyfile` `{$CADDY_TLS:default}` substitution always falls back to cert-file mode regardless of what the operator wrote into `.env`, and Caddy crash-loops on Let's Encrypt / internal-CA deployments. Should have shipped with #52; first attempt was #55, accidentally closed before merging.
|
|
|
|
### Internal
|
|
|
|
- `CLAUDE.md` — non-negotiable changelog discipline: every PR touching user-visible behavior must update `CHANGELOG.md` under `## [Unreleased]` in the same PR.
|
|
|
|
## [0.11.0] — 2026-04-26
|
|
|
|
First tagged semver release. The `version = "2.x"` strings that appeared in earlier `pyproject.toml` snapshots were arbitrary placeholders from the initial scaffold and never reflected actual API maturity — resetting to pre-1.0 to signal that things may still shift.
|
|
|
|
### Added — Auth
|
|
|
|
- **Google Workspace groups on `/profile`.** OAuth callback fetches the signed-in user's group memberships via Cloud Identity (`searchTransitiveGroups` with the `security` label — see `docs/auth-groups.md` for the GCP setup checklist and the `security`-vs-`discussion_forum` gotcha). Profile link added to the user dropdown.
|
|
- **Password reset + invite flows** for web and admin (`/auth/password/reset`, `/admin/users/invite`).
|
|
- **Personal access tokens (PAT)** with separate `:typ=pat` JWT claim, per-token revoke, last-used IP tracking, "My tokens" + admin "All tokens" UI.
|
|
- **Email magic-link provider** (itsdangerous-signed token).
|
|
- **Optional `SEED_ADMIN_PASSWORD`** to pre-hash the seed admin (dev convenience).
|
|
|
|
### Added — Deploy
|
|
|
|
- **`keboola-deploy.yml` workflow.** Tag-triggered alternative to `release.yml` for shared dev VMs that want explicit "deploy when I tag" semantics. Publishes immutable `:keboola-deploy-<tag>` + floating `:keboola-deploy-latest` alias.
|
|
- **Caddy + Let's Encrypt + corporate-CA TLS.** `Caddyfile` parametrized via `$CADDY_TLS` env var so a single file serves three regimes: cert-file (corp PKI), Let's Encrypt auto-issue, Caddy-internal-CA. URL-driven cert rotation with self-signed fallback (`scripts/grpn/agnes-tls-rotate.sh`). `docker-compose.tls.yml` overlay closes host `:8000` when Caddy fronts.
|
|
- **`dev_instances` schema in `customer-instance` Terraform module** gains optional `tls_mode` + `domain` (mirrors `prod_instance`). `infra-v1.6.0` tag.
|
|
- **Optional Google OAuth credentials from Secret Manager.** Module reads `google-oauth-client-{id,secret}` at boot if present; graceful fallback so non-Google deployments aren't affected.
|
|
- **`LOCAL_DEV_MODE` + `make local-dev-up` / `local-dev-down`** for one-keystroke local stack with magic-link auth pre-wired.
|
|
- **Per-developer `dev-<prefix>-latest` GHCR alias** for branches matching `<prefix>/<branch>` — push-to-deploy on personal dev VMs.
|
|
- **`/setup` web wizard** for first-time instance setup, plus headless `POST /api/admin/configure` and `POST /api/admin/discover-and-register`.
|
|
- **Smoke-test job in CI** (Docker-in-CI after every release) + `scripts/smoke-test.sh` for post-deploy verification.
|
|
|
|
### Added — CLI
|
|
|
|
- **Wheel distribution** + auto-update check on startup.
|
|
- `--version` flag, `--dry-run` + `X/N` progress on `da sync`, durable sync (atomic writes + manifest hash + retry on transient errors).
|
|
- gzip on JSON/HTML responses (server-side).
|
|
|
|
### Added — Data
|
|
|
|
- **Remote query engine.** Two-phase BigQuery + DuckDB engine for tables too large to sync locally (`--register-bq` flag).
|
|
- **Business metrics.** Standardized `metric_definitions` table in DuckDB with starter pack importer (`da metrics import`).
|
|
- **`/api/health`** returns `version`, `channel`, `commit_sha`, `image_tag`, `schema_version`.
|
|
- **Custom connector mount support** (`connectors/custom/`).
|
|
- **OpenAPI snapshot test** for breaking-change detection.
|
|
|
|
### Added — Docs / tooling
|
|
|
|
- `docs/auth-groups.md`, `docs/DEPLOYMENT.md`, `docs/HACKATHON.md`, `docs/ONBOARDING.md` runbooks.
|
|
- `scripts/debug/probe_google_groups.py` — stdlib-only probe for diagnosing Cloud Identity API issues without a deploy cycle.
|
|
- Schema migration safety tests (idempotency, data preservation, snapshot).
|
|
- Pre-migration snapshot of `system.duckdb` before schema upgrades.
|
|
- Auto-generated JWT and session secrets with file persistence (`/data/state/.jwt_secret`).
|
|
- Startup banner logging version, channel, and schema version.
|
|
|
|
### Changed
|
|
|
|
- **BREAKING (deployment)** — Caddy compose profile renamed `production` → `tls`. Existing `docker compose --profile production up -d` invocations need to switch.
|
|
- **BREAKING (deployment)** — Default `Caddyfile` mode is now cert-file (`tls /certs/fullchain.pem /certs/privkey.pem`); for the previous Let's Encrypt auto-issue behaviour set `CADDY_TLS=tls <ops-email>` in `.env`. See `docs/auth-groups.md` and `Caddyfile` inline docs.
|
|
- Schema migration v5→v6→v7: adds `users.active`, `personal_access_tokens` table, `personal_access_tokens.last_used_ip`. Auto-applied at boot.
|
|
- Image-level `AGNES_VERSION` now sourced from `pyproject.toml` at build time (no more drift between `da --version` and the package metadata).
|
|
- **Vendor-agnostic OSS rule** codified in `CLAUDE.md` — customer-specific names, hostnames, project IDs belong in consumer infra repos, not in this OSS distribution.
|
|
|
|
### Fixed — Security
|
|
|
|
- Open-redirect guard for backslash in `safe_next_path`.
|
|
- `SessionMiddleware max_age=3600 + https_only` (was browser-session forever, plain-HTTP-OK).
|
|
- Timezone-aware datetimes in Keboola metadata cache.
|
|
- Atomic magic-link token consumption (closes double-use race under concurrent clicks).
|
|
- Bootstrap backdoor closed when passwordless seed admin exists.
|
|
- urllib3 1.26→2.6.3 (resolves 4 Dependabot security alerts).
|
|
- argon2-cffi adopted for password hashing.
|
|
- See [docs/padak-security.md](docs/padak-security.md) for the full audit.
|
|
|
|
### Fixed — Other
|
|
|
|
- `uvicorn --proxy-headers --forwarded-allow-ips='*'` so OAuth callbacks resolve to https when behind a TLS terminator.
|
|
- `scripts/grpn/agnes-tls-rotate.sh` hardened: `--max-redirs 0` + `--proto '=https'` on cert fetch, post-fetch PEM validation (rejects HTML error pages from corp portals), `ulimit -c 0` to suppress coredumps that could leak the unencrypted privkey, POSIX-safe `${arr[@]+"${arr[@]}"}` array expansion.
|
|
- `scripts/tls-fetch.sh` — generic URL fetcher (`sm://`, `gs://`, `https://`, `file://`) with redirect refusal + PEM validation.
|
|
- `kbcstorage` moved to optional dep — unblocks urllib3 security updates; primary Keboola path now uses the DuckDB Keboola extension.
|
|
- Dependencies consolidated into `pyproject.toml` (no more `requirements.txt`).
|
|
|
|
### Internal
|
|
|
|
- Test suite expanded to 1357+ tests (4 layers — unit, integration, web smoke, journey).
|
|
|
|
[0.11.2]: https://github.com/keboola/agnes-the-ai-analyst/releases/tag/v0.11.2
|
|
[0.11.1]: https://github.com/keboola/agnes-the-ai-analyst/releases/tag/v0.11.1
|
|
[0.11.0]: https://github.com/keboola/agnes-the-ai-analyst/releases/tag/v0.11.0
|