* refactor(ops): bake all host artifacts into image, drop every curl-from-main
Replaces the curl-from-main pattern (originally introduced in 0.25.0 for
agnes-auto-upgrade.sh; older for the compose files + Caddyfile) with image-
bundled host artifacts. Same-tag delivery for everything the host runs,
version-pinned by AGNES_TAG, atomically rolled back by reverting the image.
## Motivation
The customer-instance startup template was curling 6 files from
raw.githubusercontent.com on every VM boot:
docker-compose.yml
docker-compose.prod.yml
docker-compose.host-mount.yml
docker-compose.tls.yml
Caddyfile
scripts/ops/agnes-auto-upgrade.sh (added in 0.25.0)
Every one of them already lives inside the image (`COPY . .` copies the
whole repo to /app/). Curling them from the public internet duplicates
content the image already carries and introduces three problems:
1. **Split-brain version pinning.** image_tag pins the docker image to an
immutable digest. The compose files + script bypassed that pinning by
tracking `main` (or the rarely-set compose_ref). A customer pinned to
stable-2026.04.516 could wake up tomorrow with their host artifacts
floating on whatever shipped to main overnight — even though they're
explicitly pinned for stability.
2. **No rollback knob.** Reverting a bad host artifact meant reverting
the upstream PR globally — affects every customer that reboots after
the bad commit. No "rollback for me only" path; tag-pinning gave no
protection.
3. **Public-internet dependency on every boot.** The image is already
pulled from a private registry on the same boot. Reusing that channel
is strictly cheaper than adding a second one. Customers with restricted
egress (no raw.githubusercontent.com reachability) silently broke on
every boot.
## Changes
### Dockerfile (+19 -8)
After `COPY . .` and before the wheel build, an explicit `cp` lifts every
host-side artifact into a stable contract path /opt/agnes-host/:
agnes-auto-upgrade.sh (mode 0755 — host cron driver)
docker-compose.{yml,prod,host-mount,tls}.yml
Caddyfile (mode 0644)
Why a copy instead of pointing at /app directly: /app is owned by uid 999
(USER agnes); /opt/agnes-host is root-owned, mode 0755 across the board,
stable path that won't shift if /app structure refactors.
### infra/modules/customer-instance/startup-script.sh.tpl (+22 -36)
Replaced six curls and the standalone agnes-auto-upgrade.sh extract block
(introduced earlier in this PR) with one extract sequence in section 3:
docker pull "$${IMAGE_REPO}:$${IMAGE_TAG}"
EXTRACT_CONTAINER=$(docker create "$${IMAGE_REPO}:$${IMAGE_TAG}")
trap "docker rm '$EXTRACT_CONTAINER' >/dev/null 2>&1 || true" EXIT
docker cp "$EXTRACT_CONTAINER:/opt/agnes-host/." "$APP_DIR/"
docker cp "$EXTRACT_CONTAINER:/opt/agnes-host/agnes-auto-upgrade.sh" /usr/local/bin/agnes-auto-upgrade.sh
chmod +x /usr/local/bin/agnes-auto-upgrade.sh
The auto-upgrade section (#6) is now a no-op — script is already in place.
### infra/modules/customer-instance/variables.tf (+1 -1)
`compose_ref` marked DEPRECATED in description. Default unchanged for
one release cycle to avoid breaking existing terraform plans. Will be
removed in a future major bump.
### CHANGELOG.md
`### Changed` entry under [Unreleased] — supersedes the narrower entry
this PR previously had (which only covered the script).
## Out of scope (filed as follow-ups)
1. **agnes-the-ai-analyst-infra/startup.sh (operator deploy)** still
curls the same artifacts from main. Symmetric fix needed there.
Will file as a separate PR against the infra repo.
2. **Self-update inside agnes-auto-upgrade.sh** after a successful
`docker compose pull` of a new digest. Otherwise the running cron
keeps using the OLD baked-in script for one tick after image upgrade.
~10 LOC. Deferred to keep this PR scoped.
3. **scripts/ops/agnes-tls-rotate.sh** has the same shape — host-side
bash currently sourced via the infra repo. Should follow the same
bake-into-image pattern.
## Tested
- Local: `docker build .` succeeds with the new RUN block.
- `docker create` + `docker cp /opt/agnes-host/.` round-trips all 6
artifacts; sha matches each source file.
- Not yet tested on a live VM bring-up — that requires a CI image with
this Dockerfile change. **Recommend reviewer trigger CI build, then
do a single VM-recreate against a dev VM (e.g. foundryai-development)
to confirm the extract path works end-to-end before merge.**
## Compatibility
- Existing VMs running 0.25.0 are unaffected — they have host artifacts
in place from `curl from main` already; this PR doesn't touch them.
They pick up the new pattern only on next VM recreate.
- VMs pinned to an image_tag *older* than this PR (no /opt/agnes-host
in the image) would FAIL the docker cp. Current diff fails-loud (no
fallback). Recommend operators upgrade to a fresh-enough image_tag
alongside the template upgrade — same coupling as any compose-flag bump.
* docs(infra): document image_tag >= v0.26.0 minimum on prod/dev_instances
The new startup script extracts host artifacts from /opt/agnes-host/
inside the image — a directory added in this PR (will ship as v0.26.0).
Pinning image_tag to an older tag would fail-loud at first boot with
'docker cp: No such file or directory'. Existing VMs are unaffected
because the module ignores metadata_startup_script changes.
Devin ANALYSIS_0004 on PR #149.
* fix(changelog): mark BREAKING + drop private-repo reference
Per CLAUDE.md, breaking changes start with **BREAKING** so operators
can grep before bumping the pin. The image_tag minimum constraint
introduced here qualifies — older tags fail-loud at first boot.
Also drop the explicit 'agnes-the-ai-analyst-infra' name from the
entry; the OSS distribution shouldn't reference operator-side
deploy templates by their private-repo names. Generic 'consumer-
side deploy templates' wording instead.
Devin BUG_0001 + WARN_0001 on PR #149.
* chore(release): cut 0.26.0
---------
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
100 lines
3.1 KiB
TOML
100 lines
3.1 KiB
TOML
[project]
|
|
name = "agnes-the-ai-analyst"
|
|
version = "0.26.0"
|
|
description = "Agnes — AI Data Analyst platform for AI analytical systems"
|
|
requires-python = ">=3.11,<3.14"
|
|
license = "MIT"
|
|
readme = "README.md"
|
|
|
|
dependencies = [
|
|
# Core database
|
|
"duckdb>=0.9.0",
|
|
# Web framework (FastAPI)
|
|
"fastapi>=0.115.0",
|
|
"uvicorn[standard]>=0.32.0",
|
|
"python-multipart>=0.0.26",
|
|
"jinja2>=3.1.0",
|
|
"starlette>=0.41.0",
|
|
# Authentication
|
|
"PyJWT>=2.8.0",
|
|
"itsdangerous>=2.1.0",
|
|
"authlib>=1.6.11",
|
|
"argon2-cffi>=23.1.0",
|
|
# HTTP client
|
|
"httpx>=0.27.0",
|
|
# CLI
|
|
"typer>=0.12.0",
|
|
"rich>=13.0.0",
|
|
# Configuration
|
|
"python-dotenv>=1.0.0",
|
|
"pyyaml>=6.0",
|
|
# Data processing
|
|
"pandas>=2.0.0",
|
|
"pyarrow>=12.0.0",
|
|
"pytz>=2024.1",
|
|
# SQL parsing — server-side WHERE validator for /api/v2/scan (app/api/where_validator.py)
|
|
# Minimum 30.x — older versions had walk() yielding (node, parent, key)
|
|
# tuples instead of expression nodes, which would silently bypass the
|
|
# WHERE-validator structural checks (isinstance(tuple, exp.Subquery)
|
|
# is always False). 30.x yields nodes directly.
|
|
"sqlglot>=30.0.0",
|
|
# Data source connectors
|
|
"google-cloud-bigquery>=3.0.0",
|
|
"google-cloud-bigquery-storage>=2.0.0",
|
|
# Google Workspace Cloud Identity / Admin SDK (Workspace group membership sync)
|
|
"google-api-python-client>=2.0.0",
|
|
# Profiler visualizations
|
|
"matplotlib>=3.8.0",
|
|
"numpy>=1.24.0",
|
|
# Claude Code marketplace endpoint — pure-Python git server mounted in FastAPI
|
|
"dulwich>=0.22.0",
|
|
"a2wsgi>=1.10.0",
|
|
# In-process TTL cache for marketplace etag (transitively present via
|
|
# google-auth, declared explicitly here because we depend on it directly).
|
|
"cachetools>=5.3.0",
|
|
]
|
|
|
|
[project.optional-dependencies]
|
|
# keboola-legacy: install kbcstorage>=0.9.0 manually if you need the legacy
|
|
# Keboola client fallback (primary path uses DuckDB Keboola extension)
|
|
dev = [
|
|
"pytest>=9.0.0",
|
|
"pytest-timeout>=2.0.0",
|
|
"pytest-xdist>=3.0.0",
|
|
"faker>=24.0.0",
|
|
"anthropic>=0.30.0",
|
|
"openai>=1.30.0",
|
|
# jsonschema validates the corporate-memory extraction-tool golden fixtures
|
|
# under tests/test_corporate_memory_v1.py (extraction.json, correction.json,
|
|
# confidence_calibration.json). Production code does not depend on it.
|
|
"jsonschema>=4.0.0",
|
|
# FastAPI debug toolbar — gated behind DEBUG=1 env var in app/main.py.
|
|
# Provides per-request panels (headers, routes, timer, profiling, etc.)
|
|
# for local development. Never loaded in production (no DEBUG=1 there).
|
|
"fastapi-debug-toolbar>=0.6.3",
|
|
]
|
|
|
|
[project.scripts]
|
|
da = "cli.main:app"
|
|
|
|
[build-system]
|
|
requires = ["hatchling"]
|
|
build-backend = "hatchling.build"
|
|
|
|
[tool.hatch.build.targets.wheel]
|
|
packages = ["app", "src", "connectors", "cli", "services", "config"]
|
|
|
|
[tool.ruff]
|
|
line-length = 120
|
|
target-version = "py313"
|
|
|
|
[tool.uv]
|
|
dev-dependencies = [
|
|
"pytest>=9.0.0",
|
|
"pytest-timeout>=2.0.0",
|
|
"pytest-xdist>=3.0.0",
|
|
"faker>=24.0.0",
|
|
"anthropic>=0.30.0",
|
|
"openai>=1.30.0",
|
|
"fastapi-debug-toolbar>=0.6.3",
|
|
]
|