agnes-the-ai-analyst/docs/DEPLOYMENT.md
ZdenekSrotyr 0121354596 docs: refresh DEPLOYMENT.md and ONBOARDING.md for infra-v1.4.0
- docs/DEPLOYMENT.md: rewritten to pick between Terraform (managed) and
  Docker Compose (OSS self-host). Old manual SSH-key-and-git-clone flow
  replaced with compose-based instructions pointing at the persistent-disk
  overlay and bootstrap endpoint.
- docs/ONBOARDING.md: section 4 now documents the new v1.4.0 variables
  (runtime_secrets, firewall_ssh_source_ranges, notification_channel_ids,
  compose_ref). Section 6 explains the /auth/bootstrap seed-user fix and
  warns that destroy+apply reopens the bootstrap window until run again.
- README.md: Documentation list expanded — ONBOARDING.md first (recommended
  path), DEPLOYMENT.md as the branching point, plus links to CONFIGURATION,
  architecture, and QUICKSTART.
2026-04-21 20:07:43 +02:00

128 lines
4.5 KiB
Markdown

# Deployment Guide
Agnes supports two deployment paths. Pick the one that matches your use case.
## 1. Terraform — managed, multi-customer (recommended)
For Keboola-operated deployments and anyone running Agnes for multiple customers on GCP.
**Follow:** [`ONBOARDING.md`](ONBOARDING.md)
Highlights:
- Per-customer GCP project + private infra repo cloned from [`keboola/agnes-infra-template`](https://github.com/keboola/agnes-infra-template)
- Reusable Terraform module `infra/modules/customer-instance` (versioned — `infra-vX.Y.Z` tags)
- Prod + optional branch-aware dev VMs
- Persistent SSD data disk with daily snapshots
- Secret Manager for tokens (no plaintext in VM metadata)
- OS Login for SSH, dedicated VM service account with scoped `secretAccessor`
- Cron-based auto-upgrade (pulls `:stable` image digest every 5 min)
- Caddy + Let's Encrypt TLS (opt-in with domain)
- Uptime check + alert policy per VM (wire a notification channel to be paged)
- CI/CD in the private repo: PR → `terraform plan`, merge to main → `apply-dev` auto, `apply-prod` gated by reviewer
- First-boot bootstrap via `POST /auth/bootstrap`
Target onboarding time: **< 1 hour** per customer.
## 2. Docker Compose — OSS self-host
For running Agnes on your own VM / bare metal without Terraform. You're responsible for provisioning and maintenance.
### Prerequisites
- Ubuntu 24.04 (or any Linux with Docker)
- 2 vCPU, 2 GB RAM, 30 GB SSD minimum
- Docker Engine + Compose plugin
- Public IP with ports 80/443 (if using Caddy TLS) or 8000 (plain HTTP) open
- Data-source credentials (e.g., Keboola Storage token)
### Steps
1. Clone the Agnes repository:
```bash
git clone https://github.com/keboola/agnes-the-ai-analyst.git /opt/agnes
cd /opt/agnes
```
2. Create `.env`:
```bash
cat > .env <<'EOF'
JWT_SECRET_KEY=$(openssl rand -hex 32)
DATA_DIR=/data
DATA_SOURCE=keboola
KEBOOLA_STORAGE_TOKEN=<your-token>
KEBOOLA_STACK_URL=<your-stack-url>
SEED_ADMIN_EMAIL=<your-email>
LOG_LEVEL=info
AGNES_TAG=stable
EOF
chmod 600 .env
```
3. Mount a persistent disk at `/data` (optional but recommended — survives host rebuild). If you do, use the overlay:
```bash
docker compose \
-f docker-compose.yml \
-f docker-compose.prod.yml \
-f docker-compose.host-mount.yml \
up -d
```
Without a persistent disk (data on Docker named volume, tied to boot disk):
```bash
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
```
4. Bootstrap your admin password via `POST /auth/bootstrap`:
```bash
curl -X POST http://<host>:8000/auth/bootstrap \
-H "Content-Type: application/json" \
-d '{"email":"<your-email>","password":"<strong-password>"}'
```
5. Open `http://<host>:8000/login` and sign in.
### TLS (optional)
Set `DOMAIN` in `.env` + point your DNS A-record at the host, then start with the `tls` profile:
```bash
AGNES_DOMAIN=agnes.example.com ACME_EMAIL=admin@example.com \
docker compose -f docker-compose.yml -f docker-compose.prod.yml --profile tls up -d
```
### Upgrades (manual)
```bash
cd /opt/agnes
git pull
docker compose -f docker-compose.yml -f docker-compose.prod.yml pull
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
```
Or set up a cron job — see `infra/modules/customer-instance/startup-script.sh.tpl` for the reference implementation.
## Which path should I pick?
| | Terraform | Docker Compose |
|---|---|---|
| Setup time | ~45 min first customer, ~15 min each subsequent | ~30 min |
| Infra-as-Code | Full (all resources in git) | Partial (compose.yml only) |
| Secret storage | GCP Secret Manager | `.env` file on host |
| Upgrades | Auto via cron, gated prod apply | Manual `docker compose pull` |
| Backups | Daily GCP snapshots, 30-day retention | You set up yourself |
| Monitoring / alerts | GCP Uptime Checks + alert policy | You set up yourself |
| TLS | Auto Caddy + LE | Auto Caddy + LE (same) |
| Best for | Multi-tenant SaaS, production | Single-instance self-host, learning |
## Related documentation
- [`ONBOARDING.md`](ONBOARDING.md) — end-to-end Terraform onboarding checklist
- [`CONFIGURATION.md`](CONFIGURATION.md) — `instance.yaml`, env vars, per-instance config
- [`architecture.md`](architecture.md) — internal architecture (orchestrator, extractors, DB layout)
- [`QUICKSTART.md`](QUICKSTART.md) — local development setup
- [`superpowers/specs/2026-04-21-multi-customer-deployment-spec.md`](superpowers/specs/2026-04-21-multi-customer-deployment-spec.md) — design rationale for the multi-customer model