feat(deploy): keboola-deploy tag-triggered workflow + Caddyfile LE/internal modes + dev_instances TLS support (#52)
* feat(deploy): keboola-deploy tag-triggered workflow + Caddyfile LE/internal modes + dev_instances TLS support
Three coordinated changes that together unblock Keboola's internal Agnes
deployment from the foot-gun where the dev VM tracks `:dev` (= last push
from anyone in the upstream repo).
1. .github/workflows/keboola-deploy.yml — new workflow
Triggered ONLY on `keboola-deploy-*` git tag pushes (not on every branch
push like release.yml). Builds an image and publishes two GHCR tags:
ghcr.io/keboola/agnes-the-ai-analyst:keboola-deploy-<git-tag-suffix>
ghcr.io/keboola/agnes-the-ai-analyst:keboola-deploy-latest
The Keboola dev VM pins to `keboola-deploy-latest`; an operator deploys
by `git tag keboola-deploy-foo && git push origin keboola-deploy-foo`.
Audit trail lives in git tags (immutable, who-tagged-what-when), no
PR-cycle needed for each deploy.
Doesn't touch Vojta/Minas/David workflow — release.yml still builds
`:dev-<slug>` for every branch push as before.
2. Caddyfile — parametrize TLS directive via $CADDY_TLS env var
PR #51 hardcoded cert-file mode (`tls /certs/fullchain.pem ...`) for
Groupon's corporate CA flow. That broke the Let's Encrypt path the
module previously supported. Now:
CADDY_TLS unset (default) → cert-file mode (Groupon corp PKI)
CADDY_TLS="tls user@x.com" → Let's Encrypt auto-issue
CADDY_TLS="tls internal" → Caddy-managed self-signed (lab/dev)
Single Caddyfile, three regimes, no per-deployment fork. Validated with
`caddy validate` in all three modes.
3. customer-instance module — dev_instances TLS + auto-set CADDY_TLS
- variables.tf: dev_instances object schema gains optional tls_mode +
domain (mirroring prod_instance). Defaults to "none" + "" so existing
callers without those fields keep current behavior.
- startup-script.sh.tpl: when tls_mode="caddy" and DOMAIN is set, write
CADDY_TLS=tls <ACME_EMAIL> (or "tls internal" when ACME_EMAIL empty)
into /opt/agnes/.env. Caddy then picks it up and the Caddyfile
substitution flips the cert source.
For an LE deploy: set tls_mode="caddy", domain="agnes-dev.example.com",
ensure DNS A-record points at the VM, and acme_email is set on the
module (or seed_admin_email is, since acme_email defaults to it).
After this lands, tag as infra-v1.6.0 so downstream infra repos can bump
their module ref without needing the upstream change tracking.
* feat(deploy): fetch optional Google OAuth credentials from Secret Manager
Mirrors the existing keboola-storage-token / agnes-<customer>-jwt-secret
pattern: VM SA reads google-oauth-client-{id,secret} secrets at boot
(if they exist + IAM is wired by caller via runtime_secrets) and writes
them into /opt/agnes/.env. Empty / missing / 403 → silent fallback
to "" so password and email auth keep working untouched.
Pairs with downstream change in agnes-infra-keboola which adds the two
secret names to runtime_secrets, granting the Keboola VM SA secretAccessor
on them. Operator pre-creates the SM containers via gcloud secrets create
google-oauth-client-{id,secret} (one-time, out of band) — values stay
in SM forever; rotation = `gcloud secrets versions add`.
This unblocks the Keboola agnes-dev deploy from PR #3 (infra) — without
GOOGLE_CLIENT_{ID,SECRET} in .env, app/auth/providers/google.is_available()
returns False and the Google sign-in button never even appears.
This commit is contained in:
parent
0bbbf3e40b
commit
4799119c81
4 changed files with 146 additions and 4 deletions
98
.github/workflows/keboola-deploy.yml
vendored
Normal file
98
.github/workflows/keboola-deploy.yml
vendored
Normal file
|
|
@ -0,0 +1,98 @@
|
||||||
|
name: Keboola Deploy
|
||||||
|
|
||||||
|
# Tag-triggered build for Keboola's internal dev instance.
|
||||||
|
#
|
||||||
|
# Why a separate workflow: the default release.yml builds an image for *every* push
|
||||||
|
# to *every* branch, which means Keboola's `agnes-dev` VM (pinned to `:dev` or
|
||||||
|
# similar floating tag) sees whoever pushed last — Vojta, Minas, anyone. That
|
||||||
|
# convenience for Groupon-side dev VMs (per-developer `dev-<prefix>-latest` aliases)
|
||||||
|
# is a footgun for shared instances.
|
||||||
|
#
|
||||||
|
# This workflow runs ONLY when an operator explicitly creates a `keboola-deploy-*`
|
||||||
|
# git tag. The image is published with two tags:
|
||||||
|
# - keboola-deploy-<git-tag-suffix> (immutable, audit trail in git)
|
||||||
|
# - keboola-deploy-latest (floating alias the VM tracks)
|
||||||
|
#
|
||||||
|
# Operator workflow:
|
||||||
|
# git checkout <commit>
|
||||||
|
# git tag keboola-deploy-2026-04-25-groups-test
|
||||||
|
# git push origin keboola-deploy-2026-04-25-groups-test
|
||||||
|
# # → image built, alias updated, agnes-dev cron picks it up within 5 min
|
||||||
|
on:
|
||||||
|
push:
|
||||||
|
tags:
|
||||||
|
- "keboola-deploy-*"
|
||||||
|
|
||||||
|
permissions:
|
||||||
|
contents: read
|
||||||
|
packages: write
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
test:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v5
|
||||||
|
|
||||||
|
- uses: actions/setup-python@v6
|
||||||
|
with:
|
||||||
|
python-version: "3.13"
|
||||||
|
|
||||||
|
- name: Install uv
|
||||||
|
uses: astral-sh/setup-uv@v7
|
||||||
|
|
||||||
|
- name: Install dependencies
|
||||||
|
run: uv pip install --system ".[dev]"
|
||||||
|
|
||||||
|
- name: Run tests
|
||||||
|
run: pytest tests/ -v --tb=short
|
||||||
|
env:
|
||||||
|
TESTING: "1"
|
||||||
|
|
||||||
|
build-and-push:
|
||||||
|
needs: test
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
outputs:
|
||||||
|
image_tag: ${{ steps.meta.outputs.tag }}
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v5
|
||||||
|
|
||||||
|
- name: Resolve tag + version
|
||||||
|
id: meta
|
||||||
|
run: |
|
||||||
|
TAG="${GITHUB_REF#refs/tags/}"
|
||||||
|
# Sanity: tag must start with keboola-deploy- (the `on:` filter already
|
||||||
|
# enforces this, but cheap belt-and-braces against future workflow edits).
|
||||||
|
case "$TAG" in
|
||||||
|
keboola-deploy-*) ;;
|
||||||
|
*) echo "::error::Tag $TAG does not match keboola-deploy-* — refusing to build"; exit 1 ;;
|
||||||
|
esac
|
||||||
|
# Package version: source of truth is pyproject.toml (same convention as
|
||||||
|
# release.yml). The git tag is the *deploy identifier*, package version
|
||||||
|
# is the *product identifier*.
|
||||||
|
PKG_VERSION=$(grep '^version' pyproject.toml | head -1 | sed -E 's/^version\s*=\s*"([^"]+)".*/\1/')
|
||||||
|
if [ -z "$PKG_VERSION" ]; then
|
||||||
|
echo "::error::Could not extract version from pyproject.toml"; exit 1
|
||||||
|
fi
|
||||||
|
echo "tag=${TAG}" >> "$GITHUB_OUTPUT"
|
||||||
|
echo "pkg_version=${PKG_VERSION}" >> "$GITHUB_OUTPUT"
|
||||||
|
echo "Building image for git tag: ${TAG} (package version ${PKG_VERSION})"
|
||||||
|
|
||||||
|
- name: Log in to GHCR
|
||||||
|
uses: docker/login-action@v4
|
||||||
|
with:
|
||||||
|
registry: ghcr.io
|
||||||
|
username: ${{ github.actor }}
|
||||||
|
password: ${{ secrets.GITHUB_TOKEN }}
|
||||||
|
|
||||||
|
- name: Build and push
|
||||||
|
uses: docker/build-push-action@v7
|
||||||
|
with:
|
||||||
|
push: true
|
||||||
|
build-args: |
|
||||||
|
AGNES_VERSION=${{ steps.meta.outputs.pkg_version }}
|
||||||
|
RELEASE_CHANNEL=keboola-deploy
|
||||||
|
AGNES_COMMIT_SHA=${{ github.sha }}
|
||||||
|
AGNES_TAG=${{ steps.meta.outputs.tag }}
|
||||||
|
tags: |
|
||||||
|
ghcr.io/${{ github.repository }}:${{ steps.meta.outputs.tag }}
|
||||||
|
ghcr.io/${{ github.repository }}:keboola-deploy-latest
|
||||||
15
Caddyfile
15
Caddyfile
|
|
@ -1,7 +1,16 @@
|
||||||
{$DOMAIN:localhost} {
|
{$DOMAIN:localhost} {
|
||||||
# Cert-file mode (corporate CA path). For Let's Encrypt, drop the
|
# Cert provisioning. Driven by env var CADDY_TLS:
|
||||||
# `tls` directive entirely so Caddy auto-issues. See docs/DEPLOYMENT.md.
|
# - unset (default) → cert-file mode for corporate PKI (rotated by
|
||||||
tls /certs/fullchain.pem /certs/privkey.pem {
|
# scripts/grpn/agnes-tls-rotate.sh into /data/state/certs/).
|
||||||
|
# - "tls <email>" → Let's Encrypt auto-issue, e.g. "tls ops@example.com"
|
||||||
|
# (used by public-internet deployments like Keboola dev).
|
||||||
|
# - "tls internal" → Caddy-managed self-signed cert (lab/dev only,
|
||||||
|
# browser warning on every visit).
|
||||||
|
#
|
||||||
|
# The {$VAR:default} substitution lets one Caddyfile serve all three
|
||||||
|
# regimes without per-deployment forks. Caddyfile parses the substituted
|
||||||
|
# string as a directive, so the value MUST start with `tls `.
|
||||||
|
{$CADDY_TLS:tls /certs/fullchain.pem /certs/privkey.pem} {
|
||||||
# Modern TLS only. Caddy default already excludes 1.0/1.1 in
|
# Modern TLS only. Caddy default already excludes 1.0/1.1 in
|
||||||
# most builds, but pin explicitly so a future Caddy default
|
# most builds, but pin explicitly so a future Caddy default
|
||||||
# change can't silently weaken our posture.
|
# change can't silently weaken our posture.
|
||||||
|
|
|
||||||
|
|
@ -69,12 +69,36 @@ if [ "$DATA_SOURCE" = "keboola" ]; then
|
||||||
fi
|
fi
|
||||||
JWT_KEY=$(gcloud secrets versions access latest --secret=agnes-$${CUSTOMER_NAME}-jwt-secret)
|
JWT_KEY=$(gcloud secrets versions access latest --secret=agnes-$${CUSTOMER_NAME}-jwt-secret)
|
||||||
|
|
||||||
|
# Optional Google OAuth credentials. If the operator has created
|
||||||
|
# google-oauth-client-{id,secret} secrets in the project's Secret Manager
|
||||||
|
# AND wired them via runtime_secrets in the calling Terraform, the VM SA can
|
||||||
|
# read them — write into .env so the Google sign-in flow works. Missing /
|
||||||
|
# 403 / empty → silent fallback to "" so password + email auth keep working.
|
||||||
|
GOOGLE_CLIENT_ID=$(gcloud secrets versions access latest --secret=google-oauth-client-id 2>/dev/null || echo "")
|
||||||
|
GOOGLE_CLIENT_SECRET=$(gcloud secrets versions access latest --secret=google-oauth-client-secret 2>/dev/null || echo "")
|
||||||
|
|
||||||
# AGNES_VERSION, RELEASE_CHANNEL, AGNES_COMMIT_SHA are baked into the image
|
# AGNES_VERSION, RELEASE_CHANNEL, AGNES_COMMIT_SHA are baked into the image
|
||||||
# itself as ENV (see Dockerfile ARG/ENV + release.yml build-args). We do NOT
|
# itself as ENV (see Dockerfile ARG/ENV + release.yml build-args). We do NOT
|
||||||
# set them here — doing so would override the image-level values with the
|
# set them here — doing so would override the image-level values with the
|
||||||
# floating tag name ("stable"/"dev"), hiding the real CalVer / git SHA.
|
# floating tag name ("stable"/"dev"), hiding the real CalVer / git SHA.
|
||||||
# The app picks them up from the image's runtime environment.
|
# The app picks them up from the image's runtime environment.
|
||||||
|
|
||||||
|
# CADDY_TLS controls Caddyfile cert provisioning (see Caddyfile inline docs).
|
||||||
|
# - tls_mode=caddy + ACME_EMAIL set → Let's Encrypt auto-issue (public domain)
|
||||||
|
# - tls_mode=caddy + no ACME_EMAIL → Caddy-managed self-signed (lab use)
|
||||||
|
# - any other tls_mode → leave CADDY_TLS unset, Caddyfile default
|
||||||
|
# (cert-file mode for corporate PKI) applies.
|
||||||
|
# Operators wanting cert-file mode shouldn't set tls_mode at all on the dev
|
||||||
|
# instance — leave it "none" and let the corp-PKI rotate scripts handle certs.
|
||||||
|
CADDY_TLS_LINE=""
|
||||||
|
if [ "$TLS_MODE" = "caddy" ] && [ -n "$DOMAIN" ]; then
|
||||||
|
if [ -n "$ACME_EMAIL" ]; then
|
||||||
|
CADDY_TLS_LINE="CADDY_TLS=tls $ACME_EMAIL"
|
||||||
|
else
|
||||||
|
CADDY_TLS_LINE="CADDY_TLS=tls internal"
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
cat > "$APP_DIR/.env" <<ENVEOF
|
cat > "$APP_DIR/.env" <<ENVEOF
|
||||||
JWT_SECRET_KEY=$JWT_KEY
|
JWT_SECRET_KEY=$JWT_KEY
|
||||||
DATA_DIR=$DATA_MNT
|
DATA_DIR=$DATA_MNT
|
||||||
|
|
@ -87,6 +111,9 @@ LOG_LEVEL=info
|
||||||
DOMAIN=$DOMAIN
|
DOMAIN=$DOMAIN
|
||||||
AGNES_TAG=$IMAGE_TAG
|
AGNES_TAG=$IMAGE_TAG
|
||||||
ACME_EMAIL=$ACME_EMAIL
|
ACME_EMAIL=$ACME_EMAIL
|
||||||
|
GOOGLE_CLIENT_ID=$GOOGLE_CLIENT_ID
|
||||||
|
GOOGLE_CLIENT_SECRET=$GOOGLE_CLIENT_SECRET
|
||||||
|
$CADDY_TLS_LINE
|
||||||
ENVEOF
|
ENVEOF
|
||||||
chmod 600 "$APP_DIR/.env"
|
chmod 600 "$APP_DIR/.env"
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -39,11 +39,19 @@ variable "prod_instance" {
|
||||||
}
|
}
|
||||||
|
|
||||||
variable "dev_instances" {
|
variable "dev_instances" {
|
||||||
description = "Seznam dev VMs. Prázdné pole = žádné dev VMs."
|
description = <<-EOT
|
||||||
|
Seznam dev VMs. Prázdné pole = žádné dev VMs.
|
||||||
|
|
||||||
|
tls_mode + domain are optional and default to plain HTTP on :8000. Set
|
||||||
|
tls_mode = "caddy" + domain to enable Caddy + Let's Encrypt (or whatever
|
||||||
|
CADDY_TLS env var is configured to in the Caddyfile — see Caddyfile docs).
|
||||||
|
EOT
|
||||||
type = list(object({
|
type = list(object({
|
||||||
name = string
|
name = string
|
||||||
machine_type = optional(string, "e2-small")
|
machine_type = optional(string, "e2-small")
|
||||||
image_tag = optional(string, "dev")
|
image_tag = optional(string, "dev")
|
||||||
|
tls_mode = optional(string, "none")
|
||||||
|
domain = optional(string, "")
|
||||||
}))
|
}))
|
||||||
default = []
|
default = []
|
||||||
}
|
}
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue