agnes-the-ai-analyst/scripts/tls-fetch.sh
Vojtech 0bbbf3e40b
feat(tls): corporate-CA HTTPS with URL-driven rotation, on-VM CSR gen, self-signed fallback (#51)
Replaces the implicit Let's Encrypt flow with a general corporate-CA HTTPS path:

- Caddy switches to cert-file mode (`tls /certs/fullchain.pem /certs/privkey.pem`) with HSTS + TLS 1.2/1.3 floor
- New `docker-compose.tls.yml` overlay closes host `:8000` when Caddy fronts (no TLS bypass)
- New `scripts/tls-fetch.sh` — generic URL fetcher for `sm://`, `gs://`, `https://`, `file://` with redirect refusal + PEM validation
- New `scripts/grpn/agnes-tls-rotate.sh` — daily rotation, self-signed fallback against same key (zero key churn), on-VM RSA-2048 + CSR auto-gen, atomic swap, SIGUSR1 reload
- `scripts/grpn/agnes-auto-upgrade.sh` becomes cert-aware (auto-enables tls overlay when certs present)
- Compose profile `production` renamed to `tls` (aligns with DEPLOYMENT.md and infra startup)

Pairs with FoundryAI/agnes-the-ai-analyst-infra#27 (merged) which wires per-VM `local.vm_tls`, writes `TLS_*` env vars into `.env`, auto-creates Secret Manager containers for `sm://` privkey URLs, and installs `agnes-tls-rotate.{service,timer}` for daily polling.

Includes hardening + docs follow-ups from code review:
- `TLS_CSR_SUBJECT` env-var parametrisation applied to both CSR and self-signed cert paths
- curl `--max-redirs 0 --proto '=https'` + post-fetch PEM validation in `tls-fetch.sh`
- `ulimit -c 0` + array-form `COMPOSE_FILES` (POSIX-safe, bash 3.2 compatible)
- TLS section added to `config/.env.template`
- Historical-note headers in `docs/superpowers/{plans,specs}/2026-04-09-*.md` flagging the profile rename
2026-04-25 19:51:25 +00:00

90 lines
2.9 KiB
Bash
Executable file

#!/bin/bash
# Fetch a TLS artifact (cert chain or private key) from a URL to a local
# path with the requested file mode. Supported URL schemes:
#
# sm://<secret-name> — Google Secret Manager, latest version
# gs://<bucket>/<path> — GCS object
# https://<url> — plain HTTPS download (no redirects, no
# scheme downgrade — see curl flags below)
# file://<path> — local file copy (dev/testing only)
#
# Usage: tls-fetch.sh <url> <dest> [mode] [kind]
#
# kind: cert (default) | key — controls post-fetch PEM validation.
# "cert" runs `openssl x509 -noout`, "key" runs `openssl pkey
# -noout`. Anything garbage (HTML error page from a corp portal,
# truncated body, unrelated file) is rejected loudly here so
# Caddy never tries to load an unparseable cert.
#
# Writes atomically via a temp file + install(1) so Caddy never sees a
# half-written cert. Exits non-zero on any failure — callers should not
# swallow errors (a silent TLS break is worse than a loud one).
#
# Exit codes:
# 2 — unsupported URL scheme
# 3 — fetched file is empty
# 4 — fetched content is not a valid PEM of the requested kind
set -euo pipefail
URL="${1:?usage: tls-fetch.sh <url> <dest> [mode] [kind]}"
DEST="${2:?usage: tls-fetch.sh <url> <dest> [mode] [kind]}"
MODE="${3:-644}"
KIND="${4:-cert}"
TMP=$(mktemp)
trap 'rm -f "$TMP"' EXIT
case "$URL" in
sm://*)
SECRET="${URL#sm://}"
gcloud secrets versions access latest --secret="$SECRET" > "$TMP"
;;
gs://*)
gsutil -q cp "$URL" "$TMP"
;;
https://*)
# --max-redirs 0: a redirect on a TLS-artifact URL is a smell
# (compromised DNS / hijacked endpoint can swap the cert/key for
# an attacker-controlled one). Fail loud instead.
# --proto '=https': refuse if curl would degrade scheme.
# --retry 2: tolerate single transient blips; daily timer means
# extended outages are caught the next tick anyway.
curl -fsS --max-redirs 0 --proto '=https' --retry 2 "$URL" -o "$TMP"
;;
file://*)
cp "${URL#file://}" "$TMP"
;;
*)
echo "tls-fetch: unsupported URL scheme: $URL" >&2
exit 2
;;
esac
if [ ! -s "$TMP" ]; then
echo "tls-fetch: fetched empty file from $URL" >&2
exit 3
fi
# PEM sanity check. Catches: HTML error pages with 200 OK, truncated
# downloads, and anything that's not a parseable PEM of the requested
# kind. Cheaper to fail here than to let Caddy crash on reload.
case "$KIND" in
cert)
if ! openssl x509 -in "$TMP" -noout 2>/dev/null; then
echo "tls-fetch: $URL did not return a valid PEM certificate" >&2
exit 4
fi
;;
key)
if ! openssl pkey -in "$TMP" -noout 2>/dev/null; then
echo "tls-fetch: $URL did not return a valid PEM private key" >&2
exit 4
fi
;;
*)
echo "tls-fetch: unsupported kind: $KIND (expected cert|key)" >&2
exit 2
;;
esac
install -m "$MODE" "$TMP" "$DEST"