fix(ci): smoke-test stale route + rollback ghcr auth + issues:write (#140)

Three CI fixes triggered by the failed PR #137 deploy:

1. scripts/smoke-test.sh: assertion 8 was hitting /api/admin/tables (renamed to /api/admin/registry long ago). The 404 was treated as deployment regression and triggered the auto-rollback. Same stale URL also fixed in CLAUDE.md, README.md, dev_docs/server.md.

2. .github/workflows/release.yml smoke-test job: added Log in to GHCR step. The auto-rollback's docker push :stable was failing with 'unauthenticated' because the smoke-test job had no GHCR login of its own — leaving :stable pointing at the broken image.

3. Rollback step gained GH_TOKEN env, AND the workflow's permissions block gained issues:write. Both were needed for gh issue create to actually create the alert issue (was silently swallowed by the || echo fallback).

Manual cleanup outside this PR: :stable currently points at the broken PR #137 image — needs manual retag back to stable-2026.04.505.
This commit is contained in:
ZdenekSrotyr 2026-04-30 09:42:27 +02:00 committed by GitHub
parent 4ec5ff44dd
commit b5178fe942
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 40 additions and 8 deletions

View file

@ -24,6 +24,11 @@ on:
permissions:
contents: write
packages: write
# `issues: write` lets the smoke-test job's rollback step open a
# GitHub issue alerting operators when an auto-rollback fires. Without
# this, the `gh issue create` call hits 403 and the `|| echo` fallback
# silently swallows it — operators see :stable revert with no alert.
issues: write
# When a developer pushes a brand-new branch with code changes, GitHub fires
# both a `create` and a `push` event for the same commit. Without
@ -224,6 +229,19 @@ jobs:
fetch-depth: 0
fetch-tags: true
# Required for the rollback step's `docker push` to GHCR. The
# `build-and-push` job logs in for itself; this job needs its own
# login since GitHub Actions tokens are scoped per-job. Without it,
# the rollback hits "unauthenticated: User cannot be authenticated
# with the token provided" and silently leaves :stable pointing at
# the broken image (real incident: PR #137 / 4ec5ff44).
- name: Log in to GHCR
uses: docker/login-action@v4
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Start Agnes from built image
run: |
# Create empty .env (docker-compose.yml requires env_file: .env, gitignored)
@ -239,6 +257,12 @@ jobs:
- name: Automatic rollback on failure
if: failure()
env:
# Required for the `gh issue create` call below — without GH_TOKEN
# the gh CLI fails the auth check and the issue creation falls
# through the `|| echo` fallback, so an operator never sees the
# rollback alert.
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
IMAGE_TAG="${{ needs.build-and-push.outputs.image_tag }}"
VERSION="${{ needs.build-and-push.outputs.version }}"

View file

@ -10,6 +10,10 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C
## [Unreleased]
### Internal
- `scripts/smoke-test.sh`: assertion 8 now hits `/api/admin/registry` (the current admin tables endpoint). The old `/api/admin/tables` URL was renamed long ago and the smoke test was returning 404 on every run — it only surfaced as a deploy failure when the full release pipeline first triggered the rollback path on the post-#137 deploy (run 25151878647). Same stale URL was also fixed in `CLAUDE.md`, `README.md`, and `dev_docs/server.md` — the routes now correctly point at `POST /api/admin/register-table` (create) and `PUT /api/admin/registry/{id}` (update).
- `.github/workflows/release.yml` smoke-test job: added `Log in to GHCR` step. The auto-rollback's `docker push :stable` was hitting `unauthenticated: User cannot be authenticated with the token provided` because the smoke-test job had no GHCR login of its own. Result: a failed deploy left `:stable` pointing at the broken image. The rollback step also got an explicit `GH_TOKEN` env, and the workflow's top-level `permissions` block gained `issues: write`, so its `gh issue create` call actually creates the alert issue (was silently swallowed by the `|| echo` fallback because of both the missing env var AND the missing scope).
## [0.21.0] — 2026-04-30
### Internal

View file

@ -19,7 +19,7 @@ Ask the user for:
4. Create `.env` from `config/.env.template`
### Step 3: Register Tables
1. Use the FastAPI admin API (`POST /api/admin/tables/{id}`) or webapp UI to register tables
1. Use the FastAPI admin API (`POST /api/admin/register-table`, then `PUT /api/admin/registry/{id}` for updates) or webapp UI to register tables
2. Tables are stored in DuckDB `table_registry` with source_type, bucket, source_table, query_mode
3. For migration from old format: `python scripts/migrate_registry_to_duckdb.py`

View file

@ -125,7 +125,7 @@ pytest tests/ -v
|------|---------|
| `config/instance.yaml` | Instance-specific settings: branding, data source type, auth provider, Google domain |
| `.env` | Secrets and environment variables — never committed |
| `system.duckdb` `table_registry` table | Table definitions managed via `POST /api/admin/tables/{id}` or the web UI |
| `system.duckdb` `table_registry` table | Table definitions managed via `POST /api/admin/register-table` (or `PUT /api/admin/registry/{id}` to update) or the web UI |
Copy the example to get started:

View file

@ -218,7 +218,7 @@ The FastAPI app is available at `https://your-instance.example.com`.
- **Google OAuth**: restricted to `allowed_domain` set in `config/instance.yaml`
- **Email magic link**: available out of the box (no external service required)
- **Admin API**: `POST /api/admin/tables/{id}` — register/update tables
- **Admin API**: `POST /api/admin/register-table` (register), `PUT /api/admin/registry/{id}` (update), `GET /api/admin/registry` (list) — manage tables
- **Sync API**: `POST /api/sync/trigger` — trigger data extraction
### Google OAuth setup

View file

@ -136,17 +136,21 @@ else
echo " SKIP catalog (no token)"
fi
# 8. Admin tables endpoint (authenticated)
# 8. Admin registry endpoint (authenticated)
# NOTE: was /api/admin/tables until that endpoint was renamed to
# /api/admin/registry; this assertion went stale and only surfaced when the
# auto-rollback workflow first fired (smoke test was failing for many
# releases without anyone noticing).
if [ -n "$TOKEN" ]; then
TABLES_HTTP=$(curl -s -o /tmp/smoke_tables.json -w "%{http_code}" "$HOST/api/admin/tables" \
TABLES_HTTP=$(curl -s -o /tmp/smoke_tables.json -w "%{http_code}" "$HOST/api/admin/registry" \
-H "Authorization: Bearer $TOKEN" 2>/dev/null || echo "000")
if [[ "$TABLES_HTTP" =~ ^(200|403)$ ]]; then
check "admin tables endpoint (HTTP $TABLES_HTTP)" "true"
check "admin registry endpoint (HTTP $TABLES_HTTP)" "true"
else
check "admin tables endpoint (HTTP $TABLES_HTTP)" "false"
check "admin registry endpoint (HTTP $TABLES_HTTP)" "false"
fi
else
echo " SKIP admin tables (no token)"
echo " SKIP admin registry (no token)"
fi
# 9. Marketplace.zip endpoint (with PAT auth if available)