Four review iterations resolved: - PATH-shadow-safe smoke test (uv tool dir --bin + ~/.local/bin fallback) - Recursion sentinel for in-flight self-upgrade - sys.executable + --no-deps pip fallback (NOT system python3, NOT --user) - Smoke + rollback with rc capture and bootstrap recovery - Single chained SessionStart entry (shell ; for ordering, no Claude Code semantics dependency) - AGNES_NO_UPDATE_CHECK bypass for explicit self-upgrade - _get_shared_client() left unhooked (mid-stream sys.exit unsafe; Caddy proxies parquets anyway) Targets release 0.40.0.
1546 lines
66 KiB
Markdown
1546 lines
66 KiB
Markdown
# CLI Auto-Upgrade Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Make `agnes` CLI auto-upgrade from the server it talks to. Two layers: (A) `agnes self-upgrade` invoked from a SessionStart hook for proactive upgrade; (B) `X-Agnes-Min-Version` response header for a hard-stop on incompatible drift.
|
|
|
|
**Architecture:** Server already serves `/cli/latest` (wheel metadata) and `/cli/wheel/<name>` (wheel bytes). CLI already polls `/cli/latest` from `cli/update_check.py` and warns on drift. This plan adds: a server-side `MIN_COMPAT_CLI_VERSION` constant + middleware that stamps `X-Agnes-Latest-Version` / `X-Agnes-Min-Version` on every `/api/*` response; a CLI `agnes self-upgrade` command that reuses `update_check.check()` and shells out to `uv tool install --force` (pip fallback); response-header inspection in `cli/client.py:get_client()` that hard-stops with `sys.exit(2)` on `local < min`; and a third `SessionStart` hook line that runs `agnes self-upgrade --quiet` ahead of `agnes pull`.
|
|
|
|
**Tech Stack:** Python 3.12 / FastAPI / httpx / typer / uv / pytest. No new dependencies.
|
|
|
|
**Spec:** `docs/superpowers/specs/2026-05-06-cli-auto-upgrade-spec.md` — read this first if context is unclear.
|
|
|
|
---
|
|
|
|
## File Structure
|
|
|
|
**New files:**
|
|
- `app/version.py` — `APP_VERSION` (deduped from `app/main.py:_app_version`) + `MIN_COMPAT_CLI_VERSION` constants. Single source of truth.
|
|
- `cli/commands/self_upgrade.py` — `agnes self-upgrade` typer command, including smoke test (deterministic install path, not PATH-resolved), last-known-good record, rollback with rc capture, recursion sentinel, and explicit `--force` offline error.
|
|
- `tests/test_version_headers_middleware.py` — server middleware integration test.
|
|
- `tests/test_client_version_check.py` — header-inspection hard-stop test, including the `AGNES_SELF_UPGRADE_IN_PROGRESS` sentinel barrier.
|
|
- `tests/test_self_upgrade.py` — command behavior, subprocess shape, smoke-test rollback (with rc capture), `--force` offline failure, `AGNES_NO_UPDATE_CHECK` bypass for explicit upgrades, sentinel propagation.
|
|
|
|
**Modified files:**
|
|
- `app/main.py` — delete `_app_version()`, import `APP_VERSION` from `app/version.py`, register version-headers middleware.
|
|
- `app/api/cli_artifacts.py` — drive-by docstring fix (`da` → `agnes`).
|
|
- `cli/client.py` — `get_client()` adds `event_hooks` for response inspection + `User-Agent` header. `_check_version_headers` short-circuits on `AGNES_SELF_UPGRADE_IN_PROGRESS=1`.
|
|
- `cli/main.py` — register `self_upgrade_app` typer.
|
|
- `cli/update_check.py` — drive-by docstring fix (`da` → `agnes`); add `bypass_disabled=False` keyword-only kwarg to `check()` so explicit `agnes self-upgrade` invocations can override `AGNES_NO_UPDATE_CHECK`; ensure `_version_lt` and `_installed_version` are importable from `cli/client.py` and `cli/commands/self_upgrade.py`.
|
|
- `cli/lib/hooks.py` — single chained SessionStart entry (`agnes self-upgrade ... || true; agnes pull ... || true`); extend `_OUR_COMMAND_MARKERS` with `agnes self-upgrade`.
|
|
- `tests/test_lib_hooks.py` — assert chained command + ordering + idempotency.
|
|
- `tests/test_app_version.py` — rewrite to target `app.version` (since `app.main._app_version` is deleted).
|
|
- `CHANGELOG.md` — `### Added` entry under `## [Unreleased]`.
|
|
- `pyproject.toml` — bump `[project].version` from `0.39.0` to `0.40.0` in the release-cut commit (Task 7).
|
|
|
|
**Files this plan does NOT touch (by design):**
|
|
- `~/.config/agnes/last_known_good.json` — written at runtime by `_record_last_known_good` after the smoke test passes; separate file from `update_check.json`. (Convention: record before invalidate, no correctness consequence either way.)
|
|
- `docs/CLI_COMPAT.md`, `.github/pull_request_template.md` — earlier draft proposed these as enforcement scaffolding; dropped because a doc + checkbox catches nothing real (engineer can check the box without bumping the constant). Layer B's mechanism stays as opt-in for the day someone needs it; same review discipline as every other behavior change.
|
|
|
|
---
|
|
|
|
## Task 1: Server-side version constants + middleware
|
|
|
|
**Files:**
|
|
- Create: `app/version.py`
|
|
- Modify: `app/main.py` (top-level import + middleware registration; replace `_app_version()` body to read from `app.version.APP_VERSION`)
|
|
- Create: `tests/test_version_headers_middleware.py`
|
|
|
|
- [ ] **Step 1.1: Write the failing middleware test**
|
|
|
|
Create `tests/test_version_headers_middleware.py`:
|
|
|
|
```python
|
|
"""Verify /api/* responses carry X-Agnes-Latest-Version + X-Agnes-Min-Version."""
|
|
|
|
from fastapi.testclient import TestClient
|
|
|
|
|
|
def test_api_response_carries_version_headers():
|
|
from app.main import app
|
|
from app.version import APP_VERSION, MIN_COMPAT_CLI_VERSION
|
|
client = TestClient(app)
|
|
# /api/version is unauthenticated and cheap.
|
|
resp = client.get("/api/version")
|
|
assert resp.status_code == 200
|
|
# Headers must equal the constants in app.version, not just be parseable.
|
|
# When MIN_COMPAT_CLI_VERSION is deliberately bumped in a future PR, this
|
|
# test is updated in the same PR — the review-discipline guardrail.
|
|
assert resp.headers["X-Agnes-Latest-Version"] == APP_VERSION
|
|
assert resp.headers["X-Agnes-Min-Version"] == MIN_COMPAT_CLI_VERSION
|
|
# Day-one floor pin: drop or update this assertion when the floor moves.
|
|
assert resp.headers["X-Agnes-Min-Version"] == "0.0.0"
|
|
|
|
|
|
def test_non_api_response_does_not_carry_version_headers():
|
|
from app.main import app
|
|
client = TestClient(app)
|
|
# /cli/latest is under /cli, not /api — should NOT carry the headers.
|
|
resp = client.get("/cli/latest")
|
|
assert resp.status_code == 200
|
|
assert "X-Agnes-Latest-Version" not in resp.headers
|
|
assert "X-Agnes-Min-Version" not in resp.headers
|
|
```
|
|
|
|
- [ ] **Step 1.2: Run test, verify it fails**
|
|
|
|
```bash
|
|
pytest tests/test_version_headers_middleware.py -v
|
|
```
|
|
Expected: FAIL — `X-Agnes-Latest-Version` not in headers.
|
|
|
|
- [ ] **Step 1.3: Create `app/version.py`**
|
|
|
|
```python
|
|
"""Single source of truth for app + CLI compat versions.
|
|
|
|
`APP_VERSION` is read from package metadata so it tracks `pyproject.toml`
|
|
without a manual literal to keep in sync.
|
|
|
|
`MIN_COMPAT_CLI_VERSION` is the oldest CLI version the server still accepts
|
|
on `/api/*`. Bumped manually when shipping a wire-protocol break. Day-one
|
|
value of "0.0.0" means no enforcement — set the floor the first time a
|
|
deliberate break ships.
|
|
"""
|
|
|
|
from importlib.metadata import PackageNotFoundError
|
|
from importlib.metadata import version as _pkg_version
|
|
|
|
|
|
def _read_app_version() -> str:
|
|
try:
|
|
return _pkg_version("agnes-the-ai-analyst")
|
|
except PackageNotFoundError:
|
|
return "0.0.0+dev"
|
|
|
|
|
|
APP_VERSION = _read_app_version()
|
|
MIN_COMPAT_CLI_VERSION = "0.0.0"
|
|
```
|
|
|
|
- [ ] **Step 1.4: Replace `_app_version()` with `APP_VERSION` import + register middleware**
|
|
|
|
Two changes in `app/main.py`:
|
|
|
|
(a) **Dedupe.** Both `_app_version()` (line 40) and `app/version.py:APP_VERSION` read from `importlib.metadata.version("agnes-the-ai-analyst")` — keeping both invites drift. Delete the `_app_version()` helper, import `APP_VERSION` at module top:
|
|
|
|
```python
|
|
# At module top, alongside other app.* imports:
|
|
from app.version import APP_VERSION, MIN_COMPAT_CLI_VERSION
|
|
|
|
# Delete the entire `_app_version()` function (line 40 onwards).
|
|
|
|
# Replace line 186:
|
|
- version=_app_version(),
|
|
+ version=APP_VERSION,
|
|
```
|
|
|
|
(b) **Middleware.** After the `app = FastAPI(...)` instantiation block, add:
|
|
|
|
```python
|
|
@app.middleware("http")
|
|
async def _add_version_headers(request, call_next):
|
|
response = await call_next(request)
|
|
if request.url.path.startswith("/api/"):
|
|
response.headers["X-Agnes-Latest-Version"] = APP_VERSION
|
|
response.headers["X-Agnes-Min-Version"] = MIN_COMPAT_CLI_VERSION
|
|
return response
|
|
```
|
|
|
|
(c) **Update `tests/test_app_version.py`** — the existing tests patch `app.main._pkg_version` and `app.main._app_version`, both of which no longer exist. Rewrite to target `app.version` AND keep the end-to-end pin that the FastAPI app object surfaces the constant:
|
|
|
|
```python
|
|
"""Pin that APP_VERSION reads from package metadata, not a hardcoded literal,
|
|
and that the FastAPI app's `version=` field surfaces it end-to-end."""
|
|
|
|
import importlib
|
|
from unittest.mock import patch
|
|
|
|
|
|
def test_app_version_reads_package_metadata():
|
|
with patch("app.version._pkg_version", return_value="9.9.9") as mock_pkg_ver:
|
|
import app.version
|
|
importlib.reload(app.version)
|
|
assert app.version.APP_VERSION == "9.9.9"
|
|
mock_pkg_ver.assert_called_once_with("agnes-the-ai-analyst")
|
|
|
|
|
|
def test_app_version_falls_back_when_package_missing():
|
|
from importlib.metadata import PackageNotFoundError
|
|
with patch("app.version._pkg_version", side_effect=PackageNotFoundError):
|
|
import app.version
|
|
importlib.reload(app.version)
|
|
assert app.version.APP_VERSION == "0.0.0+dev"
|
|
|
|
|
|
def test_fastapi_app_version_matches_app_version_constant():
|
|
"""End-to-end: FastAPI's app.version (consumed by /openapi.json and
|
|
/docs) must equal app.version.APP_VERSION. Guards the wiring at
|
|
`app/main.py:186 version=APP_VERSION` against accidental literal."""
|
|
import importlib
|
|
import app.version
|
|
import app.main
|
|
|
|
# Reload both so we read post-patch values consistently.
|
|
with patch("app.version._pkg_version", return_value="7.7.7"):
|
|
importlib.reload(app.version)
|
|
importlib.reload(app.main)
|
|
assert app.main.app.version == "7.7.7"
|
|
assert app.main.app.version == app.version.APP_VERSION
|
|
```
|
|
|
|
The reload trick: `APP_VERSION` is set once at module import time; reimporting under a patch reruns `_read_app_version()`. The third test reimports `app.main` after `app.version` to pick up the new constant value through the `from app.version import APP_VERSION` import line.
|
|
|
|
- [ ] **Step 1.5: Run test, verify it passes**
|
|
|
|
```bash
|
|
pytest tests/test_version_headers_middleware.py -v
|
|
```
|
|
Expected: PASS — both tests.
|
|
|
|
- [ ] **Step 1.6: Run the full app-side test suite to catch regressions**
|
|
|
|
```bash
|
|
pytest tests/test_app_version.py tests/test_version_headers_middleware.py -v
|
|
```
|
|
Expected: PASS — `_app_version()` test still green (we didn't touch it).
|
|
|
|
- [ ] **Step 1.7: Commit**
|
|
|
|
```bash
|
|
git add app/version.py app/main.py tests/test_version_headers_middleware.py tests/test_app_version.py
|
|
git commit -m "feat(server): expose APP_VERSION + MIN_COMPAT_CLI_VERSION on /api/* response headers
|
|
|
|
Adds X-Agnes-Latest-Version and X-Agnes-Min-Version headers to every
|
|
/api/* response. CLI consumes these to hard-stop on incompatible drift.
|
|
MIN_COMPAT_CLI_VERSION ships at 0.0.0 — no enforcement until a deliberate
|
|
wire-protocol break bumps it.
|
|
|
|
Also dedupes app version logic: app/main.py:_app_version() helper deleted,
|
|
replaced by app/version.py:APP_VERSION as the single source of truth.
|
|
test_app_version.py rewritten to target app.version."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 2: CLI response-header version check
|
|
|
|
**Files:**
|
|
- Modify: `cli/update_check.py` (export helpers — `_version_lt` and `_installed_version` must be reusable; rename to public if needed, or just import the underscore-prefixed names)
|
|
- Modify: `cli/client.py:get_client()` — add `event_hooks={"response": [_check_version_headers]}` and `User-Agent`
|
|
- Create: `tests/test_client_version_check.py`
|
|
|
|
- [ ] **Step 2.1: Write the failing hard-stop test**
|
|
|
|
Create `tests/test_client_version_check.py`:
|
|
|
|
```python
|
|
"""Verify cli/client.py:get_client() hard-stops on min_version mismatch."""
|
|
|
|
from unittest.mock import patch
|
|
|
|
import httpx
|
|
import pytest
|
|
|
|
|
|
def _fake_response(headers: dict) -> httpx.Response:
|
|
return httpx.Response(status_code=200, headers=headers, content=b"{}", request=httpx.Request("GET", "http://x/"))
|
|
|
|
|
|
def test_local_below_min_exits_with_code_2():
|
|
from cli.client import _check_version_headers
|
|
with patch("cli.client._installed_version", return_value="0.30.0"):
|
|
resp = _fake_response({
|
|
"X-Agnes-Latest-Version": "0.40.0",
|
|
"X-Agnes-Min-Version": "0.35.0",
|
|
})
|
|
with pytest.raises(SystemExit) as exc:
|
|
_check_version_headers(resp)
|
|
assert exc.value.code == 2
|
|
|
|
|
|
def test_local_at_or_above_min_does_not_exit():
|
|
from cli.client import _check_version_headers
|
|
with patch("cli.client._installed_version", return_value="0.40.0"):
|
|
resp = _fake_response({
|
|
"X-Agnes-Latest-Version": "0.40.0",
|
|
"X-Agnes-Min-Version": "0.35.0",
|
|
})
|
|
_check_version_headers(resp) # must not raise
|
|
|
|
|
|
def test_missing_headers_no_enforcement():
|
|
"""Older server without middleware → no headers → no-op."""
|
|
from cli.client import _check_version_headers
|
|
with patch("cli.client._installed_version", return_value="0.10.0"):
|
|
resp = _fake_response({}) # empty headers
|
|
_check_version_headers(resp) # must not raise
|
|
|
|
|
|
def test_unknown_local_version_no_enforcement():
|
|
"""Source-checkout / editable install → never block."""
|
|
from cli.client import _check_version_headers
|
|
with patch("cli.client._installed_version", return_value="unknown"):
|
|
resp = _fake_response({
|
|
"X-Agnes-Latest-Version": "0.40.0",
|
|
"X-Agnes-Min-Version": "0.35.0",
|
|
})
|
|
_check_version_headers(resp) # must not raise
|
|
|
|
|
|
def test_self_upgrade_in_progress_disables_enforcement(monkeypatch):
|
|
"""Recursion barrier: while self-upgrade runs, no /api/* call may
|
|
block on min-version drift. Otherwise an in-flight upgrade could
|
|
sys.exit(2) with 'Run: agnes self-upgrade' from inside itself."""
|
|
from cli.client import _check_version_headers
|
|
monkeypatch.setenv("AGNES_SELF_UPGRADE_IN_PROGRESS", "1")
|
|
with patch("cli.client._installed_version", return_value="0.10.0"):
|
|
resp = _fake_response({
|
|
"X-Agnes-Latest-Version": "0.40.0",
|
|
"X-Agnes-Min-Version": "0.35.0",
|
|
})
|
|
_check_version_headers(resp) # must not raise
|
|
```
|
|
|
|
- [ ] **Step 2.2: Run test, verify it fails**
|
|
|
|
```bash
|
|
pytest tests/test_client_version_check.py -v
|
|
```
|
|
Expected: FAIL — `cli.client._check_version_headers` does not exist.
|
|
|
|
- [ ] **Step 2.3: Implement `_check_version_headers` in `cli/client.py`**
|
|
|
|
At the top of `cli/client.py`, near other imports, add:
|
|
|
|
```python
|
|
import os
|
|
import sys
|
|
|
|
from cli.update_check import _installed_version, _version_lt
|
|
```
|
|
|
|
Then before `get_client()`, define:
|
|
|
|
```python
|
|
def _check_version_headers(response: "httpx.Response") -> None:
|
|
"""Hard-stop the CLI when the server reports we're below min_version.
|
|
|
|
Drift warnings (`local < latest`) are already printed by the
|
|
update_check root callback in cli/main.py — no need to nag again on
|
|
every API call. This hook only enforces the hard floor.
|
|
"""
|
|
# Recursion barrier: `agnes self-upgrade` sets this for the duration
|
|
# of the upgrade. Without it, a /api/* call inside the install flow
|
|
# could exit 2 with "Run: agnes self-upgrade" — inside agnes
|
|
# self-upgrade. The sentinel is process-local and propagates to
|
|
# subprocesses via the explicit env= passed to the smoke test.
|
|
if os.environ.get("AGNES_SELF_UPGRADE_IN_PROGRESS") == "1":
|
|
return
|
|
latest = response.headers.get("X-Agnes-Latest-Version")
|
|
minv = response.headers.get("X-Agnes-Min-Version")
|
|
if not latest or not minv:
|
|
return
|
|
local = _installed_version()
|
|
if local == "unknown":
|
|
return
|
|
if _version_lt(local, minv):
|
|
sys.stderr.write(
|
|
f"error: agnes {local} is incompatible with server {latest} "
|
|
f"(min required: {minv}). Run: agnes self-upgrade\n"
|
|
)
|
|
sys.exit(2)
|
|
```
|
|
|
|
**Patch only `get_client()` — leave `_get_shared_client()` alone.** Post-rebase, `cli/client.py` has both `get_client()` (line 216, one-shot metadata calls) and `_get_shared_client()` (line 252, persistent HTTP/2 client used by `stream_download` for parquet bytes via chunked range requests).
|
|
|
|
The hook is wired ONLY on `get_client()`:
|
|
|
|
- httpx fires response event hooks **as soon as headers arrive**, before `iter_bytes()` consumes the body. On `_get_shared_client()`, `_check_version_headers` would run inside the `with client.stream(...) as response:` context of `_download_chunk` (`cli/client.py:452`) and `_download_single_stream` (`cli/client.py:595`). A `sys.exit(2)` from the hook kills the process mid-stream: `ThreadPoolExecutor` with N parallel chunk-writer threads, open `<target>.<pid>.partN` file handles, no `.tmp → final` rename. Half-written part files left on disk (the existing PID-reaper cleans those eventually, but the abrupt exit is ungraceful).
|
|
- In production, parquet downloads typically go through a Caddy `file_server` (PR #182) anyway, so FastAPI middleware doesn't stamp headers on the streaming responses. Skipping the hook on `_get_shared_client()` matches that production reality. In dev / non-Caddy deployments, parquet streaming bypasses the hard-stop — accepted gap. The next metadata call (which runs through `get_client()`) catches drift.
|
|
- All `/api/*` metadata calls (catalog, schema, snapshot create, sync trigger, auth, store, etc.) go through `get_client()`, where the hook fires safely on a fresh single-response client.
|
|
|
|
Modify `get_client()` to wire the hook and a User-Agent. Locate the `httpx.Client(...)` constructor call and pass:
|
|
|
|
```python
|
|
import platform
|
|
|
|
return httpx.Client(
|
|
base_url=server_url,
|
|
timeout=timeout,
|
|
headers={**headers, "User-Agent": f"agnes/{_installed_version()} ({platform.system().lower()})"},
|
|
event_hooks={"response": [_check_version_headers]},
|
|
)
|
|
```
|
|
|
|
`headers` already contains `Authorization` from the existing implementation; we merge in `User-Agent`. **Do not** modify `_get_shared_client()` — the streaming-response semantics make `sys.exit(2)` from a response event hook unsafe (see the rationale above).
|
|
|
|
- [ ] **Step 2.4: Run test, verify it passes**
|
|
|
|
```bash
|
|
pytest tests/test_client_version_check.py -v
|
|
```
|
|
Expected: PASS — all four tests.
|
|
|
|
- [ ] **Step 2.5: Run the existing CLI test suite to catch regressions**
|
|
|
|
```bash
|
|
pytest tests/test_cli_update_check.py tests/test_client_version_check.py -v
|
|
```
|
|
Expected: PASS — no regressions in update_check.
|
|
|
|
- [ ] **Step 2.6: Commit**
|
|
|
|
```bash
|
|
git add cli/client.py tests/test_client_version_check.py
|
|
git commit -m "feat(cli): hard-stop on incompatible-version response header
|
|
|
|
Every API response is inspected via httpx event_hooks. When the server
|
|
reports X-Agnes-Min-Version > local, CLI prints a remediation message
|
|
and exits 2. Latest-version drift continues to be handled by the
|
|
update_check warning loop — no double-warning on every API call."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 3: `agnes self-upgrade` command
|
|
|
|
**Files:**
|
|
- Modify: `cli/update_check.py` — add `bypass_disabled` kwarg to `check()`.
|
|
- Create: `cli/commands/self_upgrade.py`
|
|
- Modify: `cli/main.py` — register the command
|
|
- Create: `tests/test_self_upgrade.py`
|
|
|
|
- [ ] **Step 3.0: Extend `check()` with `bypass_disabled` kwarg**
|
|
|
|
`AGNES_NO_UPDATE_CHECK=1` was designed to silence the implicit warning loop that runs in the root callback. An explicit `agnes self-upgrade` is a user-typed command and should not become a silent no-op when that env var happens to be set. Thread a keyword-only kwarg through:
|
|
|
|
In `cli/update_check.py`, modify the signature and the disabled-check:
|
|
|
|
```python
|
|
def check(server_url: Optional[str], *, bypass_disabled: bool = False) -> Optional[UpdateInfo]:
|
|
"""..."""
|
|
if not bypass_disabled and is_disabled():
|
|
return None
|
|
if not server_url:
|
|
return None
|
|
# ... rest unchanged
|
|
```
|
|
|
|
Existing callers (the root callback at `cli/main.py:102`) keep their default-false behavior; `self-upgrade` will pass `bypass_disabled=True`. Add a test in `tests/test_cli_update_check.py`:
|
|
|
|
```python
|
|
def test_check_bypass_disabled_overrides_env(monkeypatch):
|
|
monkeypatch.setenv("AGNES_NO_UPDATE_CHECK", "1")
|
|
with patch("cli.update_check._fetch_latest", return_value={
|
|
"version": "9.9.9", "wheel_filename": "x.whl",
|
|
"download_url_path": "/cli/wheel/x.whl",
|
|
}):
|
|
# Default: env var wins, returns None.
|
|
assert check("http://server.test") is None
|
|
# Bypass: env var ignored.
|
|
info = check("http://server.test", bypass_disabled=True)
|
|
assert info is not None and info.latest == "9.9.9"
|
|
```
|
|
|
|
Run the existing tests to catch regressions:
|
|
|
|
```bash
|
|
pytest tests/test_cli_update_check.py -v
|
|
```
|
|
Expected: PASS — old tests still green, new test passes.
|
|
|
|
Commit at end of task; the kwarg is shipped together with `self-upgrade`.
|
|
|
|
- [ ] **Step 3.1: Write the failing tests**
|
|
|
|
Create `tests/test_self_upgrade.py`:
|
|
|
|
```python
|
|
"""Tests for `agnes self-upgrade` — install path, smoke test, rollback
|
|
(with rc capture), recursion barrier, --force offline failure, AGNES_NO_UPDATE_CHECK
|
|
bypass for explicit upgrades, --quiet stderr behavior, version-mismatch
|
|
smoke detection."""
|
|
|
|
import os
|
|
import sys
|
|
from unittest.mock import patch, MagicMock
|
|
|
|
import pytest
|
|
from typer.testing import CliRunner
|
|
|
|
from cli.main import app
|
|
from cli.update_check import UpdateInfo
|
|
|
|
runner = CliRunner()
|
|
|
|
|
|
@pytest.fixture(autouse=True)
|
|
def _ensure_no_sentinel_leak(monkeypatch):
|
|
"""Pytest test order is not guaranteed; explicitly clear the recursion
|
|
sentinel before every test so a leaked value from a prior test doesn't
|
|
produce a false-positive 'cleared on exit' assertion."""
|
|
monkeypatch.delenv("AGNES_SELF_UPGRADE_IN_PROGRESS", raising=False)
|
|
yield
|
|
|
|
_OUTDATED_URL = "http://server.test/cli/wheel/agnes-0.40.0-py3-none-any.whl"
|
|
_PRIOR_URL = "http://server.test/cli/wheel/agnes-0.35.0-py3-none-any.whl"
|
|
|
|
|
|
def _outdated_info():
|
|
return UpdateInfo(installed="0.30.0", latest="0.40.0", download_url=_OUTDATED_URL)
|
|
|
|
|
|
def _current_info():
|
|
return UpdateInfo(installed="0.40.0", latest="0.40.0", download_url=None)
|
|
|
|
|
|
def _smoke_pass():
|
|
return (True, "agnes 0.40.0")
|
|
|
|
|
|
def _smoke_fail():
|
|
return (False, "exit 1: ImportError: cannot import name 'foo'")
|
|
|
|
|
|
def test_check_only_when_outdated_exits_1():
|
|
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()):
|
|
result = runner.invoke(app, ["self-upgrade", "--check-only"])
|
|
assert result.exit_code == 1
|
|
assert "out of date" in result.output
|
|
|
|
|
|
def test_check_only_when_current_exits_0():
|
|
with patch("cli.commands.self_upgrade.check", return_value=_current_info()):
|
|
result = runner.invoke(app, ["self-upgrade", "--check-only"])
|
|
assert result.exit_code == 0
|
|
|
|
|
|
def test_when_current_short_circuits_no_install():
|
|
with patch("cli.commands.self_upgrade.check", return_value=_current_info()), \
|
|
patch("cli.commands.self_upgrade.subprocess.run") as mock_run:
|
|
result = runner.invoke(app, ["self-upgrade"])
|
|
assert result.exit_code == 0
|
|
mock_run.assert_not_called()
|
|
|
|
|
|
def test_uv_path_when_uv_available():
|
|
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
|
patch("cli.commands.self_upgrade.shutil.which", return_value="/usr/local/bin/uv"), \
|
|
patch("cli.commands.self_upgrade.subprocess.run") as mock_run, \
|
|
patch("cli.commands.self_upgrade._smoke_test_new_binary", return_value=_smoke_pass()), \
|
|
patch("cli.commands.self_upgrade._read_last_known_good", return_value=None), \
|
|
patch("cli.commands.self_upgrade._record_last_known_good"), \
|
|
patch("cli.commands.self_upgrade._invalidate_update_cache"):
|
|
mock_run.return_value = MagicMock(returncode=0)
|
|
result = runner.invoke(app, ["self-upgrade"])
|
|
assert result.exit_code == 0
|
|
args = mock_run.call_args_list[0].args[0]
|
|
assert args[:3] == ["uv", "tool", "install"]
|
|
assert "--force" in args
|
|
assert _OUTDATED_URL in args
|
|
|
|
|
|
def test_pip_fallback_uses_sys_executable_not_user():
|
|
"""pip path must target the running interpreter's venv, never --user."""
|
|
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
|
patch("cli.commands.self_upgrade.shutil.which", return_value=None), \
|
|
patch("cli.commands.self_upgrade.subprocess.run") as mock_run, \
|
|
patch("cli.commands.self_upgrade._smoke_test_new_binary", return_value=_smoke_pass()), \
|
|
patch("cli.commands.self_upgrade._read_last_known_good", return_value=None), \
|
|
patch("cli.commands.self_upgrade._record_last_known_good"), \
|
|
patch("cli.commands.self_upgrade._invalidate_update_cache"):
|
|
mock_run.return_value = MagicMock(returncode=0)
|
|
result = runner.invoke(app, ["self-upgrade"])
|
|
assert result.exit_code == 0
|
|
cmds = [c.args[0] for c in mock_run.call_args_list]
|
|
assert any(cmd[0] == "curl" for cmd in cmds), cmds
|
|
pip_cmd = next(cmd for cmd in cmds if "pip" in cmd)
|
|
assert pip_cmd[0] == sys.executable, pip_cmd
|
|
assert "--force-reinstall" in pip_cmd
|
|
assert "--user" not in pip_cmd # would land outside the venv
|
|
|
|
|
|
def test_force_invalidates_cache_before_check():
|
|
"""--force must drop the cached download_url before probing /cli/latest,
|
|
so we get the SERVER's current wheel, not whatever was cached 24h ago."""
|
|
fresh_current_with_url = UpdateInfo(installed="0.40.0", latest="0.40.0",
|
|
download_url=_OUTDATED_URL)
|
|
with patch("cli.commands.self_upgrade._invalidate_update_cache") as mock_invalidate, \
|
|
patch("cli.commands.self_upgrade.check", return_value=fresh_current_with_url) as mock_check, \
|
|
patch("cli.commands.self_upgrade.shutil.which", return_value="/usr/local/bin/uv"), \
|
|
patch("cli.commands.self_upgrade.subprocess.run") as mock_run, \
|
|
patch("cli.commands.self_upgrade._smoke_test_new_binary", return_value=_smoke_pass()), \
|
|
patch("cli.commands.self_upgrade._read_last_known_good", return_value=None), \
|
|
patch("cli.commands.self_upgrade._record_last_known_good"):
|
|
mock_run.return_value = MagicMock(returncode=0)
|
|
result = runner.invoke(app, ["self-upgrade", "--force"])
|
|
assert result.exit_code == 0
|
|
# invalidate called twice: once before check (forced fresh probe),
|
|
# once after smoke pass (next invocation re-probes the new wheel).
|
|
assert mock_invalidate.call_count == 2
|
|
mock_check.assert_called_once()
|
|
|
|
|
|
def test_force_offline_exits_1_with_stderr():
|
|
"""--force + server unreachable: exit 1 with explicit stderr.
|
|
Without --force, an offline check is silent; with --force it is not."""
|
|
with patch("cli.commands.self_upgrade.check", return_value=None), \
|
|
patch("cli.commands.self_upgrade.get_server_url",
|
|
return_value="http://server.test"), \
|
|
patch("cli.commands.self_upgrade._invalidate_update_cache"):
|
|
result = runner.invoke(app, ["self-upgrade", "--force"], mix_stderr=False)
|
|
assert result.exit_code == 1
|
|
assert "cannot reach" in result.stderr
|
|
assert "server.test" in result.stderr
|
|
|
|
|
|
def test_offline_without_force_is_silent():
|
|
"""No --force, server unreachable: exit 0 silently. Implicit warning
|
|
loop already covered by update_check."""
|
|
with patch("cli.commands.self_upgrade.check", return_value=None), \
|
|
patch("cli.commands.self_upgrade._invalidate_update_cache"):
|
|
result = runner.invoke(app, ["self-upgrade"], mix_stderr=False)
|
|
assert result.exit_code == 0
|
|
assert result.stderr == ""
|
|
|
|
|
|
def test_self_upgrade_passes_bypass_disabled_to_check():
|
|
"""AGNES_NO_UPDATE_CHECK silences the implicit warning loop, but
|
|
explicit `agnes self-upgrade` must NOT be a silent no-op when set.
|
|
Verify the callback passes bypass_disabled=True to check()."""
|
|
with patch("cli.commands.self_upgrade.check", return_value=_current_info()) as mock_check:
|
|
result = runner.invoke(app, ["self-upgrade", "--check-only"])
|
|
assert result.exit_code == 0
|
|
# check() was called with bypass_disabled=True (positional or kwarg).
|
|
kwargs = mock_check.call_args.kwargs
|
|
assert kwargs.get("bypass_disabled") is True
|
|
|
|
|
|
def test_quiet_does_not_suppress_install_failure_stderr():
|
|
"""--quiet suppresses progress but install/smoke failures always surface."""
|
|
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
|
patch("cli.commands.self_upgrade.shutil.which", return_value="/usr/local/bin/uv"), \
|
|
patch("cli.commands.self_upgrade.subprocess.run") as mock_run, \
|
|
patch("cli.commands.self_upgrade._read_last_known_good", return_value=None):
|
|
mock_run.return_value = MagicMock(returncode=42)
|
|
result = runner.invoke(app, ["self-upgrade", "--quiet"], mix_stderr=False)
|
|
assert result.exit_code == 1
|
|
assert "install failed" in result.stderr
|
|
|
|
|
|
def test_smoke_fail_triggers_rollback_when_prior_url_known():
|
|
"""Broken new wheel: smoke fails, rollback to last-known-good URL, exit 1."""
|
|
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
|
patch("cli.commands.self_upgrade.shutil.which", return_value="/usr/local/bin/uv"), \
|
|
patch("cli.commands.self_upgrade.subprocess.run") as mock_run, \
|
|
patch("cli.commands.self_upgrade._smoke_test_new_binary", return_value=_smoke_fail()), \
|
|
patch("cli.commands.self_upgrade._read_last_known_good", return_value=_PRIOR_URL), \
|
|
patch("cli.commands.self_upgrade._record_last_known_good") as mock_record:
|
|
mock_run.return_value = MagicMock(returncode=0)
|
|
result = runner.invoke(app, ["self-upgrade"], mix_stderr=False)
|
|
assert result.exit_code == 1
|
|
# Two install calls: forward to new, rollback to prior
|
|
urls_installed = [
|
|
arg for c in mock_run.call_args_list
|
|
for arg in c.args[0] if isinstance(arg, str) and arg.startswith("http")
|
|
]
|
|
assert _OUTDATED_URL in urls_installed
|
|
assert _PRIOR_URL in urls_installed
|
|
# Last-known-good is NOT updated on a failed upgrade
|
|
mock_record.assert_not_called()
|
|
assert "smoke test" in result.stderr
|
|
|
|
|
|
def test_smoke_fail_with_rollback_failure_surfaces_rc():
|
|
"""Forward install ok, smoke fail, rollback ALSO fails:
|
|
stderr must surface the rollback rc + bootstrap recovery command."""
|
|
# First call: forward install (rc=0). Second call: rollback (rc=99).
|
|
install_results = [MagicMock(returncode=0), MagicMock(returncode=99)]
|
|
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
|
patch("cli.commands.self_upgrade.shutil.which", return_value="/usr/local/bin/uv"), \
|
|
patch("cli.commands.self_upgrade.subprocess.run", side_effect=install_results), \
|
|
patch("cli.commands.self_upgrade._smoke_test_new_binary", return_value=_smoke_fail()), \
|
|
patch("cli.commands.self_upgrade._read_last_known_good", return_value=_PRIOR_URL), \
|
|
patch("cli.commands.self_upgrade.get_server_url",
|
|
return_value="http://server.test"):
|
|
result = runner.invoke(app, ["self-upgrade"], mix_stderr=False)
|
|
assert result.exit_code == 1
|
|
assert "rollback ALSO failed" in result.stderr
|
|
assert "rc=99" in result.stderr
|
|
assert "/cli/install.sh" in result.stderr # bootstrap recovery
|
|
|
|
|
|
def test_smoke_fail_no_prior_url_prints_install_sh_recovery():
|
|
"""First-ever upgrade with no rollback target: stderr points at the
|
|
canonical bootstrap path with a fully-formed curl command."""
|
|
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
|
patch("cli.commands.self_upgrade.shutil.which", return_value="/usr/local/bin/uv"), \
|
|
patch("cli.commands.self_upgrade.subprocess.run") as mock_run, \
|
|
patch("cli.commands.self_upgrade._smoke_test_new_binary", return_value=_smoke_fail()), \
|
|
patch("cli.commands.self_upgrade._read_last_known_good", return_value=None), \
|
|
patch("cli.commands.self_upgrade.get_server_url",
|
|
return_value="http://server.test"):
|
|
mock_run.return_value = MagicMock(returncode=0)
|
|
result = runner.invoke(app, ["self-upgrade"], mix_stderr=False)
|
|
assert result.exit_code == 1
|
|
assert "/cli/install.sh" in result.stderr
|
|
assert "server.test" in result.stderr # actual server URL, not <placeholder>
|
|
|
|
|
|
def test_smoke_pass_records_last_known_good_then_invalidates_cache():
|
|
"""Convention: record before invalidate. No correctness consequence either
|
|
way; this test pins the convention so swapping order shows up in review."""
|
|
call_order = []
|
|
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
|
patch("cli.commands.self_upgrade.shutil.which", return_value="/usr/local/bin/uv"), \
|
|
patch("cli.commands.self_upgrade.subprocess.run") as mock_run, \
|
|
patch("cli.commands.self_upgrade._smoke_test_new_binary", return_value=_smoke_pass()), \
|
|
patch("cli.commands.self_upgrade._read_last_known_good", return_value=None), \
|
|
patch("cli.commands.self_upgrade._record_last_known_good",
|
|
side_effect=lambda url: call_order.append(("record", url))), \
|
|
patch("cli.commands.self_upgrade._invalidate_update_cache",
|
|
side_effect=lambda: call_order.append(("invalidate", None))):
|
|
mock_run.return_value = MagicMock(returncode=0)
|
|
result = runner.invoke(app, ["self-upgrade"])
|
|
assert result.exit_code == 0
|
|
record_idx = next(i for i, c in enumerate(call_order) if c[0] == "record")
|
|
invalidate_idx = next(i for i, c in enumerate(call_order) if c[0] == "invalidate")
|
|
assert record_idx < invalidate_idx, call_order
|
|
assert call_order[record_idx] == ("record", _OUTDATED_URL)
|
|
|
|
|
|
def test_self_upgrade_propagates_sentinel_to_smoke_subprocess():
|
|
"""During the upgrade, AGNES_SELF_UPGRADE_IN_PROGRESS=1 must be in
|
|
os.environ. The smoke test subprocess inherits via env={**os.environ, ...}.
|
|
Cleared in finally on callback exit. The test fakes _smoke_test_new_binary
|
|
to capture the env it would build, asserting both the sentinel propagation
|
|
and the cleanup."""
|
|
captured_envs = []
|
|
|
|
def _fake_smoke(method, expected_version):
|
|
env = {**os.environ, "AGNES_NO_UPDATE_CHECK": "1",
|
|
"AGNES_SELF_UPGRADE_IN_PROGRESS": "1"}
|
|
captured_envs.append(env)
|
|
return _smoke_pass()
|
|
|
|
with patch("cli.commands.self_upgrade.check", return_value=_outdated_info()), \
|
|
patch("cli.commands.self_upgrade.shutil.which", return_value="/usr/local/bin/uv"), \
|
|
patch("cli.commands.self_upgrade.subprocess.run",
|
|
return_value=MagicMock(returncode=0)), \
|
|
patch("cli.commands.self_upgrade._smoke_test_new_binary", side_effect=_fake_smoke), \
|
|
patch("cli.commands.self_upgrade._read_last_known_good", return_value=None), \
|
|
patch("cli.commands.self_upgrade._record_last_known_good"), \
|
|
patch("cli.commands.self_upgrade._invalidate_update_cache"):
|
|
result = runner.invoke(app, ["self-upgrade"])
|
|
assert result.exit_code == 0
|
|
assert captured_envs and captured_envs[0]["AGNES_SELF_UPGRADE_IN_PROGRESS"] == "1"
|
|
# Cleared in finally
|
|
assert os.environ.get("AGNES_SELF_UPGRADE_IN_PROGRESS") is None
|
|
|
|
|
|
@pytest.mark.parametrize("install_method,patch_target", [
|
|
("uv", "_uv_tool_bin_path"),
|
|
("pip", "_pip_bin_path"),
|
|
])
|
|
def test_smoke_test_detects_version_mismatch(install_method, patch_target):
|
|
"""The smoke test must exec the binary at the install-resolved path
|
|
(NOT shutil.which) and compare its --version output via
|
|
packaging.version.Version equality. A stale PATH-shadow returning the
|
|
old version must FAIL the smoke. Parametrized over both uv and pip
|
|
install paths so neither branch becomes silently broken."""
|
|
from pathlib import Path
|
|
from cli.commands import self_upgrade as su
|
|
|
|
fake_bin = f"/fake/{install_method}/bin/agnes"
|
|
with patch.object(su, patch_target, return_value=Path(fake_bin)), \
|
|
patch.object(su.subprocess, "run") as mock_run:
|
|
mock_run.return_value = MagicMock(returncode=0, stdout="agnes 0.30.0\n", stderr="")
|
|
ok, detail = su._smoke_test_new_binary(install_method, expected_version="0.40.0")
|
|
assert ok is False
|
|
assert "version mismatch" in detail
|
|
assert "0.40.0" in detail and "0.30.0" in detail
|
|
# Must have execed the install-path binary, not "agnes" via PATH
|
|
assert mock_run.call_args.args[0][0] == fake_bin
|
|
|
|
|
|
def test_smoke_test_passes_with_pep440_local_version():
|
|
"""PEP 440 local version segments (e.g. '0.40.0+local.dev') must NOT
|
|
trip the equality check when the server reports the canonical version.
|
|
Use Version() comparison, not substring."""
|
|
from pathlib import Path
|
|
from cli.commands import self_upgrade as su
|
|
|
|
with patch.object(su, "_uv_tool_bin_path", return_value=Path("/fake/agnes")), \
|
|
patch.object(su.subprocess, "run") as mock_run:
|
|
# Wheel reports a local-segmented version; server's expected is canonical.
|
|
mock_run.return_value = MagicMock(returncode=0, stdout="agnes 0.40.0\n", stderr="")
|
|
ok, _ = su._smoke_test_new_binary("uv", expected_version="0.40.0")
|
|
assert ok is True
|
|
# Reverse: substring "0.40.0" inside "0.40.10" must NOT pass.
|
|
mock_run.return_value = MagicMock(returncode=0, stdout="agnes 0.40.10\n", stderr="")
|
|
ok, detail = su._smoke_test_new_binary("uv", expected_version="0.40.0")
|
|
assert ok is False
|
|
assert "version mismatch" in detail
|
|
```
|
|
|
|
- [ ] **Step 3.2: Run tests, verify they fail**
|
|
|
|
```bash
|
|
pytest tests/test_self_upgrade.py -v
|
|
```
|
|
Expected: FAIL — `cli.commands.self_upgrade` module does not exist.
|
|
|
|
- [ ] **Step 3.3: Create `cli/commands/self_upgrade.py`**
|
|
|
|
```python
|
|
"""`agnes self-upgrade` — pull the wheel from the server, reinstall, smoke-test,
|
|
roll back on failure.
|
|
|
|
Flow:
|
|
1. Set AGNES_SELF_UPGRADE_IN_PROGRESS=1 (recursion barrier — see cli/client.py).
|
|
2. If --force, invalidate update_check cache so we get fresh /cli/latest.
|
|
3. Probe via update_check.check(..., bypass_disabled=True) — explicit user
|
|
intent overrides AGNES_NO_UPDATE_CHECK (which is for the implicit warning
|
|
loop only).
|
|
4. --force + offline ⇒ exit 1 with "cannot reach <server>". Without --force,
|
|
offline is silent.
|
|
5. If nothing to do (current, no download_url) → exit 0.
|
|
6. Snapshot _read_last_known_good() — URL of the last verified-good install.
|
|
7. Install via uv (preferred) or pip (sys.executable, no --user, --no-deps).
|
|
8. Smoke-test the binary at the deterministic install path (NOT shutil.which,
|
|
which can resolve a stale PATH shadow). Verify --version output contains
|
|
info.latest. Failure → rollback (capturing rc) → exit 1.
|
|
9. On smoke pass: _record_last_known_good(new_url) then
|
|
_invalidate_update_cache(). Convention; no correctness consequence either way.
|
|
10. Sentinel cleared in finally.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import json
|
|
import os
|
|
import shutil
|
|
import subprocess
|
|
import sys
|
|
import tempfile
|
|
from pathlib import Path
|
|
from typing import Optional, Union
|
|
|
|
import typer
|
|
|
|
from cli.config import _config_dir, get_server_url
|
|
from cli.update_check import UpdateInfo, check, format_outdated_notice
|
|
|
|
self_upgrade_app = typer.Typer(
|
|
name="self-upgrade",
|
|
help="Reinstall the CLI from the server's currently-shipped wheel.",
|
|
invoke_without_command=True,
|
|
)
|
|
|
|
_SENTINEL_ENV = "AGNES_SELF_UPGRADE_IN_PROGRESS"
|
|
|
|
|
|
class _Unreachable:
|
|
"""Sentinel returned by _resolve_info when --force was specified but the
|
|
server probe failed. Distinguishes 'explicitly requested an upgrade and
|
|
we couldn't reach the server' (exit 1, stderr) from 'no upgrade needed'
|
|
(exit 0, silent)."""
|
|
|
|
|
|
_UNREACHABLE = _Unreachable()
|
|
|
|
|
|
def _invalidate_update_cache() -> None:
|
|
"""Drop update_check.json so the next CLI invocation re-probes /cli/latest."""
|
|
(_config_dir() / "update_check.json").unlink(missing_ok=True)
|
|
|
|
|
|
def _last_known_good_path() -> Path:
|
|
return _config_dir() / "last_known_good.json"
|
|
|
|
|
|
def _read_last_known_good() -> Optional[str]:
|
|
"""URL of the last wheel that passed the smoke test on this machine.
|
|
None on first ever upgrade — first-run failure falls back to the bootstrap
|
|
install.sh recovery message rather than a rollback."""
|
|
p = _last_known_good_path()
|
|
if not p.exists():
|
|
return None
|
|
try:
|
|
return json.loads(p.read_text(encoding="utf-8")).get("download_url")
|
|
except (OSError, json.JSONDecodeError):
|
|
return None
|
|
|
|
|
|
def _record_last_known_good(download_url: str) -> None:
|
|
p = _last_known_good_path()
|
|
try:
|
|
p.parent.mkdir(parents=True, exist_ok=True)
|
|
p.write_text(json.dumps({"download_url": download_url}), encoding="utf-8")
|
|
except OSError:
|
|
pass # best-effort — failure to record must not break the flow
|
|
|
|
|
|
def _uv_tool_bin_path() -> Optional[Path]:
|
|
"""Locate the agnes shim uv installed.
|
|
|
|
Tries `uv tool dir --bin` first (uv >= 0.5 prints the entrypoint shim
|
|
directory directly). On older uv where `--bin` is rejected, falls back
|
|
to uv's documented default install location (`~/.local/bin/` on POSIX,
|
|
`%APPDATA%\\uv\\tools\\bin\\` on Windows). Smoke-test failure here would
|
|
silently rollback an otherwise-good install on every older-uv analyst,
|
|
so the fallback matters.
|
|
"""
|
|
bin_dir: Optional[Path] = None
|
|
try:
|
|
out = subprocess.run(
|
|
["uv", "tool", "dir", "--bin"], capture_output=True, text=True, timeout=5,
|
|
)
|
|
if out.returncode == 0:
|
|
bin_dir = Path(out.stdout.strip())
|
|
except (OSError, subprocess.TimeoutExpired):
|
|
bin_dir = None
|
|
|
|
if bin_dir is None:
|
|
# Fallback: uv's documented default install location.
|
|
if sys.platform == "win32":
|
|
appdata = os.environ.get("APPDATA")
|
|
if appdata:
|
|
bin_dir = Path(appdata) / "uv" / "tools" / "bin"
|
|
else:
|
|
bin_dir = Path.home() / ".local" / "bin"
|
|
|
|
if bin_dir is None or not bin_dir.exists():
|
|
return None
|
|
|
|
# uv emits `agnes.exe` on Windows and `agnes` on POSIX; check both.
|
|
for name in ("agnes.exe", "agnes"):
|
|
candidate = bin_dir / name
|
|
if candidate.exists():
|
|
return candidate
|
|
return None
|
|
|
|
|
|
def _pip_bin_path() -> Optional[Path]:
|
|
"""`<venv>/bin/agnes` (POSIX) or `<venv>\\Scripts\\agnes.exe` (Windows)."""
|
|
parent = Path(sys.executable).parent
|
|
name = "agnes.exe" if sys.platform == "win32" else "agnes"
|
|
candidate = parent / name
|
|
return candidate if candidate.exists() else None
|
|
|
|
|
|
def _install_with_uv(download_url: str, *, quiet: bool) -> int:
|
|
out = subprocess.DEVNULL if quiet else None
|
|
return subprocess.run(
|
|
["uv", "tool", "install", "--force", download_url], stdout=out
|
|
).returncode
|
|
|
|
|
|
def _install_with_pip(download_url: str, *, quiet: bool) -> int:
|
|
"""Install into the SAME interpreter that's running this command.
|
|
|
|
sys.executable resolves to the venv (uv-tool venv, user-pip --user venv,
|
|
or system) that owns the live `agnes` binary. Using `python3` instead
|
|
would PATH-resolve to system python on macOS analyst machines, landing
|
|
the wheel outside the agnes venv and silently no-op'ing the upgrade.
|
|
--user is wrong here: inside a uv-tool venv it targets ~/.local outside
|
|
the venv. Drop it.
|
|
"""
|
|
out = subprocess.DEVNULL if quiet else None
|
|
with tempfile.TemporaryDirectory(prefix="agnes_cli.") as td:
|
|
wheel_path = Path(td) / "agnes.whl"
|
|
rc = subprocess.run(
|
|
["curl", "-fsSL", "-o", str(wheel_path), download_url], stdout=out
|
|
).returncode
|
|
if rc != 0:
|
|
return rc
|
|
return subprocess.run(
|
|
[sys.executable, "-m", "pip", "install",
|
|
"--force-reinstall", "--no-deps", str(wheel_path)],
|
|
stdout=out,
|
|
).returncode
|
|
|
|
|
|
def _smoke_test_new_binary(install_method: str, expected_version: str) -> tuple[bool, str]:
|
|
"""Exec `<install-path>/agnes --version` from a fresh subprocess, confirm
|
|
it boots AND reports the expected version.
|
|
|
|
Resolves the binary at the install-method-specific path (uv tool dir /
|
|
sys.executable parent) rather than via PATH — defends against a stale
|
|
shadow ahead of the freshly-installed binary in $PATH. Suppresses the
|
|
new binary's own update check + propagates the recursion sentinel so
|
|
the smoke run can't trigger a nested self-upgrade.
|
|
"""
|
|
binary = _uv_tool_bin_path() if install_method == "uv" else _pip_bin_path()
|
|
if binary is None:
|
|
return False, f"agnes binary not found at expected {install_method} install path"
|
|
try:
|
|
env = {**os.environ, "AGNES_NO_UPDATE_CHECK": "1", _SENTINEL_ENV: "1"}
|
|
out = subprocess.run(
|
|
[str(binary), "--version"],
|
|
capture_output=True, text=True, timeout=10, env=env,
|
|
)
|
|
if out.returncode != 0:
|
|
return False, f"exit {out.returncode}: {out.stderr.strip()[:200]}"
|
|
# `agnes --version` prints `agnes <version>` — extract and compare
|
|
# via packaging.version.Version (PEP 440-aware) to avoid substring
|
|
# false-positives like "0.40.0" matching "0.40.10".
|
|
from packaging.version import InvalidVersion, Version
|
|
tokens = out.stdout.strip().split()
|
|
actual_str = tokens[-1] if tokens else ""
|
|
try:
|
|
if Version(actual_str) != Version(expected_version):
|
|
return False, (
|
|
f"version mismatch: expected {expected_version}, "
|
|
f"got {actual_str}"
|
|
)
|
|
except InvalidVersion:
|
|
return False, f"unparseable version output: {out.stdout.strip()[:80]}"
|
|
return True, out.stdout.strip()
|
|
except (subprocess.TimeoutExpired, OSError) as e:
|
|
return False, f"{type(e).__name__}: {e}"
|
|
|
|
|
|
def _resolve_info(force: bool) -> Union[UpdateInfo, _Unreachable, None]:
|
|
"""Returns:
|
|
UpdateInfo — install this wheel
|
|
_UNREACHABLE — --force specified, server probe failed
|
|
None — nothing to do (current, or offline without --force)
|
|
"""
|
|
if force:
|
|
_invalidate_update_cache()
|
|
# bypass_disabled=True so an explicit `agnes self-upgrade` is not silenced
|
|
# by AGNES_NO_UPDATE_CHECK (which exists for the implicit warning loop).
|
|
info = check(get_server_url(), bypass_disabled=True)
|
|
if info is None:
|
|
return _UNREACHABLE if force else None
|
|
if not info.download_url:
|
|
return None
|
|
if not force and not info.is_outdated():
|
|
return None
|
|
return info
|
|
|
|
|
|
def _do_install_with_smoke_and_rollback(
|
|
info: UpdateInfo, *, quiet: bool
|
|
) -> int:
|
|
"""Returns the exit code typer should use (0 success, 1 failure)."""
|
|
prior_url = _read_last_known_good() # may be None on first upgrade
|
|
|
|
if shutil.which("uv"):
|
|
rc = _install_with_uv(info.download_url, quiet=quiet)
|
|
method = "uv"
|
|
else:
|
|
rc = _install_with_pip(info.download_url, quiet=quiet)
|
|
method = "pip"
|
|
|
|
if rc != 0:
|
|
sys.stderr.write(f"agnes self-upgrade: install failed with exit {rc}\n")
|
|
return 1
|
|
|
|
ok, detail = _smoke_test_new_binary(method, expected_version=info.latest)
|
|
if not ok:
|
|
sys.stderr.write(
|
|
f"agnes self-upgrade: new binary failed smoke test ({detail}).\n"
|
|
)
|
|
server = get_server_url().rstrip("/")
|
|
bootstrap_recovery = f" Manual recovery: curl -fsSL {server}/cli/install.sh | bash\n"
|
|
if prior_url and prior_url != info.download_url:
|
|
sys.stderr.write(f" rolling back to {prior_url}\n")
|
|
rb_rc = (
|
|
_install_with_uv(prior_url, quiet=True)
|
|
if method == "uv"
|
|
else _install_with_pip(prior_url, quiet=True)
|
|
)
|
|
if rb_rc != 0:
|
|
sys.stderr.write(
|
|
f" rollback ALSO failed (rc={rb_rc}); CLI is in a broken state.\n"
|
|
)
|
|
sys.stderr.write(bootstrap_recovery)
|
|
else:
|
|
sys.stderr.write(
|
|
" no prior wheel URL on record; rollback skipped.\n"
|
|
)
|
|
sys.stderr.write(bootstrap_recovery)
|
|
return 1
|
|
|
|
# Convention: record then invalidate. No correctness consequence either way.
|
|
_record_last_known_good(info.download_url)
|
|
_invalidate_update_cache()
|
|
if not quiet:
|
|
typer.echo(f"agnes self-upgrade: installed {info.latest}", err=True)
|
|
return 0
|
|
|
|
|
|
@self_upgrade_app.callback()
|
|
def self_upgrade(
|
|
quiet: bool = typer.Option(False, "--quiet", help="Suppress progress output. Failures still surface on stderr."),
|
|
check_only: bool = typer.Option(False, "--check-only", help="Print status, don't install. Exit 1 if outdated."),
|
|
force: bool = typer.Option(False, "--force", help="Reinstall the server's current wheel even when already on the latest version."),
|
|
) -> None:
|
|
# Defensively snapshot any prior value so we restore (rather than
|
|
# destroy) it in finally — we own the namespace but a wrapper could
|
|
# legitimately set it for its own bookkeeping.
|
|
prior_sentinel = os.environ.get(_SENTINEL_ENV)
|
|
os.environ[_SENTINEL_ENV] = "1"
|
|
try:
|
|
info = _resolve_info(force)
|
|
|
|
# --check-only is read-only intent — never exit non-zero on
|
|
# transport errors. If unreachable, treat as "can't tell, current"
|
|
# and exit 0 silently. (Without --check-only, --force + offline
|
|
# is exit 1, which is the destructive-intent contract.)
|
|
if check_only:
|
|
if isinstance(info, _Unreachable) or info is None or not info.is_outdated():
|
|
raise typer.Exit(0)
|
|
typer.echo(format_outdated_notice(info), err=True)
|
|
raise typer.Exit(1)
|
|
|
|
if isinstance(info, _Unreachable):
|
|
sys.stderr.write(
|
|
f"agnes self-upgrade: cannot reach {get_server_url()}/cli/latest\n"
|
|
)
|
|
raise typer.Exit(1)
|
|
|
|
if info is None:
|
|
raise typer.Exit(0) # nothing to do, silent
|
|
|
|
rc = _do_install_with_smoke_and_rollback(info, quiet=quiet)
|
|
raise typer.Exit(rc)
|
|
finally:
|
|
if prior_sentinel is None:
|
|
os.environ.pop(_SENTINEL_ENV, None)
|
|
else:
|
|
os.environ[_SENTINEL_ENV] = prior_sentinel
|
|
```
|
|
|
|
- [ ] **Step 3.4: Register in `cli/main.py`**
|
|
|
|
After the existing `from cli.commands.X import Y_app` block, add:
|
|
|
|
```python
|
|
from cli.commands.self_upgrade import self_upgrade_app
|
|
```
|
|
|
|
In the `app.add_typer(...)` block (around line 109-127), add:
|
|
|
|
```python
|
|
app.add_typer(self_upgrade_app, name="self-upgrade")
|
|
```
|
|
|
|
Place it near `app.add_typer(setup_app, name="setup")` for grouping.
|
|
|
|
- [ ] **Step 3.5: Run tests, verify they pass**
|
|
|
|
```bash
|
|
pytest tests/test_self_upgrade.py -v
|
|
```
|
|
Expected: PASS — all seven tests.
|
|
|
|
- [ ] **Step 3.6: Smoke-test the command shape locally**
|
|
|
|
```bash
|
|
agnes self-upgrade --help
|
|
```
|
|
Expected: typer help text with `--quiet`, `--check-only`, `--force` flags.
|
|
|
|
- [ ] **Step 3.7: Commit**
|
|
|
|
```bash
|
|
git add cli/update_check.py cli/commands/self_upgrade.py cli/main.py \
|
|
tests/test_self_upgrade.py tests/test_cli_update_check.py
|
|
git commit -m "feat(cli): add agnes self-upgrade with smoke test + rollback
|
|
|
|
Reuses cli.update_check.check() for the version probe — extended with
|
|
bypass_disabled=True so explicit user-typed self-upgrade is not silenced
|
|
by AGNES_NO_UPDATE_CHECK (which is for the implicit warning loop).
|
|
|
|
Install path: uv tool install --force when uv is on PATH; otherwise
|
|
curl + pip via sys.executable (NOT system python3, NOT --user — both
|
|
would land outside the agnes venv and silently no-op the upgrade).
|
|
|
|
Smoke test execs the binary at the install-resolved path (uv tool dir
|
|
joined with agnes-the-ai-analyst/bin/agnes, or sys.executable's sibling
|
|
agnes for pip) — never via shutil.which, which can resolve a stale shadow
|
|
on PATH and produce a false-positive smoke pass on the OLD version. Smoke
|
|
also asserts --version output contains info.latest.
|
|
|
|
On smoke fail: rollback to last_known_good.json (written only after a
|
|
previous run's smoke passed). Rollback rc is captured and surfaced on
|
|
stderr if it also fails. First-ever upgrade or unrecoverable rollback
|
|
prints the canonical bootstrap recovery: curl -fsSL <your-agnes-server>/cli/install.sh | bash.
|
|
|
|
AGNES_SELF_UPGRADE_IN_PROGRESS=1 is set for the duration of the run
|
|
and propagated to the smoke-test subprocess. Layer B's _check_version_headers
|
|
honors the sentinel and skips the < min hard-stop, so an in-flight
|
|
upgrade can never sys.exit(2) itself.
|
|
|
|
--force invalidates the update_check cache BEFORE probing. --force +
|
|
offline = exit 1 with explicit stderr (without --force, offline is silent).
|
|
--quiet suppresses progress output but never gags failure stderr."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 4: SessionStart hook (single chained entry)
|
|
|
|
**Why one entry, not two:** Claude Code's hook execution semantics for multiple SessionStart entries (parallel? sequential? bounded?) are not documented in this repo and are not relied upon. Chain in a single entry with `;` so the shell guarantees ordering: self-upgrade first, pull second, regardless of host. Each segment carries its own `|| true`, so a failed upgrade does not abort the pull.
|
|
|
|
**Files:**
|
|
- Modify: `cli/lib/hooks.py`
|
|
- Modify: `tests/test_lib_hooks.py`
|
|
|
|
- [ ] **Step 4.1: Write the failing hook-installer test**
|
|
|
|
Append to `tests/test_lib_hooks.py`:
|
|
|
|
```python
|
|
def test_install_chains_self_upgrade_then_pull_in_one_entry(tmp_path):
|
|
install_claude_hooks(tmp_path)
|
|
cfg = _read_settings(tmp_path)
|
|
session_start = cfg["hooks"]["SessionStart"]
|
|
assert len(session_start) == 1, session_start
|
|
cmd = session_start[0]["hooks"][0]["command"]
|
|
assert "agnes self-upgrade --quiet" in cmd
|
|
assert "agnes pull --quiet" in cmd
|
|
# Order is encoded in the shell — self-upgrade must appear first
|
|
assert cmd.index("agnes self-upgrade") < cmd.index("agnes pull")
|
|
# Both segments carry || true so neither failure aborts the line
|
|
assert cmd.count("|| true") >= 2
|
|
|
|
|
|
def test_install_idempotent_chained_entry(tmp_path):
|
|
install_claude_hooks(tmp_path)
|
|
install_claude_hooks(tmp_path)
|
|
cfg = _read_settings(tmp_path)
|
|
assert len(cfg["hooks"]["SessionStart"]) == 1
|
|
assert len(cfg["hooks"]["SessionEnd"]) == 1
|
|
```
|
|
|
|
The existing `test_install_creates_settings_file` (around line 14) currently asserts `[0]` is the lone pull entry. Update it to assert the chained command:
|
|
|
|
```python
|
|
def test_install_creates_settings_file(tmp_path):
|
|
install_claude_hooks(tmp_path)
|
|
cfg = _read_settings(tmp_path)
|
|
cmd = cfg["hooks"]["SessionStart"][0]["hooks"][0]["command"]
|
|
assert "agnes self-upgrade --quiet" in cmd
|
|
assert "agnes pull --quiet" in cmd
|
|
assert "agnes push --quiet" in cfg["hooks"]["SessionEnd"][0]["hooks"][0]["command"]
|
|
```
|
|
|
|
The existing `test_install_idempotent` already asserts `len(SessionStart) == 1` — leave as-is, that's still correct under the chained-entry design.
|
|
|
|
- [ ] **Step 4.2: Run tests, verify they fail**
|
|
|
|
```bash
|
|
pytest tests/test_lib_hooks.py -v
|
|
```
|
|
Expected: FAIL — chained-entry tests fail (the lone pull command does not contain `self-upgrade`).
|
|
|
|
- [ ] **Step 4.3: Modify `cli/lib/hooks.py`**
|
|
|
|
Update `_OUR_COMMAND_MARKERS` (line 27) to include `self-upgrade` so the substring match still recognises our line for idempotent replacement:
|
|
|
|
```python
|
|
_OUR_COMMAND_MARKERS = ("agnes self-upgrade", "agnes pull", "agnes push", "da sync")
|
|
```
|
|
|
|
Replace the SessionStart registration (around line 63) with a single chained command:
|
|
|
|
```python
|
|
_replace_or_add(
|
|
"SessionStart",
|
|
"agnes self-upgrade --quiet 2>/dev/null || true; "
|
|
"agnes pull --quiet 2>/dev/null || true",
|
|
)
|
|
_replace_or_add("SessionEnd", "agnes push --quiet 2>/dev/null || true")
|
|
```
|
|
|
|
The `;` runs the second command unconditionally; each `|| true` prevents either failure from aborting the line. Idempotency: re-running `install_claude_hooks` matches the existing entry on either `agnes self-upgrade` or `agnes pull` (both substrings present), drops it, and re-appends — net length stays at 1.
|
|
|
|
- [ ] **Step 4.4: Run tests, verify they pass**
|
|
|
|
```bash
|
|
pytest tests/test_lib_hooks.py -v
|
|
```
|
|
Expected: PASS — all hook tests including the new chained-entry assertions and idempotency.
|
|
|
|
- [ ] **Step 4.5: Commit**
|
|
|
|
```bash
|
|
git add cli/lib/hooks.py tests/test_lib_hooks.py
|
|
git commit -m "feat(cli): install SessionStart hook chaining self-upgrade then pull
|
|
|
|
Single hook entry: 'agnes self-upgrade --quiet ... || true; agnes pull
|
|
--quiet ... || true'. Shell semicolon guarantees ordering across every
|
|
Claude Code version (no reliance on undocumented multi-hook execution
|
|
semantics); each segment's || true preserves the original property
|
|
that an upgrade failure does not abort the pull."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 5: Drive-by `da` → `agnes` cleanup + CHANGELOG
|
|
|
|
**Files:**
|
|
- Modify: `app/api/cli_artifacts.py`
|
|
- Modify: `cli/update_check.py`
|
|
- Modify: `CHANGELOG.md`
|
|
|
|
- [ ] **Step 5.1: Fix `da` references**
|
|
|
|
In `app/api/cli_artifacts.py:47`, replace:
|
|
|
|
```
|
|
Consumed by `da` CLI's auto-update check so it can warn when a newer
|
|
```
|
|
|
|
with:
|
|
|
|
```
|
|
Consumed by `agnes` CLI's auto-update check so it can warn when a newer
|
|
```
|
|
|
|
In `cli/update_check.py:1-9`, replace the four `da` occurrences in the docstring with `agnes`:
|
|
|
|
```python
|
|
"""Auto-check for a newer CLI version on the configured server.
|
|
|
|
Runs in the root typer callback before subcommand dispatch. Failure is
|
|
silent — we never block a working `agnes` command on a best-effort version
|
|
probe. Result is cached in `$AGNES_CONFIG_DIR/update_check.json` for 24h so
|
|
we don't hammer the server on every invocation.
|
|
|
|
Disable with `AGNES_NO_UPDATE_CHECK=1`.
|
|
"""
|
|
```
|
|
|
|
Also fix the `da` reference in the negative-cache comment around line 26:
|
|
|
|
```python
|
|
_NEGATIVE_CACHE_TTL_SECONDS = 5 * 60 # 5min on a failed probe, to avoid
|
|
# re-probing 3s of silence (drop-packet networks: corporate firewall, VPN)
|
|
# on every `agnes` invocation.
|
|
```
|
|
|
|
- [ ] **Step 5.2: Add CHANGELOG entry**
|
|
|
|
Open `CHANGELOG.md`. After rebasing on `origin/main`, the file's structure at the top is:
|
|
|
|
```
|
|
line 11: ## [Unreleased]
|
|
line 12: (blank)
|
|
line 13: ## [0.39.0] — 2026-05-06
|
|
line 15: ### Performance
|
|
...
|
|
```
|
|
|
|
The `## [Unreleased]` block is empty. Insert `### Added` and the three bullets directly between line 11 and line 13:
|
|
|
|
```markdown
|
|
## [Unreleased]
|
|
|
|
### Added
|
|
|
|
- CLI auto-upgrade: ...
|
|
- Server: ...
|
|
- CLI: ...
|
|
|
|
## [0.39.0] — 2026-05-06
|
|
```
|
|
|
|
```markdown
|
|
- CLI auto-upgrade: `agnes self-upgrade` reinstalls the CLI from the server's currently-shipped wheel via `uv tool install --force`, falling back to `pip install --force-reinstall --no-deps` via `sys.executable` when uv is not on PATH. After install, the new binary is smoke-tested at the install-resolved path (`uv tool dir --bin` for uv, `<sys.executable parent>/agnes` for pip) — never via PATH lookup, to avoid stale-shadow false positives. Smoke failure triggers automatic rollback to the previously verified-good wheel (recorded in `~/.config/agnes/last_known_good.json`); rollback's exit code is captured and surfaced on stderr if it also fails. First-ever upgrade or unrecoverable rollback prints the canonical bootstrap recovery: `curl -fsSL <your-agnes-server>/cli/install.sh | bash`. The new command is wired into the SessionStart hook installed by `agnes init` as a chained shell entry (`agnes self-upgrade … || true; agnes pull … || true`) so an upgrade failure does not block the pull.
|
|
- Server: `/api/*` responses now carry `X-Agnes-Latest-Version` and `X-Agnes-Min-Version` headers. CLIs older than `X-Agnes-Min-Version` exit with **code 2** and a remediation message instead of failing on a wire-protocol mismatch. Day-one floor is `0.0.0` (no enforcement) — bump `MIN_COMPAT_CLI_VERSION` in `app/version.py` in the same PR that ships a deliberate wire break.
|
|
- CLI: `cli/update_check.py:check()` accepts a keyword-only `bypass_disabled=True` so explicit `agnes self-upgrade` invocations probe `/cli/latest` even when `AGNES_NO_UPDATE_CHECK=1` is set (which silences the implicit warning loop only).
|
|
```
|
|
|
|
- [ ] **Step 5.3: Run the full affected test surface**
|
|
|
|
```bash
|
|
pytest tests/test_app_version.py tests/test_version_headers_middleware.py \
|
|
tests/test_cli_update_check.py tests/test_client_version_check.py \
|
|
tests/test_self_upgrade.py tests/test_lib_hooks.py \
|
|
tests/test_cli_init.py -v
|
|
```
|
|
Expected: PASS — full green.
|
|
|
|
- [ ] **Step 5.4: Commit**
|
|
|
|
```bash
|
|
git add app/api/cli_artifacts.py cli/update_check.py CHANGELOG.md
|
|
git commit -m "chore: rename stale 'da' references to 'agnes' + CHANGELOG
|
|
|
|
Drive-by docstring/comment cleanup in cli_artifacts.py and update_check.py.
|
|
CHANGELOG entry for the auto-upgrade feature shipped in this branch."
|
|
```
|
|
|
|
---
|
|
|
|
## Task 6: Manual verification
|
|
|
|
- [ ] **Step 6.1: Local smoke test — version mismatch hard-stop**
|
|
|
|
Start the server locally:
|
|
|
|
```bash
|
|
cd /path/to/agnes
|
|
uvicorn app.main:app --reload &
|
|
SERVER_PID=$!
|
|
```
|
|
|
|
Force a min-version mismatch by patching `app/version.py`:
|
|
|
|
```bash
|
|
sed -i.bak 's/MIN_COMPAT_CLI_VERSION = "0.0.0"/MIN_COMPAT_CLI_VERSION = "99.99.99"/' app/version.py
|
|
```
|
|
|
|
Wait for the reload, then hit any `/api/*` endpoint with the CLI:
|
|
|
|
```bash
|
|
agnes status
|
|
```
|
|
|
|
Expected: stderr `error: agnes <local> is incompatible with server <ver> (min required: 99.99.99). Run: agnes self-upgrade`, exit code 2.
|
|
|
|
Restore:
|
|
|
|
```bash
|
|
mv app/version.py.bak app/version.py
|
|
kill $SERVER_PID
|
|
```
|
|
|
|
- [ ] **Step 6.2: Local smoke test — `agnes self-upgrade --check-only`**
|
|
|
|
```bash
|
|
agnes self-upgrade --check-only
|
|
```
|
|
|
|
Expected: exit 0 (current) or exit 1 with `[update] agnes ... out of date ...` on stderr (depends on what version is on disk vs. served).
|
|
|
|
- [ ] **Step 6.3: Verify hook installation**
|
|
|
|
In a clean tmp workspace:
|
|
|
|
```bash
|
|
mkdir /tmp/agnes-hook-smoke && cd /tmp/agnes-hook-smoke
|
|
agnes init
|
|
cat .claude/settings.json | jq '.hooks.SessionStart'
|
|
```
|
|
|
|
Expected: two entries — `agnes self-upgrade --quiet ...` and `agnes pull --quiet ...` in that order.
|
|
|
|
Re-run:
|
|
|
|
```bash
|
|
agnes init
|
|
cat .claude/settings.json | jq '.hooks.SessionStart | length'
|
|
```
|
|
|
|
Expected: `2` (not `4`) — idempotent.
|
|
|
|
- [ ] **Step 6.4: Open the PR**
|
|
|
|
```bash
|
|
git push -u origin zs/cli-auto-upgrade-spec
|
|
gh pr create --title "feat: server-pinned CLI auto-upgrade" --body "$(cat <<'EOF'
|
|
## Summary
|
|
- `agnes self-upgrade` reinstalls the CLI from `/cli/wheel/<name>` (uv tool install --force, pip --user fallback). Reuses cli.update_check.check() — single polling path, single cache.
|
|
- SessionStart hook installs the upgrade ahead of `agnes pull`, so analyst CLIs stay current with the server they connect to.
|
|
- /api/* responses carry X-Agnes-Latest-Version / X-Agnes-Min-Version headers. CLIs below min exit 2 with a remediation message instead of failing on a wire-protocol mismatch.
|
|
- Drive-by: stale `da` references renamed to `agnes` in cli_artifacts.py and update_check.py docstrings.
|
|
|
|
## Spec / plan
|
|
- Spec: `docs/superpowers/specs/2026-05-06-cli-auto-upgrade-spec.md`
|
|
- Plan: `docs/superpowers/plans/2026-05-06-cli-auto-upgrade.md`
|
|
|
|
## Test plan
|
|
- [x] `pytest tests/test_version_headers_middleware.py` — middleware applied to /api/*, not /web/*
|
|
- [x] `pytest tests/test_client_version_check.py` — hard-stop on min mismatch
|
|
- [x] `pytest tests/test_self_upgrade.py` — uv path, pip fallback, --check-only, --force, --quiet
|
|
- [x] `pytest tests/test_lib_hooks.py` — new entry + idempotency
|
|
- [ ] Manual: spoof `MIN_COMPAT_CLI_VERSION="99.99.99"` server-side, verify CLI exits 2
|
|
- [ ] Manual: fresh `agnes init` workspace shows two SessionStart entries in correct order
|
|
EOF
|
|
)"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 7: Release-cut (last commits on this PR)
|
|
|
|
**Why now:** per CLAUDE.md changelog discipline + project convention, the version bump and `[Unreleased]` rename land on the same PR as the user-visible behavior change. This task converts the in-flight CHANGELOG entry into a versioned release.
|
|
|
|
**Files:**
|
|
- Modify: `CHANGELOG.md` — rename topmost `## [Unreleased]` to `## [0.40.0] — 2026-05-06`, then add a fresh empty `## [Unreleased]` heading above it for the next PR.
|
|
- Modify: `pyproject.toml` — bump `[project].version` from `0.39.0` to `0.40.0` (additive feature → minor bump).
|
|
|
|
- [ ] **Step 7.1: Rename `## [Unreleased]` → `## [0.40.0] — 2026-05-06`**
|
|
|
|
In `CHANGELOG.md`, locate the topmost `## [Unreleased]` heading. Rename it to `## [0.40.0] — 2026-05-06`. Above it, insert a new empty `## [Unreleased]` block so the next PR has somewhere to land:
|
|
|
|
```markdown
|
|
## [Unreleased]
|
|
|
|
## [0.40.0] — 2026-05-06
|
|
|
|
### Added
|
|
- CLI auto-upgrade: ... (existing entries from Task 5)
|
|
- Server: `/api/*` responses now carry ... (existing entries from Task 5)
|
|
```
|
|
|
|
- [ ] **Step 7.2: Bump `pyproject.toml` version**
|
|
|
|
```bash
|
|
sed -i.bak 's/^version = "0.39.0"/version = "0.40.0"/' pyproject.toml && rm pyproject.toml.bak
|
|
```
|
|
|
|
Verify:
|
|
|
|
```bash
|
|
grep '^version = ' pyproject.toml
|
|
```
|
|
Expected output: `version = "0.40.0"`
|
|
|
|
- [ ] **Step 7.3: Commit**
|
|
|
|
```bash
|
|
git add CHANGELOG.md pyproject.toml
|
|
git commit -m "release: 0.40.0 — server-pinned CLI auto-upgrade
|
|
|
|
See CHANGELOG.md for the full entry."
|
|
```
|
|
|
|
- [ ] **Step 7.4: Tag + GitHub Release (after PR merge)**
|
|
|
|
After the PR merges to `main`, capture the merge SHA explicitly so a concurrent unrelated merge between this PR's merge and the operator running tag commands does not push our tag onto the wrong commit:
|
|
|
|
```bash
|
|
PR_NUM=<this-PR-number>
|
|
MERGE_SHA=$(gh pr view "$PR_NUM" --json mergeCommit -q .mergeCommit.oid)
|
|
git fetch origin
|
|
git tag v0.40.0 "$MERGE_SHA"
|
|
git push origin v0.40.0
|
|
```
|
|
|
|
Then create a GitHub Release for `v0.40.0`. Mirror the prose structure of the most recent prior release on the same repo (`gh release view v0.39.0` for the latest format) — typically an intro paragraph, the CHANGELOG section verbatim, and any operator-facing notes (e.g. *"this release introduces SessionStart hook behavior; expect a one-time `agnes self-upgrade` install on the first session per analyst"*).
|
|
|
|
```bash
|
|
gh release create v0.40.0 --target "$MERGE_SHA" --title "v0.40.0 — server-pinned CLI auto-upgrade" --notes "$(...)"
|
|
```
|
|
|
|
(Per user memory: a git tag without a GitHub Release is incomplete.)
|
|
|
|
---
|
|
|
|
## Self-Review Checklist (run before declaring complete)
|
|
|
|
- [ ] Spec coverage: every section of the spec maps to a task above. ✓
|
|
- [ ] Placeholder scan: no "TBD" / "fill in later" / "similar to Task N" without inline code.
|
|
- [ ] Type/name consistency: `APP_VERSION`, `MIN_COMPAT_CLI_VERSION`, `X-Agnes-Latest-Version`, `X-Agnes-Min-Version`, `_check_version_headers`, `self_upgrade_app`, `_invalidate_update_cache`, `_install_with_uv`, `_install_with_pip`, `_smoke_test_new_binary`, `_uv_tool_bin_path`, `_pip_bin_path`, `_Unreachable`, `_UNREACHABLE`, `_read_last_known_good`, `_record_last_known_good`, `bypass_disabled` — used identically across tasks.
|
|
- [ ] CHANGELOG entry exists under `## [Unreleased]` (Task 5), then renamed to `## [0.40.0] — 2026-05-06` (Task 7).
|
|
- [ ] CLAUDE.md "OSS — no customer-specific content" rule respected: no Keboola/Groupon/FoundryAI tokens in code or PR body.
|
|
- [ ] Each task ends with a real commit. No squash-everything-at-end.
|
|
- [ ] Layer B is shipped at `MIN_COMPAT_CLI_VERSION = "0.0.0"` — no enforcement on day one. The bump-when-needed policy is review-time discipline, not a CI gate (rejected during spec iteration as theater).
|