Drop legacy sslVerify=false fallback from install setup prompt (#238)

The marketplace step (step 5) emitted `git config --global
http.<host>/.sslVerify false` on AGNES_DEBUG_AUTH=1 instances when no
ca_pem was readable from AGNES_TLS_FULLCHAIN_PATH. Two problems:

1. Claude Code auto-mode classifiers ("do not disable TLS verification"
   rule) block the line, breaking hands-free setup.
2. It silently masked operator misconfiguration — a debug-auth instance
   without a fullchain on disk fell through to a TLS-disabled clone
   instead of surfacing the missing cert.

After the cross-platform trust block (#137), self-signed and private-CA
servers are fully covered by step 0 reading the fullchain via
_read_agnes_ca_pem; publicly-trusted certs need no bootstrap at all.
The legacy fallback no longer covers a real scenario — verified by
running step 5 without sslVerify=false against a self-signed instance.

BREAKING: drops `self_signed_tls` parameter from
app.web.setup_instructions.resolve_lines and render_setup_instructions
(only consumed by the deleted block). The AGNES_DEBUG_AUTH env var
itself is unchanged — still gates /api/me_debug and the dropdown link.

Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
This commit is contained in:
minasarustamyan 2026-05-09 20:10:01 +02:00 committed by GitHub
parent 6fe67d5279
commit d269c69359
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 64 additions and 133 deletions

View file

@ -10,6 +10,26 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C
## [Unreleased] ## [Unreleased]
### Removed
- **BREAKING: legacy `git config --global http.<host>.sslVerify=false`
downgrade in the install setup prompt.** The marketplace step (step 5)
used to emit this line on `AGNES_DEBUG_AUTH=1` instances when no
`ca_pem` was readable from `AGNES_TLS_FULLCHAIN_PATH` (default
`/data/state/certs/fullchain.pem`). It tripped Claude Code auto-mode
classifiers ("do not disable TLS verification" rule) and silently
masked operator misconfigurations — a debug-auth instance without a
fullchain on disk would fall through to a TLS-disabled clone instead
of surfacing the missing cert. With this change there is exactly one
trust-bootstrap path: the cross-platform step 0 trust block (gated
on `_read_agnes_ca_pem` returning a PEM). Operators serving a
self-signed or private-CA cert MUST place the fullchain at the
configured path so step 0 picks it up; publicly-trusted certs need
no trust block at all. The `self_signed_tls` parameter on
`app.web.setup_instructions.resolve_lines` and
`render_setup_instructions` is also dropped (was only consumed by
the deleted block).
### Fixed ### Fixed
- **`v34→v35` migration is now idempotent under partial-rebuild recovery.** The original list-form `_V34_TO_V35_MIGRATIONS` ran four ALTER statements in sequence: `ADD _vis_v35``UPDATE _vis_v35 = visibility_status``DROP visibility_status``RENAME _vis_v35 TO visibility_status`. If the RENAME failed for any reason after the DROP succeeded (DuckDB lock contention at startup, scheduler-vs-app race opening `system.duckdb`, container kill mid-migration, …), the DB was stranded with `_vis_v35` populated and `visibility_status` missing — and `schema_version` never bumped because the UPDATE at the bottom of the migration ladder only runs when *every* step succeeds. Subsequent restarts then hit `DROP visibility_status` again with no `IF EXISTS` guard and looped on the same error; the only recovery was hand-editing the DB. The migration is rewritten as a Python function `_v34_to_v35_migrate` that inspects the table's columns up front and dispatches into one of three paths: clean v34 (run the full rebuild), partial v35 with `_vis_v35` only (finish the RENAME alone), or both columns present (drop the temp). The audit columns (`archived_at`, `archived_by`) ship first behind `IF NOT EXISTS` so they're safe in all states. Operators stranded by the original bug recover automatically on next startup. Tests cover the three direct paths plus an end-to-end scenario where `_ensure_schema` walks a `schema_version=32` DB with the half-applied state up through to v36. - **`v34→v35` migration is now idempotent under partial-rebuild recovery.** The original list-form `_V34_TO_V35_MIGRATIONS` ran four ALTER statements in sequence: `ADD _vis_v35``UPDATE _vis_v35 = visibility_status``DROP visibility_status``RENAME _vis_v35 TO visibility_status`. If the RENAME failed for any reason after the DROP succeeded (DuckDB lock contention at startup, scheduler-vs-app race opening `system.duckdb`, container kill mid-migration, …), the DB was stranded with `_vis_v35` populated and `visibility_status` missing — and `schema_version` never bumped because the UPDATE at the bottom of the migration ladder only runs when *every* step succeeds. Subsequent restarts then hit `DROP visibility_status` again with no `IF EXISTS` guard and looped on the same error; the only recovery was hand-editing the DB. The migration is rewritten as a Python function `_v34_to_v35_migrate` that inspects the table's columns up front and dispatches into one of three paths: clean v34 (run the full rebuild), partial v35 with `_vis_v35` only (finish the RENAME alone), or both columns present (drop the temp). The audit columns (`archived_at`, `archived_by`) ship first behind `IF NOT EXISTS` so they're safe in all states. Operators stranded by the original bug recover automatically on next startup. Tests cover the three direct paths plus an end-to-end scenario where `_ensure_schema` walks a `schema_version=32` DB with the half-applied state up through to v36.

View file

@ -400,16 +400,12 @@ def _build_context(
_wheel = _find_wheel() _wheel = _find_wheel()
_wheel_filename = _wheel.name if _wheel else "agnes.whl" _wheel_filename = _wheel.name if _wheel else "agnes.whl"
self_signed_tls = os.environ.get("AGNES_DEBUG_AUTH", "").strip().lower() in (
"1", "true", "yes",
)
server_host = request.url.netloc server_host = request.url.netloc
ca_pem = _read_agnes_ca_pem() ca_pem = _read_agnes_ca_pem()
setup_instructions_lines = resolve_lines( setup_instructions_lines = resolve_lines(
_wheel_filename, _wheel_filename,
plugin_install_names=[], plugin_install_names=[],
self_signed_tls=self_signed_tls,
server_host=server_host, server_host=server_host,
ca_pem=ca_pem, ca_pem=ca_pem,
) )

View file

@ -490,8 +490,6 @@ def _preflight_block(step_num: str) -> list[str]:
def _marketplace_block( def _marketplace_block(
plugin_install_names: list[str], plugin_install_names: list[str],
self_signed_tls: bool,
has_ca: bool,
step_num: str, step_num: str,
) -> list[str]: ) -> list[str]:
"""Build the marketplace + plugin-install block. """Build the marketplace + plugin-install block.
@ -536,32 +534,23 @@ def _marketplace_block(
plugin contents in place. Broken end-to-end on every Claude Code plugin contents in place. Broken end-to-end on every Claude Code
distribution; cloning is the only reliable install path. distribution; cloning is the only reliable install path.
With ``has_ca=False`` and ``self_signed_tls=True`` (legacy path, TLS handling for the in-binary ``git clone`` is fully covered by the
AGNES_DEBUG_AUTH instances): we emit the host-scoped ``git config cross-platform trust block (step 0) when the server's cert needs
sslVerify=false`` downgrade so system git's clone (which agnes bootstrapping (`ca_pem` non-empty), and by the OS trust store when
invokes via subprocess) accepts the un-trusted endpoint. With a the cert is publicly-trusted. There used to be a legacy fallback
publicly-trusted cert (Let's Encrypt etc.) or a CA bundle in step 0, here that emitted a host-scoped ``git config http.<host>.sslVerify
no extra config needed system git already trusts the chain. false`` line for the ``AGNES_DEBUG_AUTH`` path; that's gone — it
masked operator misconfigurations (a ``self_signed_tls=True``
instance without ``/data/state/certs/fullchain.pem`` on disk) and
its ``sslVerify=false`` shell command tripped Claude Code auto-mode
classifiers. Operators serving a self-signed or private-CA cert
must place the fullchain at ``AGNES_TLS_FULLCHAIN_PATH`` (default
``/data/state/certs/fullchain.pem``) so step 0 can read it via
``_read_agnes_ca_pem``.
""" """
lines: list[str] = [ return [
"", "",
f"{step_num}) Register the Agnes Claude Code marketplace and install plugins:", f"{step_num}) Register the Agnes Claude Code marketplace and install plugins:",
]
# The legacy AGNES_DEBUG_AUTH path needs sslVerify=false so system git
# accepts the self-signed cert during the bootstrap clone. has_ca path
# has GIT_SSL_CAINFO already set by step 0(d), so no extra config
# needed there.
if not has_ca and self_signed_tls:
lines.extend([
" # Self-signed TLS cert on this Agnes instance — host-scoped",
" # `sslVerify=false` so the marketplace `git clone` accepts it.",
" # Without a CA bundle we can't do better than this; flip your",
" # AGNES_DEBUG_AUTH instance to a real fullchain.pem to drop this line.",
" git config --global http.\"{server_url}/\".sslVerify false",
])
lines.extend([
" # `agnes refresh-marketplace --bootstrap` does:", " # `agnes refresh-marketplace --bootstrap` does:",
" # 1. clone the per-user marketplace bare repo to ~/.agnes/marketplace", " # 1. clone the per-user marketplace bare repo to ~/.agnes/marketplace",
" # 2. strip the PAT from the cloned origin URL (refreshes use a", " # 2. strip the PAT from the cloned origin URL (refreshes use a",
@ -580,8 +569,7 @@ def _marketplace_block(
" and run `claude` again so the new plugins load. From then on, the", " and run `claude` again so the new plugins load. From then on, the",
" SessionStart hook keeps the marketplace clone in sync via", " SessionStart hook keeps the marketplace clone in sync via",
" `agnes refresh-marketplace --quiet` on every Claude Code session.", " `agnes refresh-marketplace --quiet` on every Claude Code session.",
]) ]
return lines
def _preamble_lines(*, has_ca: bool) -> list[str]: def _preamble_lines(*, has_ca: bool) -> list[str]:
@ -656,7 +644,6 @@ def resolve_lines(
wheel_filename: str, wheel_filename: str,
*, *,
plugin_install_names: list[str] | None = None, plugin_install_names: list[str] | None = None,
self_signed_tls: bool = False,
server_host: str = "", server_host: str = "",
ca_pem: str | None = None, ca_pem: str | None = None,
) -> list[str]: ) -> list[str]:
@ -678,13 +665,6 @@ def resolve_lines(
needs the bootstrap (typically: skip for publicly-trusted certs like needs the bootstrap (typically: skip for publicly-trusted certs like
Let's Encrypt, emit for self-signed or private corp CA). Let's Encrypt, emit for self-signed or private corp CA).
`self_signed_tls=True` is the legacy fallback when no `ca_pem` is
available it prepends a host-scoped
`git config http."<host>/".sslVerify false` inside the marketplace
block (TLS *downgrade*, not bootstrap). When `ca_pem` is set, this
flag is ignored because the trust block subsumes it. No-op when the
marketplace block isn't rendered (no plugins).
Fallback: callers pass `"agnes.whl"` when no wheel is present on disk. Fallback: callers pass `"agnes.whl"` when no wheel is present on disk.
The resulting URL (`/cli/wheel/agnes.whl`) will 404 at download time, but The resulting URL (`/cli/wheel/agnes.whl`) will 404 at download time, but
the instruction text still renders so operators can see the snippet shape the instruction text still renders so operators can see the snippet shape
@ -693,10 +673,6 @@ def resolve_lines(
names = list(plugin_install_names or []) names = list(plugin_install_names or [])
has_marketplace = bool(names) has_marketplace = bool(names)
has_ca = bool(ca_pem and ca_pem.strip()) has_ca = bool(ca_pem and ca_pem.strip())
# Trust block subsumes the legacy sslVerify-off downgrade. Don't emit
# both: with `~/.agnes/ca-bundle.pem` wired into GIT_SSL_CAINFO, git already
# trusts the host without disabling verification.
effective_self_signed = self_signed_tls and not has_ca
# Step layout. Marketplace (when emitted) goes BEFORE diagnose/skills, # Step layout. Marketplace (when emitted) goes BEFORE diagnose/skills,
# so the human-loop skills question is the last step before Confirm. # so the human-loop skills question is the last step before Confirm.
@ -713,9 +689,7 @@ def resolve_lines(
lines.extend(_init_lines()) # 2, 3 lines.extend(_init_lines()) # 2, 3
if has_marketplace: if has_marketplace:
lines.extend(_preflight_block(steps["preflight"])) # 4 lines.extend(_preflight_block(steps["preflight"])) # 4
lines.extend(_marketplace_block( # 5 lines.extend(_marketplace_block(names, step_num=steps["marketplace"])) # 5
names, effective_self_signed, has_ca=has_ca, step_num=steps["marketplace"],
))
# Diagnose + skills come AFTER the marketplace block (or right after # Diagnose + skills come AFTER the marketplace block (or right after
# the catalog smoke verify if there's no marketplace step at all). # the catalog smoke verify if there's no marketplace step at all).
lines.extend(_diagnose_skills_lines( lines.extend(_diagnose_skills_lines(
@ -740,7 +714,6 @@ def render_setup_instructions(
wheel_filename: str = "agnes.whl", wheel_filename: str = "agnes.whl",
*, *,
plugin_install_names: list[str] | None = None, plugin_install_names: list[str] | None = None,
self_signed_tls: bool = False,
server_host: str = "", server_host: str = "",
ca_pem: str | None = None, ca_pem: str | None = None,
) -> str: ) -> str:
@ -749,12 +722,11 @@ def render_setup_instructions(
Used server-side for tests and any non-JS rendering path. The browser Used server-side for tests and any non-JS rendering path. The browser
clipboard flow uses the JS renderer embedded in the Jinja partial; both clipboard flow uses the JS renderer embedded in the Jinja partial; both
must produce byte-identical output for a given (server_url, token, must produce byte-identical output for a given (server_url, token,
wheel, plugins, flag, host, ca_pem) tuple. wheel, plugins, host, ca_pem) tuple.
""" """
lines = resolve_lines( lines = resolve_lines(
wheel_filename, wheel_filename,
plugin_install_names=plugin_install_names, plugin_install_names=plugin_install_names,
self_signed_tls=self_signed_tls,
server_host=server_host, server_host=server_host,
ca_pem=ca_pem, ca_pem=ca_pem,
) )

View file

@ -20,7 +20,6 @@ See also: surfaced as the "Agent Setup Prompt" admin editor at /admin/agent-prom
from __future__ import annotations from __future__ import annotations
import logging import logging
import os
import re import re
from datetime import date, datetime, timezone from datetime import date, datetime, timezone
from pathlib import Path from pathlib import Path
@ -183,9 +182,6 @@ def compute_default_agent_prompt(
except Exception: except Exception:
logger.exception("compute_default_agent_prompt: marketplace plugin resolution failed") logger.exception("compute_default_agent_prompt: marketplace plugin resolution failed")
self_signed_tls = os.environ.get("AGNES_DEBUG_AUTH", "").strip().lower() in (
"1", "true", "yes",
)
from urllib.parse import urlparse as _urlparse from urllib.parse import urlparse as _urlparse
parsed = _urlparse(server_url) parsed = _urlparse(server_url)
server_host = parsed.netloc or parsed.hostname or "" server_host = parsed.netloc or parsed.hostname or ""
@ -200,7 +196,6 @@ def compute_default_agent_prompt(
lines = resolve_lines( lines = resolve_lines(
_wheel_filename, _wheel_filename,
plugin_install_names=plugin_install_names, plugin_install_names=plugin_install_names,
self_signed_tls=self_signed_tls,
server_host=server_host, server_host=server_host,
ca_pem=ca_pem, ca_pem=ca_pem,
) )

View file

@ -297,7 +297,7 @@ def test_resolve_lines_with_plugins_uses_install_first_diagnose_last_layout():
skills_idx = joined.index("7) Skills") skills_idx = joined.index("7) Skills")
confirm_idx = joined.index("8) Confirm:") confirm_idx = joined.index("8) Confirm:")
assert install_idx < init_idx < catalog_idx < git_idx < market_idx < diag_idx < skills_idx < confirm_idx assert install_idx < init_idx < catalog_idx < git_idx < market_idx < diag_idx < skills_idx < confirm_idx
# No git-config sslVerify=false line unless self_signed_tls is set. # Legacy `git config sslVerify=false` downgrade is gone — see CHANGELOG.
assert "git config --global" not in joined assert "git config --global" not in joined
# server_host is server-side substituted; the placeholder must be gone. # server_host is server-side substituted; the placeholder must be gone.
assert "{server_host}" not in joined assert "{server_host}" not in joined
@ -341,43 +341,6 @@ def test_preflight_checks_both_git_and_claude():
assert claude_check_idx < market_idx assert claude_check_idx < market_idx
def test_resolve_lines_self_signed_legacy_path_adds_git_config_line():
"""Legacy fallback (no ca_pem on disk + self_signed_tls=True): the host-scoped
`git config sslVerify=false` downgrade is still emitted so existing
AGNES_DEBUG_AUTH instances keep working until they roll a fullchain.pem."""
from app.web.setup_instructions import resolve_lines
joined = "\n".join(
resolve_lines(
"agnes.whl",
plugin_install_names=["foo"],
self_signed_tls=True,
server_host="agnes.example.com",
)
)
assert 'git config --global http."{server_url}/".sslVerify false' in joined
# The git-config line must come BEFORE the marketplace add inside the
# marketplace step (regardless of which step number it lands on).
git_idx = joined.index('git config --global')
add_idx = joined.index('claude plugin marketplace add')
assert git_idx < add_idx
def test_resolve_lines_self_signed_no_op_without_plugins():
"""`self_signed_tls=True` is a no-op when there are no plugins (no marketplace step to attach to)."""
from app.web.setup_instructions import resolve_lines
joined = "\n".join(
resolve_lines("agnes.whl", plugin_install_names=[], self_signed_tls=True)
)
# Legacy downgrade line not present.
assert "git config --global" not in joined
assert "claude plugin" not in joined
# No pre-flight either when there's no marketplace step.
assert "Make sure git and claude are installed" not in joined
assert "6) Confirm:" in joined # original layout intact
def test_render_setup_instructions_with_plugins_substitutes_all_placeholders(): def test_render_setup_instructions_with_plugins_substitutes_all_placeholders():
from app.web.setup_instructions import render_setup_instructions from app.web.setup_instructions import render_setup_instructions
@ -386,7 +349,6 @@ def test_render_setup_instructions_with_plugins_substitutes_all_placeholders():
token="T-XYZ", token="T-XYZ",
wheel_filename="agnes-1.0-py3-none-any.whl", wheel_filename="agnes-1.0-py3-none-any.whl",
plugin_install_names=["foo", "bar"], plugin_install_names=["foo", "bar"],
self_signed_tls=True,
server_host="agnes.example.com", server_host="agnes.example.com",
) )
# No raw placeholders remain in the final string. # No raw placeholders remain in the final string.
@ -399,8 +361,12 @@ def test_render_setup_instructions_with_plugins_substitutes_all_placeholders():
# token from the agnes config that step 2 just wrote, so no token # token from the agnes config that step 2 just wrote, so no token
# in any URL inside step 5. # in any URL inside step 5.
assert "T-XYZ" in out assert "T-XYZ" in out
# Self-signed TLS line is host-scoped to server_url. # The legacy `git config --global ... sslVerify false` downgrade is gone
assert 'git config --global http."https://agnes.example.com/".sslVerify false' in out # (see CHANGELOG: it tripped Claude Code auto-mode classifiers and was
# only ever a safety net for AGNES_DEBUG_AUTH instances without a
# fullchain.pem on disk). Self-signed and private-CA cases are now
# exclusively handled by the step 0 trust block (gated on `ca_pem`).
assert "git config --global" not in out
# Marketplace step is the one-liner; no per-plugin install lines. # Marketplace step is the one-liner; no per-plugin install lines.
assert "agnes refresh-marketplace --bootstrap" in out assert "agnes refresh-marketplace --bootstrap" in out
assert "claude plugin install foo@agnes" not in out assert "claude plugin install foo@agnes" not in out
@ -615,46 +581,28 @@ def test_diagnose_step_documents_non_admin_role_state():
assert "non-admin" in joined.lower() or "analyst" in joined.lower() assert "non-admin" in joined.lower() or "analyst" in joined.lower()
def test_resolve_lines_with_ca_pem_suppresses_legacy_sslverify_line(): def test_resolve_lines_no_sslverify_downgrade_anywhere():
"""When ca_pem is supplied, the legacy `git config sslVerify=false` """The legacy `git config sslVerify=false` downgrade is gone in every
downgrade must NOT appear the trust block subsumes it (full TLS rendering combination. Self-signed and private-CA servers must place
validation re-enabled, just against the inlined cert).""" the fullchain at AGNES_TLS_FULLCHAIN_PATH (default
/data/state/certs/fullchain.pem) so step 0 picks it up via
_read_agnes_ca_pem; publicly-trusted certs need no trust block at
all. There is no third path."""
from app.web.setup_instructions import resolve_lines from app.web.setup_instructions import resolve_lines
joined = "\n".join( for kwargs in (
resolve_lines( {"plugin_install_names": ["foo"], "server_host": "agnes.example.com"},
"agnes.whl", {"plugin_install_names": ["foo"], "server_host": "agnes.example.com",
plugin_install_names=["foo"], "ca_pem": _FAKE_CA_PEM},
self_signed_tls=True, # legacy flag — should be ignored when ca_pem set {"plugin_install_names": [], "server_host": "agnes.example.com"},
server_host="agnes.example.com", ):
ca_pem=_FAKE_CA_PEM, joined = "\n".join(resolve_lines("agnes.whl", **kwargs))
assert "git config --global" not in joined, (
f"sslVerify downgrade leaked through with kwargs={kwargs!r}"
) )
) assert "sslVerify false" not in joined, (
# Legacy git-config sslVerify=false downgrade is suppressed when ca_pem is set. f"sslVerify downgrade leaked through with kwargs={kwargs!r}"
assert "git config --global" not in joined
# But the marketplace step itself still renders (as the one-liner).
assert "agnes refresh-marketplace --bootstrap" in joined
# And the trust block is present.
assert "0) Trust the Agnes TLS certificate" in joined
def test_resolve_lines_without_ca_pem_keeps_legacy_self_signed_path():
"""Legacy fallback: no ca_pem + self_signed_tls=True still emits the
sslVerify=false line (so existing AGNES_DEBUG_AUTH instances keep
working until they roll a fullchain.pem onto disk)."""
from app.web.setup_instructions import resolve_lines
joined = "\n".join(
resolve_lines(
"agnes.whl",
plugin_install_names=["foo"],
self_signed_tls=True,
server_host="agnes.example.com",
# no ca_pem
) )
)
assert "0) Trust the Agnes TLS certificate" not in joined
assert 'sslVerify false' in joined
def test_resolve_lines_ca_pem_empty_string_is_treated_as_absent(): def test_resolve_lines_ca_pem_empty_string_is_treated_as_absent():
@ -690,13 +638,13 @@ def test_render_setup_instructions_propagates_ca_pem():
token="T-CA", token="T-CA",
wheel_filename="agnes-1.0-py3-none-any.whl", wheel_filename="agnes-1.0-py3-none-any.whl",
plugin_install_names=["foo"], plugin_install_names=["foo"],
self_signed_tls=True,
server_host="agnes.example.com", server_host="agnes.example.com",
ca_pem=_FAKE_CA_PEM, ca_pem=_FAKE_CA_PEM,
) )
assert "0) Trust the Agnes TLS certificate" in out assert "0) Trust the Agnes TLS certificate" in out
assert "-----BEGIN CERTIFICATE-----" in out assert "-----BEGIN CERTIFICATE-----" in out
# ca_pem masks legacy sslVerify=false. # The legacy `git config sslVerify=false` downgrade was deleted; the
# ca_pem trust block is the sole TLS-bootstrap path now.
assert "git config --global" not in out assert "git config --global" not in out
# Other placeholders still substituted. # Other placeholders still substituted.
assert "{server_url}" not in out assert "{server_url}" not in out