fix(cli): Windows console crash on cs-CZ codepage (port + broaden #172)

Ports Minas's PR #172 (against pre-rename `da` CLI on main) and applies
the principle to the post-rename `agnes` CLI. Two distinct failure modes
on Windows consoles whose default codepage is cp1250 (cs-CZ) / cp1252
(en-US):

1. `agnes pull` and other Rich-progress codepaths
   UnicodeEncodeError on Braille spinner glyphs. Fix: `cli/main.py`
   reconfigures stdout/stderr to UTF-8 with errors='replace' at import
   time on `sys.platform == 'win32'` so Rich's legacy-Windows render
   path emits decodable bytes. Wrapped in try/except so pytest's
   captured streams (which aren't TextIOWrapper) don't break.

2. `agnes skills list` and `agnes skills show`
   UnicodeDecodeError when reading skill markdown containing em-dashes /
   accented chars. Default `Path.read_text()` uses
   locale.getpreferredencoding(False), which is the broken codepage on
   Windows. Fix: every call site passes encoding='utf-8' explicitly.

Broader scope than #172 because:
- The bootstrap rewrite renamed/removed several files Minas's PR
  patched (`cli/commands/analyst.py` -> rolled into init.py;
  `cli/commands/sync.py` -> split into pull/push). Those targets no
  longer exist; the equivalent code lives in init.py.
- Other call sites Minas didn't touch (still bare in his branch) are
  patched here too — config.py / update_check.py / snapshot_meta.py /
  setup.py / skills.py — so the codebase has zero locale-default text
  I/O in cli/.

Side cleanup: stale `Run `da`` reference in snapshot_meta.py:88 fixed
to `agnes` while touching the file.
This commit is contained in:
ZdenekSrotyr 2026-05-04 20:45:29 +02:00
parent e323ab76cc
commit ee83cebbda
8 changed files with 34 additions and 17 deletions

View file

@ -38,6 +38,7 @@ End-to-end clean-analyst-bootstrap rewrite. The web `/setup?role=analyst` page n
- `agnes snapshot create` (formerly `da fetch`) no longer materializes an empty `user/duckdb/analytics.duckdb` when run before any `agnes pull`. Friendly hint redirects to `agnes pull`.
- Workspace `agnes status` reads from the canonical `server/parquet/` and `user/duckdb/analytics.duckdb` paths (was reading legacy `data/parquet/`, `data/metadata/last_sync.json`).
- `agnes init` and `agnes pull` errors now use the `cli/error_render.py` typed-error renderer (added in 0.32.0), so analyst-facing error UX matches the structured shape `agnes query --remote` already produces.
- **Windows: `agnes` CLI no longer crashes on cs-CZ / non-UTF-8 consoles.** Two failure modes addressed (originally reported in #172 against the pre-rename `da` CLI; ported and broadened here): (1) `agnes pull` and any other Rich-progress-bar codepath crashed with `UnicodeEncodeError` because cp1250 / cp1252 cannot encode Rich's Braille spinner glyphs — `cli/main.py` now reconfigures `sys.stdout` / `sys.stderr` to UTF-8 with `errors="replace"` at import time when `sys.platform == "win32"`. (2) `agnes skills list` and `agnes skills show` crashed with `UnicodeDecodeError` reading skill markdown that contains em-dashes / accents — every `Path.read_text()` / `Path.write_text()` / `open()` call site in `cli/` (including ones not touched by #172, since several files were renamed in the bootstrap rewrite) now passes `encoding="utf-8"` explicitly. Defensive: also covers JSON / YAML config files that were ASCII-only in practice but were one non-ASCII value away from the same failure mode.
- `agnes snapshot create … --estimate` in a pre-init directory no longer leaks an httpx `ConnectError` traceback to stderr. The estimate-guard fix (3d587681) let `--estimate` reach `api_post_json`, but the existing `except V2ClientError` clause didn't catch transport-layer errors when no server was configured (defaulted to `http://localhost:8000`). Now also catches `httpx.HTTPError` and renders the friendly hint `Run \`agnes init …\` first`.
- `agnes push` now reads Claude Code session jsonls from `~/.claude/projects/<encoded-cwd>/` (where Claude Code actually writes them), instead of `<workspace>/user/sessions/` (which the SessionEnd hook never populated — the previous code uploaded an empty list every time). Encoding logic in `cli/lib/claude_sessions.py` probes both Claude Code variants — older `/`→`-` and newer all-non-alphanumeric→`-` — and unions the result, so users who have upgraded Claude Code mid-project see sessions from both encoded dirs. Falls back to `<workspace>/user/sessions/` for back-compat.

View file

@ -155,7 +155,7 @@ def init(
settings_path.write_text(json.dumps(
{"model": "sonnet", "permissions": {"allow": ["Read", "Bash", "Grep", "Glob"]}},
indent=2,
))
), encoding="utf-8")
install_claude_hooks(workspace)
# ------------------------------------------------------------------

View file

@ -23,7 +23,7 @@ def setup_init(
import yaml
config = {"server": server}
config_file.write_text(yaml.dump(config))
config_file.write_text(yaml.dump(config), encoding="utf-8")
typer.echo(f"Config saved to {config_file}")
os.environ["AGNES_SERVER"] = server
typer.echo("\nNext: agnes setup bootstrap --email admin@company.com")

View file

@ -18,7 +18,7 @@ def list_skills():
for f in sorted(SKILLS_DIR.glob("*.md")):
name = f.stem
# Read first line as description
first_line = f.read_text().split("\n")[0].strip("# ").strip()
first_line = f.read_text(encoding="utf-8").split("\n")[0].strip("# ").strip()
typer.echo(f" {name:25s} {first_line}")
@ -29,4 +29,4 @@ def show_skill(name: str = typer.Argument(..., help="Skill name to display")):
if not skill_file.exists():
typer.echo(f"Skill '{name}' not found. Run: agnes skills list", err=True)
raise typer.Exit(1)
typer.echo(skill_file.read_text())
typer.echo(skill_file.read_text(encoding="utf-8"))

View file

@ -20,7 +20,7 @@ def get_server_url() -> str:
def get_token() -> Optional[str]:
token_file = _config_dir() / "token.json"
if token_file.exists():
data = json.loads(token_file.read_text())
data = json.loads(token_file.read_text(encoding="utf-8"))
return data.get("access_token")
return os.environ.get("AGNES_TOKEN")
@ -37,7 +37,7 @@ def save_token(token: str, email: str, role: Optional[str] = None):
token_file.write_text(json.dumps({
"access_token": token,
"email": email,
}, indent=2))
}, indent=2), encoding="utf-8")
def clear_token():
@ -50,20 +50,20 @@ def load_config() -> dict:
config_file = _config_dir() / "config.yaml"
if config_file.exists():
import yaml
return yaml.safe_load(config_file.read_text()) or {}
return yaml.safe_load(config_file.read_text(encoding="utf-8")) or {}
return {}
def get_sync_state() -> dict:
state_file = _config_dir() / "sync_state.json"
if state_file.exists():
return json.loads(state_file.read_text())
return json.loads(state_file.read_text(encoding="utf-8"))
return {}
def save_sync_state(state: dict):
state_file = _config_dir() / "sync_state.json"
state_file.write_text(json.dumps(state, indent=2))
state_file.write_text(json.dumps(state, indent=2), encoding="utf-8")
def save_config(data: dict):
@ -73,6 +73,6 @@ def save_config(data: dict):
config_file = _config_dir() / "config.yaml"
existing = {}
if config_file.exists():
existing = yaml.safe_load(config_file.read_text()) or {}
existing = yaml.safe_load(config_file.read_text(encoding="utf-8")) or {}
existing.update(data)
config_file.write_text(yaml.dump(existing, default_flow_style=False))
config_file.write_text(yaml.dump(existing, default_flow_style=False), encoding="utf-8")

View file

@ -3,11 +3,27 @@
Primary interface for AI agents. Install: uv tool install agnes-the-ai-analyst
"""
import sys
from importlib.metadata import PackageNotFoundError
from importlib.metadata import version as _pkg_version
import typer
# Force UTF-8 on Windows stdout/stderr at import time. The default Windows
# console codepage (cp1250 on cs-CZ, cp1252 on en-US, …) cannot encode the
# Braille spinner glyphs Rich uses for `agnes pull` progress, nor the
# em-dash / accented chars that show up in skill markdown via
# `agnes skills list`. Both crash with UnicodeEncodeError /
# UnicodeDecodeError before any command-level code runs. `reconfigure` is
# a no-op on non-TextIOWrapper streams (pytest capture, pipes wrapped by
# other tooling) — swallow the AttributeError there.
if sys.platform == "win32":
try:
sys.stdout.reconfigure(encoding="utf-8", errors="replace")
sys.stderr.reconfigure(encoding="utf-8", errors="replace")
except (AttributeError, OSError):
pass
from cli.commands.auth import auth_app
from cli.commands.init import init_app
from cli.commands.pull import pull_app

View file

@ -39,7 +39,7 @@ def _meta_path(snap_dir: Path, name: str) -> Path:
def write_meta(snap_dir: Path, meta: SnapshotMeta) -> None:
snap_dir.mkdir(parents=True, exist_ok=True)
with _meta_path(snap_dir, meta.name).open("w") as f:
with _meta_path(snap_dir, meta.name).open("w", encoding="utf-8") as f:
json.dump(asdict(meta), f, indent=2)
@ -47,7 +47,7 @@ def read_meta(snap_dir: Path, name: str) -> Optional[SnapshotMeta]:
p = _meta_path(snap_dir, name)
if not p.exists():
return None
data = json.loads(p.read_text())
data = json.loads(p.read_text(encoding="utf-8"))
return SnapshotMeta(**data)
@ -57,7 +57,7 @@ def list_snapshots(snap_dir: Path) -> list[SnapshotMeta]:
out = []
for meta_file in snap_dir.glob("*.meta.json"):
try:
data = json.loads(meta_file.read_text())
data = json.loads(meta_file.read_text(encoding="utf-8"))
out.append(SnapshotMeta(**data))
except (json.JSONDecodeError, TypeError):
continue
@ -85,7 +85,7 @@ def snapshot_lock(snap_dir: Path):
if _fcntl is None:
raise RuntimeError(
"snapshot_lock requires POSIX fcntl — Windows is not supported. "
"Run `da` from a Mac or Linux machine, or use a WSL shell."
"Run `agnes` from a Mac or Linux machine, or use a WSL shell."
)
snap_dir.mkdir(parents=True, exist_ok=True)
lock_file = snap_dir / ".lock"

View file

@ -88,7 +88,7 @@ def _read_cache() -> Optional[dict]:
if not p.exists():
return None
try:
return json.loads(p.read_text())
return json.loads(p.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError):
return None
@ -97,7 +97,7 @@ def _write_cache(entry: dict) -> None:
p = _cache_path()
try:
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(json.dumps(entry))
p.write_text(json.dumps(entry), encoding="utf-8")
except OSError:
pass # best-effort — cache failure must not break the flow