agnes-the-ai-analyst/app/secrets.py
minasarustamyan 69a1e22cf5
feat(initial-workspace): per-instance agnes init override (#292)
* feat(initial-workspace): per-instance agnes init override

Adds Initial Workspace Template — an admin-configurable per-instance
override for the agnes init analyst workspace. When configured, agnes
init downloads a server-rendered zip from a Git repo the admin registered
and extracts it into the analyst's workspace, fully bypassing Agnes-default
CLAUDE.md / settings.json / hooks / slash commands / AGNES_WORKSPACE.md.

Repo layout convention: only the contents of a top-level `workspace/`
subdirectory ship to analysts; admin docs (README, CI configs) at the
repo root stay in the repo and never reach an analyst. Sync rejects
repos without `workspace/` at root.

Server side:
- src/initial_workspace.py — clone (or fetch+reset), validate, build zip
  with strict path checks and reserved-path rejection
  (workspace/.claude/init-complete reserved by Agnes)
- app/api/initial_workspace.py — admin CRUD + sync endpoint + analyst-
  facing status/zip/applied endpoints; config persists to instance.yaml
  overlay, PAT to .env_overlay
- app/secrets.py — refactor: persist_overlay_token shared helper with
  threading.Lock for .env_overlay writes (closes pre-existing race
  between concurrent marketplaces saves)
- app/web/templates/admin_server_config.html — new "Initial Workspace
  Template" section + modal + Sync/Edit/Delete/Download buttons (matches
  existing cfg-section visual language)

CLI side:
- cli/lib/override.py — single source of truth for is_override_workspace
  sentinel detection
- cli/lib/initial_workspace.py — probe status, safe zip extraction with
  ../absolute/symlink rejection, typed-YES force confirmation
- cli/commands/init.py — override branch (skips Agnes-default workspace
  writes); extended sentinel with override:true, template_source,
  template_sha so future agnes self-upgrade does not auto-refresh hooks
- cli/lib/hooks.py + cli/lib/commands.py — short-circuit on override
  workspaces (install_claude_hooks, install_claude_commands,
  maybe_refresh_claude_hooks)

Audit-event strategy: server writes initial_workspace.fetch_started
inside GET /api/initial-workspace.zip (cannot be spoofed by PAT-holder);
CLI POST /applied writes initial_workspace.applied as best-effort
confirmation. Admin mutations log via the existing _audit pattern.

Tests: 27 server (clone/validate/zip + workspace-subdir convention +
concurrent persist_overlay_token + endpoint shapes + audit rows) + 29
CLI (override sentinel parse + probe fall-through + safe extraction +
YES strictness + hook guards + e2e mocked init).

Risk acceptance — documented in docs/initial-workspace-override.md +
CHANGELOG Internal section so AI reviewers understand the deviations
from defaults are intentional:
- maybe_refresh_claude_hooks deliberately no-ops on override workspaces
- --force on override does NOT back up CLAUDE.md (admin's repo is the
  source of truth)
- .claude/CLAUDE.local.md IS overwritten by override extraction when
  admin's repo ships one

* test+vendor-agnostic: drop Groupon tokens from #292 fixtures + extend admin-gate coverage

Two fixes from the takeover review on #292:

1. **Vendor-agnostic OSS rule**: Replace `Groupon` / `groupon/template`
   tokens in test fixtures with `Acme` / `acme/template` (8 sites in
   test_cli_init_override.py + 1 in test_initial_workspace_api.py).
   Per CLAUDE.md "Vendor-agnostic OSS — no customer-specific content"
   rule: customer-specific tokens don't belong in shipped artifacts,
   even in test fixtures. The pre-existing FoundryAI mentions in
   test_instance_config.py + test_setup_instructions.py are out of
   scope for this PR (didn't introduce them).

2. **Admin-gate coverage gap**: `test_admin_endpoints_require_admin`
   only covered GET /api/admin/initial-workspace + POST .../sync. The
   register-write (POST .../initial-workspace) and delete (DELETE
   .../initial-workspace) endpoints used the same `Depends(require_admin)`
   wiring but had no regression test. Loop now covers all 4 verbs so
   a future refactor that drops the dependency from one endpoint
   fails here instead of silently exposing the write/delete paths to
   any analyst with a PAT.

* release: 0.54.9 — Initial Workspace Template (per-instance agnes init override)

Last commit on the PR per CLAUDE.md hard rule. Patch bump (0.54.8 →
0.54.9) for Mina's Initial Workspace Template feature.

No DB migration (config lives in instance.yaml overlay). No
mandatory operator action — empty default keeps OSS-default
agnes init behavior. Operators wanting full template control link a
Git repo on /admin/server-config → "Initial Workspace Template".
See docs/initial-workspace-override.md for the full
responsibility-transfer contract.

---------

Co-authored-by: Minas Arustamyan <arustamyan.minas@gmail.com>
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
2026-05-13 20:35:01 +00:00

112 lines
3.9 KiB
Python

"""Auto-generate and persist secrets that survive container restarts."""
import logging
import os
import secrets
import threading
from pathlib import Path
from typing import Optional
logger = logging.getLogger(__name__)
def _state_dir() -> Path:
"""Return path to writable state directory.
STATE_DIR env var takes precedence; otherwise defaults to
${DATA_DIR}/state for backward compatibility with deployments
that nest state under the data disk. See docs/state-dir.md.
"""
state = os.environ.get("STATE_DIR", "")
if state:
return Path(state)
return Path(os.environ.get("DATA_DIR", "./data")) / "state"
# Module-level lock guarding read-modify-write of `.env_overlay`. Without it,
# two admins clicking "Save" on /admin/marketplaces (or /admin/server-config
# Initial Workspace section) in the same second can race on the same file:
# both read [X, Y], one writes [X, Y, A], the other writes [X, Y, B] and
# silently clobbers A. The lock is process-local; we rely on the app being
# the sole writer to `${STATE_DIR}/.env_overlay` (no out-of-process tools
# touch it).
_overlay_lock = threading.Lock()
def persist_overlay_token(env_name: str, value: Optional[str]) -> None:
"""Atomically update a key in ``${STATE_DIR}/.env_overlay`` and ``os.environ``.
Single shared helper for every code path that writes a secret to the
overlay file (today: marketplaces PATs + initial-workspace template
PAT). The whole read-merge-write is serialized by ``_overlay_lock``.
``value=None`` or ``value=""`` removes the key from the overlay and the
process env. A non-empty value writes/replaces the key.
Path resolution matches ``app/main.py``'s startup-time read; without
this alignment, PATs persisted under the flat-mount layout
(``STATE_DIR=/data-state``) would land at ``/data/state/.env_overlay``
while the app reads from ``/data-state/.env_overlay``, silently
dropping the token on the next restart.
"""
overlay_path = _state_dir() / ".env_overlay"
with _overlay_lock:
overlay_path.parent.mkdir(parents=True, exist_ok=True)
existing: dict[str, str] = {}
if overlay_path.exists():
for line in overlay_path.read_text().splitlines():
if "=" in line and not line.startswith("#"):
k, v = line.split("=", 1)
existing[k.strip()] = v.strip()
if value:
existing[env_name] = value
os.environ[env_name] = value
else:
existing.pop(env_name, None)
os.environ.pop(env_name, None)
overlay_path.write_text(
"\n".join(f"{k}={v}" for k, v in existing.items())
+ ("\n" if existing else "")
)
try:
overlay_path.chmod(0o600)
except OSError:
pass
def _load_or_generate(env_var: str, file_name: str) -> str:
"""Load secret from env var, or from file, or generate and persist."""
val = os.environ.get(env_var, "")
if val:
return val
secret_path = _state_dir() / file_name
if secret_path.exists():
val = secret_path.read_text().strip()
if val:
return val
logger.warning("Secret file %s is empty, regenerating", secret_path)
secret_path.parent.mkdir(parents=True, exist_ok=True)
val = secrets.token_hex(32)
secret_path.write_text(val)
try:
secret_path.chmod(0o600)
except OSError:
pass # chmod not supported on all platforms (e.g., Windows)
logger.info(
"Auto-generated %s -> %s (set %s in .env to use a fixed value)",
file_name, secret_path, env_var,
)
return val
def get_jwt_secret() -> str:
"""Get JWT secret key from env, file, or auto-generate."""
return _load_or_generate("JWT_SECRET_KEY", ".jwt_secret")
def get_session_secret() -> str:
"""Get session secret from env, file, or auto-generate."""
return _load_or_generate("SESSION_SECRET", ".session_secret")