Four knowledge skills auto-load into the main agent's context when their description matches the work; invokable explicitly via Skill(<name>): - agnes-orchestrator — extract.duckdb ATTACH flow, query_mode semantics, _remote_attach, rebuild lock - agnes-rbac — require_admin vs require_resource_access, ResourceType registration - agnes-connectors — _meta contract, three connector shapes, new-connector checklist - agnes-release-process — CHANGELOG discipline, release-cut, version bump, post-merge auto-rollback Three reviewer subagents fire in parallel at end of PR work; one releaser subagent handles pre-merge release-cut + post-merge tag / GitHub Release: - agnes-reviewer-rules — CHANGELOG bullet, vendor-agnostic scan, AI attribution, commit hygiene (always fires) - agnes-reviewer-rbac — endpoint gates, ResourceType registration (fires on app/api/, app/auth/ diffs) - agnes-reviewer-architecture — extract.duckdb invariants, schema migrations, rebuild lock (fires on src/, connectors/ diffs) - agnes-releaser — Phase 1 pre-merge release-cut commit; Phase 2 post-merge tag + GitHub Release .gitignore un-ignores .claude/agents/ and .claude/skills/ while keeping the rest of .claude/ local-only. CLAUDE.md gets a new 'Specialized agents and skills' section pointing at the two directories. Source of truth for the rules these encode remains CLAUDE.md + docs/RELEASING.md — skills explicitly defer to the master docs on conflict. Design rationale: docs/superpowers/specs/2026-05-15-agnes-agents-design.md Implementation plan: docs/superpowers/plans/2026-05-15-agnes-agents.md
313 lines
14 KiB
Markdown
313 lines
14 KiB
Markdown
# Agnes specialized agents — design
|
||
|
||
**Status:** approved (brainstorm), not yet implemented
|
||
**Date:** 2026-05-15
|
||
**Author:** zsrotyr
|
||
|
||
## Problem
|
||
|
||
Working on Agnes through Claude Code, three classes of friction recur:
|
||
|
||
1. **Mental model drift.** Claude (and humans) forget how Agnes hangs together — the `extract.duckdb` contract, RBAC layering (`require_admin` vs `require_resource_access`), `query_mode` semantics (local / remote / materialized), the orchestrator's `rebuild()` flow. Edits to one part silently break invariants of another, caught only at code review.
|
||
2. **Convention enforcement at review.** Several CLAUDE.md rules (CHANGELOG bullet, vendor-agnostic OSS, no AI attribution, issue economy, RBAC gates on new endpoints) are easy to forget. Manual review catches most but not all.
|
||
3. **Release-cut workflow.** The non-negotiable rules from `docs/RELEASING.md` and `CLAUDE.md § Release process` — release-cut belongs in the PR that earned it, last commit on the PR, post-merge tag + GitHub Release — evolve and are repetitive enough that doing them by hand each time is error-prone.
|
||
|
||
The brainstorm explored variants A ("one big architect"), B ("three specialist subagents covering all three concerns"), and C ("knowledge as skills, enforcement/workflow as subagents"). C was selected because the mental model is *not delegatable* — when the main agent writes code it needs the context in its own window, not in a subagent's.
|
||
|
||
## Approach
|
||
|
||
Four layers, each with a distinct mechanism and lifecycle:
|
||
|
||
```
|
||
Layer 1: KNOWLEDGE SKILLS (.claude/skills/agnes-*.md)
|
||
- Auto-trigger by description + explicit invocation by main agent
|
||
- Load into MAIN agent's context
|
||
- Purpose: main agent knows how Agnes works while writing code
|
||
|
||
Layer 2: REVIEWER SUBAGENTS (.claude/agents/agnes-reviewer-*.md)
|
||
- Spawned via Agent tool at end of PR work
|
||
- Own context window; return a punch list
|
||
- Three specialists, fired in parallel by main agent
|
||
|
||
Layer 3: RELEASER SUBAGENT (.claude/agents/agnes-releaser.md)
|
||
- Spawned via Agent tool pre-merge (phase 1) and post-merge (phase 2)
|
||
- Own context window; produces release-cut commit + tag + GitHub Release
|
||
|
||
Layer 4: PERSONAL (~/.claude/agents/<customer>-deploy.md)
|
||
- Outside this repo (operator's home dir, never committed)
|
||
- Customer-specific context (customer VMs, gcloud / cloud accounts,
|
||
private infra repos, deploy targets)
|
||
```
|
||
|
||
Skills (Layer 1) share context with the main agent — when Claude edits
|
||
`src/orchestrator.py`, the orchestrator skill is loaded so rules stay "in head"
|
||
across delegation. Subagents (Layers 2 and 3) isolate context — review/release
|
||
read many files but only a punch list returns to the main conversation.
|
||
|
||
Cross-layer sharing happens through skills. The `agnes-release-process` skill is
|
||
read by the main agent (planning), by `agnes-reviewer-rules` (checking), and by
|
||
`agnes-releaser` (executing). Single source of truth.
|
||
|
||
## Components
|
||
|
||
### Layer 1 — Knowledge skills (`.claude/skills/`)
|
||
|
||
Four skills. Each file is ~80–120 lines. Files are kept focused; if one grows
|
||
past ~150 lines that is a signal to split. Skills do not duplicate `CLAUDE.md`
|
||
content verbatim — they reference (`see CLAUDE.md § Access control`) so master
|
||
rules have one location.
|
||
|
||
#### `agnes-orchestrator`
|
||
|
||
- **Description (triggers auto-spawn):** Use when editing `src/orchestrator.py`,
|
||
`src/db.py`, or anything that produces `extract.duckdb` in `connectors/*/`.
|
||
Rules for ATTACH flow, `query_mode` semantics, and when `rebuild()` is
|
||
required.
|
||
- **Body:** master view lifecycle in `analytics.duckdb`; thread-safety via
|
||
`_rebuild_lock`; `rebuild_source(name)` vs full `rebuild()` decision;
|
||
`_remote_attach` reattach flow at query time (extension install + token
|
||
resolution via `token_env` or extension-specific auth path).
|
||
|
||
#### `agnes-rbac`
|
||
|
||
- **Description:** Use when adding or changing an endpoint in `app/api/`,
|
||
touching `app/auth/`, or introducing a new resource type. Enforces gate
|
||
pattern (`require_admin` vs `require_resource_access`) and `ResourceType`
|
||
registration.
|
||
- **Body:** decision tree for picking a gate (app-level mutation vs
|
||
entity-scoped); how to add a new `ResourceType` (StrEnum value +
|
||
`ResourceTypeSpec` with `list_blocks` delegate in `app/resource_types.py`, no
|
||
DB migration); when a grant is needed even for reads; god-mode short-circuit
|
||
for the `Admin` group.
|
||
|
||
#### `agnes-connectors`
|
||
|
||
- **Description:** Use when adding a new data source or modifying an existing
|
||
extractor in `connectors/`. Enforces the `extract.duckdb` contract — `_meta`
|
||
table, `query_mode` column, parquet layout.
|
||
- **Body:** required `_meta` columns (`table_name`, `description`, `rows`,
|
||
`size_bytes`, `extracted_at`, `query_mode`); when a connector is batch-pull
|
||
vs remote-attach vs real-time push; how to expose `_remote_attach` for remote
|
||
mode; where the extractor writes (`/data/extracts/{source}/`).
|
||
|
||
#### `agnes-release-process`
|
||
|
||
- **Description:** Use before opening a PR, before merge, or when handling a
|
||
release-cut. Rules for CHANGELOG bullet, when the release-cut commit belongs
|
||
in the PR, version bump decision (patch is default; ask before minor).
|
||
- **Body:** CHANGELOG discipline (Added / Changed / Fixed / Removed / Internal
|
||
grouping, `**BREAKING**` prefix); release-cut decision tree (when it is the
|
||
last commit on the PR); post-merge sequence (tag `vX.Y.Z` on merge commit +
|
||
`gh release create`); patch / minor / major guidance.
|
||
|
||
### Layer 2 — Reviewer subagents (`.claude/agents/`)
|
||
|
||
Each subagent has a standard frontmatter (`name`, `description`, `tools`,
|
||
`model`). All are read-only (no `Edit` / `Write`) — they return punch lists, not
|
||
code changes.
|
||
|
||
#### `agnes-reviewer-rules`
|
||
|
||
- **When fired:** every PR, at end of work, before opening the PR.
|
||
- **Tools:** `Read`, `Bash` (restricted to `git diff`, `git log`, `grep`).
|
||
- **Model:** Haiku — fast, mostly text work.
|
||
- **Input from main agent:** PR branch name (or current HEAD), optionally PR
|
||
draft body.
|
||
- **Checks:**
|
||
- CHANGELOG.md has a new bullet under `[Unreleased]` *iff* the PR changes
|
||
user-visible behavior. Smart, not blind — doc-only PRs typically do not
|
||
need a bullet; judgment applied based on the diff.
|
||
- No customer-specific tokens in the diff or PR body (deployment names,
|
||
internal hostnames, cloud project IDs, internal SA emails). The
|
||
`agnes-reviewer-rules` subagent loads the operator's token list from
|
||
`CLAUDE.local.md` (`vendor_tokens:` entry) rather than inlining
|
||
customer names in the public OSS repo.
|
||
- Commits do not contain `Co-Authored-By: Claude` or any AI attribution; PR
|
||
body is the same.
|
||
- Issue-economy red flags — filing follow-up issues rather than fix-it-now or
|
||
close-it-as-moot.
|
||
- Commit messages are clean and concise per project convention.
|
||
- Consults `agnes-release-process` skill for the release-cut implication of
|
||
this PR.
|
||
- **Output:** Done / Missing / Warning punch list.
|
||
|
||
#### `agnes-reviewer-rbac`
|
||
|
||
- **When fired:** when the diff touches `app/api/`, `app/auth/`, or
|
||
`app/resource_types.py`.
|
||
- **Tools:** `Read`, `Grep`, `Bash` (read-only).
|
||
- **Model:** Sonnet — needs to understand the auth flow.
|
||
- **Checks:**
|
||
- New `@router.get/post/...` handlers have `Depends(require_admin)` or
|
||
`Depends(require_resource_access(ResourceType.X, "..."))`.
|
||
- New `ResourceType` values have a `ResourceTypeSpec` registration in
|
||
`app/resource_types.py`.
|
||
- Consults `agnes-rbac` skill for the gate decision rules.
|
||
- **Output:** per-endpoint flag — gated correctly / missing gate / ambiguous.
|
||
|
||
#### `agnes-reviewer-architecture`
|
||
|
||
- **When fired:** when the diff touches `src/orchestrator.py`, `src/db.py`,
|
||
`connectors/*/extractor.py`, or adds a schema migration.
|
||
- **Tools:** `Read`, `Grep`, `Bash` (read-only).
|
||
- **Model:** Sonnet.
|
||
- **Checks:**
|
||
- Extractor changes preserve `_meta` table contract.
|
||
- Remote-attach changes preserve `_remote_attach` columns (`alias`,
|
||
`extension`, `url`, `token_env`).
|
||
- Schema bumps in `src/db.py` include the `vN-1 → vN` migration step, a
|
||
CHANGELOG note, and documentation references that reflect the new version.
|
||
- Changes to `rebuild()` / `rebuild_source()` hold `_rebuild_lock` on all
|
||
write paths.
|
||
- Consults `agnes-orchestrator` and `agnes-connectors` skills.
|
||
- **Output:** per-invariant punch list — holds / broken / unclear.
|
||
|
||
### Layer 3 — Releaser subagent (`.claude/agents/agnes-releaser.md`)
|
||
|
||
- **Tools:** `Read`, `Edit`, `Bash` (including `gh`, `git`).
|
||
- **Model:** Sonnet.
|
||
|
||
Two phases, each invoked explicitly by the user (never auto-fired).
|
||
|
||
**Phase 1 — pre-merge.** User says "ready to merge".
|
||
|
||
1. Consults `agnes-release-process` skill.
|
||
2. Runs `git log` since the last tag and inspects scope.
|
||
3. Decision tree: patch (default) vs minor (asks user) vs major (requires
|
||
explicit confirmation).
|
||
4. If this PR lands the only `[Unreleased]` content since the last release,
|
||
prepares the last commit on the PR: bump `pyproject.toml`, rename
|
||
`[Unreleased]` to `[X.Y.Z] - YYYY-MM-DD`, add a new empty `[Unreleased]`.
|
||
5. Pushes the prepared commit. **Does not merge** — the user merges via
|
||
`gh pr merge` themselves.
|
||
|
||
**Phase 2 — post-merge.** User says "tag it".
|
||
|
||
1. Verifies the merge commit contains the release-cut diff.
|
||
2. `git tag vX.Y.Z <merge-sha>` and `git push origin vX.Y.Z`.
|
||
3. `gh release create vX.Y.Z` with body extracted from the `[X.Y.Z]` section of
|
||
CHANGELOG.
|
||
4. Returns the GitHub Release URL.
|
||
|
||
**Never does:** merges the PR (high-blast-radius); force-pushes; amends
|
||
published commits.
|
||
|
||
### Layer 4 — Personal (`~/.claude/agents/<customer>-deploy.md`)
|
||
|
||
Outside this repo. Customer-specific content lives here so the OSS repo stays
|
||
vendor-agnostic per `CLAUDE.md § Vendor-agnostic OSS`. Operators write their
|
||
own Layer-4 agent against their deployment.
|
||
|
||
- **Tools:** `Read`, `Bash` (including the operator's cloud CLI — `gcloud`,
|
||
`aws`, etc. — plus `gh`, `git push`).
|
||
- **Model:** Sonnet.
|
||
- **Knows** (each item is customer-specific; concrete tokens stay in the
|
||
operator's home dir / `CLAUDE.local.md`, never committed):
|
||
- VM ↔ project ↔ zone ↔ cloud-account mapping (one row per deployment).
|
||
- Per-environment deploy ritual (e.g., force-push to `<branch>` to deploy
|
||
to `<host>`).
|
||
- Cross-references to private infra repos and pinned module tag.
|
||
- Default cloud account selection rules; explicit `--account=` overrides.
|
||
- Any pre-existing legacy systems still running alongside the deployment.
|
||
- **Does not** touch the OSS repo, push to public branches, or participate in
|
||
PR review.
|
||
|
||
## End-to-end PR flow
|
||
|
||
```
|
||
1. User asks for feature X.
|
||
|
||
2. Main agent creates a worktree (per CLAUDE.local.md), invokes relevant
|
||
knowledge skills based on what the feature touches:
|
||
- src/orchestrator.py → Skill(agnes-orchestrator)
|
||
- app/api/ + app/auth/ → Skill(agnes-rbac)
|
||
- connectors/ → Skill(agnes-connectors)
|
||
|
||
3. Implementation + tests + CHANGELOG bullet (the agnes-release-process skill
|
||
reminded the main agent it is needed).
|
||
|
||
4. Before opening the PR, main agent fires reviewers in parallel — one message,
|
||
multiple Agent tool calls:
|
||
- Agent(agnes-reviewer-rules) — always
|
||
- Agent(agnes-reviewer-rbac) — only if diff touched app/api or app/auth
|
||
- Agent(agnes-reviewer-architecture) — only if diff touched src/ or connectors/
|
||
|
||
5. Main agent aggregates the punch lists, fixes findings, opens the PR.
|
||
|
||
6. User says "ready to merge" → Agent(agnes-releaser, phase 1) prepares the
|
||
release-cut decision and the last commit on the PR.
|
||
|
||
7. User confirms the version → releaser pushes the prepared commit. User
|
||
merges via `gh pr merge` manually.
|
||
|
||
8. User says "tag it" → Agent(agnes-releaser, phase 2) creates the tag and the
|
||
GitHub Release.
|
||
```
|
||
|
||
A separate flow handles personal dev-VM deploys (outside the OSS PR cycle):
|
||
|
||
```
|
||
User says "push to my VM" → main agent invokes Agent(<customer>-deploy) →
|
||
that personal-layer agent knows what "push to my VM" means in the operator's
|
||
environment (typically a force-push to a deploy-target branch), confirms with
|
||
the user (force-push is destructive), pushes.
|
||
```
|
||
|
||
## What lands in the repo
|
||
|
||
```
|
||
.claude/
|
||
├── agents/
|
||
│ ├── agnes-reviewer-rules.md
|
||
│ ├── agnes-reviewer-rbac.md
|
||
│ ├── agnes-reviewer-architecture.md
|
||
│ └── agnes-releaser.md
|
||
└── skills/
|
||
├── agnes-orchestrator.md
|
||
├── agnes-rbac.md
|
||
├── agnes-connectors.md
|
||
└── agnes-release-process.md
|
||
|
||
docs/superpowers/specs/
|
||
└── 2026-05-15-agnes-agents-design.md (this document)
|
||
```
|
||
|
||
`CLAUDE.md` gets a short paragraph under "Project conventions" pointing at
|
||
`.claude/agents/` and `.claude/skills/`, noting that subagents and skills exist
|
||
and how to invoke them.
|
||
|
||
The personal layer (`~/.claude/agents/<customer>-deploy.md`) is written
|
||
separately by each operator and is not part of the in-repo change set.
|
||
|
||
## Non-goals
|
||
|
||
- **Not a Claude Code "team".** The architecture, review, and release agents
|
||
work sequentially and do not message each other; a team would add the
|
||
experimental flag, more tokens, file-conflict risk, and no functional gain.
|
||
If `agnes-reviewer-*` ever needs to coordinate across four+ parallel reviews,
|
||
re-evaluate.
|
||
- **Not a pre-commit hook replacement.** Mechanical checks (CHANGELOG bullet
|
||
present, AI-attribution scan) could be a `pre-commit` hook in addition; the
|
||
reviewer agent provides judgment-level checks the hook cannot.
|
||
- **No auto-merge.** `agnes-releaser` never runs `gh pr merge`. Merge is a
|
||
visible, user-controlled action.
|
||
|
||
## Open questions
|
||
|
||
- **Auto-trigger reliability for knowledge skills.** Claude Code skill
|
||
auto-spawn is description-driven and not always reliable in large skill
|
||
catalogs. Mitigation: a one-line pointer at the top of `CLAUDE.md`
|
||
("when touching X, invoke skill Y") and explicit invocation in the main
|
||
agent's planning step. If reliability is still low after a few weeks, fall
|
||
back to a `SessionStart` hook that lists Agnes-specific skills as a reminder.
|
||
- **Schema migration as a separate skill?** `agnes-orchestrator` covers
|
||
`src/db.py` migration patterns. If migration gotchas grow (versioning beyond
|
||
`vN`, multi-step migrations, data backfills), split into a fifth skill
|
||
`agnes-schema-migration`.
|
||
|
||
## Implementation plan
|
||
|
||
The follow-up step is to invoke the `writing-plans` skill to turn this spec
|
||
into a sequenced implementation plan covering: skills first (lowest risk,
|
||
testable in isolation), then reviewers, then the releaser, then a CLAUDE.md
|
||
pointer. Personal layer lands separately, outside the repo.
|