Four knowledge skills auto-load into the main agent's context when their description matches the work; invokable explicitly via Skill(<name>): - agnes-orchestrator — extract.duckdb ATTACH flow, query_mode semantics, _remote_attach, rebuild lock - agnes-rbac — require_admin vs require_resource_access, ResourceType registration - agnes-connectors — _meta contract, three connector shapes, new-connector checklist - agnes-release-process — CHANGELOG discipline, release-cut, version bump, post-merge auto-rollback Three reviewer subagents fire in parallel at end of PR work; one releaser subagent handles pre-merge release-cut + post-merge tag / GitHub Release: - agnes-reviewer-rules — CHANGELOG bullet, vendor-agnostic scan, AI attribution, commit hygiene (always fires) - agnes-reviewer-rbac — endpoint gates, ResourceType registration (fires on app/api/, app/auth/ diffs) - agnes-reviewer-architecture — extract.duckdb invariants, schema migrations, rebuild lock (fires on src/, connectors/ diffs) - agnes-releaser — Phase 1 pre-merge release-cut commit; Phase 2 post-merge tag + GitHub Release .gitignore un-ignores .claude/agents/ and .claude/skills/ while keeping the rest of .claude/ local-only. CLAUDE.md gets a new 'Specialized agents and skills' section pointing at the two directories. Source of truth for the rules these encode remains CLAUDE.md + docs/RELEASING.md — skills explicitly defer to the master docs on conflict. Design rationale: docs/superpowers/specs/2026-05-15-agnes-agents-design.md Implementation plan: docs/superpowers/plans/2026-05-15-agnes-agents.md
14 KiB
Agnes specialized agents — design
Status: approved (brainstorm), not yet implemented Date: 2026-05-15 Author: zsrotyr
Problem
Working on Agnes through Claude Code, three classes of friction recur:
- Mental model drift. Claude (and humans) forget how Agnes hangs together — the
extract.duckdbcontract, RBAC layering (require_adminvsrequire_resource_access),query_modesemantics (local / remote / materialized), the orchestrator'srebuild()flow. Edits to one part silently break invariants of another, caught only at code review. - Convention enforcement at review. Several CLAUDE.md rules (CHANGELOG bullet, vendor-agnostic OSS, no AI attribution, issue economy, RBAC gates on new endpoints) are easy to forget. Manual review catches most but not all.
- Release-cut workflow. The non-negotiable rules from
docs/RELEASING.mdandCLAUDE.md § Release process— release-cut belongs in the PR that earned it, last commit on the PR, post-merge tag + GitHub Release — evolve and are repetitive enough that doing them by hand each time is error-prone.
The brainstorm explored variants A ("one big architect"), B ("three specialist subagents covering all three concerns"), and C ("knowledge as skills, enforcement/workflow as subagents"). C was selected because the mental model is not delegatable — when the main agent writes code it needs the context in its own window, not in a subagent's.
Approach
Four layers, each with a distinct mechanism and lifecycle:
Layer 1: KNOWLEDGE SKILLS (.claude/skills/agnes-*.md)
- Auto-trigger by description + explicit invocation by main agent
- Load into MAIN agent's context
- Purpose: main agent knows how Agnes works while writing code
Layer 2: REVIEWER SUBAGENTS (.claude/agents/agnes-reviewer-*.md)
- Spawned via Agent tool at end of PR work
- Own context window; return a punch list
- Three specialists, fired in parallel by main agent
Layer 3: RELEASER SUBAGENT (.claude/agents/agnes-releaser.md)
- Spawned via Agent tool pre-merge (phase 1) and post-merge (phase 2)
- Own context window; produces release-cut commit + tag + GitHub Release
Layer 4: PERSONAL (~/.claude/agents/<customer>-deploy.md)
- Outside this repo (operator's home dir, never committed)
- Customer-specific context (customer VMs, gcloud / cloud accounts,
private infra repos, deploy targets)
Skills (Layer 1) share context with the main agent — when Claude edits
src/orchestrator.py, the orchestrator skill is loaded so rules stay "in head"
across delegation. Subagents (Layers 2 and 3) isolate context — review/release
read many files but only a punch list returns to the main conversation.
Cross-layer sharing happens through skills. The agnes-release-process skill is
read by the main agent (planning), by agnes-reviewer-rules (checking), and by
agnes-releaser (executing). Single source of truth.
Components
Layer 1 — Knowledge skills (.claude/skills/)
Four skills. Each file is ~80–120 lines. Files are kept focused; if one grows
past ~150 lines that is a signal to split. Skills do not duplicate CLAUDE.md
content verbatim — they reference (see CLAUDE.md § Access control) so master
rules have one location.
agnes-orchestrator
- Description (triggers auto-spawn): Use when editing
src/orchestrator.py,src/db.py, or anything that producesextract.duckdbinconnectors/*/. Rules for ATTACH flow,query_modesemantics, and whenrebuild()is required. - Body: master view lifecycle in
analytics.duckdb; thread-safety via_rebuild_lock;rebuild_source(name)vs fullrebuild()decision;_remote_attachreattach flow at query time (extension install + token resolution viatoken_envor extension-specific auth path).
agnes-rbac
- Description: Use when adding or changing an endpoint in
app/api/, touchingapp/auth/, or introducing a new resource type. Enforces gate pattern (require_adminvsrequire_resource_access) andResourceTyperegistration. - Body: decision tree for picking a gate (app-level mutation vs
entity-scoped); how to add a new
ResourceType(StrEnum value +ResourceTypeSpecwithlist_blocksdelegate inapp/resource_types.py, no DB migration); when a grant is needed even for reads; god-mode short-circuit for theAdmingroup.
agnes-connectors
- Description: Use when adding a new data source or modifying an existing
extractor in
connectors/. Enforces theextract.duckdbcontract —_metatable,query_modecolumn, parquet layout. - Body: required
_metacolumns (table_name,description,rows,size_bytes,extracted_at,query_mode); when a connector is batch-pull vs remote-attach vs real-time push; how to expose_remote_attachfor remote mode; where the extractor writes (/data/extracts/{source}/).
agnes-release-process
- Description: Use before opening a PR, before merge, or when handling a release-cut. Rules for CHANGELOG bullet, when the release-cut commit belongs in the PR, version bump decision (patch is default; ask before minor).
- Body: CHANGELOG discipline (Added / Changed / Fixed / Removed / Internal
grouping,
**BREAKING**prefix); release-cut decision tree (when it is the last commit on the PR); post-merge sequence (tagvX.Y.Zon merge commit +gh release create); patch / minor / major guidance.
Layer 2 — Reviewer subagents (.claude/agents/)
Each subagent has a standard frontmatter (name, description, tools,
model). All are read-only (no Edit / Write) — they return punch lists, not
code changes.
agnes-reviewer-rules
- When fired: every PR, at end of work, before opening the PR.
- Tools:
Read,Bash(restricted togit diff,git log,grep). - Model: Haiku — fast, mostly text work.
- Input from main agent: PR branch name (or current HEAD), optionally PR draft body.
- Checks:
- CHANGELOG.md has a new bullet under
[Unreleased]iff the PR changes user-visible behavior. Smart, not blind — doc-only PRs typically do not need a bullet; judgment applied based on the diff. - No customer-specific tokens in the diff or PR body (deployment names,
internal hostnames, cloud project IDs, internal SA emails). The
agnes-reviewer-rulessubagent loads the operator's token list fromCLAUDE.local.md(vendor_tokens:entry) rather than inlining customer names in the public OSS repo. - Commits do not contain
Co-Authored-By: Claudeor any AI attribution; PR body is the same. - Issue-economy red flags — filing follow-up issues rather than fix-it-now or close-it-as-moot.
- Commit messages are clean and concise per project convention.
- CHANGELOG.md has a new bullet under
- Consults
agnes-release-processskill for the release-cut implication of this PR. - Output: Done / Missing / Warning punch list.
agnes-reviewer-rbac
- When fired: when the diff touches
app/api/,app/auth/, orapp/resource_types.py. - Tools:
Read,Grep,Bash(read-only). - Model: Sonnet — needs to understand the auth flow.
- Checks:
- New
@router.get/post/...handlers haveDepends(require_admin)orDepends(require_resource_access(ResourceType.X, "...")). - New
ResourceTypevalues have aResourceTypeSpecregistration inapp/resource_types.py.
- New
- Consults
agnes-rbacskill for the gate decision rules. - Output: per-endpoint flag — gated correctly / missing gate / ambiguous.
agnes-reviewer-architecture
- When fired: when the diff touches
src/orchestrator.py,src/db.py,connectors/*/extractor.py, or adds a schema migration. - Tools:
Read,Grep,Bash(read-only). - Model: Sonnet.
- Checks:
- Extractor changes preserve
_metatable contract. - Remote-attach changes preserve
_remote_attachcolumns (alias,extension,url,token_env). - Schema bumps in
src/db.pyinclude thevN-1 → vNmigration step, a CHANGELOG note, and documentation references that reflect the new version. - Changes to
rebuild()/rebuild_source()hold_rebuild_lockon all write paths.
- Extractor changes preserve
- Consults
agnes-orchestratorandagnes-connectorsskills. - Output: per-invariant punch list — holds / broken / unclear.
Layer 3 — Releaser subagent (.claude/agents/agnes-releaser.md)
- Tools:
Read,Edit,Bash(includinggh,git). - Model: Sonnet.
Two phases, each invoked explicitly by the user (never auto-fired).
Phase 1 — pre-merge. User says "ready to merge".
- Consults
agnes-release-processskill. - Runs
git logsince the last tag and inspects scope. - Decision tree: patch (default) vs minor (asks user) vs major (requires explicit confirmation).
- If this PR lands the only
[Unreleased]content since the last release, prepares the last commit on the PR: bumppyproject.toml, rename[Unreleased]to[X.Y.Z] - YYYY-MM-DD, add a new empty[Unreleased]. - Pushes the prepared commit. Does not merge — the user merges via
gh pr mergethemselves.
Phase 2 — post-merge. User says "tag it".
- Verifies the merge commit contains the release-cut diff.
git tag vX.Y.Z <merge-sha>andgit push origin vX.Y.Z.gh release create vX.Y.Zwith body extracted from the[X.Y.Z]section of CHANGELOG.- Returns the GitHub Release URL.
Never does: merges the PR (high-blast-radius); force-pushes; amends published commits.
Layer 4 — Personal (~/.claude/agents/<customer>-deploy.md)
Outside this repo. Customer-specific content lives here so the OSS repo stays
vendor-agnostic per CLAUDE.md § Vendor-agnostic OSS. Operators write their
own Layer-4 agent against their deployment.
- Tools:
Read,Bash(including the operator's cloud CLI —gcloud,aws, etc. — plusgh,git push). - Model: Sonnet.
- Knows (each item is customer-specific; concrete tokens stay in the
operator's home dir /
CLAUDE.local.md, never committed):- VM ↔ project ↔ zone ↔ cloud-account mapping (one row per deployment).
- Per-environment deploy ritual (e.g., force-push to
<branch>to deploy to<host>). - Cross-references to private infra repos and pinned module tag.
- Default cloud account selection rules; explicit
--account=overrides. - Any pre-existing legacy systems still running alongside the deployment.
- Does not touch the OSS repo, push to public branches, or participate in PR review.
End-to-end PR flow
1. User asks for feature X.
2. Main agent creates a worktree (per CLAUDE.local.md), invokes relevant
knowledge skills based on what the feature touches:
- src/orchestrator.py → Skill(agnes-orchestrator)
- app/api/ + app/auth/ → Skill(agnes-rbac)
- connectors/ → Skill(agnes-connectors)
3. Implementation + tests + CHANGELOG bullet (the agnes-release-process skill
reminded the main agent it is needed).
4. Before opening the PR, main agent fires reviewers in parallel — one message,
multiple Agent tool calls:
- Agent(agnes-reviewer-rules) — always
- Agent(agnes-reviewer-rbac) — only if diff touched app/api or app/auth
- Agent(agnes-reviewer-architecture) — only if diff touched src/ or connectors/
5. Main agent aggregates the punch lists, fixes findings, opens the PR.
6. User says "ready to merge" → Agent(agnes-releaser, phase 1) prepares the
release-cut decision and the last commit on the PR.
7. User confirms the version → releaser pushes the prepared commit. User
merges via `gh pr merge` manually.
8. User says "tag it" → Agent(agnes-releaser, phase 2) creates the tag and the
GitHub Release.
A separate flow handles personal dev-VM deploys (outside the OSS PR cycle):
User says "push to my VM" → main agent invokes Agent(<customer>-deploy) →
that personal-layer agent knows what "push to my VM" means in the operator's
environment (typically a force-push to a deploy-target branch), confirms with
the user (force-push is destructive), pushes.
What lands in the repo
.claude/
├── agents/
│ ├── agnes-reviewer-rules.md
│ ├── agnes-reviewer-rbac.md
│ ├── agnes-reviewer-architecture.md
│ └── agnes-releaser.md
└── skills/
├── agnes-orchestrator.md
├── agnes-rbac.md
├── agnes-connectors.md
└── agnes-release-process.md
docs/superpowers/specs/
└── 2026-05-15-agnes-agents-design.md (this document)
CLAUDE.md gets a short paragraph under "Project conventions" pointing at
.claude/agents/ and .claude/skills/, noting that subagents and skills exist
and how to invoke them.
The personal layer (~/.claude/agents/<customer>-deploy.md) is written
separately by each operator and is not part of the in-repo change set.
Non-goals
- Not a Claude Code "team". The architecture, review, and release agents
work sequentially and do not message each other; a team would add the
experimental flag, more tokens, file-conflict risk, and no functional gain.
If
agnes-reviewer-*ever needs to coordinate across four+ parallel reviews, re-evaluate. - Not a pre-commit hook replacement. Mechanical checks (CHANGELOG bullet
present, AI-attribution scan) could be a
pre-commithook in addition; the reviewer agent provides judgment-level checks the hook cannot. - No auto-merge.
agnes-releasernever runsgh pr merge. Merge is a visible, user-controlled action.
Open questions
- Auto-trigger reliability for knowledge skills. Claude Code skill
auto-spawn is description-driven and not always reliable in large skill
catalogs. Mitigation: a one-line pointer at the top of
CLAUDE.md("when touching X, invoke skill Y") and explicit invocation in the main agent's planning step. If reliability is still low after a few weeks, fall back to aSessionStarthook that lists Agnes-specific skills as a reminder. - Schema migration as a separate skill?
agnes-orchestratorcoverssrc/db.pymigration patterns. If migration gotchas grow (versioning beyondvN, multi-step migrations, data backfills), split into a fifth skillagnes-schema-migration.
Implementation plan
The follow-up step is to invoke the writing-plans skill to turn this spec
into a sequenced implementation plan covering: skills first (lowest risk,
testable in isolation), then reviewers, then the releaser, then a CLAUDE.md
pointer. Personal layer lands separately, outside the repo.