Four knowledge skills auto-load into the main agent's context when their description matches the work; invokable explicitly via Skill(<name>): - agnes-orchestrator — extract.duckdb ATTACH flow, query_mode semantics, _remote_attach, rebuild lock - agnes-rbac — require_admin vs require_resource_access, ResourceType registration - agnes-connectors — _meta contract, three connector shapes, new-connector checklist - agnes-release-process — CHANGELOG discipline, release-cut, version bump, post-merge auto-rollback Three reviewer subagents fire in parallel at end of PR work; one releaser subagent handles pre-merge release-cut + post-merge tag / GitHub Release: - agnes-reviewer-rules — CHANGELOG bullet, vendor-agnostic scan, AI attribution, commit hygiene (always fires) - agnes-reviewer-rbac — endpoint gates, ResourceType registration (fires on app/api/, app/auth/ diffs) - agnes-reviewer-architecture — extract.duckdb invariants, schema migrations, rebuild lock (fires on src/, connectors/ diffs) - agnes-releaser — Phase 1 pre-merge release-cut commit; Phase 2 post-merge tag + GitHub Release .gitignore un-ignores .claude/agents/ and .claude/skills/ while keeping the rest of .claude/ local-only. CLAUDE.md gets a new 'Specialized agents and skills' section pointing at the two directories. Source of truth for the rules these encode remains CLAUDE.md + docs/RELEASING.md — skills explicitly defer to the master docs on conflict. Design rationale: docs/superpowers/specs/2026-05-15-agnes-agents-design.md Implementation plan: docs/superpowers/plans/2026-05-15-agnes-agents.md
3.2 KiB
| name | description |
|---|---|
| agnes-connectors | Rules for the extract.duckdb contract every data source must produce — the _meta table, the _remote_attach mechanism for remote-mode tables, parquet layout, and the pattern for adding a new connector. Use when adding a new data source or modifying an existing extractor in connectors/. |
Agnes connectors — the extract.duckdb contract
Every data source produces the same output:
/data/extracts/{source_name}/
├── extract.duckdb ← _meta table + views
└── data/ ← parquet files (local sources only)
See CLAUDE.md § Architecture: extract.duckdb Contract and
docs/architecture.md.
Required _meta table
Every extract.duckdb MUST contain a _meta table with these columns:
| column | type | meaning |
|---|---|---|
table_name |
VARCHAR | name used in views |
description |
VARCHAR | human-readable description |
rows |
BIGINT | row count at extraction time |
size_bytes |
BIGINT | parquet size for local mode, 0 for remote |
extracted_at |
TIMESTAMP | extraction time |
query_mode |
VARCHAR | one of local, remote, materialized |
If _meta is missing or malformed, SyncOrchestrator.rebuild() skips the
source with an error logged. Tests for new connectors MUST assert _meta is
well-formed.
Four connector shapes
- Batch pull (Keboola,
query_mode='local') — DuckDB extension downloads data to parquet, scheduled. Extractor inconnectors/<name>/extractor.py. - Remote attach (BigQuery,
query_mode='remote') — DuckDB BQ extension, no download. Queries hit the upstream at query time. Requires_remote_attach. - Materialized SQL (
query_mode='materialized') — scheduler runs admin-registered SQL through DuckDB and writes the result to a parquet under/data/extracts/<source>/data/. Distributed via the same manifest +agnes pullflow aslocal. BigQuery cost guardrail:data_source.bigquery.max_bytes_per_materialize(default 10 GiB;0disables). - Real-time push (Jira) — webhooks update parquets incrementally; the
webhook handler triggers
rebuild_source('jira').
_remote_attach table (remote mode only)
For each remote-mode table in _meta, the extractor writes a row in
_remote_attach with alias, extension, url, token_env. See the
agnes-orchestrator skill for how the orchestrator consumes it.
Adding a new connector — checklist
- Create
connectors/<name>/extractor.pythat emitsextract.duckdb(+data/*.parquetif local) into/data/extracts/<name>/. - Populate
_metawith one row per table. - If any table is
query_mode='remote', populate_remote_attach. - Register the connector type in the catalog (search for existing
source_typevalues to follow the pattern). - Add a fixture-based test that runs the extractor against a fixture
upstream and asserts
_metais complete. - CHANGELOG bullet under
Addedperagnes-release-process.
Stable infrastructure — do NOT modify
connectors/jira/file_lock.py. (connectors/jira/transform.py was
previously off-limits but as of 0.54.19 is no longer; it remains
sensitive — touch only with end-to-end understanding of the
JSON-overlay / parquet-rewrite pipeline.)