Sweep operator runbooks (docs/QUICKSTART, docs/HEADLESS_USAGE,
docs/architecture, docs/sample-data, docs/agent-workspace-prompt,
docs/metrics/metrics.yml, dev_docs/server, dev_docs/disaster-recovery),
the corporate-memory service README, the jira connector README + backfill
scripts, the deploy skill, and test docstrings. Replaces `da sync` →
`agnes pull`, `da analyst setup` → `agnes init`, `da metrics ...` →
`agnes catalog --metrics` / `agnes admin metrics ...`, `da fetch` →
`agnes snapshot create`, plus the matching docker-compose admin
invocations.
Vendor-specific `/opt/data-analyst/` install paths in jira backfill /
consistency scripts and operator docs are replaced with the
placeholder `<install-dir>` and a new `AGNES_ENV_FILE` env-var override
that lets a deployment inject its actual install path without a code
change. Aligns with the OSS vendor-agnostic policy in CLAUDE.md.
CHANGELOG `### Internal` entry summarizes the audit and reaffirms the
intentional stale-marker tuples (`_LEGACY_STRINGS`, `_OUR_COMMAND_MARKERS`)
that must keep referencing `da sync` / `da fetch` / etc. for hook upgrade
and override-detection logic.
Caught by my own broader test scope after Devin fixes — three test files
asserted on user-visible strings that were renamed by the bootstrap PR
but the assertions weren't updated:
- tests/test_api_query_guardrail.py:110 — asserted `da fetch in suggestion`
on /api/query 400 response. Renamed to `agnes snapshot create`.
- tests/test_query_materialized_error_message.py:56 — asserted `da sync`
in materialized-not-yet error detail. Renamed to `agnes pull`.
- tests/test_cli_error_render.py:71 — fixture data + assertion both
carried `da fetch`. Updated to `agnes snapshot create`.
Plus an actual content miss: docs/setup/claude_settings.json (a template
shipped to operators) still installed `da sync` / `da sync --upload-only`
hooks. The companion test file (tests/test_setup_hooks_template.py) was
asserting that legacy state. Updated both:
- Template hooks: `agnes pull --quiet` / `agnes push --quiet`
- Test assertions + function name match the new commands
E2E sub-agent finding: `da query --remote "SELECT * FROM <id>"` against a
materialized table that hasn't yet been rebuilt in the server's
analytics.duckdb returns a confusing DuckDB "Table does not exist"
message even though the table is in the registry. Materialized rows
produce parquets at `${DATA_DIR}/extracts/<source>/data/<id>.parquet`,
but the orchestrator's master-view creation is `_meta`-driven — fresh
instances or pre-tick states have the registry row without a
corresponding view, so analysts hit the bare "does not exist" with no
path forward.
Improve the error rendering in `app/api/query.py:execute_query`. When
DuckDB raises a "table does not exist" error, scan the registry for any
`query_mode='materialized'` row whose id or name appears in the failed
SQL. On a hit, return a 400 whose detail names the table, explains the
materialize state, and offers two concrete next steps:
1. Run `da sync` (or wait for the scheduler tick / hit
POST /api/sync/trigger) to materialize the parquet, OR
2. Query the source directly via the catalog alias when the registry row
carries bucket+source_table (e.g. `bq."dataset"."table"` for BigQuery,
`kbc."bucket"."table"` for Keboola).
Detection is bounded — the registry round-trip only fires when DuckDB's
error mentions a missing table, so happy-path queries pay no cost.
Non-materialized unknowns fall through to DuckDB's raw error.
2 new tests: materialized id surfaces the hint with the bucket+source_table
payload; unknown table falls back to the generic error path with no false
positive on the new hint.