Commit graph

7 commits

Author SHA1 Message Date
ZdenekSrotyr
7d868159f2 address Devin Review on PR #264
BUG_0001 (red): config/claude_md_template.txt is the Jinja2 source for
every analyst workspace's CLAUDE.md served via /api/welcome
(src/claude_md.py). It still instructed the agent to use the removed
--register-bq flag in 6 places — defeating the point of the PR for
anyone who ran agnes init after merge. Rewritten:
- ASCII routing diagram: "join with a local table" now points to
  "agnes snapshot create the remote side, then join locally"
- "Three patterns" table → "Two patterns" (snapshot create + --remote)
- "Hybrid query example" rewritten as snapshot-create + local join,
  with --remote called out as the escape hatch when the remote side
  is too big to snapshot
- "When the table isn't in agnes catalog" — drop the ad-hoc
  --register-bq path; admins register, no analyst-side workaround
- Footer cross-ref drops "hybrid-query examples"

BUG_0002 (yellow): cli/error_render.py docstring line 7 said "All
three previously flattened..." after I had already reduced "Three CLI
paths" → "Two CLI paths" on line 3. "All three" → "Both".
2026-05-12 18:18:13 +02:00
ZdenekSrotyr
79b55b6ff3 remove agnes query --register-bq from client CLI
The flag ran RemoteQueryEngine in-process on the caller's machine and
required local BigQuery credentials (BIGQUERY_PROJECT + ADC). Analysts
don't have those, so calling --register-bq from an analyst workspace
surfaced as a confusing not_configured error chain ("Could not load
static instance.yaml" + "BigQuery project not configured"). An agent
following CLAUDE.md's hybrid-queries guidance would land in exactly
that trap.

The underlying engine was originally designed server-side (commit
d180b201, "Step 28: Remote query architecture"); the CLI port (commit
d605e7d9) silently assumed parity with the server. Server-side hybrid
already exists as an admin-only POST /api/query/hybrid endpoint
(app/api/query_hybrid.py) and is untouched here.

Analysts combining local + remote data now have two documented paths:
agnes snapshot create a filtered slice and join locally, or run the
join server-side via agnes query --remote. CLAUDE.md, the agent skill,
docs/DATA_SOURCES.md, and connectors.md updated accordingly.
2026-05-12 18:18:13 +02:00
ZdenekSrotyr
103efb69f0 chore(cli-rename): replace stale da verbs in active code paths
Bring admin UI, audit-log messages, code comments, and analyst-facing
skill docs in line with the post-bootstrap CLI surface (`agnes pull`,
`agnes push`, `agnes init`, `agnes snapshot create`). The legacy
`_LEGACY_STRINGS` detection tuple in `app/api/claude_md.py` and the hook
upgrade markers in `cli/lib/hooks.py` are intentionally left as-is —
they exist precisely to flag pre-rewrite content for re-authoring.

Strip "(folded from `da metrics list`)" / "(lifted from `da metrics
show`)" / "Replaces the old `da analyst status`" docstring noise — the
rename history is in CHANGELOG.md, not in module docstrings.
2026-05-04 21:10:43 +02:00
ZdenekSrotyr
1563b05f2e refactor(cli): hard-cutover env vars + config dir to AGNES_*
Task 0.5 of clean-analyst-bootstrap. Greenfield rewrite — no fallback,
no aliases. Existing dev environments lose their cached PAT and must
re-authenticate.

Env var renames (hard cutover):
- DA_CONFIG_DIR    -> AGNES_CONFIG_DIR
- DA_SERVER        -> AGNES_SERVER
- DA_SERVER_URL    -> AGNES_SERVER_URL  (test-only stale ref, not in spec)
- DA_NO_UPDATE_CHECK -> AGNES_NO_UPDATE_CHECK
- DA_LOCAL_DIR     -> AGNES_LOCAL_DIR
- DA_TOKEN         -> AGNES_TOKEN
- DA_STREAM_RETRIES -> AGNES_STREAM_RETRIES

Config dir rename: ~/.config/da/ -> ~/.config/agnes/ (across code,
comments, docstrings, error messages, install templates, dev scripts).

Stale `da X` references in CLI source (and adjacent app/, tests/):
swept docstrings, comments, help text, and error messages where the
verb survives the rewrite (init, pull, push, catalog, status, diagnose,
auth, admin, skills, query, schema, describe, explore, disk-info,
snapshot, login, logout, whoami, server, setup) and replaced `da X`
with `agnes X`. Intentionally kept `da sync`, `da fetch`, `da analyst`,
`da metrics` — those verbs are removed in later tasks; the legacy
strings will be detected by `_LEGACY_STRINGS` (added in Task 2).

Test fixes:
- TestCLIVersion now asserts output starts with `agnes ` (was `da `).

Test results: 2675 passed, 25 skipped (full pytest run, excluding 9
pre-existing test_db.py / test_user_management.py / test_e2e_extract.py
/ test_cli_binary_rename.py failures unrelated to this rename).
2026-05-04 16:35:44 +02:00
ZdenekSrotyr
7f743d0392 fix(cli): #168 review iter 6 — render empty-string diagnostics
Devin Review iter #6 found 2 issues.

🟡 BUG: cli/error_render.py filtered out empty-string values via
`detail[key] not in (None, "")` and `value not in (None, "")` before
they could reach `_kv_line`. But `_kv_line` was specifically designed
to render empty strings as `(empty)` — the filter shadowed that
branch. The hidden field happens to be the most operator-actionable
one in `cross_project_forbidden`: `billing_project: ""` is the exact
diagnostic confirming WHY USER_PROJECT_DENIED fires.

Change filter to `is not None`. Empty strings now flow through
`_kv_line` and render as `billing_project: (empty)`.

📝 ANALYSIS: CHANGELOG wording for the test-connection endpoint said
"the saved data_source.bigquery config", which Devin flagged as
slightly misleading because `get_bq_access` is `@functools.cache`d —
"Test connection" tests the config in the running process, not the
just-saved YAML overlay. The save flow already returns
`restart_required: True` and the UI shows a banner, so the behavior
is documented; only the CHANGELOG wording was loose. Tightened to
"the **process-cached** BqAccess... Tests the config active in the
running process — after a save the response includes restart_required;
click Test AFTER restart to validate the freshly-saved values."

New test: test_renders_empty_string_as_empty_marker locks in the
empty-string-as-(empty) rendering for the cross_project_forbidden
case so a future filter change won't silently drop the diagnostic
again. 9 affected render tests pass.
2026-05-04 14:30:43 +02:00
ZdenekSrotyr
a43533ca44 fix(cli): #168 review iter 4 — render reason when both kind+reason set
Devin Review iter #4 caught: `_format_dict` in cli/error_render.py
seeded `seen = {"kind", "reason"}` to keep both out of the kv block.
But the label line uses only ONE of them (`kind or reason or "error"`),
so the other was silently dropped.

Quota rejections at app/api/query.py:423 (daily-budget) and 488
(concurrent-slot) emit BOTH keys: `{reason: "daily_byte_cap_exceeded",
kind: "daily_bytes", ...}` and `{reason: "concurrent_slot_exceeded",
kind: "concurrent_scans", ...}`. Operator only saw `kind` in the label
and never the more specific `reason` value.

Fix: track which key actually went into the label and skip only that
one. The other appears in the kv section.

Verified output:
  Error: daily_bytes (HTTP 429)
    reason: daily_byte_cap_exceeded
    current: 99999
    ...

8 affected render tests pass.
2026-05-04 14:04:16 +02:00
ZdenekSrotyr
57482be263 feat(cli): #160 shared structured error renderer for BQ-typed responses
The reporter (#160) saw `USER_PROJECT_DENIED` raw in the CLI because
all three CLI error-rendering paths flatten typed BqAccessError /
guardrail / RBAC dicts to a truncated single-line string, hiding the
structured `hint` field that explains how to fix the misconfig.

Fix: shared `cli/error_render.py:render_error(status_code, body)` that
recognizes the canonical typed shapes and pretty-prints them. Falls
back to truncated-and-flattened form for unrecognized bodies, so the
renderer never makes worse-than-status-quo output.

Recognized shapes:
- {detail: {kind: ..., hint?, billing_project?, data_project?}}
  — typed BqAccessError responses from /api/v2/scan, /sample, /schema,
  /api/query (when /api/query escalates a BQ failure)
- {detail: {reason: 'remote_scan_too_large', scan_bytes, limit_bytes,
  tables, suggestion}} — new /api/query cost-guardrail rejection
- {detail: {reason: 'bq_path_not_registered'/'bq_path_access_denied',
  path, hint?, registered_as?}} — new /api/query RBAC patch
- {detail: '...'} — string detail (legacy endpoints)

Wired through 3 CLI paths:
- cli/v2_client.py: V2ClientError.__str__ delegates to render_error;
  pre-truncation removed from V2ClientError.message (was hiding hints
  past 200 chars).
- cli/commands/query.py:_query_remote: parse JSON body, call renderer
  on error.
- cli/commands/query.py:_query_hybrid: catch RemoteQueryError, build
  synthetic `{detail: {kind: error_type, **details}}` payload, render.

tests/test_cli_query.py:test_remote_query_failure: assertion updated
from `"Query failed"` (no longer printed) to `HTTP 400` + `bad SQL`
(what the renderer surfaces for string detail).

Sample output for cross_project_forbidden:

  Error: cross_project_forbidden (HTTP 502)
    billing_project: (empty)
    data_project: prj-example-data-001
    message: USER_PROJECT_DENIED on bigquery.googleapis.com
    hint: Set data_source.bigquery.billing_project in
        /admin/server-config to a project where the SA has
        serviceusage.services.use, or grant the SA that role on the
        data project.

19 tests pass — 10 from T4a now GREEN + 3 prior cli_query tests still
green + 6 ancillary.
2026-05-04 10:31:35 +02:00