agnes-the-ai-analyst

Author	SHA1	Message	Date
ZdenekSrotyr	f0d091f721	fix(store): scratch dir leak on ZIP validation failure (Devin Review) create_entity + update_entity created the `scratch` temp dir inside one try/finally but cleaned it up in a separate one. Validation HTTPExceptions raised by _safe_zip_extract (zip_unsafe_path, zip_too_large_uncompressed) or the BadZipFile→422 conversion exited the first scope, and the second finally was never entered → temp dir leaked on every failed upload. Devin flagged this on the F2 commit. The leak pre-existed (zip_unsafe_path was the original vector); F2 added zip_too_large_uncompressed to the same broken cleanup path. Fixed by collapsing scratch creation + cleanup into one outer try/finally that covers both extraction AND metadata/bake; the inner try/except/finally still handles BadZipFile→422 + tmp file cleanup. Same restructure in update_entity. Regression test `test_scratch_dir_cleaned_up_after_failed_extraction` triggers a zip_unsafe_path 422 and asserts tmp/agnes_store_* contains no leaked dirs.	2026-05-05 08:52:15 +02:00
ZdenekSrotyr	78cad8b235	release: 0.35.0 — /store + /my-ai-stack + security fixes + CLI	2026-05-05 08:18:16 +02:00
ZdenekSrotyr	16373d6b0b	feat(cli): agnes store + agnes my-stack commands Adds CLI coverage for the new REST surface introduced in this PR: agnes store list / show / install / uninstall / upload / delete agnes my-stack show / toggle Covers 11 of the 15 new endpoints — listing, detail, install/uninstall, upload (multipart), delete, my-stack get + curated toggle. Photo / docs download endpoints intentionally skipped; analyst-side automation rarely needs raw bytes back, and the web UI already covers them. cli/v2_client.py: api_post_multipart + api_put_multipart helpers (httpx files= passthrough). api_delete + api_put_json fillers were already needed for non-multipart writes; added together. Tests: tests/test_cli_store.py — help-text smoke tests + happy-path mocked tests for list, install, upload, my-stack show, my-stack toggle. 12 new tests, all green.	2026-05-05 08:18:12 +02:00
ZdenekSrotyr	fd3c76d21b	fix(store): security + correctness blockers found in PR review (F1, F2, F4, F5) Three independent reviews of PR #180 surfaced four real defects in the new Store / my-ai-stack surface. CHANGELOG entries detail each; one-liners: - F1 video_url XSS: any authenticated user could upload a Store entity with `video_url=javascript:...` and pop XSS in any viewer's session via the `<a href=...>` "Watch video" link in store_detail.html. Jinja2 autoescape doesn't block URI schemes inside attribute values. Fixed by scheme-validating to http(s) only on create + update; 400 invalid_video_url. - F2 ZIP decompression bomb: _safe_zip_extract checked path-traversal but not declared file_size totals — a 50 MB compressed upload at 1:1000 ratio decompresses to 50 GB and DOS the host disk. Fixed by summing zinfo.file_size across infolist() and refusing > 200 MB before extractall touches disk. 413 zip_too_large_uncompressed. - F4 admin authz parity: PUT /api/store/entities/{id} was owner-only while DELETE allowed owner OR admin; the store-detail page hid Edit/Delete buttons from admin even though DELETE was permitted. Fixed by allowing admin on PUT and passing is_admin to the template; gate is now is_owner OR is_admin everywhere. - F5 cross-owner suffix collision: sanitize_username is many-to-one (alice.smith / alice_smith both → alice-smith). Two such users uploading entities with the same display name produced identical `<name>-by-<username>` suffixes, silently colliding in the served agnes-store-bundle on-disk paths AND the manifest catalog (Claude Code dedupes by plugin.json `name`). Fixed by enforcing global uniqueness on the suffixed value at create_entity; 409 conflict_global_suffix. F3 (ZIP symlink members) was investigated and confirmed to be a false-positive — Python's stdlib ZipFile.extractall does not honor symlink mode bits, so no exploit exists. 9 new regression tests in tests/test_store_api.py::TestStoreSecurityFixes covering all four. Test run locally: 60/60 store-related tests pass.	2026-05-05 08:18:02 +02:00
ZdenekSrotyr	e86dd5edc5	fix(anthropic): strict json_schema (additionalProperties=false) + add /admin/scheduler-runs UI E2E test on a real BQ deploy showed every verification-extraction call fails with HTTP 400 invalid_request_error: "output_config.format.schema: For 'object' type, 'additionalProperties' must be explicitly set to false". The Anthropic structured-output API now requires the field on every object node in the json_schema. Fix: connectors/llm/anthropic_provider.py wraps the caller-supplied schema through a recursive _strict_json_schema() walker that adds the field where missing (preserving any explicit override), then passes the strict variant to the API. Six unit tests in TestStrictJsonSchema pin the recursion across nested objects, array items, and the no-mutation invariant. Adds /admin/scheduler-runs — a read-only admin page that surfaces the last 200 audit-log entries from scheduler-driven actions. New AuditRepository.query_actions(actions, limit) helper, new admin nav entry. Failed scheduler ticks (HTTP 401, network errors) don't reach the audit_log; the page calls that out with a hint to set SCHEDULER_API_TOKEN if no rows show up.	2026-05-05 08:00:57 +02:00
ZdenekSrotyr	9f9aabd72b	fix(corporate-memory): CLI catches fail-fast ValueError, exits 1 with clean message (Devin Review on #179 ) The PR's #176 fail-fast change made collect_all() raise ValueError when neither an ai: block nor ANTHROPIC_API_KEY/LLM_API_KEY was available. verification_detector's CLI was updated to handle it; corporate_memory's CLI was missed and crashed with an unhandled traceback. services/corporate_memory/collector.py:main() now wraps the collect_all call in try/except ValueError, prints a one-line actionable message to stderr, and returns rc=1. Regression test: test_llm_connector.py::TestCorporateMemoryCollector::test_main_returns_1_on_no_ai_config_instead_of_traceback.	2026-05-05 06:45:10 +02:00
ZdenekSrotyr	e68c2d3f0f	fix(session-collector): argv-free run() helper, drop SystemExit footgun (Devin Review on #179 ) run_session_collector called collector.main() which did argparse.parse_args() on uvicorn's sys.argv (['app.main:app', '--host', ...]) → sys.exit(2) → SystemExit(2), which inherits from BaseException, escapes FastAPI handlers, and propagates through the thread pool. Every scheduler tick that fired the endpoint either 500-ed or risked killing the uvicorn worker. services/session_collector/collector.py now exposes run(dry_run, verbose) that returns (rc, stats); main() is a thin CLI shim that parses argv and delegates. The admin endpoint calls run() directly and audit-logs the per-run stats (users_processed, files_copied, files_skipped) instead of just the rc. Three regression tests in TestRunHelper. Closes Devin Review finding on app/api/admin.py:2819 (#179).	2026-05-05 06:31:55 +02:00
ZdenekSrotyr	046d8705ee	docs(changelog): correct "two paths" claim + document new env vars The 0.35.0 entry's 'two paths to a working LLM pipeline' wording was defensible only after the #179 review fixes — on the initial cut, the seeded-overlay path was dead code (consumers imported the static-only loader; even when they didn't, env refs in the overlay weren't resolved). Updated Defect 5's bullet to spell out what was broken and what shipped, and added a new bullet for the scheduler-cadence env-var fix. Added the two new test modules under Internal.	2026-05-05 06:05:27 +02:00
ZdenekSrotyr	fa3a76a528	fix(scheduler): single env var drives cadence + grace (#179 review) Devin NOTABLE: SCHEDULER_VERIFICATION_DETECTOR_INTERVAL was already read by app/api/health.py to compute the staleness grace window, but the actual scheduler cadence was hardcoded to 'every 15m'. The env var name implied it controlled the cadence — it didn't. An operator throttling the detector via the env was silently ignored by the scheduler while the health grace silently widened. Wired the env var into both ends. Same pattern applied to the other two LLM-pipeline jobs: - SCHEDULER_SESSION_COLLECTOR_INTERVAL (default 600s = 10m) - SCHEDULER_VERIFICATION_DETECTOR_INTERVAL (default 900s = 15m) - SCHEDULER_CORPORATE_MEMORY_INTERVAL (default 1020s = 17m) Defaults preserve the existing 10m / 15m / 17m coprime offset so the three jobs don't fire on the same tick. build_jobs() now reads all three through _read_positive_int (matching the existing pattern for data-refresh / health-check / script-runner) and feeds them to _seconds_to_schedule. The smallest-interval check includes the new variables so an operator can't accidentally set a tick larger than any LLM cadence. New tests in tests/test_scheduler.py: - TestLLMPipelineCadenceEnvVars: env override changes the schedule string at scheduler-init time, with parametrized invalid-value rejection. - TestVerificationDetectorGraceFollowsCadence: pinning the single-source-of-truth contract — same env var moves both the scheduler cadence and the health-check grace.	2026-05-05 05:59:18 +02:00
ZdenekSrotyr	9f33e24bf9	fix(config): overlay-aware LLM consumers + env-ref resolution (#179 review) Devin BUG: /api/admin/configure seeds an ai: block to the writable overlay at DATA_DIR/state/instance.yaml, but the three LLM consumers imported from config.loader.load_instance_config — which reads the static config dir only. Even if they had read the overlay, the loader ran yaml.safe_load directly without passing through _resolve_env_refs, so '${ANTHROPIC_API_KEY}' would have stayed a literal placeholder. The pipeline appeared to work because the factory falls back to the env var directly, but the overlay path itself was dead code. Two fixes, both required: 1. Switched the three LLM consumers to app.instance_config.load_instance_config: - services/corporate_memory/collector.py:collect_all - services/verification_detector/__main__.py:main - app/api/admin.py:run_verification_detector 2. app/instance_config.py runs the loaded overlay through config.loader._resolve_env_refs before the deep-merge, so '${ANTHROPIC_API_KEY}' resolves at config-load time. New regression suite tests/test_instance_config_overlay.py pins: - env-ref resolution against the overlay (resolved when env set, empty when env missing — never the literal placeholder) - deep-merge still preserves static-only sections - the three consumers reach app.instance_config (inspected via inspect.getsource so a future refactor that reverts the import fails the test) - end-to-end: a seeded overlay + ANTHROPIC_API_KEY env reaches the factory with a resolved api_key	2026-05-05 05:57:22 +02:00
ZdenekSrotyr	98a8aba3be	fix(tests): align test_llm_connector with new factory + fail-fast (#179 review) The PR rewrote collect_all() to call the new create_extractor_from_env_or_config() helper, but the existing tests still mocked the old direct create_extractor() symbol and the old silent-skip-on-missing-config behavior. Five tests in TestCorporateMemoryCollector and one in TestCollectorExtractorIntegration were red on the PR branch. Changes: - Tests now mock connectors.llm.create_extractor_from_env_or_config (the symbol the collector imports lazily). - Renamed test_collect_all_no_ai_config_skips -> test_collect_all_no_ai_config_or_env_raises and test_collector_handles_invalid_config -> test_collector_raises_on_invalid_config. Both assert pytest.raises(ValueError) — the explicit fail-fast semantics defect 5 of #176 was supposed to enforce. - collect_all() no longer swallows the factory's ValueError into stats["errors"]; it propagates so the scheduler / admin endpoint surface the actionable misconfiguration message instead of pretending the run was a no-op. - /api/admin/run-corporate-memory translates the propagated ValueError into a 500 with the factory's message, matching /api/admin/run-verification-detector.	2026-05-05 05:55:01 +02:00
Minas Arustamyan	537ea7662b	chore(store): genericize email examples in docstring + test Per CLAUDE.md vendor-agnostic OSS guidance — replace the real groupon.com email used as a sanitize_username() example with a placeholder (alice_smith@example.com).	2026-05-05 05:48:32 +02:00
Minas Arustamyan	5372d65b26	fix(setup): install list reflects opt-outs + Store bundle `compute_default_agent_prompt` (which renders the install commands in the setup prompt's marketplace block) was calling `resolve_allowed_plugins` — the admin-only feed that predates the v25 Store/opt-out layer. Result: a user with 2 opted-out curated plugins + 2 Store skills saw the original 4 admin grants in the install list (including the opted-out ones, with cross-marketplace duplicates), and no `agnes-store-bundle` install line for the skills. Now we call `resolve_user_marketplace` — the same resolver that `/marketplace.zip` + `/marketplace.git/` serve from. The install commands now match the served catalog exactly: admin grants minus the user's opt-outs, plus the `agnes-store-bundle` synth plugin (which wraps every installed Store skill + agent into one plugin entry) and any standalone Store plugin uploads. Dedup by `manifest_name` because two upstream marketplaces shipping a plugin with the same name collide in the synth marketplace.json by design (CLAUDE.md "Same-named plugins ... collide in the catalog by design"). A duplicate `claude plugin install <name>@agnes` would be a no-op anyway, so it's just visual noise to keep emitting both.	2026-05-05 05:17:05 +02:00
Minas Arustamyan	af72c5d259	fix(setup): walk TLS chain for trust-store match — Let's Encrypt cleanup `_read_agnes_ca_pem()` decides whether the served fullchain.pem needs trust-bootstrapping in the rendered setup prompt. Pre-fix it only checked the leaf's immediate issuer against `certifi`'s trust store. For Let's Encrypt that's the intermediate (R13), which `certifi` does not ship — only roots are in trust stores. So a publicly-trusted LE chain still tripped the "needs bootstrap" path and the setup prompt emitted a step-0 TLS trust block + clone-fallback marketplace block that no client actually needs (Bun-compiled `claude.exe`, system git, Python via certifi all validate the chain through the bundled ISRG Root X1). Now we walk every cert in the fullchain (leaf + intermediates) and return None the first time any cert's issuer is in the certifi trust store — that captures the standard "leaf signed by intermediate signed by publicly-trusted root" shape. Trusted subjects are read once into a set for O(1) lookup. Self-signed (leaf.issuer == leaf.subject) and private-CA chains (no chain link's issuer in certifi) keep their previous "return PEM" behavior, so deployments that genuinely need the bootstrap still get it. Validated end-to-end against the live VM at agnes-marustamyan.groupondev.com (LE R13 → ISRG Root X1): - Let's Encrypt fullchain → has_ca=False (was True) - Self-signed cert → has_ca=True - Corporate-CA chain (private root) → has_ca=True - Missing fullchain.pem → has_ca=False	2026-05-05 04:55:06 +02:00
Minas Arustamyan	9d53efc6e1	fix(schema-v25): drop FK refs from store tables Past migration finalize steps RENAME / DROP COLUMN / ALTER on the `users` table (e.g. _v12_to_v13_finalize, _v13_to_v14_finalize, _v17_to_v18_finalize, the v5 backfill). DuckDB rejects an ALTER on a table that any other table references via FOREIGN KEY, so the new store_entities / user_store_installs / user_plugin_optouts entries — which the self-heal pass writes to _SYSTEM_SCHEMA before the migration ladder runs — broke 6 legacy-migration tests with: Cannot alter entry "users" because there are entries that depend on it Pre-existing convention (see personal_access_tokens at v6) is to omit FK constraints to `users` and validate user existence at the app layer. Sync the three v25 tables with that convention. Same edit in both _SYSTEM_SCHEMA and _V24_TO_V25_MIGRATIONS so fresh installs and upgraded installs land in the same shape. App-level cascade behavior is unchanged: store entity DELETE explicitly deletes user_store_installs rows in app/api/store.py, and the admin grant-deletion hook explicitly deletes user_plugin_optouts rows for the plugin. The dropped FK constraints were defense-in-depth, not the only guard.	2026-05-05 03:15:09 +02:00
Minas Arustamyan	d5a7c9ad79	feat(store): /store + /my-ai-stack — community marketplace + per-user composition Adds a community-driven Store where any authenticated user uploads skills/agents/plugins as ZIPs, plus /my-ai-stack as the per-user composition view. The served Claude Code marketplace is now: (admin_granted ∖ opt_outs) ∪ store_installs Skill + agent installs are merged into a single `agnes-store-bundle` plugin in the served marketplace; type=plugin uploads stay standalone. Names are suffixed with `-by-<owner-username>` at upload time so two owners can use the same display name without colliding in Claude Code's flat skill/agent namespace. Schema v23 → v24 adds three tables: - store_entities — community-uploaded skills/agents/plugins - user_store_installs — what each user has chosen to install - user_plugin_optouts — opt-out overlay on top of admin grants Admin grant-delete drops every user's opt-out for that plugin so re-grant resets cleanly to enabled (no sticky personal preference). UI: - /store — e-commerce-style listing with type/category/owner filters, search, pagination, owner-aware [Install] buttons, clickable cards - /store/new — 2-step upload wizard with drag & drop, preview validation (POST /api/store/entities/preview), docs multi-upload, photo + video URL - /store/{id} — detail page with hero, file list, docs, owner actions (Edit/Delete) for the uploader - /my-ai-stack — Granted plugins (toggle opt-out) + From the Store (uninstall) sections - Admin nav: Marketplaces moved into Admin dropdown, renamed to "Curated Marketplaces" Validation hardening: type-mismatch guards reject skill ZIP uploaded as agent (or vice versa), and plugin ZIPs masquerading as skills/agents. Human-readable error messages mapped client-side from machine codes. Cross-source naming: Store entity-id-prefixed dirs (`plugins/store-<id>/`) plus the bundle (`plugins/store-bundle/`) avoid collisions with admin marketplaces (whose `store` slug is reserved by `is_valid_slug`). Bundle composition is content-hashed at serve time — install/uninstall or owner re-upload bumps the bundle's plugin.json `version`, so Claude Code's auto-update toggle picks up changes. Tests: 50+ new tests across naming, repositories, filter (admin ∪ store ∪ bundle), API (upload/install/uninstall/delete/preview/docs), end-to-end marketplace.zip with bundle merging.	2026-05-05 02:53:49 +02:00
ZdenekSrotyr	567385d046	release: 0.35.0 — session pipeline fix (BREAKING) (#176 ) Five compounding defects on default `docker compose up` deploys made the session pipeline silently broken: sessions uploaded by analysts via `agnes push` landed on /data/user_sessions/<user>/.jsonl but nothing ever processed them. Fix is one PR: promote anthropic + openai to core deps, wire all three LLM-pipeline jobs into scheduler-v2 with offset cadences (10m/15m/17m), drop the side-car services from compose, seed a default ai: block on first-time setup with an env-var fallback in code, surface the pending review queue to admins, and expose a health check that warns when uploaded jsonls aren't being processed. BREAKING* for operators on COMPOSE_PROFILES=full or with custom Compose overrides referencing the corporate-memory or session-collector service stanzas — drop them. The scheduler is now the sole driver.	2026-05-05 00:46:27 +02:00
ZdenekSrotyr	a621a415cc	fix(health): session-pipeline staleness check (#176 ) GET /api/health/detailed now returns a session_pipeline service entry. Heuristic: max(mtime of /data/user_sessions/*/.jsonl) <= max(processed_at in session_extraction_state) + grace_seconds grace_seconds = 2 × verification-detector cadence (default 30 min; configurable via SCHEDULER_VERIFICATION_DETECTOR_INTERVAL). When the assert fails, status='warning' (never 'error') with an actionable detail pointing at the verification-detector scheduler job. A warning bubbles up to the existing overall='degraded' aggregation — operators querying /api/health/detailed (or /agnes diagnose system) get a clear breadcrumb instead of a silently-broken pipeline. Cold-start case (no session files, or files newer than the grace window with empty state table) is handled explicitly to avoid noise on a fresh deploy. Tests: tests/test_health_session_pipeline.py.	2026-05-05 00:04:28 +02:00
ZdenekSrotyr	c53c1e1572	fix(ui): admin pending-review banner on /corporate-memory (#176 ) The /corporate-memory page filters status IN ('approved','mandatory') and showed no hint that pending items exist. With approval_mode set to 'review_queue' (the default in instance.yaml.example), every collection run would silently funnel new items into the pending bucket where no operator ever saw them. For admins (is_km_admin), the page now renders a banner above the stats bar: N pending items awaiting review — review them at /corporate-memory/admin Non-admins see no change (the route zeroes the count server-side before passing to the template, so the hint is never leaked). Tests: tests/test_corporate_memory_page.py.	2026-05-05 00:01:22 +02:00
ZdenekSrotyr	c3df03beb3	fix(compose): drop corporate-memory + session-collector services (#176 ) BREAKING for operators using `COMPOSE_PROFILES=full` or custom Compose overrides that referenced these stanzas — they're gone in docker-compose.yml and docker-compose.prod.yml. The scheduler-v2 model (previous commit) is now the sole driver: every cadence is a job in services/scheduler/__main__.py:JOBS hitting an admin HTTP endpoint. Why drop instead of keep behind `profiles: [full]`: - The previous stanzas were tight `restart: unless-stopped` boot loops. When the scheduled run ended (every cycle), Docker re-spawned the container, defeating any cadence the service intended. - The whole point of #176 is that there's now exactly one driver. Two drivers (scheduler HTTP + standalone container loop) would race on the same /data/user_sessions and knowledge_items writes. - Removing the stanzas is a louder signal than commenting them out — operators upgrading get a clean failure mode (no stale containers), not a silently double-driven pipeline. The Python entry points (services/{corporate_memory, session_collector, verification_detector}/__main__.py) stay — they're still callable from the CLI for manual one-shot runs and from the new admin endpoints. docs/architecture.md updated to reflect the new schedule table. tests/test_docker_compose.py pins the contract: the two services must not reappear under either Compose file.	2026-05-04 23:59:44 +02:00
ZdenekSrotyr	45de71e8ab	fix(scheduler): wire LLM pipeline into scheduler-v2 (#176 ) The session-collector, verification-detector, and corporate-memory services now run on the same scheduler-v2 model that already drives data-refresh, health-check, script-runner, and marketplaces: - New admin endpoints in app/api/admin.py: POST /api/admin/run-session-collector POST /api/admin/run-verification-detector POST /api/admin/run-corporate-memory All admin-gated, sync-def (FastAPI thread pool), with one audit row per invocation. Same single-writer-of-system.duckdb pattern as the existing /api/marketplaces/sync-all job. - services/scheduler/__main__.py JOBS gains three entries with offset cadences (10m / 15m / 17m, all coprime modulo the 30s tick) so the three LLM-backed jobs don't fire on the same tick and stack their API + DB load. - The verification-detector endpoint surfaces the LLM factory's fail-fast ValueError as HTTP 500 with the actionable message, preserving the no-silent-skip contract from the previous commit. Tests: - tests/test_admin_run_endpoints.py covers admin gating + scheduler registration + endpoint contract. - tests/test_scheduler_sidecar.py existing tests continue to pass.	2026-05-04 23:57:43 +02:00
ZdenekSrotyr	bbb04ac041	fix(setup): seed default ai: block + env-var fallback (#176 ) POST /api/admin/configure now writes a default ai: block into the instance.yaml overlay when the request leaves it untouched and either ANTHROPIC_API_KEY or LLM_API_KEY is set in the environment. The block references the env var via ${VAR} syntax — secrets never land in YAML. connectors.llm.factory grows create_extractor_from_env_or_config which falls back to ANTHROPIC_API_KEY / LLM_API_KEY when ai_config is empty and raises a clear ValueError when neither is available. Both services/corporate_memory and services/verification_detector switch to the new helper, replacing the old 'silently skip when ai: missing' path that was the silent-failure root cause. Tests: - tests/test_setup_ai_block.py — overlay seeding contract. - tests/test_llm_provider_env_fallback.py — fallback + fail-fast.	2026-05-04 23:55:19 +02:00
ZdenekSrotyr	d2104555c6	fix(deps): promote anthropic + openai to core dependencies (#176 ) LLM provider SDKs are imported by services/corporate_memory and services/verification_detector — both production code paths. Listing them only in [project.optional-dependencies].dev caused the scheduler container to boot-loop with ModuleNotFoundError on default `docker compose up` deploys, because the Dockerfile installs core deps only (`uv pip install --system --no-cache .`). Adds tests/test_packaging.py to lock the contract: anthropic + openai must live in [project].dependencies, not in dev extras.	2026-05-04 23:52:30 +02:00
ZdenekSrotyr	c2b1ca076d	Merge pull request #173 from keboola/zs/clean-analyst-bootstrap-spec Clean analyst bootstrap: web → paste → done UX (BREAKING)	2026-05-04 23:40:19 +02:00
ZdenekSrotyr	4c7ce9ce32	Update cli/commands/init.py Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-04 23:25:06 +02:00
ZdenekSrotyr	0430c0de00	release: 0.34.0 — clean analyst bootstrap (BREAKING) + bundled fixes Headlines: - Clean analyst bootstrap rewrite: web /setup → paste prompt → Claude Code in empty folder = working analyst workspace. CLI binary renamed da → agnes. See CHANGELOG ## [0.34.0] for the full breaking-change matrix. - Unified /setup flow: collapsed the admin/analyst tile split (the ?role= query parameter introduced mid-cycle is gone). Every signed-in user sees the same flow; marketplace + plugins block emitted iff caller has plugin grants. PAT scope uniform (general 90 d). - Bundled fixes: supersedes #172 (Windows console encoding), merges #174 (BigQuery materialize view fix + concurrency, schema v24 migration), closes #171 (--remote query pre-check no longer over-rejects narrow queries on partitioned tables, ~30,000x over-estimate fix). - Devin Review findings addressed throughout the cycle: query.py:464 (rewriter cross-contamination), extractor.py:166 (TTL reclaim dead code), db.py:1757 (v24 migration retry path), init.py:99 (stale on-disk token override), and more. - Operator UX: register-table now requires --bucket for materialized rows + emits first-sync and grant hints on success. agnes status sessions counter reads from ~/.claude/projects/<encoded-cwd>/. agnes init --token now wins over stale ~/.config/agnes/token.json. Open follow-ups (separate issues): - #175 sync architecture redesign (full-extract Keboola, full-file downloads, user-global sync_state) - #177 admin CLI: missing unregister-table / update-table commands - #178 agnes diagnose: introduce "info" severity tier	2026-05-04 23:13:23 +02:00
ZdenekSrotyr	0612c1e1a1	fix(schema-v24): raise on deferred migration so retry path actually runs (Devin Review on db.py:1757) Pre-fix: when v24 migration found rows to migrate but data_source.bigquery.project was empty, it logged a warning per row and returned normally. Schema_version then bumped to 24 unconditionally → next start's 'if current < 24:' gate skipped _v23_to_v24_finalize forever, leaving rows in DuckDB-flavor SQL that the new _wrap_admin_sql_for_jobs_api wrapping path rejects. Devin escalated this from advisory ("idempotent retry") to critical on rescan after my reply. The reply was wrong — the LIKE filter inside the function gives idempotency IF the function is called again, but the schema-version gate prevents that call from happening. Fix (Devin's recommended Approach 1): raise RuntimeError BEFORE the schema-version bump when rows need migration but project_id is empty. The schema_version stays at 23, so on next start the 'if current < 24:' gate fires and the migration runs again — this time with project_id configured. Side effect: a BQ-using deployment that hasn't set the project blocks startup until they do. That's the right call for a config error that would otherwise silently break all materialized tables. The error message points at the right knob (data_source.bigquery.project + restart). No-rows-no-block invariant preserved: the early 'if not rows: return' at the top of _v23_to_v24_finalize means non-BQ deployments are unaffected. Tests: - test_v24_raises_when_project_not_configured_and_rows_need_migration: asserts raise + schema_version stays at 23 (the load-bearing invariant for retry-on-next-start to work) - test_v24_skips_clean_when_no_rows_match_even_without_project: asserts non-BQ deployments don't block startup - Existing 3 tests still pass	2026-05-04 23:11:34 +02:00
ZdenekSrotyr	36012e0833	fix(admin): register-table real-world UX gaps for materialized BQ Three items from operator feedback after running the actual flow: (1) Help docstring lied: "--bucket / --source-table ignored" for materialized rows. Reality: --bucket is load-bearing because `agnes schema <name>` builds the BQ identifier as `bq.<bucket>.<source_table>`. An empty bucket registered the row but broke schema/describe with HTTP 400 "unsafe BQ identifier in registry". Fix: docstring rewritten to reflect reality, plus client-side validation rejects materialized + empty bucket with a clear error pointing at the right knob. (2) Post-register UX cliff: `agnes pull` after register-table reports "Updated 0 tables (1 total)" because registration adds a registry row but does NOT trigger a parquet build. Operators routinely assume something's broken when they need to run `agnes setup first-sync` to kick off the materialization. Hint emitted on success now points at first-sync. (3) RBAC gotcha: `agnes catalog` is RBAC-filtered via `resource_grants`, so non-admin users don't see freshly-registered rows until a grant is created. Hint emitted on success now points at `agnes admin grant create <group> table <name>`. Tests: 8/8 in test_cli_admin_materialized.py, including two new regression tests for the validation + the hint output.	2026-05-04 23:06:17 +02:00
ZdenekSrotyr	5915f92eaa	fix(query-guardrail): single-pass alternation regex (Devin Review on query.py:464) The iterative bare-name rewriter (one re.sub per name, longest-first) was vulnerable to cross-contamination when the GCP project ID contained a registered table name as a hyphen-delimited word. Concrete repro: project = 'my-ue-project' registered = ['orders', 'ue'] user SQL = 'SELECT * FROM orders JOIN ue ON ...' iter 1 (orders): produces 'FROM `my-ue-project.fin.orders` JOIN ue ...' iter 2 (ue): '\bue\b' matches 'ue' INSIDE 'my-ue-project' (hyphen creates word boundary on both sides) — corrupts the iter-1 path Fallback at query.py:576 caught the resulting BQ parse error and fell back to per-table SELECT * estimate, so impact was over-estimation, not fail-open — but the #171 partition-pruning fix silently degraded to pre-fix behavior whenever a project name shared a hyphen-segment with a registered table. Fix: single re.sub call with an alternation regex sorted longest-first. Single-pass means each source position is processed exactly once, so freshly-inserted backticked text from one match isn't re-scanned by later names in the alternation. Regression test test_rewrite_helper_does_not_corrupt_when_project_id_contains_registered_name covers the exact Devin repro.	2026-05-04 22:51:33 +02:00
ZdenekSrotyr	c432e90f62	fix(bq-materialize): TTL reclaim was dead code (Devin Review on extractor.py:166) `_try_acquire_file_lock` opened the lock file with `open(mode='w')` BEFORE the mtime check, which truncated the file and refreshed mtime to now. The subsequent age check always saw ~0, so the TTL reclaim branch was never reachable and `materialize.lock_ttl_seconds` was a silently no-op config knob. Repro: before open(w): mtime age = 100000s after open(w): mtime age = 0s Fix: stat the lock path BEFORE any open(). If pre-probe mtime is older than TTL, unlink (forcing a fresh inode for the open + flock that follows). Order is now stat-then-decide-then-probe, not probe-then-stat-then-decide. Two regression tests added in tests/test_bq_materialize_concurrency.py: - test_stale_held_lock_is_reclaimed_despite_live_holder — exercises the full reclaim path with a still-living fcntl holder. Pre-fix this returned None (in_flight forever); post-fix returns a holder fd on a new inode. - test_failed_probe_does_not_self_refresh_lock_mtime — sister test pins that a failed acquisition's mode='w' truncate doesn't pathologically loop. Residual cross-process risk (genuinely overrunning materialize past TTL races a fresh attempt — both write to the same parquet.tmp, inode-level flock independence means new acquisition succeeds while old holder is still alive) stays documented in the helper docstring. In-process threading.Lock keyed on table_id blocks the single-process race; cross-process protection relies on TTL being well above longest plausible COPY (24h default).	2026-05-04 22:36:56 +02:00
ZdenekSrotyr	bc9dd5c5f0	test(setup-instructions): pin no-legacy-da-verbs invariant Adds `test_unified_flow_uses_only_agnes_verbs` that asserts no `da ` substring (with trailing space, to dodge false positives on `Darwin` / `database` / `adapter`) appears in any of the four `resolve_lines()` shapes: - bare (no plugins, no ca) - plugins only - ca only - plugins + ca Also pins the `agnes init --server-url … --token …` shape — commit 8784f10a's stale-on-disk-token fix relies on `init` receiving an explicit `--token` argument; if a future refactor drops the flag from the emitted command the test fails loudly instead of silently regressing to 401-on-stale-token in production. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 10.	2026-05-04 22:20:40 +02:00
ZdenekSrotyr	ed969f5e37	docs(changelog): unified /setup flow under Unreleased Replace the analyst-vs-admin `?role=` design summary with the unified flow we're shipping: single tile, single PAT-mint shape (general / 90 d), `agnes init` mandatory for everyone, marketplace block gated by `resource_grants`, pre-flight check now validates both git and claude. The intro paragraph references the 10-task unification follow-up and the `?role=` introduction-and-removal cycle so a future operator reading the diff doesn't think they missed a release. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 9.	2026-05-04 22:19:57 +02:00
ZdenekSrotyr	424ec9b0f4	refactor(install.html): single tile, single PAT-mint body shape Drops the `<nav class="role-tiles">` block (Analyst / Admin tiles), the `_show_admin_tile` flag, the `const ROLE = {{ role \| tojson }};` JS line, and the role-aware PAT-mint ternary. The setupNewClaude button now mints a uniform PAT for everyone: { name: defaultTokenName(), expires_in_days: 90 } …against the existing `POST /auth/tokens` endpoint. No new endpoint, no role-locked TTL clamp. The `bootstrap-analyst` 1-hour scope is no longer used from /setup (it broke the install flow anyway — saved PATs expired before the user opened Claude Code; tracked as a separate cleanup issue). Also removes the now-unused `.role-tiles` / `.role-tile` CSS rules so the stylesheet doesn't carry dead selectors. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 6.	2026-05-04 22:18:00 +02:00
ZdenekSrotyr	2ee529533f	refactor(setup-page): drop role query param The `/setup` route no longer accepts `?role=analyst\|admin`. The route signature drops the `Literal[...] = Query(...)` parameter and the silent admin-downgrade block (`if role == "admin" and not is_admin: role = "analyst"`). The `role` ctx variable threaded into install.html also goes away — Task 6 cleans up the template's role-tile UI and the JS PAT-mint ternary. `?role=` is silently ignored by FastAPI for unknown query params, so existing bookmarks (none in production — the param was added in this PR and never shipped) just degrade to the unified layout. No RedirectResponse shim needed. Tests: drop the entire `tests/test_setup_page_roles.py` file (eight role-branching tests that no longer apply) and add `tests/test_setup_page_unified.py` with three tests: - `test_setup_page_renders_unified_layout` - `test_setup_page_ignores_role_query_param` - `test_setup_page_renders_marketplace_for_user_with_grants` - `test_install_legacy_path_redirects_to_setup` Also replace the role-aware `test_install_preview_*` tests in test_web_ui.py with unified-layout assertions. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 5.	2026-05-04 22:16:59 +02:00
ZdenekSrotyr	291079b1d2	refactor(welcome-template): drop role param; resolve plugins per-user unconditionally Removes the `role: Literal["analyst", "admin"] = "admin"` parameter from `compute_default_agent_prompt`. The same RBAC pass (`marketplace_filter.resolve_allowed_plugins`) now runs for every user — admin or not. Users with no `resource_grants` rows get the no-marketplace layout; users with grants get the marketplace block inserted. Admin-vs-analyst is no longer a layout branch. `render_agent_prompt_banner` no longer derives a `role` from `user.is_admin`; it just delegates to `compute_default_agent_prompt`. Two `compute_default_agent_prompt(...role=role)` call sites in `app/web/router.py::setup_page` are updated to drop the keyword so the route keeps rendering — Task 5 will remove the `?role=` query parameter and the silent admin-downgrade block from the route signature itself. Tests: drop role-aware assertions from test_welcome_template_renderer and test_welcome_template_api. Both files now assert the unified default contains `agnes init` + `uv tool install` and bans the legacy `agnes auth import-token` / `agnes auth whoami` verbs. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 4.	2026-05-04 22:13:46 +02:00
ZdenekSrotyr	74b7f6e254	feat(setup-instructions): preflight checks both git and claude Renames `_git_check_block` to `_preflight_block` and adds a `claude --version` check beside `git --version`. Both binaries are required by the marketplace step — git for the clone fallback, claude for `claude plugin marketplace add` / `claude plugin install` — so checking them together gives one clear failure instead of two confusing downstream errors. Install hints: `npm i -g @anthropic-ai/claude-code` for Linux / WSL plus a doc URL (https://docs.claude.com/claude-code) for the native macOS / Windows installers. We don't try to one-line a native installer; the canonical instructions live upstream. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 3.	2026-05-04 22:11:38 +02:00
ZdenekSrotyr	e16698c3cc	refactor(setup-instructions): unified layout with mandatory agnes init Adds `_step_numbers(*, has_marketplace, has_skills)` so step numbering lives in one place instead of being split across three branches in `resolve_lines`. Pins the unified layout in the tests: No plugins: 1 install, 2 init, 3 catalog, 4 diagnose, 5 skills, 6 confirm With plugins: 1, 2, 3, 4 preflight, 5 marketplace, 6 diagnose, 7 skills, 8 confirm `agnes auth import-token` / `agnes auth whoami` are now banned from the rendered prompt — `agnes init` subsumes them. The renamed `test_resolve_lines_no_plugins_unified_six_step_layout` asserts those strings are absent and that the new step headers (`Bootstrap your Agnes workspace`, `Verify the data is queryable`) are present. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 2.	2026-05-04 22:10:05 +02:00
ZdenekSrotyr	9334beed15	refactor(setup-instructions): drop role param; collapse analyst/admin into one layout Removes the `role: Literal["analyst", "admin"]` parameter from `resolve_lines` / `render_setup_instructions` and deletes the `_resolve_analyst_lines`, `_analyst_init_lines`, `_analyst_finale_lines` helpers. The unified flow now always emits `agnes init` (the workspace-rails delivery mechanism) in place of the legacy `agnes auth import-token` + `agnes auth whoami` pair, and uses `agnes catalog` as the smoke-verify step. `agnes init` already verifies the PAT internally, and `agnes catalog` doubles as a data-plane smoke check, so dropping `agnes auth whoami` costs no signal. Drops the now-redundant `tests/test_setup_instructions_analyst.py` and patches the one ordering test in `tests/test_setup_instructions.py` that referenced the old "Log in" / "Verify the login" headers. Also strips the `role=role` kwarg from `compute_default_agent_prompt`'s call into `resolve_lines` so the welcome-template render path keeps working; welcome_template.py's own role param is removed in a follow-up task. Plan: docs/superpowers/plans/2026-05-04-unified-setup-prompt.md task 1.	2026-05-04 22:08:48 +02:00
ZdenekSrotyr	8784f10a6b	fix(devin-review): stale-token override + status sessions counter + lock comment Three Devin Review findings on PR #173 addressed in one commit since they're in adjacent code paths: 1. cli/commands/init.py:99 (\u{1F534}): `agnes init --token NEW` ran step 2 verify against the OLD on-disk token because `get_token()` read `~/.config/agnes/token.json` before the env var, and `_override_server_env` only set the env var. So `agnes init --force` on a machine with a stale token.json failed 401 with a confusing 'token expired' even though the --token arg was valid. Fix: ContextVar-based override in `cli.config._token_override` checked by `get_token()` BEFORE the on-disk read. `_with_token_override` context manager scopes the override. `_override_server_env` now also sets the contextvar via `_with_token_override(token)`, so both env var and contextvar carry the override (env for back-compat with anything bypassing get_token; contextvar is the authoritative source). Async-safe (each task sees its own override) and leak-proof (resets on context exit). 2 new tests: regression on stale-disk-token + scope leak guard. 2. cli/commands/status.py:43 (\u{1F7E1}): sessions_pending_upload only checked legacy `<workspace>/user/sessions/` and always reported 0 in workspaces bootstrapped with `agnes init` (Claude Code writes to `~/.claude/projects/`, not the legacy path). Same bug we fixed for `agnes push` in `08e49591`. Fix: route through `cli.lib.claude_sessions.list_session_files()` so status and push agree on what counts as a pending session. 3. connectors/bigquery/extractor.py:111 (\u{1F7E1}): docstring claimed "a live holder still wins the second flock attempt" — incorrect on Linux. After `unlink()` + `open()`, the new file is a new inode; fcntl.flock keys per-inode, so the old holder's lock does NOT block the new acquisition. In a genuine TTL-overrun scenario two writers CAN race the parquet.tmp. Fix: documentation only. Comment now honestly describes the inode-recreation behavior, names the threading.Lock as the actual in-process guard, and flags pid-gating as the next-iteration fix if real corruption surfaces. The 24h default TTL is well above typical COPY durations so the practical risk is low. Tests: 17/17 across test_cli_init.py + test_lib_pull.py + the broader regression set.	2026-05-04 21:26:30 +02:00
ZdenekSrotyr	8233c3e3f9	chore(docs): replace stale `da` verbs and vendor-specific install paths Sweep operator runbooks (docs/QUICKSTART, docs/HEADLESS_USAGE, docs/architecture, docs/sample-data, docs/agent-workspace-prompt, docs/metrics/metrics.yml, dev_docs/server, dev_docs/disaster-recovery), the corporate-memory service README, the jira connector README + backfill scripts, the deploy skill, and test docstrings. Replaces `da sync` → `agnes pull`, `da analyst setup` → `agnes init`, `da metrics ...` → `agnes catalog --metrics` / `agnes admin metrics ...`, `da fetch` → `agnes snapshot create`, plus the matching docker-compose admin invocations. Vendor-specific `/opt/data-analyst/` install paths in jira backfill / consistency scripts and operator docs are replaced with the placeholder `<install-dir>` and a new `AGNES_ENV_FILE` env-var override that lets a deployment inject its actual install path without a code change. Aligns with the OSS vendor-agnostic policy in CLAUDE.md. CHANGELOG `### Internal` entry summarizes the audit and reaffirms the intentional stale-marker tuples (`_LEGACY_STRINGS`, `_OUR_COMMAND_MARKERS`) that must keep referencing `da sync` / `da fetch` / etc. for hook upgrade and override-detection logic.	2026-05-04 21:22:19 +02:00
ZdenekSrotyr	976d0c7160	fix(pull): re-download parquet when file missing despite matching hash Pre-fix `agnes pull` decided what to download from sync_state hash equality alone: if server_hash != local_hash or tid not in local_tables or not server_hash: to_download.append(tid) If the recorded local hash matched server but the actual parquet had been deleted from disk, the download was skipped. The next DuckDB view rebuild then fails on a missing file. Repro: `rm server/parquet/X.parquet && agnes pull` → 'Updated 0 tables', X still missing. Failure modes that produce hash-equal-but-file-missing: - manual `rm` of a single parquet - operator-side cleanup of `server/parquet/` - two workspaces sharing one user's `~/.config/agnes/sync_state.json` (TODO(workspace-scoped-sync-state) in pull.py): one workspace writes its parquets, the other reads sync_state and concludes 'I already have these' - disk corruption / partial restore from backup Fix: existence check runs alongside the hash compare. Missing file forces a re-download regardless of hash equality. `parquet_dir` is hoisted above the loop so the existence check is in scope when the download set is built. Tests: regression test for the hash-equal-but-missing-file case + counterpart for the fast-path (hash-equal-and-file-present must still skip).	2026-05-04 21:12:06 +02:00
ZdenekSrotyr	103efb69f0	chore(cli-rename): replace stale `da` verbs in active code paths Bring admin UI, audit-log messages, code comments, and analyst-facing skill docs in line with the post-bootstrap CLI surface (`agnes pull`, `agnes push`, `agnes init`, `agnes snapshot create`). The legacy `_LEGACY_STRINGS` detection tuple in `app/api/claude_md.py` and the hook upgrade markers in `cli/lib/hooks.py` are intentionally left as-is — they exist precisely to flag pre-rewrite content for re-authoring. Strip "(folded from `da metrics list`)" / "(lifted from `da metrics show`)" / "Replaces the old `da analyst status`" docstring noise — the rename history is in CHANGELOG.md, not in module docstrings.	2026-05-04 21:10:43 +02:00
ZdenekSrotyr	500db8cd3c	fix(query-guardrail): dry-run user SQL not synthetic SELECT * (#171 ) Closes #171. The /api/query cost guardrail used to dry-run a synthetic `SELECT * FROM <table>` for each registered remote-BQ row referenced by the user SQL — which made BigQuery estimate a full table scan, with column projection, predicate pushdown, and partition pruning all disabled. Narrow queries on big partitioned/clustered tables (the documented happy path for `agnes query --remote`) hit ~30,000× over-estimates and got rejected with 400 `remote_scan_too_large` even when BQ's own dry-run reported single-digit MB. Pavel's report on #171 traced the root cause and proposed the fix: rewrite the user SQL to BQ-native syntax and dry-run it as a single job, exactly the way `bq query --dry_run` works. Implementation: - New helper _rewrite_user_sql_for_bq_dry_run rewrites bare registered names (word-boundary, case-insensitive, longest-first to avoid prefix collisions) + bq."<ds>"."<tbl>" forms to backticked `<project>.<ds>.<tbl>` paths. - _bq_quota_and_cap_guard runs ONE dry-run on the rewritten SQL. Cap check uses the real estimate. - Fallback path: if BQ rejects with bq_bad_request (e.g. DuckDB-only syntax like ::INT casts), the guard falls back to the pre-fix per-table SELECT * approach so non-portable queries still get a (loose) cap estimate instead of fail-opening. Non-parse BQ errors (forbidden, upstream) still propagate as 502. - _bq_guardrail_inputs now also returns name_lookups so the rewriter has the (registered_name, bucket, source_table) mapping it needs. - Per-table breakdown is unavailable from a composite dry-run; total bytes are pinned to dry_run_set[0] for the post-flight record_bytes(sum(...)) call to keep returning the right total. Tests (7 new, 3 existing still pass): - dry-run receives rewritten user SQL with WHERE clause intact (the load-bearing assertion for #171) - single dry-run per request even with multiple registered tables (JOIN, UNION) referenced - fallback to per-table SELECT * on bq_bad_request - non-parse BQ errors (forbidden) still 502 - rewriter unit tests: bare + bq.path in same SQL, longest-name-wins on prefix collision, case-insensitive bare-name match	2026-05-04 21:08:21 +02:00
ZdenekSrotyr	bd462187e8	test(welcome-template): tighten default-rendered assertions to new agnes verbs The renderer no longer emits the legacy "da analyst setup" verb (the analyst flow uses `agnes init`, the admin flow uses `agnes auth import-token`). The disjunction assertions ("da analyst setup" OR "agnes auth" OR "curl") were permissive and would have silently kept passing even if the renderer regressed. Replace them with role-aware assertions that match the actual emitted markers and explicitly check that no legacy verb survives.	2026-05-04 21:07:51 +02:00
ZdenekSrotyr	8890b6f09b	fix(post-merge): clean up stale `da` verbs introduced via #174 merge Four call sites where #174 (branched from main before the agnes rename fully landed in some files) emitted or referenced `da fetch`. None are operator-visible runtime crashes — but `extractor.py` logs a stale verb to the operator log and `DATA_SOURCES.md` is current docs: - connectors/bigquery/extractor.py:431,434 (operator-facing log line on unverified BQ entity_type — was suggesting `da fetch`). - docs/DATA_SOURCES.md:77,85 (current public docs, two refs to `da fetch` in the workflow + the BQ scope description). - tests/test_cli_query_render.py:7 (module docstring listed `da fetch / agnes schema / etc.` — now `agnes snapshot create / agnes schema / etc.`). - tests/test_cli_snapshot_create.py:1 (docstring referenced `(folded from `da fetch`)` — historical, removed; no value once the rename landed). Pre-existing stale `da` references elsewhere in the branch (templates, operator runbooks, internal comments) are not touched by this commit — they live outside the merge surface and are a separate cleanup task. Verified: 10/10 across the affected test files pass.	2026-05-04 20:57:36 +02:00
ZdenekSrotyr	e438170ade	merge: pull #174 (BQ materialize view fix + concurrency, 0.33.0) into bootstrap branch Brings in zs/materialize-sync-fix (PR #174): - BigQuery view materialize works (wrap admin SQL in bigquery_query()) - Per-table mutex + fcntl.flock for concurrent COPY corruption - Cost guardrail dry-run engages on materialized rows - Schema v23 -> v24 migration: rewrite source_query to BQ-native - Server-generated trivial source_query from bucket+source_table - Validator backtick relaxation for materialized rows - 0.33.0 release cut Conflict resolution: - CHANGELOG.md: keep our [Unreleased] (bootstrap rewrite content) ABOVE the new [0.33.0] section from #174. The bootstrap rewrite remains unreleased; it'll cut 0.34.0 (or later) when this PR merges to main. - tests/conftest.py: union — keep our analyst-bootstrap fixture re-export AND #174's bq_instance / stub_bq_extractor fixtures. - pyproject.toml auto-merged to 0.33.0 (matches the cut), correct. - src/db.py auto-merged: SCHEMA_VERSION = 24, _v23_to_v24_finalize added — no overlap with our work which left schema at v23. - CLAUDE.md auto-merged: schema-history paragraph extended with v24. Verified: 79/79 across CLI bootstrap suite + materialize suite + schema v24 migration tests pass locally on Python 3.13/macOS.	2026-05-04 20:53:00 +02:00
ZdenekSrotyr	ee83cebbda	fix(cli): Windows console crash on cs-CZ codepage (port + broaden #172 ) Ports Minas's PR #172 (against pre-rename `da` CLI on main) and applies the principle to the post-rename `agnes` CLI. Two distinct failure modes on Windows consoles whose default codepage is cp1250 (cs-CZ) / cp1252 (en-US): 1. `agnes pull` and other Rich-progress codepaths UnicodeEncodeError on Braille spinner glyphs. Fix: `cli/main.py` reconfigures stdout/stderr to UTF-8 with errors='replace' at import time on `sys.platform == 'win32'` so Rich's legacy-Windows render path emits decodable bytes. Wrapped in try/except so pytest's captured streams (which aren't TextIOWrapper) don't break. 2. `agnes skills list` and `agnes skills show` UnicodeDecodeError when reading skill markdown containing em-dashes / accented chars. Default `Path.read_text()` uses locale.getpreferredencoding(False), which is the broken codepage on Windows. Fix: every call site passes encoding='utf-8' explicitly. Broader scope than #172 because: - The bootstrap rewrite renamed/removed several files Minas's PR patched (`cli/commands/analyst.py` -> rolled into init.py; `cli/commands/sync.py` -> split into pull/push). Those targets no longer exist; the equivalent code lives in init.py. - Other call sites Minas didn't touch (still bare in his branch) are patched here too — config.py / update_check.py / snapshot_meta.py / setup.py / skills.py — so the codebase has zero locale-default text I/O in cli/. Side cleanup: stale `Run `da`` reference in snapshot_meta.py:88 fixed to `agnes` while touching the file.	2026-05-04 20:45:29 +02:00
ZdenekSrotyr	e6a2c4c51d	tests: rename 'prj-grp' placeholder to 'my-project' for vendor-agnostic OSS The dashed identifier is what the test exercises (backticks required for dashed BQ project IDs); the literal string can be any synthetic value. 'prj-grp' is too close to a real customer-prefix pattern that the OSS vendor-scrub regex flags. 'my-project' matches placeholders used elsewhere in the project.	2026-05-04 20:38:47 +02:00
ZdenekSrotyr	e323ab76cc	fix(snapshot): catch httpx transport errors in --estimate path CI failure: test_readers_in_pre_init_dir asserted no Traceback in stderr when running `agnes snapshot create x --as y --estimate` in a folder that never saw `agnes init`. The estimate-guard fix in `3d587681` let `--estimate` skip the local_db check and reach `api_post_json`, but the existing `except V2ClientError` doesn't cover transport-layer failures. With no server configured the URL defaults to http://localhost:8000; httpx raises ConnectError → ConnectError isn't a V2ClientError → the exception bubbles up through Typer/rich as a full traceback. Add `except httpx.HTTPError` next to V2ClientError so connection / DNS / TLS / timeout failures all render the friendly hint `Run `agnes init …` first` instead of leaking transport noise.	2026-05-04 20:36:30 +02:00
ZdenekSrotyr	cd3293b994	release: 0.33.0 — BQ materialize view fix + concurrency control	2026-05-04 20:30:50 +02:00

1 2 3 4 5 ...

723 commits