agnes-the-ai-analyst

Author	SHA1	Message	Date
ZdenekSrotyr	cbf335cb5e	Merge pull request #210 from keboola/ma/marketplace-clone-and-auto-refresh feat(marketplace): clone-based plugin setup + SessionStart auto-refresh	2026-05-07 07:11:59 +02:00
ZdenekSrotyr	d3e8d29cfb	test(hooks): pin v0.43.0 chained-entry → v0.44.0 two-entry upgrade path	2026-05-07 07:00:00 +02:00
ZdenekSrotyr	bb36a69b1e	release: 0.44.0	2026-05-07 07:00:00 +02:00
Minas Arustamyan	cd10aefdbd	fix(refresh-marketplace): align manual-mode hint with hook JSON Hook JSON path uses /reload-plugins (no restart needed); manual-mode echo path was still telling the operator to /exit + restart. Both now say /reload-plugins. Tests renamed to _reload_hint_ to match the new wording.	2026-05-07 06:59:13 +02:00
Minas Arustamyan	3aeb0f2fbd	fix(refresh-marketplace): use /reload-plugins instead of /exit + restart Claude Code's `/reload-plugins` slash command picks up newly installed plugins into the running session without forcing the user to /exit and restart Claude Code. The hook JSON `systemMessage` and `additionalContext` both now point at it. Tests updated to pin the new hint shape.	2026-05-07 06:59:13 +02:00
Minas Arustamyan	166c1c0752	fix(refresh-marketplace): pass --scope project to `claude plugin update` Without `--scope project`, `claude plugin update <name>@agnes` operated at user scope (the default) instead of updating the project-scoped install — so version bumps in the served manifest never propagated to the workspace, even though `claude plugin install` correctly used `--scope project` for the missing-plugin path. Mirrors the install line in the same function. Any change refresh- marketplace makes to a plugin must now stay in project scope — consistent with the SessionStart hook firing per-workspace.	2026-05-07 06:59:13 +02:00
Minas Arustamyan	50e0463501	feat(marketplace): clone-based plugin setup + auto-refresh SessionStart hook Adds end-to-end flow for installing and keeping the per-user filtered Claude Code marketplace in sync with the user's Agnes stack (admin RBAC grants \ MyAIStack opt-outs U /store installs). Setup (one-liner in install prompt step 5): `agnes refresh-marketplace --bootstrap` clones the per-user marketplace bare repo to ~/.agnes/marketplace, strips PAT from the cloned origin URL, registers the local path with Claude Code, and installs every plugin in the served manifest at --scope project. Replaces a 15-line inline shell sequence that tripped Claude Code's agent-driven `rm -rf` permission gate. Auto-refresh (SessionStart hook installed by `agnes init`): `agnes refresh-marketplace --quiet` runs every Claude Code session, fetches+resets the clone (server rebuilds as orphan commits, so pull --ff-only is impossible), and version-aware reconciles: - missing in workspace -> claude plugin install <name>@agnes --scope project - version differs -> claude plugin update <name>@agnes - matches -> skip Don't auto-uninstall plugins that disappeared from the manifest -- a transient empty manifest from the server would wipe the stack. Hook output: when --quiet AND something actually changed, emits Claude Code hook JSON on stdout -- `systemMessage` (transient toast) and `hookSpecificOutput.additionalContext` (model-side system reminder), both carrying the change summary plus a "/exit + restart Claude Code" instruction (Claude only scans plugins at session start). Windows hook compatibility: the refresh-marketplace hook command is wrapped in `bash -c "..."` because Claude Code on Windows runs hook commands directly without invoking a shell, so `2>/dev/null \|\| true` would otherwise be passed as literal argv tokens. Cross-cutting: - cli/lib/marketplace.py: shared CLONE_DIR + MARKETPLACE_NAME constants. - cli/lib/hooks.py: SessionStart now has two independent entries (pull + refresh-marketplace) so a failure in one doesn't suppress the other; legacy `da sync` and prior single-pull layouts upgrade cleanly on re-init. - PAT injection on every git fetch via per-invocation credential helper (token in \$AGNES_TOKEN env, never in argv or .git/config). - Pre-snapshot of installed plugins captured BEFORE `claude plugin marketplace update` so silent auto-applied version bumps still fire notifications. - scripts/dev/agnes-client-reset.sh: cleans ~/.claude/plugins/marketplaces/agnes, ~/.claude/plugins/cache/agnes, drops uv build cache, documents workspace-scoped residue that can't be enumerated from the script. - app/web/setup_instructions.py: legacy AGNES_DEBUG_AUTH path also uses clone (direct HTTPS marketplace add is broken end-to-end on every Claude Code distribution -- stores response as single file, plugin source paths then 404). 28 new tests (test_cli_refresh_marketplace.py) + extended hook + setup template tests cover bootstrap, fetch+reset ordering, version-aware reconcile, project-path filtering, hook JSON shape, and the bash-c Windows wrapper invariant.	2026-05-07 06:59:13 +02:00
ZdenekSrotyr	f52cfd1119	infra(customer-instance): allow stopping VMs for in-place updates (#211 ) Add allow_stopping_for_update=true on google_compute_instance.vm. Without it, a TF change to machine_type triggers ForceNew (destroy + recreate); with it, the provider stops + mutates + restarts the VM in place, which is what an operator resizing a running deployment expects. Tag as infra-v1.7.0; consumers opt in by bumping the module ref.	2026-05-07 06:58:10 +02:00
ZdenekSrotyr	d3113e7a31	Merge pull request #209 from keboola/zs/cli-auto-upgrade-spec feat: server-pinned CLI auto-upgrade (0.43.0)	2026-05-06 23:46:47 +02:00
ZdenekSrotyr	e1ac7d41f1	release: 0.43.0 — server-pinned CLI auto-upgrade See CHANGELOG.md for the full entry. (Bumped from 0.42.0 to 0.43.0 since 0.42.0 was taken by PR #208's backtick-rewriter fix during this branch's review cycle.)	2026-05-06 23:24:44 +02:00
ZdenekSrotyr	df896816d8	chore: rename stale 'da' references to 'agnes' + CHANGELOG Drive-by docstring/comment cleanup in cli_artifacts.py and update_check.py. CHANGELOG entry for the auto-upgrade feature shipped in this branch.	2026-05-06 23:23:59 +02:00
ZdenekSrotyr	73d2896fa6	docs(hooks): update install_claude_hooks docstring for chained SessionStart	2026-05-06 23:23:23 +02:00
ZdenekSrotyr	be62ce61b8	feat(cli): install SessionStart hook chaining self-upgrade then pull Single hook entry: 'agnes self-upgrade --quiet ... \|\| true; agnes pull --quiet ... \|\| true'. Shell semicolon guarantees ordering across every Claude Code version (no reliance on undocumented multi-hook execution semantics); each segment's \|\| true preserves the original property that an upgrade failure does not abort the pull.	2026-05-06 23:23:23 +02:00
ZdenekSrotyr	630e224578	feat(cli): add agnes self-upgrade with smoke test + rollback Reuses cli.update_check.check() for the version probe — extended with bypass_disabled=True so explicit user-typed self-upgrade is not silenced by AGNES_NO_UPDATE_CHECK (which is for the implicit warning loop). Install path: uv tool install --force when uv is on PATH; otherwise curl + pip via sys.executable (NOT system python3, NOT --user — both would land outside the agnes venv and silently no-op the upgrade). Smoke test execs the binary at the install-resolved path (uv tool dir joined with agnes-the-ai-analyst/bin/agnes, or sys.executable's sibling agnes for pip) — never via shutil.which, which can resolve a stale shadow on PATH and produce a false-positive smoke pass on the OLD version. Smoke also asserts --version output contains info.latest via PEP 440 Version() equality (so 0.40.0 does not falsely match 0.40.10). On smoke fail: rollback to last_known_good.json (written only after a previous run's smoke passed). Rollback rc is captured and surfaced on stderr if it also fails. First-ever upgrade or unrecoverable rollback prints the canonical bootstrap recovery: curl -fsSL <server>/cli/install.sh \| bash. AGNES_SELF_UPGRADE_IN_PROGRESS=1 is set for the duration of the run and propagated to the smoke-test subprocess. Layer B's _check_version_headers honors the sentinel and skips the < min hard-stop, so an in-flight upgrade can never sys.exit(2) itself. --force invalidates the update_check cache BEFORE probing. --force + offline = exit 1 with explicit stderr (without --force, offline is silent). --quiet suppresses progress output but never gags failure stderr.	2026-05-06 23:23:23 +02:00
ZdenekSrotyr	d93eda7de3	perf+test(cli): cache User-Agent at module scope; pin local==min boundary	2026-05-06 23:23:23 +02:00
ZdenekSrotyr	2680a6724b	feat(cli): hard-stop on incompatible-version response header Every API response is inspected via httpx event_hooks. When the server reports X-Agnes-Min-Version > local, CLI prints a remediation message and exits 2. Latest-version drift continues to be handled by the update_check warning loop — no double-warning on every API call.	2026-05-06 23:23:23 +02:00
ZdenekSrotyr	af2b866961	docs(version): clarify APP_VERSION scope + middleware /api prefix rationale	2026-05-06 23:23:23 +02:00
ZdenekSrotyr	57170bc556	feat(server): expose APP_VERSION + MIN_COMPAT_CLI_VERSION on /api/* response headers Adds X-Agnes-Latest-Version and X-Agnes-Min-Version headers to every /api/* response. CLI consumes these to hard-stop on incompatible drift. MIN_COMPAT_CLI_VERSION ships at 0.0.0 — no enforcement until a deliberate wire-protocol break bumps it. Also dedupes app version logic: app/main.py:_app_version() helper deleted, replaced by app/version.py:APP_VERSION as the single source of truth. test_app_version.py rewritten to target app.version.	2026-05-06 23:23:23 +02:00
ZdenekSrotyr	56483989cf	docs(plan): server-pinned CLI auto-upgrade — spec + implementation plan Four review iterations resolved: - PATH-shadow-safe smoke test (uv tool dir --bin + ~/.local/bin fallback) - Recursion sentinel for in-flight self-upgrade - sys.executable + --no-deps pip fallback (NOT system python3, NOT --user) - Smoke + rollback with rc capture and bootstrap recovery - Single chained SessionStart entry (shell ; for ordering, no Claude Code semantics dependency) - AGNES_NO_UPDATE_CHECK bypass for explicit self-upgrade - _get_shared_client() left unhooked (mid-stream sys.exit unsafe; Caddy proxies parquets anyway) Targets release 0.40.0.	2026-05-06 23:23:23 +02:00
ZdenekSrotyr	e3494607bf	Merge pull request #208 from keboola/zs/issue-201-rewriter-backtick fix(query): rewriter respects backtick paths; tighten cap-guard fallback (#201)	2026-05-06 23:09:43 +02:00
ZdenekSrotyr	bc55af6e88	chore: trigger Devin re-review	2026-05-06 22:07:49 +02:00
ZdenekSrotyr	f4bc04958d	fix: Devin Review #1 — apply backtick mask to wrapping rewriter `_rewrite_user_sql_for_bigquery_query` does its own bare-name detection (mirroring the non-RBAC parts of `_bq_guardrail_inputs`). The backtick masking from #201 was applied to `_bq_guardrail_inputs` and the forbidden-table loop, but missed this third site — so a registered local-mode table name appearing as the table segment of a user-supplied full backtick path (e.g. ``\`prj.ds.orders\`` matching registered local ``orders``) tripped the cross-source guard and forced every backtick-path query into the 50-100× slower ATTACH-catalog fallback. Mask once at the top of the function, route both the BQ-name detection (line ~830) and the cross-source check (line ~867) through the masked copy. New regression test `test_local_name_inside_backtick_path_does_not_trip_cross_source` proves the wrapper now wraps when it should.	2026-05-06 21:06:21 +02:00
ZdenekSrotyr	09958c9d87	release: 0.42.0	2026-05-06 18:04:39 +02:00
ZdenekSrotyr	824e3cb636	feat(query): registry-gate full backtick BigQuery paths (#201 ) Adds Pass 3 to `_bq_guardrail_inputs` that scans user SQL for full backtick paths `<project>.<dataset>.<table>` and gates them identically to the `bq."<dataset>"."<table>"` pass: - Project must match the configured BigQuery data project (`get_bq_access().projects.data`). Mismatch → HTTP 403 `bq_path_cross_project`. - Path must point at a registered row. Unregistered → HTTP 403 `bq_path_not_registered`. - Non-admin caller must hold a grant on the registered row's id. Missing grant → HTTP 403 `bq_path_access_denied`. Pre-fix, full backtick paths bypassed Agnes RBAC entirely — only the service account scope limited reach. Post-fix the boundary matches what `agnes catalog`-driven flows already enforce. Admin still bypasses the per-id grant check but cannot bypass registration or project match. Pass 3 also seeds `dry_run_set` for resolved registered paths so the cost-cap dry-run runs against the same physical table the user named — composing cleanly with the Layer 2 fail-fast fallback.	2026-05-06 18:02:53 +02:00
ZdenekSrotyr	c32be3fe96	fix(query): cap-guard fallback retries original SQL, fails fast (#201 ) When BQ rejects the rewritten dry-run SQL with `bq_bad_request`, the cap-guard now retries with the user's ORIGINAL SQL instead of building a synthetic `SELECT * FROM <table>` per registered table. The synthetic path threw away user filters / projections / partition predicates and routinely ballooned the estimate to "full table size", falsely tripping `remote_scan_too_large` on legitimate narrow queries (typical issue #201 trace: rewriter corrupts a backtick path → BQ parse error → synthetic over-estimate → 400). Behaviour: - Rewritten SQL succeeds: same as before (issue #171 single-dry-run). - Rewritten SQL parse-errors, original SQL succeeds: use original estimate. Common case for users submitting BQ-native input. - Both fail with `bq_bad_request`: HTTP 400 `remote_estimate_failed` with a hint pointing at `agnes catalog` / BQ-native syntax. No silent over-estimate. - Non-parse BQ error (forbidden, upstream): still 502 as before. This is a behaviour change for clients matching error kinds — failure to estimate scan size now surfaces as `remote_estimate_failed` instead of being masked behind `remote_scan_too_large` from the synthetic path. Replaces the existing `test_guardrail_falls_back_to_per_table_estimate_on_bq_parse_error` (which pinned the old contract) with `test_fallback_tries_original_sql_first` and `test_fallback_fails_fast_on_pure_duckdb_syntax`.	2026-05-06 18:02:53 +02:00
ZdenekSrotyr	720a2180c0	fix(query): rewriter respects backtick segments (#201 ) `agnes query --remote` corrupted user SQL when the request contained a full BigQuery backtick path (`<project>.<dataset>.<table>`) whose table segment matched a registered bare-name alias. The bare-name rewriter used `\b` word-boundary matching against the lower-cased SQL; both `.` and `` ` `` are non-word characters, so the regex fired INSIDE the user's backtick path and produced malformed nested-backtick SQL that BigQuery rejected at parse time. Fix: - Add `_mask_backticks(sql)` helper: replace each `…` segment with spaces of equal length, preserving offsets so word-boundary searches find positions only outside backticks. - `_bq_guardrail_inputs` (bare-name pass + forbidden-table pass) searches against the masked SQL. - `_rewrite_bq_table_refs_to_native` Pass 1 splits the SQL on `(\`[^\`]*\`)` and rewrites only the outside-backtick chunks. Pass 2 (`bq."ds"."tbl"` → backtick form) is unchanged — its prefix can't appear inside backticks. Adds three regressions covering the rewrite + guardrail paths.	2026-05-06 18:02:53 +02:00
ZdenekSrotyr	1b49de1568	Merge pull request #202 from keboola/zs/perf-followup-0.41.0 fix(0.41.0): orchestrator filesystem fallback for materialized parquets	2026-05-06 17:16:38 +02:00
ZdenekSrotyr	7781c3f331	fix(0.41.0): orphan parquet skip in filesystem fallback (CI regression) Pre-existing test_orchestrator_skips_orphan_parquet_in_extracts caught the regression: my filesystem fallback created master views for ANY parquet on disk, including orphans where DELETE /api/admin/registry removed the registry row but the parquet wasn't fully cleaned up. Fix: load the set of registered materialized table_ids for THIS source from table_registry before the scan, and skip any parquet whose stem isn't in that set. If the registry read fails (test fixture, transient DB error), skip the fallback entirely — orphan exposure is worse than missing master view recovery. Pre-existing test now passes. New regression test pins the orphan-skip contract specifically for the filesystem-fallback path.	2026-05-06 17:06:20 +02:00
ZdenekSrotyr	dfb7f25e76	release: 0.41.0 — orchestrator filesystem fallback for missing _meta materialized rows 0.40.0 added _persist_materialized_inner_view in materialize_query, which tried to open extract.duckdb from a fresh DuckDB handle to write the _meta row + inner view. In production this conflicts with the same uvicorn process's existing read-only ATTACH (orchestrator's analytics conn holds extract.duckdb ATTACHed as <source_name> alias), and DuckDB single-process file-handle uniqueness rejects with: Binder Error: Unique file handle conflict: Cannot attach "extract" — already attached by database "<source>" The helper logs WARNING fail-soft, parquet stays canonical, but the master view never appears via the meta path. Fix: at the end of _attach_and_create_views, scan <extract_dir>/data/.parquet and CREATE OR REPLACE VIEW <id> AS SELECT FROM read_parquet('<path>') for any parquet whose <id> is not already in the per-source tables list (= meta path didn't pick it up). Decoupled from materialize_query open-handle race. Honors the same view_ownership cross-connector collision rules as the meta path (first-come-first-served via view_repo.claim). Tests: - filesystem-fallback fires when _meta row missing - skipped when meta path already created the view (no shadow) - skips invalid identifiers (e.g. parquet stem starting with a digit) - doesn't crash when source has no data/ subdir	2026-05-06 16:58:18 +02:00
ZdenekSrotyr	0fd73faa8d	Merge pull request #200 from keboola/zs/perf-followup-0.40.0 fix(0.40.0): materialize_query writes _meta + inner view (master view recovery)	2026-05-06 16:18:41 +02:00
ZdenekSrotyr	b5b16e98a0	release: 0.40.0 — materialize_query writes _meta + inner view so master views appear Pre-fix flow: 1. extractor subprocess writes _meta with N remote rows + creates N inner views in extract.duckdb (rebuild_from_registry skips materialized rows per design — explicit `continue` at line 389) 2. _run_materialized_pass calls materialize_query, which writes parquet atomically + returns stats — but never updates _meta 3. orchestrator.rebuild scans _meta, finds only the N remote rows, creates master views only for them. Materialized parquet is on disk but invisible to /api/query → 400 'not yet materialized' Symptom appears after every container recreate (the previous run's _meta state is wiped because docker compose down nukes the named volume that backs extract.duckdb on some compose layouts; even on volumes that persist, the next extractor pass calls _create_meta_table which DROPs + CREATEs _meta cleanly). Fix: after os.replace(tmp_path, parquet_path) in materialize_query, open extract.duckdb (read-write), DELETE existing _meta row for table_id, INSERT new one with query_mode='materialized', and CREATE OR REPLACE VIEW <table_id> AS SELECT * FROM read_parquet(<path>). All inside a single transaction so concurrent reads see either old or new state, not torn rows. Fail-soft on lock contention or schema drift — parquet remains canonical, next sync pass recovers. Tests: 3 new in test_bq_materialize.py covering: - meta + inner view registered after materialize, alongside existing remote rows - re-run replaces (not duplicates) the meta row - skips inner-view registration when extract.duckdb doesn't exist yet (fresh BQ-only deployment edge case)	2026-05-06 16:04:58 +02:00
ZdenekSrotyr	6de7084c9f	Merge pull request #199 from keboola/zs/perf-bundle-0.39.0 perf(0.39.0): bundle — BQ query rewrite + session pool + chunked download + HTTP/2	2026-05-06 14:37:48 +02:00
ZdenekSrotyr	f03fa67b2e	chore: trigger Devin re-review All Devin findings from initial review on `8e56d45c` addressed: - Devin #1 (BQ billing project) → fixed in `81d065b1` - Devin #2 (try/except scope) → fixed in `aee585fa` (was already in flight at initial review time) Plus three rounds of devil's advocate review (`e5645fd2`, `aee585fa`, `77d88014`) addressing 9 additional findings. 76/76 perf tests pass; CI green.	2026-05-06 14:32:36 +02:00
ZdenekSrotyr	81d065b1ea	fix: Devin Review #1 — bigquery_query() first arg uses billing project, not data In cross-project BQ setups (where billing != data), the SA typically has serviceusage.services.use on the billing project but not on the data project. The rewriter passed bq.projects.data as the first arg to bigquery_query(), which BQ uses as the execution + billing project → 403 USER_PROJECT_DENIED. Match the convention used everywhere else in the codebase (app/api/v2_scan.py, app/api/v2_sample.py, app/api/v2_schema.py, connectors/bigquery/extractor.py): backtick paths inside the inner SQL use the data project (resolves the actual table location), the bigquery_query() first arg uses the billing project (decides who pays + which project the job runs under). For single-project deploys the two are identical so the fix is a no-op there. Test pins the cross-project case: data-prj for backticks, billing-prj for the bigquery_query() first arg.	2026-05-06 14:07:38 +02:00
ZdenekSrotyr	77d88014df	fix: devil's advocate R3 — reap PID-suffixed leftovers from dead processes R3 final pass surfaced one issue, addressed: R2#2 introduced PID-suffixed <target>.{pid}.tmp / .{pid}.partN to prevent concurrent agnes pull invocations from yanking each other's in-progress writes. The pre-clean inside _download_chunked / _download_single_stream only deletes leftovers from the CURRENT process's PID — files from a SIGKILL'd or crashed prior pull (any other PID) are never touched and accumulate on disk forever. Add _reap_dead_pid_leftovers(target_path) called at the start of both download paths. Globs <target>..tmp / <target>..partN, extracts the embedded PID, calls os.kill(pid, 0) to test liveness (POSIX standard no-op probe), and unlinks files whose process no longer exists. Permission-denied = process is alive but owned by another user → keep the file (conservative). Windows users get the conservative 'keep' default. Two new tests pin the behavior — live-PID file preserved, dead-PID .tmp + .partN reaped, bare-name (legacy) untouched, garbage filenames skipped without raise.	2026-05-06 14:04:47 +02:00
ZdenekSrotyr	aee585fac6	fix: devil's advocate R2 — narrow shared-client try, PID tmp suffix, Syntax error anchor R2 adversarial review surfaced 3 issues, all addressed: #1 cli/client.py:572-577 outer try/except wrapped both _get_shared_client() AND the actual download. A 401/403/404/5xx from the server triggered a full second download attempt with a fresh client — wasted bandwidth on hard failures, no fail-fast on revoked PAT. Narrowed the try to only the shared-client construction; the download itself is no longer retried under the fallback except. #2 concurrent agnes pull invocations (e.g. SessionStart hook + manual run) collided on bare <target>.tmp / <target>.partN paths — one process's in-progress write got yanked by the other's cleanup, manifest hash check then failed spuriously. Per-process suffix (<target>.{pid}.tmp, <target>.{pid}.partN) makes intermediate files disjoint; the final os.replace to the bare target is atomic so last-writer-wins. #3 _looks_like_bq_rewrite_parse_error patterns 'Syntax error' could false-positive on a query like WHERE log_msg = 'Syntax error in foo' that fails for an unrelated reason (quota, network) and has the literal substring echoed in the error text. Anchored to 'Syntax error: ' (with trailing colon) — BQ always emits the colon in this error format, user SQL string literals normally don't.	2026-05-06 13:57:29 +02:00
ZdenekSrotyr	e5645fd280	fix: devil's advocate R1 — chunked probe, parse-error heuristic narrow, pool settings refresh, content-length sanity, multi-project skip R1 adversarial review surfaced 5 issues, all addressed: #1 chunked download silently disabled in non-Caddy deployments (HEAD on GET-only FastAPI route returns 405). _probe_range_support now falls back to GET with Range: bytes=0-0 when HEAD fails — works against both Caddy file_server (HEAD-friendly) and dev FastAPI direct (GET-only). #2 parse-error fallback heuristic too broad — matched on Unrecognized name / Function not found / No matching signature / Invalid cast, which BQ surfaces for ordinary user-column typos. That triggered slow ATTACH-catalog retry on every typo (2× latency tax). Narrowed to just 'Syntax error' / 'syntax error' which are the genuine DuckDB-vs-BQ dialect mismatch markers. #3 apply_bq_session_settings was only run on fresh-built pool entries, not on reuse. An operator's /admin/server-config change to bq_query _timeout_ms wouldn't propagate to long-lived pooled sessions until restart. Fixed: re-apply on every pool acquire (idempotent + fail-soft). #4 content-length sanity bound — a misconfigured proxy returning a wildly inflated Content-Length would cause overlapping chunked Range requests against the actual file → corrupt assembled output (caught by manifest hash check, but only after wasted bandwidth). Cap at 100 GiB; above that, drop to single-stream. #5 rewriter assumed every BQ row resolves under the single bq.projects.data project. Bucket containing '.' suggests a project- qualified bucket (multi-project deployment); rewriter would silently target the wrong project. Conservative skip with regression test.	2026-05-06 13:50:46 +02:00
ZdenekSrotyr	8e56d45c68	fix(query): code-review fixes — outer LIMIT wrap, dollar-quoting, parse-error fallback Address code-reviewer findings on the bigquery_query() rewrite path: 1. Outer LIMIT wrap — bigquery_query() materialises BQ result into DuckDB before fetchmany sees it (vs ATTACH-catalog Storage Read API streaming). A user 'SELECT *' against a billion-row remote table would buffer the entire result before request.limit applied. Wrap rewritten SQL in an outer 'LIMIT N+1' so the cap pushes into the BQ job itself. 2. Dollar-quoted inner SQL — naive replace("'", "''") doubling missed DuckDB backslash-escape sequences (\\, \\n, \\t, …). A predicate like 'WHERE name = ''O\\'Brien''' was unsafe under the doubling path. DuckDB $bqq_inner$ … $bqq_inner$ form takes the inner SQL verbatim with no escapes whatsoever. Falls back to legacy doubling if user SQL improbably contains the literal tag. 3. Parse-error fallback — when the rewritten path fails with a BQ-side parse / validation error (DuckDB-only syntax like ::INT cast that survives identifier rewrite but BQ refuses), retry the user's original SQL via the legacy ATTACH-catalog path so the request still succeeds. Mirrors the existing dry-run fallback contract. 4. CHANGELOG — delete duplicate CLI bullets that landed under already-released [0.38.1] (file corruption from merge — entries are correctly under [0.39.0]).	2026-05-06 13:29:45 +02:00
ZdenekSrotyr	3b9f6b447d	release: 0.39.0 — perf bundle (BQ query rewrite + session pool + chunked download + HTTP/2)	2026-05-06 13:18:19 +02:00
ZdenekSrotyr	830d1a38f6	merge: CLI perf (chunked DL + HTTP/2 + persistent client + progress) # Conflicts: # CHANGELOG.md	2026-05-06 13:16:31 +02:00
ZdenekSrotyr	c96ea3ad49	merge: server-side perf (BQ rewrite + session pool + error mapping)	2026-05-06 13:13:48 +02:00
ZdenekSrotyr	e72ff259f9	feat(pull): aggregated progress + non-TTY textual fallback Two improvements to `agnes pull` progress reporting: 1. Aggregated per-file progress across chunked downloads: the existing Rich progress bar already used one task per file, but the chunked-download contract (one file = N parallel chunk callbacks summing to file size) meant we needed to verify that all chunk threads advance the same task. They do — the per-file callback is constructed once per tid and routes every chunk's byte delta to the same task / textual entry, so the bar shows one aggregated bytes- downloaded total rather than N separate sub-bars. 2. Textual fallback for non-TTY stderr: when stderr is not a terminal (SessionStart hook, CI runner, Docker log capture), Rich either suppresses output (silent multi-minute pull on a 5 GB parquet) or emits raw control sequences. The new `_TextualProgress` helper instead emits one plain-text line per file at most every 10%-of-total-bytes or 30 s, plus a final `100% done` line per file. Format: `[N/T files] <tid>: 25% (16 MB / 66 MB) at 1.5 MB/s`. The TTY path is unchanged. Detection uses `sys.stderr.isatty()` — `show_progress=True` flips into the textual fallback when that returns False. `show_progress=False` (the SessionStart hook) still emits no progress text in either mode.	2026-05-06 13:09:37 +02:00
ZdenekSrotyr	14db85f506	fix(bq): map 'Response too large' to its own error class instead of generic bad_request translate_bq_error previously mapped BQ's responseTooLarge failure mode to bq_bad_request (HTTP 400 with the raw upstream message). The user- facing implication ('your SQL has a syntax error') is wrong -- the root cause is query shape (BQ refused to return the result inline because it exceeded the response size limit), and the actionable remediation is 'narrow the WHERE clause, aggregate further, or use a materialized table'. Add bq_response_too_large as a first-class BqAccessError kind (also 400) with a canonical hint message; original BQ message preserved in details for operator debugging. Detection is substring-based on 'response too large' and fires before the generic BadRequest path so the dedicated mapping always wins. Affects every BQ-touching path since they all share translate_bq_error -- /api/query, /api/v2/{scan,sample,schema}, materialize.	2026-05-06 13:09:31 +02:00
ZdenekSrotyr	bd1b5ad444	perf(cli): persistent HTTP/2 client across pull invocation Pool the httpx.Client used by `stream_download` so N parquet downloads share a single TLS handshake instead of one handshake each. With the optional `h2` package installed, HTTP/2 multiplexing further lets all chunk Range requests share a single TCP connection — synergizes with the range-chunked download path added in the previous commit. The shared client is created lazily on first stream-download call, kept alive for the duration of the process via a module-level slot, and closed at exit via `atexit.register`. Construction wraps in a try/except: when `h2` is unavailable (slim install), httpx raises ImportError on `http2=True` and we transparently fall back to an HTTP/1.1 client — pooling alone still amortizes TLS handshakes. `agnes pull` must never crash on a missing optional package, so the fallback path is non-negotiable. `h2>=4.1.0` is added to the core dependency set; downstream slim installs that drop it lose the HTTP/2 benefit but keep correctness.	2026-05-06 13:06:36 +02:00
ZdenekSrotyr	83209f32b0	perf(bq): pool DuckDB BQ extension sessions to amortize INSTALL/LOAD/ATTACH cost Each BqAccess.duckdb_session() acquire previously created a fresh in-memory DuckDB conn and ran INSTALL bigquery; LOAD bigquery; CREATE SECRET; ATTACH on it -- costing ~0.5 s per request even before any BQ work. Add a process-local pool (deque + lock) of pre-warmed sessions; acquire reuses a warm entry when available, refreshing the auth SECRET so a long-lived pool entry doesn't keep a stale GCE metadata token past its TTL. Liveness probe (cheap SELECT 1) drops broken entries before handing them to callers. On exception inside the with-block the conn is closed instead of returned to pool (session may carry dirty state). Pool size is data_source.bigquery.session_pool_size (default 4; sentinel 0 disables pooling). Process-cached, not fork-safe (single uvicorn worker is the supported deployment shape per CLAUDE.md). All call sites get faster automatically: /api/query, /api/v2/{scan, sample,schema}, materialize, the orchestrator's remote-attach, and the BQ dry-run cap-guard.	2026-05-06 13:06:25 +02:00
ZdenekSrotyr	dee33fe25b	feat(pull): range-chunked parallel download for single large files When the server advertises `accept-ranges: bytes` and a parquet exceeds `AGNES_PULL_CHUNK_THRESHOLD_BYTES` (default 50 MB), `stream_download` now splits the file into N parallel HTTP Range requests (`AGNES_PULL_CHUNK_PARALLELISM`, default 4, capped 1..16) and assembles the parts into the destination atomically. Targets the per-flow-shaped network (corp VPN with per-TCP-connection rate-limiting) where single-stream throughput is throttled but N parallel streams over the same connection scale roughly linearly. Manifests with 1 large materialized parquet + N remote tables previously left the existing across-files `AGNES_PULL_PARALLELISM=4` pool with 1 active worker = single-stream throughput; this fixes that. Falls back to single-stream when: - HEAD doesn't advertise `accept-ranges: bytes` - Server returns 200 instead of 206 to a Range probe - File size below the threshold Cleanup discipline: every part file removed before return (success or failure); destination written via `<target>.tmp` and renamed atomically. Per-chunk retry on transient network blips (bounded by AGNES_STREAM_RETRIES).	2026-05-06 13:04:53 +02:00
ZdenekSrotyr	b2c1ff143c	fix(query): rewrite BQ-backed user SQL via bigquery_query() to enable predicate pushdown User SQL hitting query_mode='remote' BigQuery rows was 50-100x slower than the equivalent direct bigquery_query() call because DuckDB's master view (CREATE VIEW … AS SELECT * FROM bigquery.<ds>.<tbl>) does not push WHERE/SELECT/LIMIT into BQ in ATTACH-catalog mode. The BQ extension opens a Storage Read API session over the entire upstream table; on >100M-row sources this was 70-150s and frequently failed with 'Response too large to return'. Extract the existing dry-run rewriter's core (table-name → BQ-native backtick path) into a shared helper. Add an execution-path rewriter that wraps the whole user SQL in bigquery_query('<project>', '<inner>') so the BQ planner sees the full query and engages partition pruning + projection pushdown server-side. Conservative fall-through: cross-source JOINs (BQ ↔ Keboola/Jira local), queries already containing bigquery_query(, and unconfigured BQ project all skip the rewrite and run the original SQL via ATTACH-catalog so behavior degrades gracefully.	2026-05-06 13:02:34 +02:00
ZdenekSrotyr	9649f42b99	Merge pull request #198 from keboola/zs/admin-tables-description-clamp fix(admin/tables): keep row Actions reachable + sanitize description escapes	2026-05-06 11:50:13 +02:00
ZdenekSrotyr	226eb71592	Merge remote-tracking branch 'origin/main' into pr198-review # Conflicts: # CHANGELOG.md	2026-05-06 11:35:45 +02:00
ZdenekSrotyr	6bc8739010	feat(admin/tables): show source, schedule, folder, registered, and sync-error in row	2026-05-06 11:09:02 +02:00

1 2 3 4 5 ...

871 commits