Commit graph

333 commits

Author SHA1 Message Date
ZdenekSrotyr
e53de59a42 docs: multi-customer deployment spec + implementation plan
- Spec: pure self-deploy model with per-customer GCP project
- Public upstream repo with TF module; private template + per-customer repos
- Branch-aware dev VMs via dev_instances list
- Caddy TLS, Secret Manager for tokens, SA JSON key for CI (WIF follow-up)
- 6-phase implementation plan with bite-sized tasks
2026-04-21 15:25:17 +02:00
ZdenekSrotyr
bd6921c4d5 docs,tests: anonymize customer references
Replace identifying customer names and infrastructure URLs in
documentation and test fixtures with generic placeholders.
Test semantics preserved.
2026-04-21 11:56:19 +02:00
ZdenekSrotyr
c74a1fab53 Merge pull request #4 from keboola/feature/v2-fastapi-duckdb-docker-cli
test: comprehensive test suite — 1169 tests, 4 layers
2026-04-13 16:14:11 +02:00
ZdenekSrotyr
5bbd82bacd fix: address Devin review — docker-e2e .env, jira webhook test isolation
- Create empty .env before docker compose up in CI (env_file: .env is required)
- Mock get_jira_service in webhook HMAC test to isolate signature check
  from Jira API availability — strict assert 200 instead of permissive 500
2026-04-13 14:36:31 +02:00
ZdenekSrotyr
863453b2e2 fix: address code review findings — duplicate fixture, JWT key length, async deprecation
- Remove duplicate mock_extract_factory fixture in conftest.py
- Use 32+ char JWT_SECRET_KEY everywhere (was 15 chars, triggered warnings)
- Replace deprecated asyncio.get_event_loop() with asyncio.run()
- Unify WebhookEventFactory sign methods (consistent json.dumps)
2026-04-13 13:47:51 +02:00
ZdenekSrotyr
12480b8c35 fix: graceful skip for telegram bot tests when log dir unavailable in CI 2026-04-13 13:31:51 +02:00
ZdenekSrotyr
98af8e2df3 fix: make bot.py FileHandler resilient to missing log directory 2026-04-13 13:28:59 +02:00
ZdenekSrotyr
0045f5d324 fix: ensure DATA_DIR and notifications dir exist before bot.py import in CI 2026-04-13 13:26:18 +02:00
ZdenekSrotyr
1a68decd4e fix: patch BOT_LOG_FILE at import time for CI/xdist compatibility 2026-04-13 13:21:04 +02:00
ZdenekSrotyr
9a144f8291 fix: unify JWT_SECRET_KEY across all test modules for xdist stability 2026-04-12 14:28:17 +02:00
ZdenekSrotyr
ed58075419 Merge branch 'worktree-agent-a417e289' into feature/v2-fastapi-duckdb-docker-cli 2026-04-12 14:24:39 +02:00
ZdenekSrotyr
325f785ef4 fix: get_instance_name reads nested instance.name from YAML 2026-04-12 14:23:54 +02:00
ZdenekSrotyr
209643becb fix: return filename instead of absolute path in upload responses 2026-04-12 14:23:51 +02:00
ZdenekSrotyr
31e210c7e3 fix: require admin/km_admin role for web admin pages 2026-04-12 14:23:47 +02:00
ZdenekSrotyr
01b5f80ef9 fix: restrict script deploy/execute to analyst role, undeploy to admin 2026-04-12 14:23:44 +02:00
ZdenekSrotyr
5bfff6616c ci: add parallel test execution and nightly Docker E2E job 2026-04-12 14:15:46 +02:00
ZdenekSrotyr
2ec50b4e4f test: add telegram API endpoint tests (verify, unlink, status) 2026-04-12 14:12:28 +02:00
ZdenekSrotyr
e25a7aba7d fix: resolve JWT secret key test isolation issue
Replace module-level SECRET_KEY cache with lazy _get_cached_secret_key()
that re-reads env vars in test mode. This fixes 20 test failures caused
by JWT secret mismatch when test modules load in different orders.
2026-04-12 14:05:41 +02:00
ZdenekSrotyr
833de96cd7 merge: resolve Block E conflicts in pytest.ini and conftest.py 2026-04-12 11:17:26 +02:00
ZdenekSrotyr
d70d645902 Merge branch 'worktree-agent-afb2461f' into feature/v2-fastapi-duckdb-docker-cli 2026-04-12 11:15:35 +02:00
ZdenekSrotyr
8e22eed669 Merge branch 'worktree-agent-aaa8db4c' into feature/v2-fastapi-duckdb-docker-cli 2026-04-12 11:15:34 +02:00
ZdenekSrotyr
44317a86c6 merge: resolve factories.py conflict — keep Faker factories + add Block D convenience methods 2026-04-12 11:15:15 +02:00
ZdenekSrotyr
7967279181 test: add E2E journey tests (J1-J8) covering full user flows
40 tests across 8 files covering bootstrap/auth, sync+query, hybrid
queries, RBAC+access-requests, Jira webhooks, corporate memory,
analyst uploads, and multi-source orchestration. Adds mock_extract_factory
and admin_user fixtures to conftest, and journey marker to pytest.ini.
2026-04-12 11:13:51 +02:00
ZdenekSrotyr
9c2bd3ff25 test: add 132 API gap tests across 8 endpoint modules
Covers upload (sessions, artifacts, local-md), scripts (deploy/run/delete),
settings (get/dataset), memory (CRUD, voting, admin governance),
access-requests (create, approve, deny), permissions (grant/revoke/list),
metadata (get/save/push), and admin configure+registry endpoints.

Each file tests happy path, auth required (401), role enforcement (403),
and input validation (422) independently using the seeded_app fixture.
2026-04-12 11:13:24 +02:00
ZdenekSrotyr
cef1310b8f test: add CLI gap tests for all 9 command groups
81 tests covering auth login/logout/whoami, admin user/table/metadata
CRUD, sync download/upload/skip-unchanged, query local/remote/formats,
analyst setup/status freshness, server subprocess delegation, diagnose
health checks, explore local/remote, and metrics list/show.
2026-04-12 11:13:15 +02:00
ZdenekSrotyr
3c653b6dc2 test: add connector test suite (Block D) — 5 files, 58 tests
Tests cover Keboola extractor (extension + legacy fallback, _remote_attach),
BigQuery extractor (remote views, contract validation), Jira service
(webhook processing, HMAC verification, HTTP mocking), Jira incremental
transform (upsert/delete, monthly parquet partitioning), and LLM providers
(factory, AnthropicExtractor retry/auth, OpenAICompatExtractor strategy
cascade, JSON extraction helpers). Also adds tests/helpers/factories.py
with WebhookEventFactory.
2026-04-12 11:12:50 +02:00
ZdenekSrotyr
b6ace1e09a Merge branch 'worktree-agent-af11156d' into feature/v2-fastapi-duckdb-docker-cli 2026-04-12 11:12:14 +02:00
ZdenekSrotyr
5a651ca59c test: add Block C services tests (68 tests across 6 files)
Cover ws_gateway JWT auth, telegram storage user linking and verification
codes, telegram bot handlers, scheduler pure functions, corporate memory
collector hash detection and governance, and session file collection.
2026-04-12 11:11:48 +02:00
ZdenekSrotyr
ba61eb5f44 Merge branch 'worktree-agent-ab9a9016' into feature/v2-fastapi-duckdb-docker-cli 2026-04-12 11:10:27 +02:00
ZdenekSrotyr
4d8de9c3b7 test: add Docker E2E and live connector test files
Adds test_docker_full.py (4 docker-marked tests against a running stack),
test_live_keboola.py, test_live_bigquery.py, and test_live_jira.py (live-marked,
read-only, skipped when credentials are absent).
2026-04-12 11:10:06 +02:00
ZdenekSrotyr
510608813c test: add shared test infrastructure (fixtures, factories, assertions, mocks)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 11:05:35 +02:00
ZdenekSrotyr
51f60bbf91 docs: add comprehensive test suite implementation plan (8 tasks, 6 parallel blocks)
Covers shared infrastructure, API gaps, CLI gaps, services, connectors,
E2E journeys, Docker and live tests. Tasks 2-7 are independent for
parallel sub-agent dispatch.
2026-04-12 10:44:08 +02:00
ZdenekSrotyr
55d11920ef docs: add comprehensive test strategy spec (6 parallel blocks, 4 layers)
Covers gap analysis, 8 critical E2E journeys, shared test infrastructure,
Docker E2E and live test design for full project coverage.
2026-04-12 10:33:26 +02:00
ZdenekSrotyr
dab5c84860 Merge pull request #3 from keboola/feature/v2-fastapi-duckdb-docker-cli
feat: remote query — extension re-attach + two-phase BQ+DuckDB engine
2026-04-12 10:19:13 +02:00
ZdenekSrotyr
e351c38368 test: add correctness test for _reattach_remote_extensions
Verifies that _remote_attach table is actually found via table_catalog
and contains expected extension data (not just resilience).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 08:40:12 +02:00
ZdenekSrotyr
35df940e5c fix: BQ COUNT subquery alias, wrap ImportError in RemoteQueryError
- Add AS _cnt alias to COUNT(*) subquery (BQ Standard SQL requires it)
- Catch ImportError in _get_bq_client() and raise RemoteQueryError
  so API endpoint returns proper 400 instead of 500

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 20:29:03 +02:00
ZdenekSrotyr
618385e7e4 fix: table_catalog in re-attach query, --limit in hybrid CLI
- _reattach_remote_extensions: query table_catalog instead of table_schema
  (DuckDB ATTACHed databases use table_catalog for the alias)
- _query_hybrid: forward --limit flag to RemoteQueryEngine.max_result_rows

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 20:13:35 +02:00
ZdenekSrotyr
77d369e311 fix: CLI help test handles ANSI escape codes in Typer output
Rich/Typer may insert ANSI codes within option names like --register-bq,
breaking exact string matching in CI. Check parts separately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 19:58:01 +02:00
ZdenekSrotyr
2ad8828f8c fix: stdin register_bq parsing, separate BQ SQL validation
- cli/commands/query.py: --stdin mode now reads register_bq from the
  JSON payload and merges it into the register_bq option list, matching
  the documented {"register_bq": {...}, "sql": "..."} contract.
- src/remote_query.py: add _validate_bq_sql() with a narrower blocklist
  (writes only); register_bq() now calls _validate_bq_sql() so legitimate
  BQ operations like INFORMATION_SCHEMA, CALL, IMPORT are not blocked.
  The final DuckDB execute() path still uses the full _validate_sql().
- tests/test_remote_query.py: add TestValidateBqSql covering allowed
  INFORMATION_SCHEMA queries and blocked write operations.
2026-04-11 19:31:39 +02:00
ZdenekSrotyr
f4129dc87d fix: alias validation, url escaping, read-only CLI, blocklist comment
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 11:28:27 +02:00
ZdenekSrotyr
872b06ffae docs: add hybrid query usage instructions to CLAUDE.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:11:10 +02:00
ZdenekSrotyr
ed43feb4e6 feat: add POST /api/query/hybrid endpoint for two-phase BQ+DuckDB queries 2026-04-11 11:09:42 +02:00
ZdenekSrotyr
d605e7d95f feat: add --register-bq and --stdin to da query for hybrid BQ+local queries
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 11:09:11 +02:00
ZdenekSrotyr
86bbb8fce4 feat: add RemoteQueryEngine with BQ registration and safety limits
Two-phase query engine: Phase 1 registers BQ query results as DuckDB
Arrow views (with COUNT pre-check, row/memory limits, Storage API
fallback); Phase 2 executes validated SQL against DuckDB with result
serialization and truncation. 25 tests covering all branches.
2026-04-11 11:07:08 +02:00
ZdenekSrotyr
0a69814fca fix: re-attach remote extensions in get_analytics_db_readonly()
Add _reattach_remote_extensions() helper that reads _remote_attach
tables from attached extract.duckdb files and LOADs the corresponding
DuckDB extensions, so BigQuery and other remote views resolve correctly
in read-only analytics connections.
2026-04-11 11:04:04 +02:00
ZdenekSrotyr
816168f96b docs: add remote query implementation plan (5 tasks)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:02:04 +02:00
ZdenekSrotyr
eb68e6292d docs: fix remote query spec after code review
- Address read-only LOAD uncertainty with verification step + workaround
- Clarify register_bq wraps BQ logic (not delegates to register_bq_table)
- Use existing max_bq_registration_rows config key name
- Apply SQL blocklist to both register_bq and final sql
- Define connection lifecycle (caller owns, try/finally)
- Fix CLI argument handling (optional positional + --sql flag)
- Document concurrency safety (Unix inode semantics)
- Handle missing google-cloud-bigquery gracefully

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:58:25 +02:00
ZdenekSrotyr
017cf07674 docs: add design spec for remote query (extension re-attach + two-phase BQ)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:52:39 +02:00
ZdenekSrotyr
c24205a1bf Merge pull request #2 from keboola/feature/v2-fastapi-duckdb-docker-cli
feat: business metrics, analyst bootstrap, metadata writer
2026-04-11 10:29:38 +02:00
ZdenekSrotyr
fbad3f5538 fix: address Devin review — partial download cleanup, category validation, path escaping, docs
- cli/commands/analyst.py: delete partial parquet file on download failure to unblock re-download
- cli/commands/analyst.py: escape single quotes in parquet path to prevent SQL injection
- app/api/metrics.py: replace tempfile-based import with inline YAML parse + direct repo.create(); validates name+category upfront and returns 400 if missing; removes os/tempfile imports
- CLAUDE.md: update schema version text to v4 with full migration chain

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 09:41:29 +02:00