agnes-the-ai-analyst/tests
ZdenekSrotyr 4e4d2a39e6
chore(oss): isolate customer-specific deploy bits from scripts/grpn/ (#88, wave 1) (#94)
* chore(oss): isolate customer-specific deploy bits from scripts/grpn/ (#88)

Vendor-neutralization step before public release. The directory mixed
two concerns: (1) generic ops scripts referenced from mainline OSS
infrastructure (TLS rotation, auto-upgrade cron) and (2) one operator's
hackathon manual-deploy helper with hardcoded GCP project IDs, VM names,
and admin emails. Splitting them per concern.

Moved (still in OSS, just under a vendor-neutral name):
- scripts/grpn/agnes-tls-rotate.sh   → scripts/ops/agnes-tls-rotate.sh
- scripts/grpn/agnes-auto-upgrade.sh → scripts/ops/agnes-auto-upgrade.sh

Removed (belongs in private consumer infra repos, not upstream OSS):
- scripts/grpn/Makefile (hardcoded prj-grp-foundryai-dev-7c37, foundryai-development VM name, e_zsrotyr@groupon.com bootstrap email)
- scripts/grpn/README.md (GRPN hackathon deploy walkthrough)
- docs/superpowers/plans/2026-04-22-grpn-deploy-learnings.md (org-specific deploy log)

Cross-refs updated in README.md, CLAUDE.md, docs/DEPLOYMENT.md,
docker-compose.yml. CHANGELOG entry flags BREAKING (ops) for any
consumer infra repo that installs these scripts via path-based systemd
timers.

This is the first wave of #88 — the remaining leaks (test data with
prj-grp-dataview-prod-1ff9, AIAgent.FoundryAI tags in OpenMetadata test
fixtures, docstrings in connectors/openmetadata/enricher.py) will be a
separate, smaller PR.

Refs #88.

* chore(oss): comprehensive vendor-neutralization (#88 wave 2 + review fixes)

PR #94 review found that the original wave-1 grep was scoped wrong and
many leaks survived. This commit closes wave 1 properly AND folds in all
wave-2 anonymization in a single pass — easier to review than two PRs.

Wave-1 review-fix corrections:
- Caddyfile: scripts/grpn/agnes-tls-rotate.sh → scripts/ops/ (the original
  wave-1 grep filter excluded extensionless files like Caddyfile).
- CHANGELOG bullet rewritten — original wording implied an in-repo migration
  for infra/modules/customer-instance/, which is wrong (the TF module embeds
  the script inline via heredoc, never sourced from scripts/grpn/). Now
  flags downstream consumer infra repos only.
- infra/modules/customer-instance/variables.tf: Czech docstring with `grpn`
  example → English description with `acme, example` placeholders.

Wave-2 anonymization:
- Code docstrings (connectors/openmetadata/{client,transformer,enricher}.py,
  src/catalog_export.py, scripts/duckdb_manager.py): prj-grp-… →
  my-bq-project / prj-example-1234, AIAgent.FoundryAI → AIAgent.MyAgent,
  FoundryAIDataModel → AnalyticsDataModel.
- Test fixtures (4 files): same set of replacements — 157 tests still pass.
- .github/workflows/keboola-deploy.yml: "Groupon-side dev VMs" comment →
  generic "per-developer dev VMs".
- docs/auth-groups.md + scripts/debug/probe_google_groups.py:
  kids-ai-data-analysis project name → acme-internal-prod placeholder.
- 5 planning/spec docs under docs/superpowers/{plans,specs}/2026-04-21-*:
  hardcoded IPs (34.77.94.14, 34.77.102.61) → <dev-vm-ip>/<prod-vm-ip>;
  GRPN/Groupon → Acme/another-customer; prj-grp-… → prj-example-….
- scripts/switch-dev-vm.sh deleted — hackathon-era helper hardcoded to a
  specific shared dev VM. Per-developer dev VMs are the supported pattern.

Final grep `groupon|grpn|foundryai|prj-grp|groupondev|34\.77\.(94|102)\.…|kids-ai-data`
returns zero hits (excluding CHANGELOG.md historical entries).

CHANGELOG entry expanded to document both waves under one bullet, with
the BREAKING (ops) clarification about the TF module being unaffected.

Refs review of #94, closes #88.

* fix(oss): close remaining #94 review-2 findings (Czech, padak refs, CHANGELOG)

Reviewer of PR #94 round 2 caught 4 remaining items the wave-2 pass missed:

1. infra/modules/customer-instance/variables.tf had Czech descriptions on
   8 more variables. Previous review only flagged line 19; this round
   audited the rest. Translated lines 2, 28, 42-46 (heredoc), 60, 65, 71,
   78, 84 to English. Same review concern: a Terraform module that is
   the customer-facing API surface in Czech is unfit for OSS distribution.

2. infra/modules/customer-instance/outputs.tf had Czech descriptions on
   four outputs. Same fix.

3. docs/padak-security.md referenced a private repo (padak/keboola_agent_cli#206)
   in two places. Replaced with generic 'tracked upstream in the auth-CLI repo'
   per CLAUDE.md vendor-agnostic rule (no cross-refs to private repos).

4. scripts/fetch-env-from-secrets.sh:41 had a Czech comment.
   Translated.

5. CHANGELOG cosmetic: bullet said 'AIAgent.FoundryAI -> AIAgent.MyAgent'
   but the actual code uses both MyAgent (in docstrings) and Example
   (in test fixtures). Reworded to mention both targets.

Final grep across all shipping file types (.md, .py, .yml, .yaml, .sh,
Makefile, .json, .tf, .tpl, Caddyfile, .toml) for groupon|grpn|foundryai|
prj-grp|groupondev|34.77.94.14|34.77.102.61|kids-ai-data|padak/keboola_agent_cli
returns ZERO hits (excluding CHANGELOG.md). Czech-diacritic grep across
.tf/.toml/Caddyfile/Makefile/.yml returns ZERO hits.

157/157 OpenMetadata + DuckDB tests still pass.

* fix(oss): close #94 round-3 leaks (env.template, instance.yaml.example, padak typo)

Round-3 reviewer caught two MUST-FIX leaks the round-2 grep missed
(grep was scoped to extensions that did not include .template / .example
suffixes — the audit was right, the previous grep was not paranoid enough):

1. config/instance.yaml.example:114 — '(optional - Groupon-specific)' brand
   leak in a shipping config example. Replaced with '(optional)'.

2. config/.env.template:68 — stale path 'scripts/grpn/agnes-tls-rotate.sh'
   in operator-facing env-template comment. The script lives at
   scripts/ops/ now (commit 16a85cc); this comment had been pointing
   operators at a non-existent path.

3. docs/padak-security.md:188 — phrase duplication 'tracked in tracked
   upstream' from a sloppy substitution in round-2. Trivial wording fix.

Final paranoid grep across .md/.py/.yml/.yaml/.sh/Makefile/.json/.tf/.tpl/
Caddyfile/.toml/.template/.example/.env* with the full token set
(groupon|grpn|foundryai|prj-grp|groupondev|34\.77\.94\.14|34\.77\.102\.61|
kids-ai-data|padak/keboola_agent_cli) returns ZERO hits, excluding
CHANGELOG.md historical entries.

* fix(oss): #94 round-4 — QUICKSTART.md + rename padak-security.md

Devin Review caught two findings on the latest round-3 commit:

1. docs/QUICKSTART.md:67 still pointed users at the deleted
   scripts/switch-dev-vm.sh. A Quickstart user following step-by-step
   would hit a missing-file error at the final step. Replaced with the
   inline gcloud-ssh equivalent that the Removed bullet documents.

2. docs/padak-security.md filename retains the personal identifier
   'padak'. The PR fixed the body content (replaced
   padak/keboola_agent_cli#206 references with generic wording) but
   missed the filename. Renamed to docs/security-audit-2026-04.md
   (date-anchored, vendor-neutral). Updated the historical CHANGELOG
   link to point at the new path with an inline note about the rename.

* fix(oss): redact remaining hardcoded IPs from planning docs + remove default email

Devin Review caught two more leaks:
1. scripts/fetch-env-from-secrets.sh line 16 had a hardcoded
   personal-email default (zdenek.srotyr@keboola.com). Replaced with
   ':?' bash error so SEED_ADMIN_EMAIL must be explicitly set —
   safer than carrying any specific identity.
2. Planning docs still had 35.195.96.98 and 34.62.223.189 (legacy
   prod/dev IPs) that the round-1 IP-replace pattern missed (it only
   targeted 34.77.x.x). Generic regex redaction across all five
   planning docs replaces every public IP with <redacted-ip>,
   preserving private/loopback/IAP ranges.
2026-04-27 20:24:34 +02:00
..
helpers fix: address code review findings — duplicate fixture, JWT key length, async deprecation 2026-04-13 13:47:51 +02:00
snapshots feat: multi-instance deployment — all 14 must-have items from spec 2026-04-10 11:57:42 +02:00
__init__.py
conftest.py fix: address code review findings — duplicate fixture, JWT key length, async deprecation 2026-04-13 13:47:51 +02:00
test_access_control.py refactor: remove legacy webapp + add missing tests + housekeeping 2026-03-31 13:44:06 +02:00
test_access_requests_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_admin_configure_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_admin_role_mapping_ui.py feat(auth): unified role management — UI + REST API + CLI + schema v9 (v0.11.4) (#73) 2026-04-27 02:23:01 +02:00
test_admin_tokens_ui.py feat(auth): Google Workspace groups on /profile + tag-triggered Keboola deploy workflow (#56) 2026-04-26 00:56:44 +02:00
test_admin_user_capabilities_ui.py feat(auth): unified role management — UI + REST API + CLI + schema v9 (v0.11.4) (#73) 2026-04-27 02:23:01 +02:00
test_analyst_bootstrap.py feat: add da analyst setup command with bootstrap flow 2026-04-10 19:43:36 +02:00
test_api.py feat: add POST /api/query/hybrid endpoint for two-phase BQ+DuckDB queries 2026-04-11 11:09:42 +02:00
test_api_complete.py fix: return filename instead of absolute path in upload responses 2026-04-12 14:23:51 +02:00
test_api_role_management.py feat(auth): unified role management — UI + REST API + CLI + schema v9 (v0.11.4) (#73) 2026-04-27 02:23:01 +02:00
test_api_scripts.py fix: restrict script execution endpoints to analyst/admin roles 2026-04-09 16:31:42 +02:00
test_app_version.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_auth_providers.py release(0.11.2): LOCAL_DEV_GROUPS dev mock + Makefile defaults + docs/local-development.md (#70) 2026-04-26 16:48:55 +02:00
test_auto_profiling.py Add self-service data onboarding system 2026-03-09 14:25:37 +01:00
test_bigquery_extractor.py feat: add graceful shutdown handler 2026-04-09 07:03:45 +02:00
test_bigquery_extractor_full.py test: add connector test suite (Block D) — 5 files, 58 tests 2026-04-12 11:12:50 +02:00
test_bootstrap.py fix(auth): /auth/bootstrap activates seed users, disabled only by real password 2026-04-21 20:01:20 +02:00
test_catalog_export.py chore(oss): isolate customer-specific deploy bits from scripts/grpn/ (#88, wave 1) (#94) 2026-04-27 20:24:34 +02:00
test_cli.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_cli_admin.py User management + PAT + CLI distribution + HTML auth redirect (#9 #10 #11 #12) (#28) 2026-04-22 14:24:28 +02:00
test_cli_admin_role.py feat(auth): unified role management — UI + REST API + CLI + schema v9 (v0.11.4) (#73) 2026-04-27 02:23:01 +02:00
test_cli_analyst.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_artifacts.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_cli_auth.py User management + PAT + CLI distribution + HTML auth redirect (#9 #10 #11 #12) (#28) 2026-04-22 14:24:28 +02:00
test_cli_diagnose.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_explore.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_metrics.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_query.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_server.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_sync.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_cli_update_check.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_column_metadata.py feat: add ColumnMetadataRepository with CRUD and proposal import 2026-04-10 19:41:53 +02:00
test_connector_kit_poc.py User management + PAT + CLI distribution + HTML auth redirect (#9 #10 #11 #12) (#28) 2026-04-22 14:24:28 +02:00
test_corporate_memory_collector.py test: add Block C services tests (68 tests across 6 files) 2026-04-12 11:11:48 +02:00
test_db.py test: add correctness test for _reattach_remote_extensions 2026-04-12 08:40:12 +02:00
test_docker_full.py fix(tests): refresh nightly docker-e2e asserts after auth + health refactors (#69) 2026-04-26 16:12:20 +02:00
test_duckdb_manager.py chore(oss): isolate customer-specific deploy bits from scripts/grpn/ (#88, wave 1) (#94) 2026-04-27 20:24:34 +02:00
test_e2e_api.py refactor: remove legacy webapp + add missing tests + housekeeping 2026-03-31 13:44:06 +02:00
test_e2e_docker.py fix(tests): refresh nightly docker-e2e asserts after auth + health refactors (#69) 2026-04-26 16:12:20 +02:00
test_e2e_extract.py fix: use SCHEMA_VERSION constant in e2e migration test 2026-04-10 19:39:19 +02:00
test_generate_sample_data.py Add --format parquet using project's ParquetManager 2026-03-10 21:46:20 +01:00
test_instance_config.py fix: address code review findings — duplicate fixture, JWT key length, async deprecation 2026-04-13 13:47:51 +02:00
test_jira_incremental.py test: add connector test suite (Block D) — 5 files, 58 tests 2026-04-12 11:12:50 +02:00
test_jira_service.py test: add missing coverage for web UI, Jira extract, instance config, and concurrent rebuild 2026-04-09 07:15:14 +02:00
test_jira_service_full.py fix(security): close Jira webhook fail-open + path traversal (#83) (#93) 2026-04-27 19:53:55 +02:00
test_jira_validation.py fix(security): close Jira webhook fail-open + path traversal (#83) (#93) 2026-04-27 19:53:55 +02:00
test_jira_webhooks.py fix(security): close Jira webhook fail-open + path traversal (#83) (#93) 2026-04-27 19:53:55 +02:00
test_journey_analyst.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_bootstrap_auth.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_hybrid.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_jira.py fix(security): close Jira webhook fail-open + path traversal (#83) (#93) 2026-04-27 19:53:55 +02:00
test_journey_memory.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_multisource.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_rbac.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_sync_query.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_keboola_extractor.py feat: add graceful shutdown handler 2026-04-09 07:03:45 +02:00
test_keboola_extractor_full.py test: add connector test suite (Block D) — 5 files, 58 tests 2026-04-12 11:12:50 +02:00
test_live_bigquery.py test: add Docker E2E and live connector test files 2026-04-12 11:10:06 +02:00
test_live_jira.py test: add Docker E2E and live connector test files 2026-04-12 11:10:06 +02:00
test_live_keboola.py test: add Docker E2E and live connector test files 2026-04-12 11:10:06 +02:00
test_llm_connector.py Add modular LLM connector for Corporate Memory 2026-03-23 12:08:33 +01:00
test_llm_providers_full.py test: add connector test suite (Block D) — 5 files, 58 tests 2026-04-12 11:12:50 +02:00
test_memory_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_metadata_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_metrics.py fix: address code review — path injection, multi-table search, metrics import API, error handling 2026-04-10 19:56:00 +02:00
test_migration.py fix: replace os.environ direct assignment with monkeypatch.setenv in test fixtures 2026-04-09 07:11:36 +02:00
test_openapi_snapshot.py feat: multi-instance deployment — all 14 must-have items from spec 2026-04-10 11:57:42 +02:00
test_openmetadata_client.py Implement OpenMetadata catalog integration (Phase 1) 2026-03-12 14:07:13 +01:00
test_openmetadata_enricher.py chore(oss): isolate customer-specific deploy bits from scripts/grpn/ (#88, wave 1) (#94) 2026-04-27 20:24:34 +02:00
test_openmetadata_transformer.py chore(oss): isolate customer-specific deploy bits from scripts/grpn/ (#88, wave 1) (#94) 2026-04-27 20:24:34 +02:00
test_orchestrator.py test: add missing coverage for web UI, Jira extract, instance config, and concurrent rebuild 2026-04-09 07:15:14 +02:00
test_password_flows.py feat(auth): password reset & invite flows for web + admin (#34) (#37) 2026-04-22 17:43:57 +02:00
test_pat.py feat(auth): Google Workspace groups on /profile + tag-triggered Keboola deploy workflow (#56) 2026-04-26 00:56:44 +02:00
test_permissions.py fix: replace os.environ direct assignment with monkeypatch.setenv in test fixtures 2026-04-09 07:11:36 +02:00
test_permissions_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_profiler.py
test_rbac.py fix: replace os.environ direct assignment with monkeypatch.setenv in test fixtures 2026-04-09 07:11:36 +02:00
test_remote_query.py fix: BQ COUNT subquery alias, wrap ImportError in RemoteQueryError 2026-04-11 20:29:03 +02:00
test_repositories.py fix: replace os.environ direct assignment with monkeypatch.setenv in test fixtures 2026-04-09 07:11:36 +02:00
test_role_resolver.py feat(auth): unified role management — UI + REST API + CLI + schema v9 (v0.11.4) (#73) 2026-04-27 02:23:01 +02:00
test_scheduler.py Support multiple daily sync times (e.g., "daily 07:00,13:00,18:00") 2026-03-16 23:09:48 +01:00
test_scheduler_full.py test: add Block C services tests (68 tests across 6 files) 2026-04-12 11:11:48 +02:00
test_schema_v9_migration.py feat(auth): unified role management — UI + REST API + CLI + schema v9 (v0.11.4) (#73) 2026-04-27 02:23:01 +02:00
test_scripts_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_security.py fix: address code review findings — duplicate fixture, JWT key length, async deprecation 2026-04-13 13:47:51 +02:00
test_selective_gzip.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_session_collector.py test: add Block C services tests (68 tests across 6 files) 2026-04-12 11:11:48 +02:00
test_settings_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_setup_instructions.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_telegram_api.py test: add telegram API endpoint tests (verify, unlink, status) 2026-04-12 14:12:28 +02:00
test_telegram_bot.py fix: address code review findings — duplicate fixture, JWT key length, async deprecation 2026-04-13 13:47:51 +02:00
test_telegram_storage.py test: add Block C services tests (68 tests across 6 files) 2026-04-12 11:11:48 +02:00
test_upload_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_user_management.py User management + PAT + CLI distribution + HTML auth redirect (#9 #10 #11 #12) (#28) 2026-04-22 14:24:28 +02:00
test_web_ui.py feat(auth): unified role management — UI + REST API + CLI + schema v9 (v0.11.4) (#73) 2026-04-27 02:23:01 +02:00
test_ws_gateway.py test: add Block C services tests (68 tests across 6 files) 2026-04-12 11:11:48 +02:00