- admin_welcome.html: update subtitle, description, placeholder cheatsheet
(drop tables/metrics/marketplaces/sync_interval; add user-null note and
security note). Textarea initial value is now empty (no default template
to show). Preview pane uses innerHTML (HTML output). refreshStatus sets
editor to empty when no override. Preview pane styled as light surface.
Reset modal copy updated (no banner shown, not "OSS-shipped template").
- config/claude_md_template.txt: deleted (markdown template is gone;
default is now no banner).
- docs/agent-setup-prompt.md: rewritten for variant C — describes the
/setup banner, smaller placeholder table, security/sanitization notes,
anonymous-user guard, example HTML snippet.
Diagnostic + operator-facing documentation that closes the loop on the work in this PR.
`da diagnose` (via /api/health/detailed):
- New _check_bq_billing_project() helper. When data_source.type='bigquery' and BqProjects.billing == .data, surface a yellow warning: 'BigQuery billing project equals data project'. Hint includes the YAML field path + the /admin/server-config UI shortcut. Diagnose's overall status promotes warning → degraded so the CLI echoes it.
- Non-BQ instances (Keboola-only, etc.) skip the check.
- Implementation hooks into the existing /api/health/detailed surface — no new endpoint, no CLI changes.
config/instance.yaml.example documentation:
- data_source.bigquery.billing_project: USER_PROJECT_DENIED hint, /admin/server-config UI reference
- data_source.bigquery.legacy_wrap_views: analyst-side discipline note (use `da fetch` / `da query --remote`), issue #101 history, view-heavy deployment guidance
- data_source.bigquery.max_bytes_per_materialize: cost guardrail block (NEW — wasn't documented in .example before)
- ai.base_url: provider list + UI hint
- openmetadata + desktop: 'configurable via /admin/server-config UI' headers
- corporate_memory: leading note that the schema is editable via UI
Other docs:
- CHANGELOG.md: comprehensive Unreleased section
- CLAUDE.md: schema chain → v20 + Materialized SQL connector mode + per-connector tab UI mention
- README.md: mode-first source table summary
- docs/architecture.md: per-connector tab UI mention
- cli/skills/connectors.md: bootstrap rails (parallel to #154)
- docs/superpowers/plans/2026-05-01-admin-tables-form-cleanup.md: implementation plan archive (2515 lines)
- scripts/seed_dummy_tables.py: drop is_public after #150 RBAC migration (column gone)
Tests:
- test_diagnose_billing.py — 3 cases (BQ with billing==data warns, BQ with billing!=data clean, non-BQ skips)
* fix(analyst): document BigQuery remote-query capability in bootstrap CLAUDE.md template
Closes#153.
The CLAUDE.md template generated by `da analyst bootstrap` (config/claude_md_template.txt)
covered metrics, sync, corporate memory, and directory layout — but had ZERO
mention of query_mode: "remote", da fetch, da query --remote, or --register-bq.
Result: the AI analyst running in a freshly-bootstrapped workspace had no
idea BigQuery-backed tables existed, no path to fetch unsynced data, and no
fallback for tables not in the catalog.
Validated against /Users/<user>/foundry-ai/foundryai-data-analyst/CLAUDE.md
on 2026-05-01: section confirmed missing. Workspace-level (parent-dir)
CLAUDE.md carried legacy SSH-heredoc instructions but the analyst-level
file (which Claude reads as primary project context) had nothing.
## Changes
### config/claude_md_template.txt (+83)
Added a `## Remote Queries (BigQuery)` section covering:
- Discovery first — `da catalog --json | jq '...'` to see all tables
with their query_mode, then `da schema` and `da describe` for shape.
- Three query patterns:
- `da fetch` (preferred) — materialize a filtered subset locally,
query the snapshot, drop when done.
- `da query --remote` — one-shot server-side execution (cheap probes).
- `da query --register-bq` — hybrid joins between local + ad-hoc BQ.
- `da fetch` estimate-first discipline — rules of thumb on
--select / --where / --estimate / snapshot reuse.
- BigQuery SQL flavor cheat sheet for `--where` (DATE literal,
DATE_SUB, REGEXP_CONTAINS, CAST AS INT64).
- Unknown-table fallback: when a table isn't in `da catalog` at all,
use ad-hoc `--register-bq` if the agnes server SA has BQ access, or
ask admin to register with `query_mode: "remote"` for ongoing use.
- Pointer to `da skills show agnes-data-querying` for deeper guidance.
### docs/setup/claude_md_template.txt (deleted)
Stale 359-line template that documented the deprecated SSH-heredoc
remote_query.sh protocol. No code references it (verified via grep
across .py / .sh / .yml / .md). Removing eliminates two failure
modes:
1. A future refactor accidentally pulling it into a workspace and
shipping deprecated guidance to analyst Claude sessions.
2. Reviewer confusion over which template is canonical.
### CHANGELOG.md
`### Fixed` and `### Removed` entries under [Unreleased].
## Tested
- Manually walked the diff against `da skills show agnes-data-querying`
output on a live VM (foundryai-development) — patterns + flags
match the modern CLI exactly.
- Re-bootstrap test deferred: requires network round-trip; pattern
is identical to existing template substitution path so render is
not at risk.
## Out of scope
- The companion gap that data_description.md often only enumerates
query_mode: "local" tables (no signal that other modes exist) —
separate concern, fix likely belongs in the metadata generator
on the server side, not in the analyst template.
- Encouraging admins to register frequently-queried BQ tables as
`query_mode: "remote"` in the registry — workflow improvement, not
a code bug.
* chore(release): cut 0.28.0
---------
Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
Replaces the BigQuery wrap-view pattern with a discovery + scoped-fetch toolkit driven by the analyst's Claude session. Adds /api/v2/{catalog,schema,sample,scan,scan/estimate}, da catalog/schema/describe/fetch/snapshot/disk-info CLI commands, sqlglot-backed WHERE validator, process-local quota tracker, agent rails skill (cli/skills/agnes-data-querying.md). BREAKING: BQ wrap views off by default — set data_source.bigquery.legacy_wrap_views=true for one cycle. Backward-compat field_validator on primary_key. Catalog cache now matches documented 300s TTL with RBAC fresh per request. Cuts release v0.14.0.
* chore(oss): isolate customer-specific deploy bits from scripts/grpn/ (#88)
Vendor-neutralization step before public release. The directory mixed
two concerns: (1) generic ops scripts referenced from mainline OSS
infrastructure (TLS rotation, auto-upgrade cron) and (2) one operator's
hackathon manual-deploy helper with hardcoded GCP project IDs, VM names,
and admin emails. Splitting them per concern.
Moved (still in OSS, just under a vendor-neutral name):
- scripts/grpn/agnes-tls-rotate.sh → scripts/ops/agnes-tls-rotate.sh
- scripts/grpn/agnes-auto-upgrade.sh → scripts/ops/agnes-auto-upgrade.sh
Removed (belongs in private consumer infra repos, not upstream OSS):
- scripts/grpn/Makefile (hardcoded prj-grp-foundryai-dev-7c37, foundryai-development VM name, e_zsrotyr@groupon.com bootstrap email)
- scripts/grpn/README.md (GRPN hackathon deploy walkthrough)
- docs/superpowers/plans/2026-04-22-grpn-deploy-learnings.md (org-specific deploy log)
Cross-refs updated in README.md, CLAUDE.md, docs/DEPLOYMENT.md,
docker-compose.yml. CHANGELOG entry flags BREAKING (ops) for any
consumer infra repo that installs these scripts via path-based systemd
timers.
This is the first wave of #88 — the remaining leaks (test data with
prj-grp-dataview-prod-1ff9, AIAgent.FoundryAI tags in OpenMetadata test
fixtures, docstrings in connectors/openmetadata/enricher.py) will be a
separate, smaller PR.
Refs #88.
* chore(oss): comprehensive vendor-neutralization (#88 wave 2 + review fixes)
PR #94 review found that the original wave-1 grep was scoped wrong and
many leaks survived. This commit closes wave 1 properly AND folds in all
wave-2 anonymization in a single pass — easier to review than two PRs.
Wave-1 review-fix corrections:
- Caddyfile: scripts/grpn/agnes-tls-rotate.sh → scripts/ops/ (the original
wave-1 grep filter excluded extensionless files like Caddyfile).
- CHANGELOG bullet rewritten — original wording implied an in-repo migration
for infra/modules/customer-instance/, which is wrong (the TF module embeds
the script inline via heredoc, never sourced from scripts/grpn/). Now
flags downstream consumer infra repos only.
- infra/modules/customer-instance/variables.tf: Czech docstring with `grpn`
example → English description with `acme, example` placeholders.
Wave-2 anonymization:
- Code docstrings (connectors/openmetadata/{client,transformer,enricher}.py,
src/catalog_export.py, scripts/duckdb_manager.py): prj-grp-… →
my-bq-project / prj-example-1234, AIAgent.FoundryAI → AIAgent.MyAgent,
FoundryAIDataModel → AnalyticsDataModel.
- Test fixtures (4 files): same set of replacements — 157 tests still pass.
- .github/workflows/keboola-deploy.yml: "Groupon-side dev VMs" comment →
generic "per-developer dev VMs".
- docs/auth-groups.md + scripts/debug/probe_google_groups.py:
kids-ai-data-analysis project name → acme-internal-prod placeholder.
- 5 planning/spec docs under docs/superpowers/{plans,specs}/2026-04-21-*:
hardcoded IPs (34.77.94.14, 34.77.102.61) → <dev-vm-ip>/<prod-vm-ip>;
GRPN/Groupon → Acme/another-customer; prj-grp-… → prj-example-….
- scripts/switch-dev-vm.sh deleted — hackathon-era helper hardcoded to a
specific shared dev VM. Per-developer dev VMs are the supported pattern.
Final grep `groupon|grpn|foundryai|prj-grp|groupondev|34\.77\.(94|102)\.…|kids-ai-data`
returns zero hits (excluding CHANGELOG.md historical entries).
CHANGELOG entry expanded to document both waves under one bullet, with
the BREAKING (ops) clarification about the TF module being unaffected.
Refs review of #94, closes#88.
* fix(oss): close remaining #94 review-2 findings (Czech, padak refs, CHANGELOG)
Reviewer of PR #94 round 2 caught 4 remaining items the wave-2 pass missed:
1. infra/modules/customer-instance/variables.tf had Czech descriptions on
8 more variables. Previous review only flagged line 19; this round
audited the rest. Translated lines 2, 28, 42-46 (heredoc), 60, 65, 71,
78, 84 to English. Same review concern: a Terraform module that is
the customer-facing API surface in Czech is unfit for OSS distribution.
2. infra/modules/customer-instance/outputs.tf had Czech descriptions on
four outputs. Same fix.
3. docs/padak-security.md referenced a private repo (padak/keboola_agent_cli#206)
in two places. Replaced with generic 'tracked upstream in the auth-CLI repo'
per CLAUDE.md vendor-agnostic rule (no cross-refs to private repos).
4. scripts/fetch-env-from-secrets.sh:41 had a Czech comment.
Translated.
5. CHANGELOG cosmetic: bullet said 'AIAgent.FoundryAI -> AIAgent.MyAgent'
but the actual code uses both MyAgent (in docstrings) and Example
(in test fixtures). Reworded to mention both targets.
Final grep across all shipping file types (.md, .py, .yml, .yaml, .sh,
Makefile, .json, .tf, .tpl, Caddyfile, .toml) for groupon|grpn|foundryai|
prj-grp|groupondev|34.77.94.14|34.77.102.61|kids-ai-data|padak/keboola_agent_cli
returns ZERO hits (excluding CHANGELOG.md). Czech-diacritic grep across
.tf/.toml/Caddyfile/Makefile/.yml returns ZERO hits.
157/157 OpenMetadata + DuckDB tests still pass.
* fix(oss): close#94 round-3 leaks (env.template, instance.yaml.example, padak typo)
Round-3 reviewer caught two MUST-FIX leaks the round-2 grep missed
(grep was scoped to extensions that did not include .template / .example
suffixes — the audit was right, the previous grep was not paranoid enough):
1. config/instance.yaml.example:114 — '(optional - Groupon-specific)' brand
leak in a shipping config example. Replaced with '(optional)'.
2. config/.env.template:68 — stale path 'scripts/grpn/agnes-tls-rotate.sh'
in operator-facing env-template comment. The script lives at
scripts/ops/ now (commit 16a85cc); this comment had been pointing
operators at a non-existent path.
3. docs/padak-security.md:188 — phrase duplication 'tracked in tracked
upstream' from a sloppy substitution in round-2. Trivial wording fix.
Final paranoid grep across .md/.py/.yml/.yaml/.sh/Makefile/.json/.tf/.tpl/
Caddyfile/.toml/.template/.example/.env* with the full token set
(groupon|grpn|foundryai|prj-grp|groupondev|34\.77\.94\.14|34\.77\.102\.61|
kids-ai-data|padak/keboola_agent_cli) returns ZERO hits, excluding
CHANGELOG.md historical entries.
* fix(oss): #94 round-4 — QUICKSTART.md + rename padak-security.md
Devin Review caught two findings on the latest round-3 commit:
1. docs/QUICKSTART.md:67 still pointed users at the deleted
scripts/switch-dev-vm.sh. A Quickstart user following step-by-step
would hit a missing-file error at the final step. Replaced with the
inline gcloud-ssh equivalent that the Removed bullet documents.
2. docs/padak-security.md filename retains the personal identifier
'padak'. The PR fixed the body content (replaced
padak/keboola_agent_cli#206 references with generic wording) but
missed the filename. Renamed to docs/security-audit-2026-04.md
(date-anchored, vendor-neutral). Updated the historical CHANGELOG
link to point at the new path with an inline note about the rename.
* fix(oss): redact remaining hardcoded IPs from planning docs + remove default email
Devin Review caught two more leaks:
1. scripts/fetch-env-from-secrets.sh line 16 had a hardcoded
personal-email default (zdenek.srotyr@keboola.com). Replaced with
':?' bash error so SEED_ADMIN_EMAIL must be explicitly set —
safer than carrying any specific identity.
2. Planning docs still had 35.195.96.98 and 34.62.223.189 (legacy
prod/dev IPs) that the round-1 IP-replace pattern missed (it only
targeted 34.77.x.x). Generic regex redaction across all five
planning docs replaces every public IP with <redacted-ip>,
preserving private/loopback/IAP ranges.
* chore(deploy): trust proxy headers + document HTTPS env vars
- uvicorn: add --proxy-headers --forwarded-allow-ips='*' so the app honors
X-Forwarded-Proto/Host from a TLS-terminating reverse proxy (Caddy,
Cloudflare Tunnel, nginx, LB). Without this the app saw every request as
plain HTTP and built redirect/OAuth URLs from the raw Host, which is
fragile behind a proxy.
- .env.template: document DOMAIN (enables Secure cookie flag) and new
SERVER_URL (deterministic base URL for OAuth callbacks and external
links). Grouped under a dedicated HTTPS / REVERSE PROXY section.
* chore(deploy): add proxy header flags to Dockerfile CMD and Kamal config
Matches the docker-compose changes so non-compose deployments (docker run,
Kubernetes, ECS, Kamal) also trust X-Forwarded-Proto/X-Forwarded-For.
* fix(auth): align Google OAuth cookie Secure flag with password/email providers
Google OAuth set the access_token cookie Secure flag based on the TESTING env
var, while password and email providers use DOMAIN. This meant the DOMAIN
env var (now documented in config/.env.template) did not actually control
Secure for Google cookies. Align all three providers on DOMAIN so the
documented behavior holds consistently.
* dryrun: intentional failing test (will be reverted)
* feat(auth): optional SEED_ADMIN_PASSWORD to pre-hash seed admin (dev helper)
Terraform gains enable_seed_password + seed_admin_password (sensitive) vars
on the customer-instance module; when enabled the password is piped via
startup-script into /opt/agnes/.env as SEED_ADMIN_PASSWORD. On first boot
app/main.py argon2-hashes it onto the seed user so the admin can log in
immediately without going through /auth/bootstrap. Never overwrites an
existing password_hash — safe against accidental reset on terraform apply.
* ci(release): build :dev-<slug> on any branch, not just feature/**
Before: only 'feature/**' branches triggered release.yml, so pushing
'zs/my-edit' or 'fix/bug' did not publish an image. dev_instances entry
pinning image_tag = 'dev-zs-my-edit' then crashed VM startup with
'image not found'.
Now: any branch push (except main, which produces :stable) publishes
:dev-<slug>. Slug strips a leading 'feature/' and replaces non-[a-z0-9-]
with '-', keeping existing feature/** behavior identical.
* Revert "dryrun: intentional failing test (will be reverted)"
This reverts commit cf9cc06a7884bb401ff29fc5cb6d8baf84dc3daa.
- SyncSettingsRepository + DatasetPermissionRepository with RBAC
- Script deploy/run/undeploy API with import sandboxing
- User sync settings API with permission checks
- 4 CLI skills (connectors, security, notifications, corporate-memory)
- Kamal production + staging configs
- GitHub Actions CI + deploy workflows
- 91 total tests passing
Add admin curation layer between AI extraction and knowledge distribution.
Admins (km_admin flag in instance.yaml) can approve, reject, mandate, and
revoke knowledge items. Mandatory items distribute to all targeted users
automatically.
Three governance modes (configurable per instance):
- mandatory_only: admin controls everything, no user voting
- admin_curated: admin controls, users vote as feedback signal
- hybrid: mandatory from admin + optional from user voting
Three approval workflows:
- review_queue: nothing published without admin approval
- auto_publish: items go live immediately, admin intervenes retroactively
- threshold: confidence-based auto-publish (Phase 5)
Includes:
- 9 admin action functions (approve/reject/mandate/revoke/edit/batch/...)
- 11 new admin API endpoints under /api/corporate-memory/admin/
- Immutable audit log (audit.jsonl)
- Audience targeting via groups
- Automatic migration of existing items to "approved" status
- km_admin_required auth decorator
- 69 tests covering all governance logic
- Backward compatible: no config = legacy wiki behavior
Replace hardwired Anthropic API calls with a pluggable provider system.
Each deployment configures its AI provider in instance.yaml — switching
between Anthropic, LiteLLM, OpenRouter, or any OpenAI-compatible proxy
is a config change, not a code change.
New connectors/llm/ module:
- StructuredExtractor Protocol with extract_json() interface
- AnthropicExtractor: direct Anthropic SDK with retry + backoff
- OpenAICompatExtractor: any OpenAI-compatible proxy with three-layer
structured output fallback (json_schema -> json_object -> prompt)
- Configurable structured_output policy (strict/json/auto)
- Custom exception hierarchy (auth/rate_limit/timeout/format/refusal)
- Zero secrets in logs: no API keys, prompts, or responses logged
Reviewed by: Google Gemini, Claude Sonnet, OpenAI GPT-5.4.
Security audit passed with all critical findings resolved.
Add src/remote_query.py CLI module enabling the AI agent to run SQL
queries spanning local Parquet tables and remote BigQuery tables in a
single DuckDB session on the server. Two-phase protocol: BQ sub-queries
(--register-bq) fetch filtered/aggregated data, then DuckDB SQL (--sql)
joins everything.
Safety: COUNT(*) pre-check, memory estimation (2GB cap), row limits
(500K per BQ sub-query, 100K final result).
Changes:
- New src/remote_query.py with CLI, BQ registration, output formatting
- Add bq_entity_type field to TableConfig (view vs table routing)
- Extract create_local_views() from duckdb_manager.py for reuse
- Update claude_md_template.txt with remote query agent instructions
- Update example configs with remote_query section and docs
- 52 new tests (42 remote_query + 10 bq_entity_type), all passing
- New sync_schedule and profile_after_sync fields in TableConfig
(formats: "every 15m", "every 1h", "daily 05:00")
- New src/scheduler.py with schedule evaluation logic (is_table_due)
- New --scheduled mode in data_sync.py: only syncs tables that are due,
respects profile_after_sync flag, auto-restarts webapp after profiling
- Systemd timer+service for data-refresh (every 15 min)
- Systemd timer+service for catalog-refresh (every 15 min)
- deploy.sh enables new timers automatically
- Complete table config reference in data_description.md.example
- 58 new scheduler tests
Instead of hardcoded Python constants, load profiler settings from config:
- instance.yaml: profiler section with all parameters
- Defaults: fallback to sensible defaults if config not found
- Centralized: all profiler tuning in one place, no code changes needed
Add OpenMetadata REST API connector and enricher to merge table/column metadata
from OpenMetadata catalog at sync and query time.
Changes:
- connectors/openmetadata/client.py: HTTP client for OM API
- connectors/openmetadata/enricher.py: Data enrichment with TTL cache
- tests/test_openmetadata_*: Unit tests for client and enricher
- src/config.py: Add catalog_fqn field to TableConfig
- src/data_sync.py: Use enricher in _generate_schema_yaml (catalog > BQ API > data_description.md)
- webapp/app.py: Initialize enricher, enrich catalog data with tags/tier/owners/url
- config/instance.yaml.example: Document openmetadata section
Features:
- FQN auto-derivation: bigquery.{table.id}
- TTL cache (default 1h) to avoid repeated API calls
- Graceful degradation: disabled if token missing, silent on HTTP errors
- Column description priority: catalog > BQ API > (none)
- Table description priority: catalog > data_description.md
BigQuery connector that syncs BQ tables to local Parquet files via PyArrow
(no CSV intermediate step). Supports full refresh, timestamp-based
incremental (via incremental_column), and partition-based sync strategies.
- connectors/bigquery/client.py: BQ API wrapper with ADC auth, parameterized
queries, metadata cache, cross-project support (job project != data project)
- connectors/bigquery/adapter.py: DataSource implementation with merge/dedup
- src/config.py: Add incremental_column field to TableConfig
- 72 unit tests (mocked, no GCP SDK required)
Move hardcoded Keboola SVG logo from 4 templates into config.
Templates now use {{ config.LOGO_SVG | safe }}.
Default falls back to Keboola logo when not configured.
- Support comma-separated domains in auth.allowed_domain config
- Use full email as system username (user@domain.com -> user_domain_com)
to avoid collisions with reserved names and across domains
- Update both auth providers (google, email) for multi-domain display
- Add tests for username generation and update email auth tests
New pluggable auth provider that sends passwordless sign-in links.
Works with domain restriction (same as Google OAuth). Falls back to
showing the link in browser when SMTP is not configured (dev mode).
H1 - Sanitize dev_docs/ for public release:
- Replace all real employee names with generic placeholders
(padak->admin1, matejkys->admin2, dasa->admin3, petr->john, etc.)
- Replace GCP project ID (kids-ai-data-analysis -> your-gcp-project)
- Replace server hostname (data-broker-for-claude -> your-server)
- Replace real IP address (34.88.8.46 -> YOUR_SERVER_IP)
- Replace internal FQDN with placeholder
- Covers: security.md, server.md, disaster-recovery.md, desktop-app.md,
session_explore.md, plan-rsync-fix.md, draft/*.md
H3 - webapp-setup.sh: validate sudoers syntax BEFORE copying to /etc/sudoers.d
- Prevents broken sudo if syntax is invalid
- Uses install -m 440 for atomic copy with correct permissions
M1 - setup.sh: deploy user created with /usr/sbin/nologin instead of /bin/bash
- CI/CD service account does not need interactive shell
M2 - config/loader.py: warn on missing env vars, validate webapp_secret_key
- _resolve_env_refs now logs warnings for unset ${ENV_VAR} references
- _validate_config checks auth.webapp_secret_key is non-empty
- Prevents Flask signing sessions with empty secret key
All 118 tests pass.
Phase 1 - Internal reference cleanup:
- Delete dev_docs/meetings/ (internal meeting notes/transcripts)
- Replace hardcoded usernames (padak/matejkys/dasa) with deploy/generic
- Replace "Internal AI Data Analyst" with "AI Data Analyst"
- Replace keboola/internal_ai_data_analyst URLs with your-org/ai-data-analyst
- Replace /tmp/keboola_load/ with /tmp/data_analyst_staging/ in dev_docs
Phase 2 - Deployment hardening:
- Tighten sudoers wildcards to explicit paths (visudo, sudoers cp)
- setup.sh creates all groups (data-ops, dataread, data-private) and deploy user
- webapp-setup.sh copies sudoers-webapp from repo instead of inline definition
- deploy.sh conditional copy for data_description.md (not in git for OSS)
- deploy.sh ownership changed to deploy:data-ops for /data/{scripts,docs,examples}
Phase 3 - Config and misc:
- Add ${ENV_VAR} interpolation to config/loader.py
- Expand config/instance.yaml.example with all sections (admins, deployment, auth, etc.)
- Create config/.env.template for secret values
- Add MIT LICENSE
- Fix .gitignore: add .venv/, docs/data_description.md
- Fix README.md: CSV status Planned, remove metrics/, update license text
- Translate Czech comments in requirements.txt to English
- Fix test_account_service.py: mock username mapping instead of relying on instance config
All 118 tests pass.
Open-source AI data analyst platform extracted from internal repo.
Includes data sync engine, Keboola adapter, Flask web portal,
server deployment scripts, and configuration templates.