Commit graph

22 commits

Author SHA1 Message Date
Petr
eb5264b903 Make header logo configurable via instance.yaml logo_svg
Move hardcoded Keboola SVG logo from 4 templates into config.
Templates now use {{ config.LOGO_SVG | safe }}.
Default falls back to Keboola logo when not configured.
2026-03-11 13:08:26 +01:00
Petr
d3c7f7feea Fix activity-center 500 error: provide default data structure
The activity_center view was passing an empty dict but the template
expected nested keys (executive_summary, maturity_roadmap, etc).
Added _build_activity_data() that returns properly structured defaults.
2026-03-11 12:59:40 +01:00
Petr
91a05a2c2b Add auth.disabled_providers config to skip auth providers
Reads disabled_providers list from instance.yaml auth section.
Listed providers are skipped during auto-discovery.
2026-03-11 12:54:23 +01:00
Petr
954aa0f17e Add theme color support via instance.yaml
Allow instances to override primary CSS color variables
through theme section in instance.yaml config.
2026-03-11 00:42:10 +01:00
Petr
ad3b94c168 Add Business Metrics card to dashboard 2026-03-10 22:52:48 +01:00
Petr
34fde746e7 Remove hardcoded Jira and Telemetry cards from dashboard 2026-03-10 22:50:22 +01:00
Petr
49559fba1b Remove hardcoded Jira and Telemetry cards from catalog
These Keboola-specific data source cards don't belong in the OSS repo.
The catalog now shows only dynamic content: Core Business Data (from
data_description.md) and Business Metrics (from docs/metrics/*.yml).

Also update auto-install.md with Business Metrics documentation,
pipeline diagram, and expanded checklist.
2026-03-10 22:48:07 +01:00
Petr
5a84473213 Add dynamic Business Metrics with sample e-commerce definitions
Replace hardcoded Keboola-specific metrics card in Data Catalog with
dynamic Jinja template that renders whatever metric YAMLs exist in
docs/metrics/. Add 10 sample e-commerce metric definitions across
4 categories (revenue, customers, marketing, support) that align
with the sample data generator tables.

Key changes:
- MetricParser: new category colors + dynamic sql_* field discovery
- _load_metrics_data(): scans docs/metrics/*/*.yml with prod fallback
- catalog.html: 240 lines hardcoded HTML -> 35 lines Jinja loop
- metric_modal.js: regex-based category class removal, new categories
- 21 tests validating YAML schema, parser, and loader
2026-03-10 22:38:44 +01:00
Petr
28543d98b1 Fix profiler file_size and catalog stats fallback
- Profiler computes file_size_mb from actual parquet files when
  sync_state.json is absent (sample data / no-sync deployments)
- Catalog header falls back to profiles.json for aggregate stats
  (tables count, total rows) when sync_state.json is missing
2026-03-10 22:12:46 +01:00
Petr
1ac868d787 Improve setup instructions for robustness
- Check for existing SSH config entry before overwriting
- Use --no-perms --no-group in rsync (fixes macOS permission errors)
- Explicit mkdir instead of brace expansion (Claude Code compatibility)
- Gracefully handle missing server directories (empty server is OK)
- Conditional steps for setup_views.sh and CLAUDE.md template
2026-03-10 11:29:31 +01:00
Petr
fde1d6fc01 Move Claude Code setup to dashboard, remove step 5 from onboarding
- Get Started page now has 4 steps (folder, SSH key, pubkey, register)
- After account creation, dashboard shows prominent "Set up your local
  environment" CTA with claude command and Copy Setup Instructions
- CTA only visible when user hasn't synced yet (last_sync is empty)
- Bottom banner demoted to subtle secondary style for returning users
2026-03-10 11:18:56 +01:00
Petr
45454ab86a Redesign onboarding: compact single-screen layout with terminal block
- Merge steps 1-3 into a single dark terminal block with copy buttons
- Inline registration form with single-row layout for step 4
- Compact step 5 with Claude Code command and copy button on one line
- Full-width layout (960px) instead of narrow 640px column
- Everything fits on one screen without scrolling
2026-03-10 11:10:19 +01:00
Petr
9c4208bb89 Unify onboarding into single-column stepper with inline registration
Merge the two-column layout (setup steps + registration form) into one
unified flow. Step 4 now contains the registration form inline, creating
a natural top-to-bottom progression through the setup process.
2026-03-10 11:06:11 +01:00
Petr
21af1abb6e Fix setup instructions: add SSH key steps, fix clipboard on HTTP
- Add steps 2-4 (SSH key generation, copy pubkey, create account)
- Fix clipboard copy using textarea fallback for non-HTTPS contexts
- Generate simple plain-text Claude Code prompt instead of full YAML
- Show what Claude will do (SSH, rsync, DuckDB, CLAUDE.md)
2026-03-10 11:00:48 +01:00
Petr
f635195c80 Add multi-domain support and full-email username generation
- Support comma-separated domains in auth.allowed_domain config
- Use full email as system username (user@domain.com -> user_domain_com)
  to avoid collisions with reserved names and across domains
- Update both auth providers (google, email) for multi-domain display
- Add tests for username generation and update email auth tests
2026-03-10 10:50:01 +01:00
Petr
e2ab219171 Add email magic link authentication provider
New pluggable auth provider that sends passwordless sign-in links.
Works with domain restriction (same as Google OAuth). Falls back to
showing the link in browser when SMTP is not configured (dev mode).
2026-03-10 10:39:19 +01:00
Petr
b99ec576ca Add self-service data onboarding system
Table Registry as central source of truth (JSON) with atomic writes,
optimistic locking, audit logging, and data_description.md generation.
Existing readers (config.py, profiler.py) need zero changes.

Phase 1 - Discovery API:
  - discover_tables() on DataSource ABC + Keboola implementation
  - admin_required decorator with server-side recomputation
  - GET /api/admin/discover-tables endpoint

Phase 2 - Table Registry:
  - src/table_registry.py with CRUD, validation, migration from MD
  - Admin API: register/update/unregister with version locking
  - DELETE cascade cleans up per-user subscriptions

Phase 3 - Auto-Profiling:
  - profile_changed_tables() for incremental profiling
  - Non-fatal hook in sync_all() after successful sync

Phase 4 - Per-Table Subscriptions:
  - table_mode (all/explicit) with per-table toggles
  - GET/POST /api/table-subscriptions endpoints
  - Subscription status in catalog and dashboard views

Phase 5 - Smart Sync:
  - Python-generated rsync filter files (not shell YAML parsing)
  - sync_data.sh uses --filter="merge ..." for explicit mode

Phase 6 - Admin UI:
  - /admin/tables with discovery, registration modal, registry mgmt
  - Vanilla JS, matching existing design system
2026-03-09 14:25:37 +01:00
Petr
c6a711aa27 Extract pluggable auth provider system into auth/ package
Replace hardcoded Google OAuth + password auth registration with
auto-discovered auth providers. Each provider in auth/<name>/provider.py
implements AuthProvider ABC and is automatically registered at startup.

- auth/__init__.py: AuthProvider ABC + discover_providers() scanner
- auth/google/: Google OAuth provider (extracted from webapp/auth.py)
- auth/password/: Email/password provider (delegates to webapp/password_auth)
- auth/desktop/: Desktop JWT auth (API-only, not visible on login page)
- webapp/auth.py: stripped to core infra (login_required, /login, /logout)
- webapp/app.py: auto-discovery loop replaces manual blueprint registration
- login.html: dynamic provider buttons via Jinja loop
2026-03-09 13:02:08 +01:00
Petr
f2d3d156e3 Move standalone services from server/ to services/
Extract 4 self-contained services into services/ module:
- server/telegram_bot/ -> services/telegram_bot/
- server/ws_gateway/ -> services/ws_gateway/
- server/corporate_memory/ -> services/corporate_memory/
- server/session_collector.py -> services/session_collector/

Each service now has its own systemd/ directory with .service and .timer files.
deploy.sh updated to auto-discover service units from services/*/systemd/*.

server/ now contains only deployment infrastructure (deploy.sh, setup scripts,
bin/ management tools, sudoers, nginx config).

All imports updated: webapp/app.py, server/bin/ scripts, systemd ExecStart paths.
2026-03-09 12:54:30 +01:00
Petr
86edd27655 Extract Jira into connectors/jira module
Move all Jira-specific code into a self-contained connector module:
- 22 files moved via git mv (transform, service, webhook, scripts,
  systemd units, tests, docs, bin helper)
- All imports updated to use connectors.jira.* paths
- Jira is now conditional: auto-detected via JIRA_DOMAIN env var
- Webapp registers Jira blueprint only when available
- Health service monitors Jira timers only when enabled
- Profiler loads Jira tables dynamically from filesystem
- Sync settings uses config-driven dependency validation
- Renamed keboola_platform_url -> custom_url in transform
- Updated deploy.sh, sudoers-deploy, backfill_gap.sh paths
- Fixed pytest.ini to skip live tests by default
2026-03-09 11:17:50 +01:00
Petr
26c4e0934d OSS cleanup: remove internal references, harden deployment, add config env interpolation
Phase 1 - Internal reference cleanup:
- Delete dev_docs/meetings/ (internal meeting notes/transcripts)
- Replace hardcoded usernames (padak/matejkys/dasa) with deploy/generic
- Replace "Internal AI Data Analyst" with "AI Data Analyst"
- Replace keboola/internal_ai_data_analyst URLs with your-org/ai-data-analyst
- Replace /tmp/keboola_load/ with /tmp/data_analyst_staging/ in dev_docs

Phase 2 - Deployment hardening:
- Tighten sudoers wildcards to explicit paths (visudo, sudoers cp)
- setup.sh creates all groups (data-ops, dataread, data-private) and deploy user
- webapp-setup.sh copies sudoers-webapp from repo instead of inline definition
- deploy.sh conditional copy for data_description.md (not in git for OSS)
- deploy.sh ownership changed to deploy:data-ops for /data/{scripts,docs,examples}

Phase 3 - Config and misc:
- Add ${ENV_VAR} interpolation to config/loader.py
- Expand config/instance.yaml.example with all sections (admins, deployment, auth, etc.)
- Create config/.env.template for secret values
- Add MIT LICENSE
- Fix .gitignore: add .venv/, docs/data_description.md
- Fix README.md: CSV status Planned, remove metrics/, update license text
- Translate Czech comments in requirements.txt to English
- Fix test_account_service.py: mock username mapping instead of relying on instance config

All 118 tests pass.
2026-03-09 07:59:57 +01:00
Petr
c56905d34f Initial commit: OSS data distribution platform
Open-source AI data analyst platform extracted from internal repo.
Includes data sync engine, Keboola adapter, Flask web portal,
server deployment scripts, and configuration templates.
2026-03-08 23:31:28 +01:00