agnes-the-ai-analyst/src
Petr b99ec576ca Add self-service data onboarding system
Table Registry as central source of truth (JSON) with atomic writes,
optimistic locking, audit logging, and data_description.md generation.
Existing readers (config.py, profiler.py) need zero changes.

Phase 1 - Discovery API:
  - discover_tables() on DataSource ABC + Keboola implementation
  - admin_required decorator with server-side recomputation
  - GET /api/admin/discover-tables endpoint

Phase 2 - Table Registry:
  - src/table_registry.py with CRUD, validation, migration from MD
  - Admin API: register/update/unregister with version locking
  - DELETE cascade cleans up per-user subscriptions

Phase 3 - Auto-Profiling:
  - profile_changed_tables() for incremental profiling
  - Non-fatal hook in sync_all() after successful sync

Phase 4 - Per-Table Subscriptions:
  - table_mode (all/explicit) with per-table toggles
  - GET/POST /api/table-subscriptions endpoints
  - Subscription status in catalog and dashboard views

Phase 5 - Smart Sync:
  - Python-generated rsync filter files (not shell YAML parsing)
  - sync_data.sh uses --filter="merge ..." for explicit mode

Phase 6 - Admin UI:
  - /admin/tables with discovery, registration modal, registry mgmt
  - Vanilla JS, matching existing design system
2026-03-09 14:25:37 +01:00
..
__init__.py Extract Keboola into connectors/keboola module 2026-03-09 12:22:16 +01:00
config.py Extract Keboola into connectors/keboola module 2026-03-09 12:22:16 +01:00
data_sync.py Add self-service data onboarding system 2026-03-09 14:25:37 +01:00
parquet_manager.py Initial commit: OSS data distribution platform 2026-03-08 23:31:28 +01:00
profiler.py Add self-service data onboarding system 2026-03-09 14:25:37 +01:00
table_registry.py Add self-service data onboarding system 2026-03-09 14:25:37 +01:00