Commit graph

15 commits

Author SHA1 Message Date
ZdenekSrotyr
0320845e2c docs: add metrics workflow instructions to CLAUDE.md 2026-04-10 19:35:33 +02:00
ZdenekSrotyr
40cca627be fix: address Devin review round 4 — bash arithmetic, CalVer max, docs
- smoke-test.sh: replace ((PASS++)) with PASS=$((PASS + 1)) to avoid
  set -e abort when counter is 0 (bash returns exit 1 for ((0)))
- CalVer: use max(N) from existing tags instead of count, safe when
  tags are deleted (e.g. deprecated version cleanup)
- CLAUDE.md: update schema version from v2 to v3

663 tests pass.
2026-04-10 14:39:16 +02:00
ZdenekSrotyr
0279cc06fa refactor: consolidate deps into pyproject.toml, remove requirements.txt
- All dependencies now in pyproject.toml [project.dependencies]
- Dev/test deps in [project.optional-dependencies] dev and [tool.uv]
- Dockerfile uses uv pip install . from pyproject.toml
- CI uses uv pip install ".[dev]"
- Deleted requirements.txt and requirements-dev.txt
- Updated README, CLAUDE.md install instructions
- Enhanced .dockerignore (exclude tests, docs, infra from image)
2026-04-09 13:17:59 +02:00
ZdenekSrotyr
06e1cf0a8d feat: generic _remote_attach contract for remote DuckDB extension views
Extractors with remote tables now write a _remote_attach table into
extract.duckdb so the orchestrator can re-ATTACH external extensions
at query time. The mechanism is source-agnostic — any connector can use it.

- Keboola extractor writes _remote_attach + creates views on kbc.*
- Orchestrator reads _remote_attach, installs extension, reads token from env
- Graceful degradation: missing token → warning, local tables still work
2026-04-08 18:10:12 +02:00
ZdenekSrotyr
92fbb88c15 chore: Docker prod config (Python 3.13, no reload), fix utcnow deprecation, update docs 2026-04-08 12:10:47 +02:00
ZdenekSrotyr
5ee12d78e7 refactor: final cleanup — delete legacy auth, clean deps, fix hash, migrate to uv
- Delete root auth/ directory (legacy Flask providers, orphaned)
- Clean requirements.txt: remove Flask, gunicorn, authlib, sendgrid,
  anthropic, openai, argon2-cffi (9 unused deps)
- Fix hash computation in orchestrator: MD5 of parquet mtime+size
  (CLI sync now skips unchanged tables correctly)
- Migrate pip → uv in CLAUDE.md, scripts/init.sh, pyproject.toml
- Sync pyproject.toml dependencies with requirements.txt

578 tests passing.
2026-03-31 19:18:30 +02:00
ZdenekSrotyr
4d1acd014a refactor: remove legacy webapp + add missing tests + housekeeping
Phase A: Close fixed issues (#7, #8, #9), add server/ user/ to
.gitignore, increase extractor timeout to 30 min.

Phase B: Add 10 new tests — access request lifecycle (4), CLI admin
commands (5), sync subprocess trigger (1). 578 tests passing.

Phase C: Delete entire webapp/ directory (24,800 lines) — legacy Flask
app fully replaced by FastAPI app/. Fix auth providers to use
app.instance_config instead of webapp.config. Update CLAUDE.md.

Delete 6 webapp-only test files. Fix Jira service config imports.
2026-03-31 13:44:06 +02:00
ZdenekSrotyr
9fef90a729 docs: rewrite CLAUDE.md for extract.duckdb architecture
Update project structure, architecture diagram, key implementation
details, development commands, and extensibility docs.
Add extract service to docker-compose.yml for one-shot extraction.
2026-03-31 07:52:44 +02:00
Petr
e35e602c59 Update CLAUDE.md with metrics, table registry, password auth
Add docs/metrics/ to project structure, Business Metrics and Table
Registry patterns to implementation details, password auth provider
to extensibility section, fix sync command for returning users.
2026-03-10 23:05:03 +01:00
Petr
44bf43535b Add sample data generator with 9 e-commerce tables
Synthetic data generator for demo/testing without real data adapter:
- 9 tables: customers, products, campaigns, web_sessions, web_leads,
  orders, order_items, payments, support_tickets
- 4 size presets: xs (1MB), s (15MB), m (150MB), l (1.5GB)
- Realistic patterns: seasonality, Pareto customer distribution,
  segment-based behavior, referential integrity
- Deterministic output via --seed parameter

Also: docs/sample-data.md, updated auto-install.md with Step 6,
updated CLAUDE.md (email auth provider, dual-repo architecture)
2026-03-10 12:31:14 +01:00
Petr
7c9007a8f9 Update docs for modular architecture (auth/, services/, scripts/)
Add auth providers, standalone services, and service patterns
to project structure in README, ARCHITECTURE, and CLAUDE.md.
Reflects the completed extraction of auth, telegram bot,
ws gateway, corporate memory, and session collector.
2026-03-09 13:11:40 +01:00
Petr
2d3f127e58 Update paths in docs and sudoers after services/ extraction
All references to server/telegram_bot/, server/ws_gateway/,
server/corporate_memory/, server/session_collector* updated
to their new locations under services/.
2026-03-09 13:02:13 +01:00
Petr
38b86127ed Branding cleanup: remove Keboola-specific references from docs and config
- server/deploy.sh: KEBOOLA_ENV_FILE -> SYNC_ENV_FILE
- server/ws-gateway.service, notify-bot.service: remove Keboola from descriptions
- .gitignore: generic comment for data directory
- CLAUDE.md, README.md, ARCHITECTURE.md: update paths from src/adapters to connectors/
- docs/DATA_SOURCES.md: update custom connector guide to connectors/ pattern
- connectors/jira/README.md: keboola-analyst -> data-analyst in config paths
- dev_docs/desktop-app.md: KeboolaAnalyst -> DataAnalyst branding
2026-03-09 12:22:27 +01:00
Petr
86edd27655 Extract Jira into connectors/jira module
Move all Jira-specific code into a self-contained connector module:
- 22 files moved via git mv (transform, service, webhook, scripts,
  systemd units, tests, docs, bin helper)
- All imports updated to use connectors.jira.* paths
- Jira is now conditional: auto-detected via JIRA_DOMAIN env var
- Webapp registers Jira blueprint only when available
- Health service monitors Jira timers only when enabled
- Profiler loads Jira tables dynamically from filesystem
- Sync settings uses config-driven dependency validation
- Renamed keboola_platform_url -> custom_url in transform
- Updated deploy.sh, sudoers-deploy, backfill_gap.sh paths
- Fixed pytest.ini to skip live tests by default
2026-03-09 11:17:50 +01:00
Petr
c56905d34f Initial commit: OSS data distribution platform
Open-source AI data analyst platform extracted from internal repo.
Includes data sync engine, Keboola adapter, Flask web portal,
server deployment scripts, and configuration templates.
2026-03-08 23:31:28 +01:00