- Legacy extractor now uses read_csv(all_varchar=true) to avoid type
inference errors (e.g. seniority column typed as DOUBLE with string values)
- DEPLOYMENT.md rewritten based on actual dev VM deployment experience:
deploy key setup, DuckDB write locking, env reload gotchas, bootstrap flow
Extractors with remote tables now write a _remote_attach table into
extract.duckdb so the orchestrator can re-ATTACH external extensions
at query time. The mechanism is source-agnostic — any connector can use it.
- Keboola extractor writes _remote_attach + creates views on kbc.*
- Orchestrator reads _remote_attach, installs extension, reads token from env
- Graceful degradation: missing token → warning, local tables still work
Three-pronged fix for DuckDB lock conflicts:
1. WAL mode on system.duckdb — enables concurrent readers + writer
2. Sync trigger runs extractor as subprocess (not background task) —
separate process = separate DuckDB connections, no lock conflict
3. Both extractor and orchestrator write to .tmp then atomic rename —
avoids lock conflict with API reads on extract.duckdb/analytics.duckdb
Fixes#9 permanently.