agnes-the-ai-analyst/scripts
ZdenekSrotyr 117b6784ea
fix(sync+ops): defer-probe race, AGNES_TEMP_DIR chown, default-schedule env knob (#283)
* fix(sync+ops): defer-probe race, AGNES_TEMP_DIR chown, default-schedule env knob

Three sync-ops fixes surfaced during agnes-dev steady-state operation
after the v0.46→v0.54 cutover settled. None of them depend on each
other; bundled because they all live in the sync trigger / agnes-auto-
upgrade flow and are diagnosed from the same observation window.

1. (fix) /api/sync/status race window. The trigger handler returned 200
   BEFORE the background task acquired _sync_lock. In that few-hundred-ms
   gap, an honest /api/sync/status call returned locked=false — and the
   host-side agnes-auto-upgrade.sh defer probe fired right in that
   window proceeded with 'docker compose up -d' and SIGKILLed the
   just-spawning extractor / materialized worker.

   Observed on agnes-dev: 3 mid-sync container kills in 30 min, each
   followed by a few-min outage and a partial sync. The WAL replay
   auto-recovery (PR #217) kept the system DB consistent through each
   kill, but the actual sync work was lost.

   Fix: handler stamps _recent_trigger_at; status endpoint returns
   locked=true for _TRIGGER_HOLD_SEC (=30s) after the most recent
   trigger, even if the background task hasn't yet acquired the lock.
   30s covers the schedule → spawn latency with margin; short enough
   not to indefinitely block auto-upgrade after a one-off trigger.
   Defense in depth: the real lock still gates the extractor subprocess.

2. (fix) scripts/ops/agnes-auto-upgrade.sh: post-upgrade chown loop
   now mkdir -p's /data/tmp before chown'ing, and includes it in the
   list of dirs that get the runtime UID:GID. /data/tmp is the default
   AGNES_TEMP_DIR set in docker-compose.yml — Snowflake-UNLOAD slice
   staging and CSV intermediates land here. Pre-fix the runtime user
   (uid 999) couldn't create /data/tmp under a root-owned data-disk
   root, so tempfiles silently fell back to the boot disk's overlayfs
   /tmp — defeating the whole point of routing slice staging onto the
   dedicated data volume.

3. (feat) AGNES_DEFAULT_SYNC_SCHEDULE env var sets the platform-wide
   fallback sync_schedule. Lets a deployment dial cadence down to
   'daily 03:00' (data freshness budget once-per-day) without having
   to PUT every registry row. Per-table sync_schedule still wins;
   literal 'every 1h' is the floor if neither is set — OSS-historical
   default unchanged.

Tests:
- test_sync_status_trigger_hold_window_reports_locked_after_trigger
- test_sync_status_trigger_hold_window_expires
- test_default_schedule_falls_through_env_then_every_1h (3 branches)

* release: 0.54.3 — sync defer-probe race + AGNES_TEMP_DIR chown + default-schedule env knob

Last commit on the PR per CLAUDE.md hard rule. Patch bump (0.54.2 →
0.54.3) bundling three sync-ops fixes from agnes-dev steady-state
observation.

No DB migration; trigger-hold window is additive (anything that already
saw locked=true still does — the window EXTENDS the true period);
/data/tmp chown is no-op when already correct; AGNES_DEFAULT_SYNC_SCHEDULE
unset = every-1h default unchanged.
2026-05-13 09:44:20 +00:00
..
debug chore(oss): isolate customer-specific deploy bits from scripts/grpn/ (#88, wave 1) (#94) 2026-04-27 20:24:34 +02:00
dev feat(home): state-aware /home + /setup-advanced + schema v26 (#228) 2026-05-08 18:28:47 +02:00
ops fix(sync+ops): defer-probe race, AGNES_TEMP_DIR chown, default-schedule env knob (#283) 2026-05-13 09:44:20 +00:00
backfill_usage_attribution.py Activity Center: audit log + telemetry + sessions + agnes_* tables (#278) 2026-05-12 22:41:19 +02:00
bootstrap-gcp.sh fix(bootstrap): grant monitoring.editor + enable monitoring API 2026-04-21 20:32:50 +02:00
duckdb_manager.py chore(oss): isolate customer-specific deploy bits from scripts/grpn/ (#88, wave 1) (#94) 2026-04-27 20:24:34 +02:00
fetch-env-from-secrets.sh chore(oss): isolate customer-specific deploy bits from scripts/grpn/ (#88, wave 1) (#94) 2026-04-27 20:24:34 +02:00
fix_description_escapes.py fix(admin/tables): script to clean already-corrupted descriptions in registry 2026-05-06 10:14:23 +02:00
generate_openapi.py feat: multi-instance deployment — all 14 must-have items from spec 2026-04-10 11:57:42 +02:00
generate_sample_data.py feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
init.sh refactor: final cleanup — delete legacy auth, clean deps, fix hash, migrate to uv 2026-03-31 19:18:30 +02:00
migrate_json_to_duckdb.py feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150) 2026-04-30 22:02:16 +02:00
migrate_metrics_to_duckdb.py feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
migrate_parquets_to_extracts.py feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
migrate_registry_to_duckdb.py feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
README.md fix: rewrite Makefile and scripts/README.md 2026-04-09 17:16:04 +02:00
run-local-dev.ps1 System plugins (schema v39) + marketplace UX polish + drop legacy pages (#241) 2026-05-10 19:15:41 +00:00
run-local-dev.sh fix(security+ops) + release(0.12.1): #82 #85 #87 hardening + cut 0.12.1 (#104) 2026-04-28 19:57:30 +02:00
seed_corporate_memory.py feat(memory): corporate memory v1+v1.5 + 0.15.0 (#72) 2026-04-29 07:16:22 +02:00
seed_dummy_tables.py feat(diagnose) + docs: warn on USER_PROJECT_DENIED footgun + document all newly-exposed knobs 2026-05-01 20:27:24 +02:00
smoke-test-materialized-bq.sh feat(materialized): query_mode='materialized' for BigQuery + Keboola — admin SELECT → parquet → analyst 2026-05-01 20:25:56 +02:00
smoke-test.sh fix(ci): smoke-test stale route + rollback ghcr auth + issues:write (#140) 2026-04-30 09:42:27 +02:00
tls-fetch.sh feat(tls): corporate-CA HTTPS with URL-driven rotation, on-VM CSR gen, self-signed fallback (#51) 2026-04-25 19:51:25 +00:00

Scripts

Utility and migration scripts for Agnes AI Data Analyst.

Active Scripts

Script Purpose
generate_sample_data.py Generate sample data for development/demo
duckdb_manager.py DuckDB database management utilities
init.sh Initial server setup (install deps, create dirs)

Migration Scripts (one-time use)

Script Purpose
migrate_json_to_duckdb.py Migrate v1 JSON state files to DuckDB
migrate_parquets_to_extracts.py Migrate v1 parquet layout to extract.duckdb
migrate_registry_to_duckdb.py Migrate v1 table registry to DuckDB