chore: clean repo for public release — fix references, remove drafts

- Replace padak/tmp_oss → keboola/agnes-the-ai-analyst in all docs, infra, CLI - Replace your-org/ai-data-analyst → keboola/agnes-the-ai-analyst in README, Jira docs - Remove real GCP project ID from terraform.tfvars.example - Delete internal draft documents (dev_docs/draft/) - Update infra/main.tf to clone from main branch
2026-04-08 19:27:25 +02:00 · 2026-04-08 19:27:25 +02:00 · 89154d043b
commit 89154d043b
parent 79443e0df4
15 changed files with 877 additions and 1011 deletions
--- a/README.md
+++ b/README.md
@ -59,8 +59,8 @@ The short version:
 ```bash
 # 1. Clone the repository
-git clone https://github.com/your-org/ai-data-analyst.git
+git clone https://github.com/keboola/agnes-the-ai-analyst.git
-cd ai-data-analyst
+cd agnes-the-ai-analyst
 # 2. Copy and edit configuration
 cp config/instance.yaml.example config/instance.yaml
--- a/cli/skills/deploy.md
+++ b/cli/skills/deploy.md
@ -19,9 +19,9 @@ ssh user@your-server-ip
 ### 2. Clone the repository
 ```bash
-git clone https://github.com/padak/tmp_oss.git /opt/data-analyst
+git clone https://github.com/keboola/agnes-the-ai-analyst.git /opt/data-analyst
 cd /opt/data-analyst
-git checkout feature/v2-fastapi-duckdb-docker-cli
+git checkout main
 ```
 ### 3. Create .env file
--- a/connectors/jira/README.md
+++ b/connectors/jira/README.md
@ -500,11 +500,11 @@ The SLA polling job runs every 15 minutes via systemd timer (`jira-sla-poll.time
 3. Updates raw JSON atomically (`tempfile.mkstemp()` + `os.fchmod(fd, 0o660)` + `os.replace()`)
 4. Triggers incremental Parquet transform (inside advisory file lock)
-**Self-healing:** The poll fetches `status`, `resolution`, `resolutiondate`, and `updated` alongside the SLA fields. If a ticket is resolved in Jira but still appears "open" in Parquet (e.g. due to a missed webhook), the poll automatically corrects the status in JSON and re-transforms to Parquet. Log output: `Self-healing: SUPPORT-XXXX is resolved in Jira`. This was added in response to [#203](https://github.com/your-org/ai-data-analyst/issues/203) where 12 tickets were permanently stale after a permission bug prevented webhooks from updating JSON files.
+**Self-healing:** The poll fetches `status`, `resolution`, `resolutiondate`, and `updated` alongside the SLA fields. If a ticket is resolved in Jira but still appears "open" in Parquet (e.g. due to a missed webhook), the poll automatically corrects the status in JSON and re-transforms to Parquet. Log output: `Self-healing: SUPPORT-XXXX is resolved in Jira`. This was added in response to [#203](https://github.com/keboola/agnes-the-ai-analyst/issues/203) where 12 tickets were permanently stale after a permission bug prevented webhooks from updating JSON files.
 **File locking:** The entire read-modify-write + Parquet transform is wrapped in a per-issue advisory file lock (`connectors/jira/file_lock.py`) to prevent races with the webhook handler. The webhook handler (`connectors/jira/service.py`) uses the same lock. Different issue keys don't block each other.
-**Important — `mkstemp` and ACL:** The `issues/` directory uses POSIX ACLs with `default:mask::rwx`. `tempfile.mkstemp()` creates files with mode `0600`, which overrides the ACL mask to `---` and breaks group access for www-data (webhook handler) and deploy (batch transform). The `os.fchmod(fd, 0o660)` call immediately after `mkstemp()` restores the mask to `rw-`, preserving ACL-based access. See [#203](https://github.com/your-org/ai-data-analyst/issues/203) for the full incident report.
+**Important — `mkstemp` and ACL:** The `issues/` directory uses POSIX ACLs with `default:mask::rwx`. `tempfile.mkstemp()` creates files with mode `0600`, which overrides the ACL mask to `---` and breaks group access for www-data (webhook handler) and deploy (batch transform). The `os.fchmod(fd, 0o660)` call immediately after `mkstemp()` restores the mask to `rw-`, preserving ACL-based access. See [#203](https://github.com/keboola/agnes-the-ai-analyst/issues/203) for the full incident report.
 ```bash
 # Manual run
--- a/dev_docs/disaster-recovery.md
+++ b/dev_docs/disaster-recovery.md
@ -105,7 +105,7 @@ Disk Layout:
   ```bash
   mkdir -p /opt/data-analyst
   chown deploy:data-ops /opt/data-analyst
-   sudo -u deploy git clone git@github.com:your-org/ai-data-analyst.git /opt/data-analyst/repo
+   sudo -u deploy git clone git@github.com:keboola/agnes-the-ai-analyst.git /opt/data-analyst/repo
   git config --global --add safe.directory /opt/data-analyst/repo
   /opt/data-analyst/repo/server/setup.sh
   ```
--- a/dev_docs/draft/services-integration-claude.md
+++ b/dev_docs/draft/services-integration-claude.md
@ -1,233 +0,0 @@
 # Service Connector - Integration of Internal APIs into Data Analyst Platform
 ## Context
 The data analyst platform currently supports only data analysis (parquet files + DuckDB). We want to extend it so analysts can also interact with internal services (Purchase Order system, Invoicing, CRM) through Claude Code. This requires:
 1. **API keys** delivered to the analyst's local machine (`.env` file)
 2. **Skills** teaching Claude Code how to use each service's API (`.claude/rules/` markdown files)
 3. **Seamless UX** - non-technical users click "Connect" in the web portal, everything else is automatic
 Key constraints:
 - All external services are internal apps (we can modify them)
 - They already have Google OAuth and Bearer token/API key authentication
 - They already have token generation UI
 - We target 2-3 services initially
 - Must reuse established patterns (sudo install, atomic JSON, sync_data.sh)
 ## Architecture Overview
 ```
 User clicks "Connect" on your-instance.example.com
    |
    v
 Webapp calls external service's internal token-exchange endpoint
    |  (service-to-service, shared secret)
    v
 API key returned, stored in /data/service-connectors/connections.json
    |
    v
 Webapp writes /home/{user}/.service_env (sudo install, mode 600)
 Webapp writes /home/{user}/.claude_rules/sc_{service}.md (skill file)
    |
    v
 Analyst runs sync_data.sh
    |
    v
 .service_env -> merged into ~/keboola-analysis/.env
 sc_*.md -> already synced with existing corporate memory rules sync
 ```
 ## Implementation Plan
 ### Phase 1: Service Registry & Infrastructure
 **1.1 Create service registry config**
 - File: `docs/setup/service_connectors.json`
 - Defines available services: id, name, description, URLs, env var names, skill file name
 - Deployed to `/data/docs/setup/` by deploy.sh
 **1.2 Create sudo helper script**
 - File: `server/bin/install-service-env`
 - Accepts: USERNAME, ENV_SOURCE_PATH, SKILLS_SOURCE_DIR
 - Installs `.service_env` (mode 600) to user home
 - Installs `sc_*.md` skill files to `.claude_rules/` (mode 600)
 - Only removes `sc_*.md` files (leaves `km_*.md` from corporate memory intact)
 - Template: `server/bin/install-user-rules` (63 lines, same structure)
 **1.3 Update sudoers**
 - File: `server/sudoers-webapp` - add entry for `install-service-env`
 **1.4 Update deploy.sh**
 - Create `/data/service-connectors/` directory (www-data:data-ops, 2770)
 - Deploy service registry and skill files
 - Add new env vars to .env block: `SC_SECRET_PURCHASE_ORDERS`, `SC_SECRET_INVOICING`, `SC_SECRET_CRM`
 **1.5 Add config entries**
 - File: `webapp/config.py` - no new config class entries needed (secrets read directly with `os.environ.get()` in the service module, same pattern as sync_settings_service.py)
 ### Phase 2: Backend Service
 **2.1 Create service connector module**
 - File: `webapp/service_connector_service.py`
 - Pattern: follows `webapp/sync_settings_service.py` exactly
 Key functions:
 ```python
 # Data storage
 CONNECTORS_DIR = Path(os.environ.get("CONNECTORS_DIR", "/data/service-connectors"))
 CONNECTIONS_FILE = CONNECTORS_DIR / "connections.json"
 # Core functions
 def get_available_services() -> dict                          # Load registry
 def get_user_connections(username: str) -> dict                # User's connection status
 def connect_service(username, service_id, user_email) -> (bool, str)  # Token exchange + install
 def disconnect_service(username, service_id) -> (bool, str)    # Revoke + cleanup
 def check_service_health(service_id) -> dict                   # Health check
 # Internal
 def _exchange_token(service, user_email) -> dict | None        # Call external service
 def _revoke_token(service, token_id) -> bool                   # Call revoke endpoint
 def _regenerate_user_env(username) -> bool                     # Write .service_env via sudo
 def _install_service_skills(username) -> bool                  # Write sc_*.md via sudo
 def _get_server_username(webapp_username) -> str               # Reuse WEBAPP_TO_SERVER_USERNAME
 ```
 Storage format (`connections.json`):
 ```json
 {
  "john": {
    "purchase_orders": {
      "connected": true,
      "api_key": "pk_live_abc123...",
      "token_id": "tok_xyz789",
      "connected_at": "2026-02-16T12:00:00Z",
      "expires_at": "2026-05-17T12:00:00Z"
    }
  }
 }
 ```
 Note: API keys stored in connections.json (protected by 660 permissions, www-data:data-ops). This follows the same approach as telegram_users.json storing chat_ids. For internal services, this is acceptable security level.
 **2.2 Add API routes to webapp**
 - File: `webapp/app.py` - add routes in `register_routes()`
 ```
 GET  /api/service-connectors          - List services + user connections
 POST /api/service-connectors/connect  - Connect to a service {service_id}
 POST /api/service-connectors/disconnect - Disconnect {service_id}
 GET  /api/service-connectors/health/<service_id> - Health check
 ```
 **2.3 Token exchange protocol**
 What each external service needs to implement:
 ```
 POST /api/internal/token-exchange
 Authorization: Bearer <shared_secret>
 Body: {"user_email": "john@your-domain.com", "ttl_days": 90}
 Response: {"status": "ok", "api_key": "...", "token_id": "...", "expires_at": "..."}
 POST /api/internal/token-revoke
 Authorization: Bearer <shared_secret>
 Body: {"token_id": "tok_xyz789"}
 Response: {"status": "ok"}
 ```
 ### Phase 3: Dashboard UI
 **3.1 Add Service Connectors card to dashboard**
 - File: `webapp/templates/dashboard.html`
 - New card in the existing 2-column layout (same pattern as Data Settings and Telegram cards)
 - Shows grid of service cards with Connect/Disconnect buttons
 - Connected = green badge + expiry date
 - AJAX calls to `/api/service-connectors/*` endpoints
 ### Phase 4: Sync & Skills
 **4.1 Extend sync_data.sh**
 - File: `scripts/sync_data.sh`
 - Add block after corporate memory rules sync (line ~418):
  1. Download `~/.service_env` from server via SCP
  2. If exists: merge into local `.env` using marker comments (`# --- SERVICE CONNECTOR START/END ---`)
  3. If not exists: clean old service connector block from `.env`
 ```bash
 # --- Sync service connector credentials ---
 if scp -q data-analyst:~/.service_env /tmp/.service_env_$$ 2>/dev/null; then
    # Remove old block, append new one with markers
    sed -i.bak '/^# --- SERVICE CONNECTOR START ---$/,/^# --- SERVICE CONNECTOR END ---$/d' ./.env 2>/dev/null
    { echo "# --- SERVICE CONNECTOR START ---"; cat /tmp/.service_env_$$; echo "# --- SERVICE CONNECTOR END ---"; } >> ./.env
    rm -f /tmp/.service_env_$$
 fi
 ```
 Note: `sc_*.md` skills are already synced by the existing corporate memory sync block (line 410: `scp -rq "data-analyst:~/.claude_rules/"* .claude/rules/`).
 **4.2 Create skill files**
 - Directory: `docs/service_connector_skills/`
 - Files: `sc_purchase_orders.md`, `sc_invoicing.md`, `sc_crm.md`
 - Content: Authentication setup, available endpoints, common patterns, data models
 - Deployed to `/data/docs/service_connector_skills/` by deploy.sh
 - Installed to user's `.claude_rules/` when they connect
 ### Phase 5: Tests
 **5.1 Unit tests**
 - File: `tests/test_service_connector_service.py`
 - Test: connect/disconnect flow, env generation, registry loading, error handling
 ## Files to Create
 | File | Purpose |
 |------|---------|
 | `webapp/service_connector_service.py` | Core service (connect, disconnect, env generation) |
 | `docs/setup/service_connectors.json` | Service registry config |
 | `docs/service_connector_skills/sc_purchase_orders.md` | PO API skill |
 | `server/bin/install-service-env` | Sudo helper for env + skills install |
 | `tests/test_service_connector_service.py` | Unit tests |
 ## Files to Modify
 | File | Change |
 |------|--------|
 | `webapp/app.py` | Import service_connector_service, add 4 API routes |
 | `webapp/templates/dashboard.html` | Add Service Connectors card widget |
 | `server/sudoers-webapp` | Add `install-service-env` entry |
 | `server/deploy.sh` | Create /data/service-connectors/, deploy skills, add env vars |
 | `scripts/sync_data.sh` | Add .service_env download and .env merge block |
 | `.github/workflows/deploy.yml` | Add SC_SECRET_* GitHub Secrets to env |
 ## Key Patterns Reused
 - **Sudo install**: `sync_settings_service.py:_regenerate_user_config()` (line 143-183)
 - **Atomic JSON**: `sync_settings_service.py:_write_json()` (line 61-74)
 - **Username mapping**: `corporate_memory_service.py:_get_server_username()` (line 56-59)
 - **Sudo helper script**: `server/bin/install-user-rules` (entire file)
 - **Dashboard AJAX pattern**: Sync settings toggles in `dashboard.html`
 ## Security Model
 | Stage | Protection |
 |-------|------------|
 | Token exchange (webapp <-> service) | HTTPS + shared secret in Authorization header |
 | Central storage (connections.json) | /data/service-connectors/ (2770), file 660 |
 | User home (.service_env) | Mode 600 (owner-only), sudo install |
 | Transit (sync) | SCP over SSH |
 | Client (.env) | Local filesystem; Claude Code settings deny Read(.env) |
 | Claude Code usage | Python `load_dotenv()` via Bash (allowed) |
 ## Verification
 1. **Unit tests**: `pytest tests/test_service_connector_service.py`
 2. **Manual flow**:
   - Deploy to server
   - Log into your-instance.example.com
   - Click "Connect" on PO system in dashboard
   - Verify `.service_env` appears in `/home/{user}/`
   - Run `sync_data.sh` on client
   - Verify `.env` contains PO_API_KEY
   - Verify `.claude/rules/sc_purchase_orders.md` exists
   - In Claude Code: `python -c "from dotenv import load_dotenv; load_dotenv(); import os; print(os.environ.get('PO_API_KEY', 'NOT SET'))"`
 3. **Disconnect flow**: Click Disconnect, verify key removed from .env after sync
--- a/dev_docs/draft/services-integration-codex.md
+++ b/dev_docs/draft/services-integration-codex.md
@ -1,180 +0,0 @@
 # Pilot: PO Integrace pro Interního Analysta (bez lokálního ukládání klíčů)
 ## Shrnutí
 Cíl je dodat produkční pilot pro **Purchase Order System** s tímto chováním:
 1. Uživatel v dashboardu udělá jen `Connect` (1 klik + Google consent).
 2. Server bezpečně uloží per-user credential (šifrovaně, ne v lokálním počítači).
 3. Na klientovi se token načte **on-demand přes SSH** pouze pro jeden příkaz (ENV jen v child procesu).
 4. Do `.claude/rules` se synchronizuje znalost, že PO operace mají používat PO wrapper/skill.
 5. Dodáme zároveň **full skill beta** (vedle produkční varianty rules+wrapper).
 ## Scope a hranice
 - In-scope:
  - Obecný service registry model (rozšiřitelný), implementace provideru jen pro `po`.
  - Connect/Disconnect/Test flow v dashboard katalogu.
  - Server-side per-user encrypted credential store.
  - Runtime wrapper přes SSH s krátkodobým tokenem.
  - Audit + revoke + 30denní rotace.
  - Sync rules + skill beta distribuce.
 - Out-of-scope:
  - CRM/Revolut implementace (jen připravený framework).
  - Plná proxy architektura pro API calls (volání půjde přímo klient -> PO API).
  - Migrace na externí secret manager v pilotu.
 ## Návrh řešení (decision-complete)
 ### 1) Service registry a datový model
 - Přidat nový server-side registry JSON: `/data/integrations/registry.json`.
 - Schema položky služby:
  - `service_id` (`po`)
  - `display_name`
  - `connect_type` (`oauth_bootstrap_link`)
  - `scopes`
  - `api_base_url`
  - `enabled`
  - `skill_slug` (`po-system`)
  - `rule_template_id`
 - Per-user credential metadata:
  - `/data/integrations/state/<service>/<username>.json`
  - Obsah: `status`, `connected_at`, `last_rotated_at`, `expires_at`, `token_ref`, `audit_last_use_at`
 - Per-user encrypted secret:
  - `/data/integrations/secrets/<service>/<username>.enc`
  - Šifrování AES-GCM, key z `/etc/internal-analyst/integrations.key` (root:root, 600).
 ### 2) Dashboard UX (katalog zdrojů)
 - Do `webapp/templates/catalog.html` přidat novou kartu “Purchase Order System”.
 - Akce:
  - `Connect` -> redirect na bootstrap URL PO.
  - `Disconnect` -> revoke + smazání secretu.
  - `Test connection` -> ověření issuance short-lived tokenu.
 - UI stav:
  - `Not connected`, `Connected`, `Error`, `Reauthorization required`.
 ### 3) Webapp API rozhraní
 - Přidat nové endpointy (Flask):
  - `GET /api/integrations/catalog`
  - `POST /api/integrations/<service>/connect`
  - `GET /api/integrations/<service>/callback`
  - `POST /api/integrations/<service>/disconnect`
  - `POST /api/integrations/<service>/test`
 - Interní service layer:
  - nový `webapp/integration_service.py`
  - provider interface:
    - `build_connect_url(user, state)`
    - `exchange_callback(code, state)`
    - `rotate(user)`
    - `issue_runtime_token(user, ttl_seconds)`
    - `revoke(user)`
 - `po` provider implementace jako první konkrétní provider.
 ### 4) PO bootstrap link + token lifecycle
 - Connect flow:
  1. Uživatel klikne Connect v dashboardu.
  2. Redirect na PO auth/bootstrap endpoint (Google auth + consent).
  3. Callback zpět na webapp.
  4. Webapp uloží refresh credential šifrovaně.
 - Rotace:
  - server job (cron/systemd) denně kontroluje `last_rotated_at`.
  - rotuje každých 30 dní.
 - Revoke:
  - Disconnect okamžitě revokuje token v PO a maže server secret.
 ### 5) Runtime token injection přes SSH (bez lokální persistence)
 - Přidat server helper:
  - `/usr/local/bin/integration-runtime-env`
  - Vstup: `--service po --ttl 300 --format env`
  - Výstup: shell-safe `export` lines pro child proces.
 - Přístup:
  - přes sudoers povolit pouze self-service issuance (uživatel jen pro svůj účet).
  - helper tvrdě validuje volajícího uživatele a service allowlist.
 - Lokální wrapper skript v syncovaných skriptech:
  - `server/scripts/po-run`
  - Spuštění:
    - `ssh data-analyst "/usr/local/bin/integration-runtime-env --service po --ttl 300 --format env"`
    - spustí cílový příkaz v subshell s ENV
    - po skončení ENV zahodí (bez zápisu do souboru)
 - Přímá API volání:
  - klientský skript/skill volá PO API přímo s runtime ENV tokenem.
 ### 6) Rules + skill beta distribuce
 - Produkčně:
  - generovat service rule soubor na serveru: `/home/<user>/.claude_rules/svc_po.md`
  - `scripts/sync_data.sh` už pravidla stahuje do `.claude/rules/`; upravit jen reporting, aby počítal i `svc_*.md`.
 - Skill beta:
  - přidat sync složku `server/skills/po-system/` (SKILL.md + scripts + references).
  - přidat skript `server/scripts/install_skills.sh`:
    - instaluje do uživatelova Codex skill home.
  - `sync_data.sh` po syncu volá `install_skills.sh` (best-effort, neblokuje datový sync).
 ### 7) Audit, observability, bezpečnost
 - Audit log soubor:
  - `/data/integrations/audit.log` (append-only)
  - eventy: `connect`, `callback_ok`, `runtime_issue`, `rotate`, `revoke`, `error`
 - Bezpečnostní guardrails:
  - token nikdy nelogovat
  - helper vrací jen krátkodobý access token
  - TTL runtime tokenu default 5 min
  - strict input validation service/usernames
 - Monitoring:
  - dashboard health check pro integrations service
  - metriky: connect success rate, runtime issuance latency, rotate failures.
 ## Důležité změny veřejných API/rozhraní/typů
 - Nové REST endpointy pod `/api/integrations/*`.
 - Nový datový kontrakt `registry.json` pro katalog služeb.
 - Nový provider interface v `webapp/integration_service.py`.
 - Nový CLI kontrakt helperu `integration-runtime-env`.
 - Nový lokální wrapper command `po-run`.
 ## Testy a validační scénáře
 ### Unit testy
 - `webapp/integration_service.py`:
  - validace state/nonce
  - encrypt/decrypt secretu
  - rotace pravidla 30 dní
 - provider `po`:
  - connect URL tvorba
  - callback exchange error handling
 - helper `integration-runtime-env`:
  - self-user enforcement
  - invalid service rejection
  - TTL bounds
 ### Integration testy
 - E2E Connect flow:
  - dashboard -> PO consent -> callback -> connected status.
 - E2E runtime:
  - `po-run "curl ..."` získá token přes SSH a provede volání.
 - E2E Disconnect:
  - revoke v PO + odstranění state/secrets + UI přepnutí na not connected.
 - E2E rotation:
  - forced rotation scenario + audit log verification.
 ### Security testy
 - Pokus o issuance tokenu pro jiného uživatele (musí failnout).
 - Pokus o command injection přes service parameter (musí failnout).
 - Ověření, že token není v logu ani na disku klienta po dokončení příkazu.
 ### Acceptance kritéria
 - Uživatel připojí PO službu 1 klikem + consent.
 - Žádný persistentní API klíč na klientově disku.
 - `po-run` funguje bez ručního exportu tokenu.
 - Disconnect do 1 min deaktivuje další runtime issuance.
 - Rotace probíhá automaticky po 30 dnech.
 ## Rollout plán
 1. Backend + helper + audit nasadit behind feature flag `INTEGRATIONS_PO_ENABLED=false`.
 2. Zapnout interně pro 1–2 test uživatele.
 3. Ověřit E2E + security checklist.
 4. Zapnout pro všechny analytiky.
 5. Po stabilizaci přidat další provider (CRM) přes stejný interface.
 ## Assumptions a zvolené defaulty
 - PO systém je plně pod kontrolou týmu a lze doplnit endpointy pro issuance/rotate/revoke.
 - Identita uživatele je mapovatelná mezi dashboardem a server účtem.
 - Runtime přenos credentialů má být pouze přes SSH mechanismus (ne lokální secret file).
 - Default runtime token TTL: 300 sekund.
 - Default rotace: 30 dní.
 - Pilot storage není `.env`; používá per-user encrypted store.
 - Produkční path v pilotu je rules+wrapper, full skill je beta paralelně.
--- a/dev_docs/draft/services-integration.md
+++ b/dev_docs/draft/services-integration.md
@ -1,577 +0,0 @@
 # Service Connector - Integration of Internal APIs into Data Analyst Platform
 ## Origin
 This plan was derived from two independent proposals (Claude and Codex), both reviewed by 3 AI models (6 reviews total). The reviews identified real issues but also pushed the design toward over-engineering. This final version applies KISS and YAGNI to keep only what matters.
 **Previous drafts** (kept for reference):
 - `services-integration-claude.md` - Claude plan (good patterns, missing encryption)
 - `services-integrations-codex.md` - Codex plan (SSH runtime injection, encrypted store - overkill for pilot)
 **What we cut and why:**
 - Fernet encryption of connections.json - encryption key lives on the same server as the data, security theater
 - fcntl file locking - 5-10 users clicking Connect once a month, race condition probability ~0%
 - Transactional connect with rollback - if token exchange fails, user clicks again
 - Audit log + HMAC + logrotate - Flask access log is enough for 5 users
 - Auto-rotation timer + systemd units - set long TTL, user reconnects if expired
 - Feature flag rollout - just deploy when ready
 - Security tests (CSRF, injection) - internal tool behind Google OAuth, all users are employees
 ## Context
 The data analyst platform supports data analysis (parquet + DuckDB). We want analysts to also interact with internal services (Purchase Orders, Invoicing, CRM) through Claude Code.
 **What the analyst needs:**
 1. API key in their local `.env` file
 2. Skill file in `.claude/rules/` teaching Claude Code how to use the API
 **What we already have:**
 - `sync_data.sh` that syncs files from server to analyst's machine
 - `.claude_rules/` sync for corporate memory (skills already flow through this)
 - `sudo install` pattern for writing to user home dirs
 - Dashboard with AJAX cards (Data Settings, Telegram)
 **Trust model:** Employee laptops are trusted (corporate-managed, encrypted disks). If a laptop is compromised, we have bigger problems than API keys.
 ## Architecture Overview
 ```
 User clicks "Connect" on your-instance.example.com
    |
    v
 Webapp calls service's token-exchange endpoint (shared secret)
    |
    v
 API key stored in /data/service-connectors/connections.json (plaintext, 660)
    |
    v
 Webapp writes /home/{user}/.service_env (sudo install, mode 600)
 Webapp writes /home/{user}/.claude_rules/sc_{service}.md (skill file)
    |
    v
 Analyst runs sync_data.sh (existing)
    |
    v
 .service_env -> merged into local .env
 sc_*.md -> already synced via existing .claude_rules/ sync
 ```
 That's it. No encryption layer, no file locking, no audit log, no rotation timer.
 ## Implementation
 ### 1. Service Registry
 File: `docs/setup/service_connectors.json`
 ```json
 [
  {
    "id": "purchase_orders",
    "name": "Purchase Order System",
    "description": "Create and query purchase orders",
    "token_exchange_url": "https://po.internal.example.com/api/internal/token-exchange",
    "token_revoke_url": "https://po.internal.example.com/api/internal/token-revoke",
    "env_var_name": "PO_API_KEY",
    "skill_file": "sc_purchase_orders.md",
    "enabled": true
  }
 ]
 ```
 Deployed to `/data/docs/setup/` by deploy.sh (same as other config files).
 ### 2. Backend Service
 File: `webapp/service_connector_service.py`
 Follows `webapp/sync_settings_service.py` pattern exactly.
 ```python
 """
 Service connector - manages API key provisioning for internal services.
 Reads service registry from /data/docs/setup/service_connectors.json.
 Stores user connections in /data/service-connectors/connections.json.
 Writes .service_env and skill files to user home dirs via sudo install.
 """
 import json
 import logging
 import os
 import subprocess
 import tempfile
 from datetime import datetime
 from pathlib import Path
 from typing import Any
 import httpx
 logger = logging.getLogger(__name__)
 CONNECTORS_DIR = Path(os.environ.get("CONNECTORS_DIR", "/data/service-connectors"))
 CONNECTIONS_FILE = CONNECTORS_DIR / "connections.json"
 REGISTRY_FILE = Path(os.environ.get("REGISTRY_FILE", "/data/docs/setup/service_connectors.json"))
 SKILLS_DIR = Path(os.environ.get("SC_SKILLS_DIR", "/data/docs/service_connector_skills"))
 # Username mapping (reuse existing pattern)
 WEBAPP_TO_SERVER_USERNAME = {
    # Add overrides here if webapp username != server username
    # "jane.smith": "jane",
 }
 def get_available_services() -> list[dict]:
    """Load service registry."""
    if not REGISTRY_FILE.exists():
        return []
    with open(REGISTRY_FILE) as f:
        services = json.load(f)
    return [s for s in services if s.get("enabled", True)]
 def get_user_connections(username: str) -> dict:
    """Get user's active connections (without API keys)."""
    connections = _load_connections()
    user_conns = connections.get(username, {})
    # Strip API keys from response
    safe = {}
    for service_id, conn in user_conns.items():
        safe[service_id] = {
            "connected": conn.get("connected", False),
            "connected_at": conn.get("connected_at"),
            "expires_at": conn.get("expires_at"),
        }
    return safe
 def connect_service(username: str, service_id: str, user_email: str) -> tuple[bool, str]:
    """Exchange token with service, store key, install to user home."""
    service = _get_service_config(service_id)
    if not service:
        return False, "Unknown service"
    # Get shared secret for this service
    secret_env = f"SC_SECRET_{service_id.upper()}"
    shared_secret = os.environ.get(secret_env)
    if not shared_secret:
        logger.error(f"Missing {secret_env} environment variable")
        return False, "Service not configured"
    # Token exchange
    try:
        resp = httpx.post(
            service["token_exchange_url"],
            headers={"Authorization": f"Bearer {shared_secret}"},
            json={"user_email": user_email, "ttl_days": 365},
            timeout=30,
        )
        resp.raise_for_status()
        token_data = resp.json()
    except Exception as e:
        logger.error(f"Token exchange failed for {service_id}: {e}")
        return False, f"Token exchange failed: {e}"
    # Store in connections.json
    connections = _load_connections()
    connections.setdefault(username, {})[service_id] = {
        "connected": True,
        "api_key": token_data["api_key"],
        "token_id": token_data.get("token_id"),
        "connected_at": datetime.utcnow().isoformat() + "Z",
        "expires_at": token_data.get("expires_at"),
    }
    _save_connections(connections)
    # Write .service_env and skills to user home
    server_username = _get_server_username(username)
    _regenerate_user_env(server_username, connections.get(username, {}))
    _install_service_skills(server_username, connections.get(username, {}))
    return True, "Connected successfully"
 def disconnect_service(username: str, service_id: str) -> tuple[bool, str]:
    """Revoke token and remove credentials."""
    connections = _load_connections()
    conn = connections.get(username, {}).get(service_id)
    if not conn:
        return False, "Not connected"
    # Try to revoke remotely (best-effort)
    service = _get_service_config(service_id)
    token_id = conn.get("token_id")
    if service and token_id:
        try:
            secret_env = f"SC_SECRET_{service_id.upper()}"
            shared_secret = os.environ.get(secret_env, "")
            httpx.post(
                service["token_revoke_url"],
                headers={"Authorization": f"Bearer {shared_secret}"},
                json={"token_id": token_id},
                timeout=30,
            )
        except Exception as e:
            logger.warning(f"Remote revoke failed for {service_id}/{token_id}: {e}")
    # Always clean up locally
    connections.get(username, {}).pop(service_id, None)
    if username in connections and not connections[username]:
        del connections[username]
    _save_connections(connections)
    # Regenerate user files
    server_username = _get_server_username(username)
    _regenerate_user_env(server_username, connections.get(username, {}))
    _install_service_skills(server_username, connections.get(username, {}))
    return True, "Disconnected"
 # --- Internal helpers ---
 def _get_service_config(service_id: str) -> dict | None:
    """Find service in registry by ID."""
    for s in get_available_services():
        if s["id"] == service_id:
            return s
    return None
 def _get_server_username(webapp_username: str) -> str:
    """Map webapp username to server Linux username."""
    return WEBAPP_TO_SERVER_USERNAME.get(webapp_username, webapp_username)
 def _load_connections() -> dict:
    """Load connections.json."""
    if not CONNECTIONS_FILE.exists():
        return {}
    with open(CONNECTIONS_FILE) as f:
        return json.load(f)
 def _save_connections(data: dict) -> None:
    """Atomic write to connections.json (same pattern as sync_settings_service)."""
    fd, temp_path = tempfile.mkstemp(dir=str(CONNECTORS_DIR), suffix=".json")
    try:
        os.write(fd, json.dumps(data, indent=2).encode())
        os.close(fd)
        os.chmod(temp_path, 0o660)
        os.replace(temp_path, str(CONNECTIONS_FILE))
    except Exception:
        os.close(fd) if not os.get_inheritable(fd) else None
        os.unlink(temp_path)
        raise
 def _regenerate_user_env(server_username: str, user_connections: dict) -> None:
    """Write .service_env to user home via sudo install."""
    # Build env file content
    lines = []
    for service_id, conn in user_connections.items():
        if not conn.get("connected"):
            continue
        service = _get_service_config(service_id)
        if service:
            lines.append(f"{service['env_var_name']}={conn['api_key']}")
    # Write to temp file, then sudo install
    fd, temp_path = tempfile.mkstemp(suffix=".env")
    try:
        os.write(fd, "\n".join(lines).encode() if lines else b"")
        os.close(fd)
        target = f"/home/{server_username}/.service_env"
        if lines:
            subprocess.run(
                ["sudo", "/usr/bin/install", "-o", server_username, "-g", server_username,
                 "-m", "600", temp_path, target],
                check=True, capture_output=True,
            )
        else:
            # No connections - remove .service_env if it exists
            subprocess.run(
                ["sudo", "rm", "-f", target],
                check=True, capture_output=True,
            )
    finally:
        os.unlink(temp_path)
 def _install_service_skills(server_username: str, user_connections: dict) -> None:
    """Install sc_*.md skill files to user's .claude_rules/ via sudo helper."""
    connected_services = [
        sid for sid, conn in user_connections.items() if conn.get("connected")
    ]
    # Copy relevant skill files to temp dir
    temp_dir = tempfile.mkdtemp()
    try:
        for service_id in connected_services:
            service = _get_service_config(service_id)
            if service and service.get("skill_file"):
                src = SKILLS_DIR / service["skill_file"]
                if src.exists():
                    dest = Path(temp_dir) / service["skill_file"]
                    dest.write_bytes(src.read_bytes())
        subprocess.run(
            ["sudo", "/usr/local/bin/install-service-env",
             server_username, temp_dir],
            check=True, capture_output=True,
        )
    finally:
        import shutil
        shutil.rmtree(temp_dir, ignore_errors=True)
 ```
 ### 3. API Routes
 Add to `webapp/app.py` in `register_routes()`:
 ```python
 from webapp import service_connector_service
@app.route("/api/service-connectors")
@login_required
 def api_service_connectors():
    username = get_username_from_email(session["user"]["email"])
    services = service_connector_service.get_available_services()
    connections = service_connector_service.get_user_connections(username)
    return jsonify({"services": services, "connections": connections})
@app.route("/api/service-connectors/connect", methods=["POST"])
@login_required
 def api_service_connect():
    username = get_username_from_email(session["user"]["email"])
    email = session["user"]["email"]
    service_id = request.json.get("service_id")
    ok, msg = service_connector_service.connect_service(username, service_id, email)
    return jsonify({"success": ok, "message": msg})
@app.route("/api/service-connectors/disconnect", methods=["POST"])
@login_required
 def api_service_disconnect():
    username = get_username_from_email(session["user"]["email"])
    service_id = request.json.get("service_id")
    ok, msg = service_connector_service.disconnect_service(username, service_id)
    return jsonify({"success": ok, "message": msg})
 ```
 ### 4. Sudo Helper
 File: `server/bin/install-service-env` (copy of `install-user-rules`, modified for `sc_*` prefix)
 ```bash
 #!/bin/bash
 # Install service connector skill files to user's .claude_rules/.
 # Called by webapp (www-data) via sudo.
 #
 # Usage: sudo install-service-env USERNAME SKILLS_SOURCE_DIR
 set -euo pipefail
 if [[ $EUID -ne 0 ]]; then
    echo "Must be run as root (via sudo)" >&2
    exit 1
 fi
 if [[ $# -lt 2 ]]; then
    echo "Usage: sudo install-service-env USERNAME SKILLS_SOURCE_DIR" >&2
    exit 1
 fi
 USERNAME="$1"
 SOURCE_DIR="$2"
 if ! id "$USERNAME" &>/dev/null; then
    echo "User '$USERNAME' does not exist" >&2
    exit 1
 fi
 if [[ ! -d "$SOURCE_DIR" ]]; then
    echo "Source directory '$SOURCE_DIR' does not exist" >&2
    exit 1
 fi
 USER_HOME=$(eval echo "~${USERNAME}")
 RULES_DIR="${USER_HOME}/.claude_rules"
 mkdir -p "$RULES_DIR"
 chown "${USERNAME}:${USERNAME}" "$RULES_DIR"
 chmod 700 "$RULES_DIR"
 # Remove old service connector skills only (sc_*.md), preserve km_*.md
 rm -f "${RULES_DIR}"/sc_*.md
 # Install new skill files
 COUNT=0
 for src_file in "${SOURCE_DIR}"/*.md; do
    if [[ -f "$src_file" ]]; then
        /usr/bin/install -o "$USERNAME" -g "$USERNAME" -m 600 "$src_file" "$RULES_DIR/"
        COUNT=$((COUNT + 1))
    fi
 done
 echo "Installed ${COUNT} service skills for ${USERNAME} in ${RULES_DIR}"
 ```
 ### 5. Dashboard UI Card
 Add to `webapp/templates/dashboard.html` - new card in existing grid, same AJAX pattern as Data Settings toggles:
 - Grid of service cards (name + description from registry)
 - Green "Connected" badge or grey "Not connected"
 - Connect / Disconnect button
 - Shows `expires_at` if connected
 ### 6. Sync Extension
 Add to `scripts/sync_data.sh` after existing corporate memory sync block:
 ```bash
 # --- Sync service connector credentials ---
 if scp -q data-analyst:~/.service_env /tmp/.service_env_$$ 2>/dev/null; then
    # Remove old service connector block
    if [ -f ./.env ]; then
        awk '
            /^# --- SERVICE CONNECTOR START ---$/ { skip=1; next }
            /^# --- SERVICE CONNECTOR END ---$/ { skip=0; next }
            !skip { print }
        ' ./.env > ./.env.tmp && mv ./.env.tmp ./.env
    fi
    # Append new block
    {
        echo "# --- SERVICE CONNECTOR START ---"
        cat /tmp/.service_env_$$
        echo "# --- SERVICE CONNECTOR END ---"
    } >> ./.env
    rm -f /tmp/.service_env_$$
    echo "Service connector credentials synced"
 else
    # No active connections - clean old block if present
    if [ -f ./.env ] && grep -q "^# --- SERVICE CONNECTOR START ---$" ./.env 2>/dev/null; then
        awk '
            /^# --- SERVICE CONNECTOR START ---$/ { skip=1; next }
            /^# --- SERVICE CONNECTOR END ---$/ { skip=0; next }
            !skip { print }
        ' ./.env > ./.env.tmp && mv ./.env.tmp ./.env
        echo "Service connector credentials removed"
    fi
 fi
 ```
 Skill files (`sc_*.md`) are already synced by the existing `.claude_rules/` sync block.
 ### 7. Deploy Changes
 Add to `server/deploy.sh`:
 ```bash
 # Service connectors directory
 mkdir -p /data/service-connectors
 chown www-data:data-ops /data/service-connectors
 chmod 2770 /data/service-connectors
 # Deploy skill files
 mkdir -p /data/docs/service_connector_skills
 cp -r docs/service_connector_skills/* /data/docs/service_connector_skills/ 2>/dev/null || true
 # Deploy service registry
 cp docs/setup/service_connectors.json /data/docs/setup/ 2>/dev/null || true
 # Install sudo helper
 install -m 755 server/bin/install-service-env /usr/local/bin/
 ```
 Add to `server/sudoers-webapp`:
 ```
 www-data ALL=(ALL) NOPASSWD: /usr/local/bin/install-service-env
 ```
 Add to `.github/workflows/deploy.yml` env block:
 ```yaml
 SC_SECRET_PURCHASE_ORDERS: ${{ secrets.SC_SECRET_PURCHASE_ORDERS }}
 ```
 ## Token Exchange Protocol
 What each internal service needs to implement (simple Bearer + JSON):
 ```
 POST /api/internal/token-exchange
 Authorization: Bearer <shared_secret>
 Content-Type: application/json
 Body: {"user_email": "john@your-domain.com", "ttl_days": 365}
 Response: {"status": "ok", "api_key": "...", "token_id": "...", "expires_at": "..."}
 POST /api/internal/token-revoke
 Authorization: Bearer <shared_secret>
 Content-Type: application/json
 Body: {"token_id": "tok_xyz789"}
 Response: {"status": "ok"}
 ```
 TTL is 365 days. If a key expires, user clicks Reconnect. No auto-rotation needed.
 ## Files to Create
 | File | Purpose | Size estimate |
 |------|---------|---------------|
 | `webapp/service_connector_service.py` | Connect, disconnect, env generation | ~150 lines |
 | `docs/setup/service_connectors.json` | Service registry | ~20 lines |
 | `docs/service_connector_skills/sc_purchase_orders.md` | PO API skill for Claude Code | ~50 lines |
 | `server/bin/install-service-env` | Sudo helper (copy of install-user-rules) | ~30 lines |
 | `tests/test_service_connector_service.py` | Unit tests | ~100 lines |
 ## Files to Modify
 | File | Change | Size estimate |
 |------|--------|---------------|
 | `webapp/app.py` | Add 3 API routes | ~20 lines |
 | `webapp/templates/dashboard.html` | Service connectors card | ~60 lines |
 | `server/sudoers-webapp` | Add install-service-env entry | 1 line |
 | `server/deploy.sh` | Create dirs, deploy skills, add env vars | ~10 lines |
 | `scripts/sync_data.sh` | .service_env merge block | ~20 lines |
 | `.github/workflows/deploy.yml` | Add SC_SECRET_* to env | ~3 lines |
 **Total new code: ~350 lines** (vs ~800+ in the hybrid plan, ~1200+ in the Codex plan)
 ## Security Model
 | Stage | Protection |
 |-------|------------|
 | Token exchange | HTTPS + per-service shared secret |
 | Server storage (connections.json) | File permissions 660, dir 2770 (www-data:data-ops) |
 | User home (.service_env) | Mode 600, sudo install |
 | Transit | SCP over SSH |
 | Client (.env) | Local filesystem, Claude Code denies Read(.env) |
 | Trust model | Employee laptops trusted (corporate-managed, encrypted disks) |
 ## Key Patterns Reused
 - **Sudo install**: `sync_settings_service.py:_regenerate_user_config()`
 - **Atomic JSON write**: `sync_settings_service.py:_write_json()` (tempfile + os.replace)
 - **Username mapping**: `corporate_memory_service.py:_get_server_username()`
 - **Sudo helper**: `server/bin/install-user-rules` (same structure)
 - **Dashboard AJAX**: Sync settings toggles in `dashboard.html`
 ## Verification
 1. `pytest tests/test_service_connector_service.py`
 2. Deploy, click Connect on PO, verify `.service_env` in `/home/{user}/`
 3. Run `sync_data.sh`, verify `.env` contains `PO_API_KEY`
 4. Verify `.claude/rules/sc_purchase_orders.md` exists
 5. In Claude Code: `python -c "from dotenv import load_dotenv; load_dotenv(); import os; print(os.environ.get('PO_API_KEY', 'NOT SET'))"`
 6. Click Disconnect, sync, verify key removed
 ## What We Might Add Later (only if needed)
 | Feature | When to add |
 |---------|-------------|
 | Encryption of connections.json | If we get a compliance requirement |
 | Auto-rotation | If services start issuing short-lived tokens |
 | Audit log | If we need forensics capability |
 | File locking | If we ever have concurrent connect/disconnect issues |
 | SSH runtime injection | If laptop trust model changes |
--- a/dev_docs/insights.md
+++ b/dev_docs/insights.md
@ -19,7 +19,7 @@ This is NOT a usage log. It is a **strategic command center** that:
 - **Current state**: Demo mockup with fictional data (DEMO badge in header)
 - **URL**: https://your-instance.example.com/activity-center (requires login)
- **PR**: https://github.com/your-org/ai-data-analyst/pull/122
+- **PR**: https://github.com/keboola/agnes-the-ai-analyst/pull/122
 - **Branch**: `feature/activity-center`
 - **Dashboard link**: Not added yet (UX placement decided, implementation pending)
--- a/dev_docs/server.md
+++ b/dev_docs/server.md
@ -145,7 +145,7 @@ except Exception:
    raise
 ```
-Use `0o660` for files accessed by services via data-ops group ACL, `0o644` for world-readable files (e.g., profiler output). See [#203](https://github.com/your-org/ai-data-analyst/issues/203) for a production incident caused by missing `fchmod`.
+Use `0o660` for files accessed by services via data-ops group ACL, `0o644` for world-readable files (e.g., profiler output). See [#203](https://github.com/keboola/agnes-the-ai-analyst/issues/203) for a production incident caused by missing `fchmod`.
 **Per-issue file locking for concurrent writers:**
@ -678,7 +678,7 @@ sudo cat /home/deploy/.ssh/id_ed25519.pub
 ```
 **3. Add Deploy Key to GitHub:**
- Go to: https://github.com/your-org/ai-data-analyst/settings/keys
+- Go to: https://github.com/keboola/agnes-the-ai-analyst/settings/keys
 - Click "Add deploy key"
 - Title: `data-broker-server`
 - Key: (paste public key from previous step)
@ -688,7 +688,7 @@ sudo cat /home/deploy/.ssh/id_ed25519.pub
 ```bash
 sudo mkdir -p /opt/data-analyst
 sudo chown deploy:data-ops /opt/data-analyst
-sudo -u deploy git clone git@github.com:your-org/ai-data-analyst.git /opt/data-analyst/repo
+sudo -u deploy git clone git@github.com:keboola/agnes-the-ai-analyst.git /opt/data-analyst/repo
 sudo git config --global --add safe.directory /opt/data-analyst/repo
 sudo -u deploy git config --global --add safe.directory /opt/data-analyst/repo
 sudo /opt/data-analyst/repo/server/setup.sh
@ -1144,7 +1144,7 @@ cat ~/.notifications/logs/runner.log
 ### Known Issues
 **On-demand script execution security hardening (partially resolved):**
-The `notify-scripts` helper replaced direct `sudo -H -u ... /usr/bin/env ...` calls with a single auditable entry point. Services no longer need filesystem access to user home directories (750 permissions are preserved). The bot still requires `NoNewPrivileges=false` and `/tmp` in `ReadWritePaths` for sudo execution. A queue-based approach ([#51](https://github.com/your-org/ai-data-analyst/issues/51)) could further improve this by having `notify-runner` pick up run requests from a queue instead of the bot calling sudo directly.
+The `notify-scripts` helper replaced direct `sudo -H -u ... /usr/bin/env ...` calls with a single auditable entry point. Services no longer need filesystem access to user home directories (750 permissions are preserved). The bot still requires `NoNewPrivileges=false` and `/tmp` in `ReadWritePaths` for sudo execution. A queue-based approach ([#51](https://github.com/keboola/agnes-the-ai-analyst/issues/51)) could further improve this by having `notify-runner` pick up run requests from a queue instead of the bot calling sudo directly.
 ## Data Sync Settings (Web Portal)
@ -1510,7 +1510,7 @@ for row in result:
 - API token has read-only access to Jira (no write permissions needed)
 - Webhook events are logged for audit purposes
 - Multiple services write to `/data/src_data/raw/jira/`: webapp (www-data), SLA poll (root), consistency check (root), backfill scripts (admin users)
- Concurrent writes to the same issue JSON are serialized via per-issue advisory file locking (`connectors/jira/file_lock.py`, `fcntl.flock`). Lock files in `issues/.locks/`. See [#203](https://github.com/your-org/ai-data-analyst/issues/203).
+- Concurrent writes to the same issue JSON are serialized via per-issue advisory file locking (`connectors/jira/file_lock.py`, `fcntl.flock`). Lock files in `issues/.locks/`. See [#203](https://github.com/keboola/agnes-the-ai-analyst/issues/203).
 ## Data Profiler
@ -1967,7 +1967,7 @@ The Corporate Memory page at `/corporate-memory` provides:
 - **No credentials stored**: Knowledge items are filtered before storage
 - **Source attribution**: Items track which users contributed (displayed as avatar initials)
 - **Read-only for analysts**: `/data/corporate-memory/` is only writable by data-ops group
- **Atomic writes**: All JSON file updates use `tempfile.mkstemp()` + `os.replace()` to prevent corruption. **Critical:** always call `os.fchmod(fd, 0o660)` (or appropriate mode) immediately after `mkstemp()` — otherwise the default `0600` mode overrides the POSIX ACL mask to `---`, breaking group-based access for other services. See [#203](https://github.com/your-org/ai-data-analyst/issues/203).
+- **Atomic writes**: All JSON file updates use `tempfile.mkstemp()` + `os.replace()` to prevent corruption. **Critical:** always call `os.fchmod(fd, 0o660)` (or appropriate mode) immediately after `mkstemp()` — otherwise the default `0600` mode overrides the POSIX ACL mask to `---`, breaking group-based access for other services. See [#203](https://github.com/keboola/agnes-the-ai-analyst/issues/203).
 ## Session Collector
--- a/docs/auto-install.md
+++ b/docs/auto-install.md
@ -3,8 +3,8 @@
 Step-by-step deployment of AI Data Analyst on a clean Ubuntu 24.04 VM.
 Two repos are involved:
- **OSS repo** (public/private): application code (`padak/tmp_oss`)
+- **OSS repo** (public/private): application code (`keboola/agnes-the-ai-analyst`)
- **Instance repo** (private): your config, secrets template, data schema (`padak/tmp_oss_cfg`)
+- **Instance repo** (private): your config, secrets template, data schema (your private instance config repo)
 ## Architecture on Server
--- a/docs/superpowers/plans/2026-04-08-final-integration-fixes.md
+++ b/docs/superpowers/plans/2026-04-08-final-integration-fixes.md
@ -0,0 +1,512 @@
 # Final Integration Fixes — Implementation Plan
 > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
 **Goal:** Fix all remaining integration gaps after the v2 refactoring so the system is fully operational — Jira webhooks, Docker services, dynamic login, profiler auto-trigger, scheduler auth.
 **Architecture:** Five independent fixes targeting: (1) Jira webhook FastAPI adapter, (2) Docker compose service entries, (3) dynamic auth provider detection on login page, (4) profiler integration with sync pipeline, (5) scheduler auto-authentication.
 **Tech Stack:** Python 3.11+, FastAPI, DuckDB, Docker Compose, httpx
 ---
 ### Task 1: Jira Webhook FastAPI Adapter
 **Files:**
 - Create: `app/api/jira_webhooks.py`
 - Modify: `app/main.py`
 - Modify: `connectors/jira/service.py` (add `JIRA_WEBHOOK_SECRET` to `_JiraConfig`)
 - Test: `tests/test_jira_webhooks.py`
 - [ ] **Step 1: Add JIRA_WEBHOOK_SECRET to config**
 ```python
 # connectors/jira/service.py — add to _JiraConfig class (line ~24)
 JIRA_WEBHOOK_SECRET = os.environ.get("JIRA_WEBHOOK_SECRET", "")
 DEBUG = os.environ.get("DEBUG", "").lower() in ("1", "true")
 ```
 - [ ] **Step 2: Write the failing test**
 ```python
 # tests/test_jira_webhooks.py
 """Tests for Jira webhook FastAPI adapter."""
 import hashlib
 import hmac
 import json
 import os
 import pytest
 from fastapi.testclient import TestClient
@pytest.fixture
 def webhook_client(tmp_path):
    os.environ["DATA_DIR"] = str(tmp_path)
    os.environ["JWT_SECRET_KEY"] = "test-secret-32chars-minimum!!"
    os.environ["JIRA_WEBHOOK_SECRET"] = "test-webhook-secret"
    (tmp_path / "state").mkdir()
    (tmp_path / "extracts" / "jira" / "raw" / "webhook_events").mkdir(parents=True)
    from app.main import create_app
    app = create_app()
    return TestClient(app)
 def _sign(payload: bytes, secret: str = "test-webhook-secret") -> str:
    return "sha256=" + hmac.new(secret.encode(), payload, hashlib.sha256).hexdigest()
 class TestJiraWebhook:
    def test_health(self, webhook_client):
        resp = webhook_client.get("/webhooks/jira/health")
        assert resp.status_code == 200
        assert resp.json()["status"] == "ok"
    def test_missing_signature_401(self, webhook_client):
        resp = webhook_client.post("/webhooks/jira", json={"webhookEvent": "test"})
        assert resp.status_code == 401
    def test_invalid_signature_401(self, webhook_client):
        body = json.dumps({"webhookEvent": "test"}).encode()
        resp = webhook_client.post(
            "/webhooks/jira", content=body,
            headers={"X-Hub-Signature-256": "sha256=invalid", "Content-Type": "application/json"},
        )
        assert resp.status_code == 401
    def test_valid_signature_accepted(self, webhook_client):
        body = json.dumps({
            "webhookEvent": "jira:issue_updated",
            "issue": {"key": "TEST-1", "fields": {"summary": "Test"}},
        }).encode()
        sig = _sign(body)
        resp = webhook_client.post(
            "/webhooks/jira", content=body,
            headers={"X-Hub-Signature-256": sig, "Content-Type": "application/json"},
        )
        # 200 or 503 (Jira not configured) — but NOT 401
        assert resp.status_code in (200, 503)
    def test_empty_payload_400(self, webhook_client):
        body = b""
        sig = _sign(body)
        resp = webhook_client.post(
            "/webhooks/jira", content=body,
            headers={"X-Hub-Signature-256": sig, "Content-Type": "application/json"},
        )
        assert resp.status_code == 400
 ```
 - [ ] **Step 3: Run test to verify it fails**
 Run: `pytest tests/test_jira_webhooks.py -v`
 Expected: FAIL — no `/webhooks/jira` route
 - [ ] **Step 4: Create the FastAPI adapter**
 ```python
 # app/api/jira_webhooks.py
 """Jira webhook endpoint — FastAPI adapter for connectors/jira/webhook.py logic."""
 import hashlib
 import hmac
 import json
 import logging
 from datetime import datetime
 from pathlib import Path
 from fastapi import APIRouter, HTTPException, Request
 from connectors.jira.service import Config, get_jira_service
 logger = logging.getLogger(__name__)
 router = APIRouter(prefix="/webhooks", tags=["webhooks"])
 WEBHOOK_LOG_DIR = Config.JIRA_DATA_DIR / "webhook_events"
 def _verify_signature(payload: bytes, signature: str | None) -> bool:
    secret = Config.JIRA_WEBHOOK_SECRET
    if not secret:
        logger.warning("JIRA_WEBHOOK_SECRET not configured, skipping verification")
        return True
    if not signature:
        return False
    if signature.startswith("sha256="):
        signature = signature[7:]
    expected = hmac.new(secret.encode(), payload, hashlib.sha256).hexdigest()
    return hmac.compare_digest(signature, expected)
 def _log_event(event_data: dict) -> None:
    try:
        WEBHOOK_LOG_DIR.mkdir(parents=True, exist_ok=True)
        ts = datetime.utcnow().strftime("%Y%m%d_%H%M%S_%f")
        event_type = event_data.get("webhookEvent", "unknown").replace(":", "_")
        path = WEBHOOK_LOG_DIR / f"{ts}_{event_type}.json"
        path.write_text(json.dumps(event_data, indent=2, default=str))
    except Exception as e:
        logger.warning("Failed to log webhook event: %s", e)
@router.post("/jira")
 async def receive_jira_webhook(request: Request):
    payload = await request.body()
    signature = request.headers.get("X-Hub-Signature-256") or request.headers.get("X-Hub-Signature")
    if not _verify_signature(payload, signature):
        raise HTTPException(status_code=401, detail="Invalid signature")
    try:
        event_data = json.loads(payload) if payload else None
    except (json.JSONDecodeError, ValueError):
        raise HTTPException(status_code=400, detail="Invalid JSON")
    if not event_data:
        raise HTTPException(status_code=400, detail="Empty payload")
    _log_event(event_data)
    webhook_event = event_data.get("webhookEvent", "unknown")
    issue_key = event_data.get("issue", {}).get("key", "unknown")
    logger.info("Received webhook: %s for %s", webhook_event, issue_key)
    jira_service = get_jira_service()
    if not jira_service.is_configured():
        return {"status": "error", "message": "Jira not configured"}, 503
    success = jira_service.process_webhook_event(event_data)
    if success:
        return {"status": "ok", "event": webhook_event, "issue": issue_key}
    raise HTTPException(status_code=500, detail="Failed to process event")
@router.get("/jira/health")
 async def jira_webhook_health():
    jira_service = get_jira_service()
    return {
        "status": "ok",
        "configured": jira_service.is_configured(),
        "webhook_secret_set": bool(Config.JIRA_WEBHOOK_SECRET),
        "jira_domain": Config.JIRA_DOMAIN or "(not set)",
    }
 ```
 - [ ] **Step 5: Register router in app/main.py**
 ```python
 # After existing router imports, add:
 from app.api.jira_webhooks import router as jira_webhooks_router
 # In create_app(), add:
 app.include_router(jira_webhooks_router)
 ```
 - [ ] **Step 6: Run tests to verify they pass**
 Run: `pytest tests/test_jira_webhooks.py -v`
 Expected: 5 passed
 - [ ] **Step 7: Commit**
 ```bash
 git add app/api/jira_webhooks.py app/main.py connectors/jira/service.py tests/test_jira_webhooks.py
 git commit -m "feat: Jira webhook FastAPI adapter — replaces Flask Blueprint"
 ```
 ---
 ### Task 2: Docker Compose — Add Missing Services + Scheduler Auth
 **Files:**
 - Modify: `docker-compose.yml`
 - Modify: `services/scheduler/__main__.py`
 - Delete: `services/catalog_refresh/`, `services/data_refresh/` (empty dirs)
 - [ ] **Step 1: Add missing services to docker-compose.yml**
 Add after `telegram-bot` service:
 ```yaml
  ws-gateway:
    build: .
    command: python -m services.ws_gateway
    volumes:
      - data:/data
    env_file: .env
    environment:
      - DATA_DIR=/data
    depends_on:
      - app
    profiles:
      - full
    restart: unless-stopped
  corporate-memory:
    build: .
    command: python -m services.corporate_memory
    volumes:
      - data:/data
    env_file: .env
    environment:
      - DATA_DIR=/data
    depends_on:
      - app
    profiles:
      - full
    restart: unless-stopped
  session-collector:
    build: .
    command: python -m services.session_collector
    volumes:
      - data:/data
    env_file: .env
    environment:
      - DATA_DIR=/data
    depends_on:
      - app
    profiles:
      - full
    restart: unless-stopped
 ```
 - [ ] **Step 2: Fix scheduler auto-auth**
 In `services/scheduler/__main__.py`, add token auto-fetch if `SCHEDULER_API_TOKEN` not set:
 ```python
 # After line 25 (SCHEDULER_API_TOKEN = ...), add:
 def _get_auth_token() -> str:
    """Get auth token — use SCHEDULER_API_TOKEN or auto-fetch from API."""
    token = SCHEDULER_API_TOKEN
    if token:
        return token
    # Auto-fetch: call /auth/token with SEED_ADMIN_EMAIL
    admin_email = os.environ.get("SEED_ADMIN_EMAIL", "")
    if not admin_email:
        logger.warning("No SCHEDULER_API_TOKEN or SEED_ADMIN_EMAIL — scheduler calls will be unauthenticated")
        return ""
    try:
        resp = httpx.post(f"{API_URL}/auth/token", json={"email": admin_email}, timeout=10)
        if resp.status_code == 200:
            token = resp.json().get("access_token", "")
            logger.info("Auto-fetched scheduler token for %s", admin_email)
            return token
    except Exception as e:
        logger.warning("Failed to fetch scheduler token: %s", e)
    return ""
 ```
 Update `_call_api` to use `_get_auth_token()`:
 ```python
 def _call_api(endpoint: str, method: str = "POST") -> bool:
    url = f"{API_URL}{endpoint}"
    headers = {}
    token = _get_auth_token()
    if token:
        headers["Authorization"] = f"Bearer {token}"
    # ... rest unchanged
 ```
 - [ ] **Step 3: Add SEED_ADMIN_EMAIL to scheduler in docker-compose.yml**
 ```yaml
  scheduler:
    # ... existing config ...
    environment:
      - DATA_DIR=/data
      - API_URL=http://app:8000
      - SEED_ADMIN_EMAIL=${SEED_ADMIN_EMAIL:-}
 ```
 - [ ] **Step 4: Delete empty service directories**
 ```bash
 rm -rf services/catalog_refresh/ services/data_refresh/
 ```
 - [ ] **Step 5: Commit**
 ```bash
 git add docker-compose.yml services/scheduler/__main__.py
 git add -A services/catalog_refresh/ services/data_refresh/
 git commit -m "feat: add Docker services (ws-gateway, corporate-memory, session-collector) + scheduler auto-auth"
 ```
 ---
 ### Task 3: Dynamic Auth Providers on Login Page
 **Files:**
 - Modify: `app/web/router.py`
 - [ ] **Step 1: Update login_page in router.py**
 Replace the hard-coded providers list (lines 152-158):
 ```python
@router.get("/login", response_class=HTMLResponse)
 async def login_page(request: Request):
    providers = []
    # Google OAuth — available if credentials configured
    try:
        from app.auth.providers.google import is_available as google_available
        if google_available():
            providers.append({"name": "google", "display_name": "Google", "icon": "google"})
    except Exception:
        pass
    # Password auth — always available
    providers.append({"name": "password", "display_name": "Email & Password", "icon": "key"})
    # Email magic link — available if configured
    try:
        from app.auth.providers.email import is_available as email_available
        if email_available():
            providers.append({"name": "email", "display_name": "Email Link", "icon": "mail"})
    except Exception:
        pass
    ctx = _build_context(request, providers=providers)
    return templates.TemplateResponse(request, "login.html", ctx)
 ```
 - [ ] **Step 2: Verify login page still renders**
 Run: `pytest tests/test_api_complete.py::TestWebUI::test_login_page -v`
 Expected: PASS
 - [ ] **Step 3: Commit**
 ```bash
 git add app/web/router.py
 git commit -m "feat: dynamic auth provider detection on login page"
 ```
 ---
 ### Task 4: Profiler Auto-Trigger After Sync
 **Files:**
 - Modify: `app/api/sync.py`
 - Modify: `app/api/catalog.py`
 - [ ] **Step 1: Add profiler call after orchestrator rebuild in sync.py**
 After the orchestrator rebuild line in `_run_sync`, add:
 ```python
        # Auto-profile synced tables
        try:
            from src.profiler import profile_table, TableInfo, parse_data_description, load_sync_state, load_metrics, get_parquet_path
            from src.db import get_system_db
            from src.repositories.profiles import ProfileRepository
            from pathlib import Path
            data_dir = Path(os.environ.get("DATA_DIR", "./data"))
            extracts_dir = data_dir / "extracts"
            sys_conn = get_system_db()
            try:
                profile_repo = ProfileRepository(sys_conn)
                # Profile each synced table from extract parquets
                for source_name, table_names in views.items():
                    for table_name in table_names[:10]:  # Limit to 10 per sync to avoid timeout
                        pq_path = extracts_dir / source_name / "data" / f"{table_name}.parquet"
                        if not pq_path.exists():
                            continue
                        try:
                            table_info = TableInfo(name=table_name, table_id=table_name)
                            profile = profile_table(table_info, pq_path, [], {}, {})
                            profile_repo.save(table_name, profile)
                        except Exception as pe:
                            print(f"[SYNC] Profile {table_name}: {pe}", file=_sys.stderr, flush=True)
            finally:
                sys_conn.close()
            print(f"[SYNC] Profiler complete", file=_sys.stderr, flush=True)
        except Exception as e:
            print(f"[SYNC] Profiler skipped: {e}", file=_sys.stderr, flush=True)
 ```
 - [ ] **Step 2: Add profile refresh endpoint to catalog.py**
 ```python
 # app/api/catalog.py — add after existing endpoints:
@router.post("/profile/{table_name}/refresh")
 async def refresh_profile(
    table_name: str,
    user: dict = Depends(get_current_user),
    conn: duckdb.DuckDBPyConnection = Depends(_get_db),
 ):
    """Re-generate profile for a table on demand."""
    import os as _os
    from pathlib import Path as _Path
    from src.profiler import profile_table, TableInfo
    data_dir = _Path(_os.environ.get("DATA_DIR", "./data"))
    extracts_dir = data_dir / "extracts"
    # Find parquet file
    candidates = list(extracts_dir.rglob(f"data/{table_name}.parquet"))
    if not candidates:
        raise HTTPException(status_code=404, detail=f"No parquet for '{table_name}'")
    try:
        table_info = TableInfo(name=table_name, table_id=table_name)
        profile = profile_table(table_info, candidates[0], [], {}, {})
        ProfileRepository(conn).save(table_name, profile)
        return {"status": "ok", "table": table_name, "columns": len(profile.get("columns", {}))}
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Profile failed: {e}")
 ```
 - [ ] **Step 3: Commit**
 ```bash
 git add app/api/sync.py app/api/catalog.py
 git commit -m "feat: auto-profile after sync + on-demand profile refresh endpoint"
 ```
 ---
 ### Task 5: Final Verification
 - [ ] **Step 1: Run full test suite**
 ```bash
 pytest tests/ --ignore=tests/test_cli.py -v
 ```
 Expected: all pass
 - [ ] **Step 2: Redeploy to production**
 ```bash
 ssh -i ~/.ssh/google_compute_engine deploy@35.195.96.98 \
  'cd /home/deploy/app && GIT_SSH_COMMAND="ssh -i ~/.ssh/github_deploy -o StrictHostKeyChecking=no" git pull && sudo docker compose build --quiet && sudo docker compose up -d'
 ```
 - [ ] **Step 3: Verify on production**
 ```bash
 # Health
 curl http://35.195.96.98:8000/api/health
 # Jira webhook health
 curl http://35.195.96.98:8000/webhooks/jira/health
 # Login page
 curl -s http://35.195.96.98:8000/login | grep -o "password\|google\|email"
 # Scheduler logs (should show token auto-fetch)
 ssh deploy@35.195.96.98 'cd /home/deploy/app && sudo docker compose logs scheduler --tail 5'
 ```
 - [ ] **Step 4: Commit and push**
 ```bash
 git push origin feature/v2-fastapi-duckdb-docker-cli
 git push fork feature/v2-fastapi-duckdb-docker-cli
 ```
--- a/docs/superpowers/plans/2026-04-08-security-hardening.md
+++ b/docs/superpowers/plans/2026-04-08-security-hardening.md
@ -0,0 +1,344 @@
 # Security Hardening — Implementation Plan
 > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
 **Goal:** Fix all critical/high security vulnerabilities found by audit before merging to main. Also update Python version, sync deps, fix Docker prod config.
 **Architecture:** 7 independent fix tasks targeting specific vulnerabilities. Each is self-contained.
 **Tech Stack:** Python 3.13, FastAPI, DuckDB, Docker
 ---
 ### Task 1: SQL Query Hardening
 **Files:**
 - Modify: `app/api/query.py`
 - Modify: `src/db.py` (add read-only analytics getter)
 - Test: existing `tests/test_security.py` should still pass
 - [ ] **Step 1: Add read-only analytics DB connection**
 In `src/db.py`, add after `get_analytics_db()`:
 ```python
 def get_analytics_db_readonly() -> duckdb.DuckDBPyConnection:
    """Read-only connection to analytics DB. Blocks writes and external access."""
    db_path = _get_data_dir() / "analytics" / "server.duckdb"
    if not db_path.exists():
        db_path.parent.mkdir(parents=True, exist_ok=True)
    conn = duckdb.connect(str(db_path), read_only=True)
    conn.execute("SET enable_external_access = false")
    return conn
 ```
 - [ ] **Step 2: Harden query endpoint**
 In `app/api/query.py`, replace `get_analytics_db()` with `get_analytics_db_readonly()`. Add to blocklist:
 ```python
 blocked = [
    "drop ", "delete ", "insert ", "update ", "alter ", "create ",
    "copy ", "attach ", "detach ", "load ", "install ",
    "export ", "import ", "pragma ",
    "read_csv", "read_json", "read_parquet(", "read_text",
    "write_csv", "write_parquet",
    "read_blob", "glob(", "read_ndjson",
    ";",
    "'/", '"/",  # Block absolute file paths in FROM clause
 ]
 ```
 - [ ] **Step 3: Run tests**
 ```bash
 pytest tests/test_security.py tests/test_e2e_api.py -v --tb=short
 ```
 - [ ] **Step 4: Commit**
 ```bash
 git commit -m "security: query endpoint — read-only DB + disable external access + block file paths"
 ```
 ---
 ### Task 2: Upload Path Traversal Fix
 **Files:**
 - Modify: `app/api/upload.py`
 - [ ] **Step 1: Sanitize filenames**
 Replace lines 30-31 and 47-48 with:
 ```python
 # Sanitize filename — strip directory components to prevent traversal
 raw_name = file.filename or f"session_{uuid.uuid4().hex[:8]}.jsonl"
 filename = Path(raw_name).name  # strips ../../../ etc
 if not filename or filename.startswith("."):
    filename = f"upload_{uuid.uuid4().hex[:8]}"
 target = sessions_dir / filename
 ```
 Same pattern for artifact upload.
 - [ ] **Step 2: Commit**
 ```bash
 git commit -m "security: sanitize upload filenames — prevent path traversal"
 ```
 ---
 ### Task 3: Script Sandbox Hardening
 **Files:**
 - Modify: `app/api/scripts.py`
 - [ ] **Step 1: Add AST-based validation**
 Add before the blocklist check:
 ```python
 import ast
 # AST-based validation — catches obfuscated imports
 try:
    tree = ast.parse(source)
 except SyntaxError as e:
    raise HTTPException(status_code=400, detail=f"Script syntax error: {e}")
 for node in ast.walk(tree):
    if isinstance(node, ast.Import):
        for alias in node.names:
            if alias.name.split(".")[0] in BLOCKED_MODULES:
                raise HTTPException(status_code=400, detail=f"Blocked import: {alias.name}")
    elif isinstance(node, ast.ImportFrom):
        if node.module and node.module.split(".")[0] in BLOCKED_MODULES:
            raise HTTPException(status_code=400, detail=f"Blocked import: {node.module}")
    elif isinstance(node, ast.Call):
        if isinstance(node.func, ast.Name) and node.func.id in BLOCKED_FUNCTIONS:
            raise HTTPException(status_code=400, detail=f"Blocked function: {node.func.id}")
 BLOCKED_MODULES = {"os", "sys", "subprocess", "shutil", "ctypes", "importlib", "socket",
                   "requests", "urllib", "http", "signal", "pathlib", "builtins"}
 BLOCKED_FUNCTIONS = {"exec", "eval", "compile", "open", "globals", "locals",
                     "getattr", "setattr", "delattr", "breakpoint", "__import__"}
 ```
 Keep the string blocklist as a secondary check.
 - [ ] **Step 2: Commit**
 ```bash
 git commit -m "security: AST-based script validation — catches obfuscated imports"
 ```
 ---
 ### Task 4: Password Hashing + Auth Fixes
 **Files:**
 - Modify: `app/auth/providers/password.py` (remove SHA256 fallback)
 - Modify: `app/auth/providers/google.py` (add secure cookie flag)
 - Modify: `app/auth/dependencies.py` (fix get_optional_user bug)
 - Modify: `app/auth/jwt.py` (add secret length validation)
 - Modify: `pyproject.toml` (add missing deps)
 - [ ] **Step 1: Remove SHA256 fallback in password.py**
 Replace the try/except ImportError block with:
 ```python
 from argon2 import PasswordHasher
 ph = PasswordHasher()
 hashed = ph.hash(request.password)
 ```
 Remove all `except ImportError: import hashlib` blocks. Same in `app/auth/router.py` if present.
 - [ ] **Step 2: Add secure flag to Google OAuth cookie**
 In `app/auth/providers/google.py`, change set_cookie to:
 ```python
 is_https = os.environ.get("HTTPS", "").lower() in ("1", "true") or request.url.scheme == "https"
 response.set_cookie(
    key="access_token", value=jwt_token,
    httponly=True, max_age=86400 * 30, samesite="lax",
    secure=is_https,
 )
 ```
 - [ ] **Step 3: Fix get_optional_user argument bug**
 In `app/auth/dependencies.py`, change line 68:
 ```python
 # Before (wrong):
 return await get_current_user(authorization, conn)
 # After (correct):
 return await get_current_user(request=None, authorization=authorization, conn=conn)
 ```
 - [ ] **Step 4: Add JWT secret validation**
 In `app/auth/jwt.py`, add after SECRET_KEY:
 ```python
 if len(SECRET_KEY) < 32 and not os.environ.get("TESTING"):
    import warnings
    warnings.warn("JWT_SECRET_KEY is less than 32 characters — insecure for production", stacklevel=2)
 ```
 - [ ] **Step 5: Sync pyproject.toml deps**
 Add to pyproject.toml dependencies:
 ```toml
 "authlib>=1.3.0",
 "argon2-cffi>=23.1.0",
 ```
 - [ ] **Step 6: Commit**
 ```bash
 git commit -m "security: fix password hashing, OAuth cookie, JWT validation, optional_user bug"
 ```
 ---
 ### Task 5: CORS + Session Middleware
 **Files:**
 - Modify: `app/main.py`
 - [ ] **Step 1: Fix CORS**
 Replace wildcard CORS with environment-configured origins:
 ```python
 cors_origins = os.environ.get("CORS_ORIGINS", "http://localhost:3000,http://localhost:8000").split(",")
 app.add_middleware(
    CORSMiddleware,
    allow_origins=[o.strip() for o in cors_origins],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
 )
 ```
 - [ ] **Step 2: Fix SessionMiddleware secret**
 ```python
 session_secret = os.environ.get("SESSION_SECRET", os.environ.get("JWT_SECRET_KEY", ""))
 if not session_secret:
    import secrets
    session_secret = secrets.token_hex(32)
 app.add_middleware(SessionMiddleware, secret_key=session_secret)
 ```
 - [ ] **Step 3: Commit**
 ```bash
 git commit -m "security: CORS from env config + session secret validation"
 ```
 ---
 ### Task 6: Docker Production Config
 **Files:**
 - Modify: `docker-compose.yml` (remove --reload)
 - Modify: `Dockerfile` (upgrade Python)
 - [ ] **Step 1: Remove --reload from docker-compose.yml**
 ```yaml
 # Before:
 command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
 # After:
 command: uvicorn app.main:app --host 0.0.0.0 --port 8000
 ```
 Create `docker-compose.override.yml` for dev:
 ```yaml
 services:
  app:
    command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
    volumes:
      - .:/app
 ```
 - [ ] **Step 2: Upgrade Dockerfile to Python 3.13**
 ```dockerfile
 FROM python:3.13-slim
 ```
 - [ ] **Step 3: Commit**
 ```bash
 git commit -m "chore: Docker production config — no reload, Python 3.13"
 ```
 ---
 ### Task 7: Stale Docs Cleanup + datetime.utcnow Fix
 **Files:**
 - Modify: `CLAUDE.md` (fix test count)
 - Delete or update: stale docs referencing Flask
 - Modify: `app/api/jira_webhooks.py`, `src/profiler.py` (utcnow → now(UTC))
 - [ ] **Step 1: Fix CLAUDE.md test count**
 Update test count to current number.
 - [ ] **Step 2: Delete stale docs**
 ```bash
 rm docs/superpowers/plans/*.md  # agent plans, not needed in repo
 ```
 - [ ] **Step 3: Fix datetime.utcnow deprecation**
 Replace `datetime.utcnow()` with `datetime.now(timezone.utc)` in:
 - `app/api/jira_webhooks.py`
 - `src/profiler.py` (lines 1213, 1379)
 - [ ] **Step 4: Commit**
 ```bash
 git commit -m "chore: fix stale docs, test count, datetime deprecation"
 ```
 ---
 ## Execution order
 ```
 Task 1 (SQL)        ─┐
 Task 2 (Upload)      ├─ 3 parallel agents
 Task 3 (Scripts)    ─┘
 Task 4 (Auth)       ─┐
 Task 5 (CORS)        ├─ 3 parallel agents
 Task 6 (Docker)     ─┘
 Task 7 (Docs)       ── sequential after above
 ```
 ## Verification
 ```bash
 # All tests pass
 pytest tests/ --ignore=tests/test_cli.py -v
 # Security spot checks
 python3 -c "from app.api.query import *"  # no errors
 curl -X POST http://localhost:8000/api/query -d '{"sql":"SELECT * FROM \"/etc/passwd\""}' # should fail
 # Docker build
 docker build -t test .
 ```
--- a/docs/testing/vm_test_plan.md
+++ b/docs/testing/vm_test_plan.md
@ -17,7 +17,7 @@ End-to-end test of the full platform on a clean VM with a new GitHub repository.
 **On your local machine:**
 ```bash
-cd /Users/padak/github/oss-ai-data-analyst
+cd /path/to/agnes-the-ai-analyst
 # Create repo on GitHub (pick org/name)
 gh repo create YOUR_ORG/ai-data-analyst --private --source=. --push
--- a/infra/main.tf
+++ b/infra/main.tf
@ -70,12 +70,12 @@ locals {
    echo "=== Cloning repository ==="
    APP_DIR="/opt/data-analyst"
    if [ ! -d "$APP_DIR" ]; then
-      git clone https://github.com/padak/tmp_oss.git "$APP_DIR"
+      git clone https://github.com/keboola/agnes-the-ai-analyst.git "$APP_DIR"
      cd "$APP_DIR"
-      git checkout feature/v2-fastapi-duckdb-docker-cli
+      git checkout main
    else
      cd "$APP_DIR"
-      git pull origin feature/v2-fastapi-duckdb-docker-cli || true
+      git pull origin main || true
    fi
    echo "=== Creating .env ==="
--- a/infra/terraform.tfvars.example
+++ b/infra/terraform.tfvars.example
@ -1,5 +1,5 @@
 # Copy to terraform.tfvars and fill in values
-project_id          = "kids-ai-data-analysis"
+project_id          = "your-gcp-project"
 region              = "europe-north1"
 zone                = "europe-north1-a"
 machine_type        = "e2-small"       # 2 vCPU, 2GB RAM, ~$7/mo