H1 - Sanitize dev_docs/ for public release:
- Replace all real employee names with generic placeholders
(padak->admin1, matejkys->admin2, dasa->admin3, petr->john, etc.)
- Replace GCP project ID (kids-ai-data-analysis -> your-gcp-project)
- Replace server hostname (data-broker-for-claude -> your-server)
- Replace real IP address (34.88.8.46 -> YOUR_SERVER_IP)
- Replace internal FQDN with placeholder
- Covers: security.md, server.md, disaster-recovery.md, desktop-app.md,
session_explore.md, plan-rsync-fix.md, draft/*.md
H3 - webapp-setup.sh: validate sudoers syntax BEFORE copying to /etc/sudoers.d
- Prevents broken sudo if syntax is invalid
- Uses install -m 440 for atomic copy with correct permissions
M1 - setup.sh: deploy user created with /usr/sbin/nologin instead of /bin/bash
- CI/CD service account does not need interactive shell
M2 - config/loader.py: warn on missing env vars, validate webapp_secret_key
- _resolve_env_refs now logs warnings for unset ${ENV_VAR} references
- _validate_config checks auth.webapp_secret_key is non-empty
- Prevents Flask signing sessions with empty secret key
All 118 tests pass.
233 lines
9.4 KiB
Markdown
233 lines
9.4 KiB
Markdown
# Service Connector - Integration of Internal APIs into Data Analyst Platform
|
|
|
|
## Context
|
|
|
|
The data analyst platform currently supports only data analysis (parquet files + DuckDB). We want to extend it so analysts can also interact with internal services (Purchase Order system, Invoicing, CRM) through Claude Code. This requires:
|
|
|
|
1. **API keys** delivered to the analyst's local machine (`.env` file)
|
|
2. **Skills** teaching Claude Code how to use each service's API (`.claude/rules/` markdown files)
|
|
3. **Seamless UX** - non-technical users click "Connect" in the web portal, everything else is automatic
|
|
|
|
Key constraints:
|
|
- All external services are internal apps (we can modify them)
|
|
- They already have Google OAuth and Bearer token/API key authentication
|
|
- They already have token generation UI
|
|
- We target 2-3 services initially
|
|
- Must reuse established patterns (sudo install, atomic JSON, sync_data.sh)
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
User clicks "Connect" on your-instance.example.com
|
|
|
|
|
v
|
|
Webapp calls external service's internal token-exchange endpoint
|
|
| (service-to-service, shared secret)
|
|
v
|
|
API key returned, stored in /data/service-connectors/connections.json
|
|
|
|
|
v
|
|
Webapp writes /home/{user}/.service_env (sudo install, mode 600)
|
|
Webapp writes /home/{user}/.claude_rules/sc_{service}.md (skill file)
|
|
|
|
|
v
|
|
Analyst runs sync_data.sh
|
|
|
|
|
v
|
|
.service_env -> merged into ~/keboola-analysis/.env
|
|
sc_*.md -> already synced with existing corporate memory rules sync
|
|
```
|
|
|
|
## Implementation Plan
|
|
|
|
### Phase 1: Service Registry & Infrastructure
|
|
|
|
**1.1 Create service registry config**
|
|
- File: `docs/setup/service_connectors.json`
|
|
- Defines available services: id, name, description, URLs, env var names, skill file name
|
|
- Deployed to `/data/docs/setup/` by deploy.sh
|
|
|
|
**1.2 Create sudo helper script**
|
|
- File: `server/bin/install-service-env`
|
|
- Accepts: USERNAME, ENV_SOURCE_PATH, SKILLS_SOURCE_DIR
|
|
- Installs `.service_env` (mode 600) to user home
|
|
- Installs `sc_*.md` skill files to `.claude_rules/` (mode 600)
|
|
- Only removes `sc_*.md` files (leaves `km_*.md` from corporate memory intact)
|
|
- Template: `server/bin/install-user-rules` (63 lines, same structure)
|
|
|
|
**1.3 Update sudoers**
|
|
- File: `server/sudoers-webapp` - add entry for `install-service-env`
|
|
|
|
**1.4 Update deploy.sh**
|
|
- Create `/data/service-connectors/` directory (www-data:data-ops, 2770)
|
|
- Deploy service registry and skill files
|
|
- Add new env vars to .env block: `SC_SECRET_PURCHASE_ORDERS`, `SC_SECRET_INVOICING`, `SC_SECRET_CRM`
|
|
|
|
**1.5 Add config entries**
|
|
- File: `webapp/config.py` - no new config class entries needed (secrets read directly with `os.environ.get()` in the service module, same pattern as sync_settings_service.py)
|
|
|
|
### Phase 2: Backend Service
|
|
|
|
**2.1 Create service connector module**
|
|
- File: `webapp/service_connector_service.py`
|
|
- Pattern: follows `webapp/sync_settings_service.py` exactly
|
|
|
|
Key functions:
|
|
```python
|
|
# Data storage
|
|
CONNECTORS_DIR = Path(os.environ.get("CONNECTORS_DIR", "/data/service-connectors"))
|
|
CONNECTIONS_FILE = CONNECTORS_DIR / "connections.json"
|
|
|
|
# Core functions
|
|
def get_available_services() -> dict # Load registry
|
|
def get_user_connections(username: str) -> dict # User's connection status
|
|
def connect_service(username, service_id, user_email) -> (bool, str) # Token exchange + install
|
|
def disconnect_service(username, service_id) -> (bool, str) # Revoke + cleanup
|
|
def check_service_health(service_id) -> dict # Health check
|
|
|
|
# Internal
|
|
def _exchange_token(service, user_email) -> dict | None # Call external service
|
|
def _revoke_token(service, token_id) -> bool # Call revoke endpoint
|
|
def _regenerate_user_env(username) -> bool # Write .service_env via sudo
|
|
def _install_service_skills(username) -> bool # Write sc_*.md via sudo
|
|
def _get_server_username(webapp_username) -> str # Reuse WEBAPP_TO_SERVER_USERNAME
|
|
```
|
|
|
|
Storage format (`connections.json`):
|
|
```json
|
|
{
|
|
"john": {
|
|
"purchase_orders": {
|
|
"connected": true,
|
|
"api_key": "pk_live_abc123...",
|
|
"token_id": "tok_xyz789",
|
|
"connected_at": "2026-02-16T12:00:00Z",
|
|
"expires_at": "2026-05-17T12:00:00Z"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
Note: API keys stored in connections.json (protected by 660 permissions, www-data:data-ops). This follows the same approach as telegram_users.json storing chat_ids. For internal services, this is acceptable security level.
|
|
|
|
**2.2 Add API routes to webapp**
|
|
- File: `webapp/app.py` - add routes in `register_routes()`
|
|
|
|
```
|
|
GET /api/service-connectors - List services + user connections
|
|
POST /api/service-connectors/connect - Connect to a service {service_id}
|
|
POST /api/service-connectors/disconnect - Disconnect {service_id}
|
|
GET /api/service-connectors/health/<service_id> - Health check
|
|
```
|
|
|
|
**2.3 Token exchange protocol**
|
|
What each external service needs to implement:
|
|
|
|
```
|
|
POST /api/internal/token-exchange
|
|
Authorization: Bearer <shared_secret>
|
|
Body: {"user_email": "john@your-domain.com", "ttl_days": 90}
|
|
Response: {"status": "ok", "api_key": "...", "token_id": "...", "expires_at": "..."}
|
|
|
|
POST /api/internal/token-revoke
|
|
Authorization: Bearer <shared_secret>
|
|
Body: {"token_id": "tok_xyz789"}
|
|
Response: {"status": "ok"}
|
|
```
|
|
|
|
### Phase 3: Dashboard UI
|
|
|
|
**3.1 Add Service Connectors card to dashboard**
|
|
- File: `webapp/templates/dashboard.html`
|
|
- New card in the existing 2-column layout (same pattern as Data Settings and Telegram cards)
|
|
- Shows grid of service cards with Connect/Disconnect buttons
|
|
- Connected = green badge + expiry date
|
|
- AJAX calls to `/api/service-connectors/*` endpoints
|
|
|
|
### Phase 4: Sync & Skills
|
|
|
|
**4.1 Extend sync_data.sh**
|
|
- File: `scripts/sync_data.sh`
|
|
- Add block after corporate memory rules sync (line ~418):
|
|
1. Download `~/.service_env` from server via SCP
|
|
2. If exists: merge into local `.env` using marker comments (`# --- SERVICE CONNECTOR START/END ---`)
|
|
3. If not exists: clean old service connector block from `.env`
|
|
|
|
```bash
|
|
# --- Sync service connector credentials ---
|
|
if scp -q data-analyst:~/.service_env /tmp/.service_env_$$ 2>/dev/null; then
|
|
# Remove old block, append new one with markers
|
|
sed -i.bak '/^# --- SERVICE CONNECTOR START ---$/,/^# --- SERVICE CONNECTOR END ---$/d' ./.env 2>/dev/null
|
|
{ echo "# --- SERVICE CONNECTOR START ---"; cat /tmp/.service_env_$$; echo "# --- SERVICE CONNECTOR END ---"; } >> ./.env
|
|
rm -f /tmp/.service_env_$$
|
|
fi
|
|
```
|
|
|
|
Note: `sc_*.md` skills are already synced by the existing corporate memory sync block (line 410: `scp -rq "data-analyst:~/.claude_rules/"* .claude/rules/`).
|
|
|
|
**4.2 Create skill files**
|
|
- Directory: `docs/service_connector_skills/`
|
|
- Files: `sc_purchase_orders.md`, `sc_invoicing.md`, `sc_crm.md`
|
|
- Content: Authentication setup, available endpoints, common patterns, data models
|
|
- Deployed to `/data/docs/service_connector_skills/` by deploy.sh
|
|
- Installed to user's `.claude_rules/` when they connect
|
|
|
|
### Phase 5: Tests
|
|
|
|
**5.1 Unit tests**
|
|
- File: `tests/test_service_connector_service.py`
|
|
- Test: connect/disconnect flow, env generation, registry loading, error handling
|
|
|
|
## Files to Create
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `webapp/service_connector_service.py` | Core service (connect, disconnect, env generation) |
|
|
| `docs/setup/service_connectors.json` | Service registry config |
|
|
| `docs/service_connector_skills/sc_purchase_orders.md` | PO API skill |
|
|
| `server/bin/install-service-env` | Sudo helper for env + skills install |
|
|
| `tests/test_service_connector_service.py` | Unit tests |
|
|
|
|
## Files to Modify
|
|
|
|
| File | Change |
|
|
|------|--------|
|
|
| `webapp/app.py` | Import service_connector_service, add 4 API routes |
|
|
| `webapp/templates/dashboard.html` | Add Service Connectors card widget |
|
|
| `server/sudoers-webapp` | Add `install-service-env` entry |
|
|
| `server/deploy.sh` | Create /data/service-connectors/, deploy skills, add env vars |
|
|
| `scripts/sync_data.sh` | Add .service_env download and .env merge block |
|
|
| `.github/workflows/deploy.yml` | Add SC_SECRET_* GitHub Secrets to env |
|
|
|
|
## Key Patterns Reused
|
|
|
|
- **Sudo install**: `sync_settings_service.py:_regenerate_user_config()` (line 143-183)
|
|
- **Atomic JSON**: `sync_settings_service.py:_write_json()` (line 61-74)
|
|
- **Username mapping**: `corporate_memory_service.py:_get_server_username()` (line 56-59)
|
|
- **Sudo helper script**: `server/bin/install-user-rules` (entire file)
|
|
- **Dashboard AJAX pattern**: Sync settings toggles in `dashboard.html`
|
|
|
|
## Security Model
|
|
|
|
| Stage | Protection |
|
|
|-------|------------|
|
|
| Token exchange (webapp <-> service) | HTTPS + shared secret in Authorization header |
|
|
| Central storage (connections.json) | /data/service-connectors/ (2770), file 660 |
|
|
| User home (.service_env) | Mode 600 (owner-only), sudo install |
|
|
| Transit (sync) | SCP over SSH |
|
|
| Client (.env) | Local filesystem; Claude Code settings deny Read(.env) |
|
|
| Claude Code usage | Python `load_dotenv()` via Bash (allowed) |
|
|
|
|
## Verification
|
|
|
|
1. **Unit tests**: `pytest tests/test_service_connector_service.py`
|
|
2. **Manual flow**:
|
|
- Deploy to server
|
|
- Log into your-instance.example.com
|
|
- Click "Connect" on PO system in dashboard
|
|
- Verify `.service_env` appears in `/home/{user}/`
|
|
- Run `sync_data.sh` on client
|
|
- Verify `.env` contains PO_API_KEY
|
|
- Verify `.claude/rules/sc_purchase_orders.md` exists
|
|
- In Claude Code: `python -c "from dotenv import load_dotenv; load_dotenv(); import os; print(os.environ.get('PO_API_KEY', 'NOT SET'))"`
|
|
3. **Disconnect flow**: Click Disconnect, verify key removed from .env after sync
|