agnes-the-ai-analyst/dev_docs/server.md
ZdenekSrotyr c8e232e43e docs: update stale v1 docs to v2 Docker/FastAPI/DuckDB architecture
- CONFIGURATION.md: remove Flask/SendGrid/WEBAPP_SECRET_KEY references,
  update env vars to JWT_SECRET_KEY and SESSION_SECRET, point to
  config/.env.template and config/instance.yaml.example
- disaster-recovery.md: rewrite for Docker volumes; cover GCP disk
  snapshot backup/restore and full VM rebuild; drop systemd/nginx/SSH
- server.md: strip rsync, systemd, nginx, Linux group, and sudo
  sections; keep Docker Compose operations, log viewing, health checks,
  sync/admin CLI, and Jira webhook procedures
2026-04-09 18:44:25 +02:00

7 KiB

Server Operations

Operational guide for the AI Data Analyst Docker deployment.

Basic Information

Parameter Value
GCP Project your-gcp-project
Zone europe-north1-a
Machine type e2-medium
OS Debian 12 (bookworm)
External IP YOUR_SERVER_IP

Docker Compose

Starting and stopping

# Start all services (app + scheduler)
docker compose up -d

# Include optional services (Telegram bot, etc.)
docker compose --profile full up -d

# Stop all services
docker compose down

# Restart a single service
docker compose restart app

# Pull latest images and redeploy
docker compose pull && docker compose up -d

Status

# List running containers and their state
docker compose ps

# Resource usage
docker stats

Log Viewing

# All services, follow
docker compose logs -f

# Single service
docker compose logs -f app
docker compose logs -f scheduler

# Last N lines
docker compose logs --tail=100 app

# Since a timestamp
docker compose logs --since=1h app

Application logs are written to stdout/stderr and captured by Docker.

Health Check

# Quick check
curl https://your-instance.example.com/health

# With response body
curl -s https://your-instance.example.com/health | python3 -m json.tool

Expected response:

{"status": "ok"}

The /health endpoint also checks DuckDB connectivity and returns 503 if the database is unavailable.

Data Sync

Trigger a manual sync

# Via API
curl -X POST http://localhost:8000/api/sync/trigger

# Via CLI inside the container
docker compose exec app da sync

# Sync a single table
docker compose exec app da sync --table table_name

Check sync status

curl -s http://localhost:8000/api/sync/status | python3 -m json.tool

Data Structure

/data/                          # Persistent volume (GCP pd-balanced, snapshotted)
├── state/
│   └── system.duckdb           # Table registry, users, sync state, audit log
├── analytics/
│   └── server.duckdb           # Master analytics DB (rebuilt on startup)
└── extracts/
    └── {source_name}/
        ├── extract.duckdb      # Per-source extract DB with views
        └── data/               # Parquet files (local sources: Keboola, Jira)
            └── *.parquet

system.duckdb is the source of truth for configuration. Back it up before any destructive operation.

Admin CLI

# List registered tables
docker compose exec app da admin tables list

# Register a new table
docker compose exec app da admin tables add

# User management
docker compose exec app da admin users list

# Query data directly
docker compose exec app da query "SELECT * FROM my_table LIMIT 10"

Application Deployment

Application is deployed via Docker image. The recommended workflow:

  1. Push changes to the main branch
  2. CI builds and pushes a new image
  3. On the server, pull and restart:
    cd /opt/data-analyst
    docker compose pull
    docker compose up -d
    

To pin a specific image version, set the tag in docker-compose.yml before deploying.

Environment configuration

# Edit .env (never commit this file)
nano /opt/data-analyst/.env

# Restart app to apply changes
docker compose restart app

See config/.env.template for the full variable reference and config/instance.yaml.example for instance configuration.

Monitoring

GCP Cloud Monitoring

The VM reports metrics via the Google Cloud Ops Agent:

# Check agent status
sudo systemctl status google-cloud-ops-agent

Key metrics in GCP Console > Monitoring > Metrics Explorer:

  • agent.googleapis.com/disk/percent_used — watch /data partition
  • agent.googleapis.com/memory/percent_used
  • agent.googleapis.com/cpu/utilization

A disk space alert fires when /data exceeds 85% for 5 minutes.

Local checks

# Disk usage
df -h /data

# Data directory breakdown
du -sh /data/*

# Container resource usage
docker stats --no-stream

Backup and Disaster Recovery

The /data persistent disk has daily GCP snapshot schedules with 14-day retention.

# List existing snapshots
gcloud compute snapshots list --project=your-gcp-project \
  --filter="sourceDisk:data-disk" --sort-by=~creationTimestamp

# Create a manual snapshot before risky operations
gcloud compute disks snapshot data-disk \
  --project=your-gcp-project \
  --zone=europe-north1-a \
  --snapshot-names=data-disk-$(date +%Y%m%d)-manual

See disaster-recovery.md for full recovery procedures.

Web Application

The FastAPI app is available at https://your-instance.example.com.

  • Google OAuth: restricted to allowed_domain set in config/instance.yaml
  • Email magic link: available out of the box (no external service required)
  • Admin API: POST /api/admin/tables/{id} — register/update tables
  • Sync API: POST /api/sync/trigger — trigger data extraction

Google OAuth setup

  1. Go to Google Cloud Console
  2. Create OAuth 2.0 Client ID (Web application)
  3. Authorized JavaScript origins: https://your-instance.example.com
  4. Authorized redirect URIs: https://your-instance.example.com/auth/google/callback
  5. Add GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET to .env

Jira Webhook Integration

Receives webhooks from Atlassian Jira for real-time issue sync.

Configuration

Add to .env:

JIRA_WEBHOOK_SECRET=<generate with: python -c "import secrets; print(secrets.token_hex(32))">
JIRA_API_TOKEN=<API token from https://id.atlassian.com/manage-profile/security/api-tokens>

Add to config/instance.yaml:

jira:
  domain: "your-org.atlassian.net"
  email: "integration-user@your-domain.com"
  webhook_secret: "${JIRA_WEBHOOK_SECRET}"
  api_token: "${JIRA_API_TOKEN}"

Jira webhook setup

  1. Go to Jira Admin > System > WebHooks
  2. Create new webhook:
    • URL: https://your-instance.example.com/webhooks/jira
    • Secret: same value as JIRA_WEBHOOK_SECRET
    • Events: Issue created/updated/deleted, Comment created/updated, Attachment created

Monitoring

# Health check
curl https://your-instance.example.com/webhooks/jira/health

# Webhook processing logs
docker compose logs -f app | grep -i jira

Troubleshooting

Container won't start

docker compose logs app | tail -50
# Look for configuration or DuckDB errors at startup

DuckDB locked

If the app crashes mid-write, DuckDB may hold a write lock:

docker compose down
# Wait a few seconds, then:
docker compose up -d

DuckDB releases locks when the process exits cleanly. A forced restart resolves most lock issues.

Sync failing

# Check sync logs
docker compose logs app | grep -i "sync\|error\|exception"

# Verify data source credentials in .env
docker compose exec app da admin tables list

Out of disk space

df -h /data
du -sh /data/extracts/*

# Remove old parquet partitions if needed (check with orchestrator first)
# Trigger a fresh snapshot before any manual cleanup