ZdenekSrotyr
9962fc4d40
Merge: final deployment log iteration 2
2026-04-21 19:11:14 +02:00
ZdenekSrotyr
6470e23df3
docs: finalize deployment log — iteration 2 summary
2026-04-21 19:11:07 +02:00
ZdenekSrotyr
1073517969
Merge: onboarding race-condition fix
2026-04-21 19:10:12 +02:00
ZdenekSrotyr
0b4807a836
docs(onboarding): use 'gh repo create --clone' to avoid template-copy race
...
Separate 'gh repo create --clone=false' + 'git clone' races with GitHub's
template content propagation. '--clone' waits for it in one step.
2026-04-21 19:10:04 +02:00
ZdenekSrotyr
4501840893
Merge: onboarding docs — propagation, restore, monitoring
2026-04-21 19:06:27 +02:00
ZdenekSrotyr
3e9213bfc4
docs(onboarding): add module propagation, backup restore, monitoring setup
...
- 'Propagating module changes' — explains ignore_changes + -replace workflow
- 'Restoring from backup' — step-by-step disk swap from daily snapshot
- 'Monitoring alerts' — wiring notification channels
2026-04-21 19:06:20 +02:00
ZdenekSrotyr
85bca573a7
Merge: daily backup snapshot + monitoring alerts
2026-04-21 19:02:07 +02:00
ZdenekSrotyr
0842debf8a
feat(infra): add daily backup snapshot + monitoring alerts
...
- google_compute_resource_policy.daily_backup: daily snapshot at 02:00,
30-day retention, labels (app=agnes, customer=<name>)
- google_compute_disk_resource_policy_attachment.data_backup: attach policy
to each data disk (prod + dev)
- google_monitoring_uptime_check_config.health: per-VM /api/health uptime
check every 60s, 10s timeout
- google_monitoring_alert_policy.health_failure: alert when uptime check
fails for > 5 min
New opt-out: enable_monitoring = false (default true)
New opt-in: notification_channel_ids = [...] to wire alerts to email/Slack
Module API unchanged; existing customers pick up backups + monitoring on
next module upgrade. TF provider requirement unchanged.
2026-04-21 19:01:56 +02:00
ZdenekSrotyr
0ca8ed2bce
Merge: per-branch image tag :dev-<slug> for branch-aware dev deploys
2026-04-21 18:47:16 +02:00
ZdenekSrotyr
5188bd9127
ci: add per-branch image tag :dev-<slug> for branch-aware dev deploys
...
Extracts branch name from GITHUB_REF, slugifies it, and adds as extra tag
on feature branch builds. Main branch is unaffected (no branch_slug output).
Enables dev_instances tfvar with image_tag pinning specific feature branches.
2026-04-21 18:47:01 +02:00
ZdenekSrotyr
1811a408de
Merge: fix CI smoke test — split host bind mount to separate overlay
2026-04-21 16:54:27 +02:00
ZdenekSrotyr
1acc89c486
fix(ci): move bind-mount of /data to separate overlay, fix CI smoke test
...
The CI smoke test failed because docker-compose.prod.yml forced a bind mount
to /data on the host — which doesn't exist on GitHub runners.
Split the bind mount into docker-compose.host-mount.yml, which is only
composed by the VM startup script (/data exists there, mounted from the
persistent disk). CI continues to use the default named volume.
Module startup script + auto-upgrade cron now compose all three:
-f docker-compose.yml -f docker-compose.prod.yml -f docker-compose.host-mount.yml
2026-04-21 16:54:18 +02:00
ZdenekSrotyr
a3b4b43e47
Merge: deployment log with final state
2026-04-21 16:51:28 +02:00
ZdenekSrotyr
03dd81c825
docs: update deployment log with final state and onboarding workflow
...
- Volume fix documented (Docker named volume → bind mount /data)
- Watchtower → cron-based auto-upgrade
- Final state snapshot of VMs, repos, tags, secrets
- Onboarding flow summary for 2nd customer
2026-04-21 16:51:20 +02:00
ZdenekSrotyr
85c6b114b0
Merge: add ONBOARDING.md
2026-04-21 16:49:54 +02:00
ZdenekSrotyr
a44e11a5e2
docs: add ONBOARDING.md — end-to-end per-customer deployment guide
2026-04-21 16:49:45 +02:00
ZdenekSrotyr
3dcdc52faf
Merge: replace watchtower with cron, bump infra module to v1.1.0
2026-04-21 16:47:05 +02:00
ZdenekSrotyr
cbd85c52ed
fix(infra): replace watchtower with cron for auto-upgrade
...
Watchtower container has Docker API mismatch (client 1.25 vs daemon 1.54+)
that can't be worked around without upstream fix. Simple cron job does the
same thing more reliably:
- Every 5 min: docker compose pull + detect digest change + up -d if changed
- Logs to /var/log/agnes-auto-upgrade.log
This removes the watchtower container and a Docker daemon dependency.
2026-04-21 16:46:55 +02:00
ZdenekSrotyr
94b6a8eff2
Merge feature/multi-customer-deployment: multi-customer deployment infra
...
- infra/modules/customer-instance/ — reusable Terraform module (tag infra-v1.0.0)
- infra/examples/minimal/ — OSS self-host quickstart
- scripts/bootstrap-gcp.sh — per-customer GCP setup
- scripts/fetch-env-from-secrets.sh — VM-side .env from Secret Manager
- docker-compose.prod.yml — bind data volume to host /data for persistent disks
- docs/superpowers/specs/2026-04-21-multi-customer-deployment-spec.md
- docs/superpowers/plans/2026-04-21-multi-customer-deployment.md
- docs/superpowers/plans/2026-04-21-deployment-log.md
2026-04-21 16:43:06 +02:00
ZdenekSrotyr
52d63457ff
fix(prod): bind docker data volume to host /data for persistent disk
...
Without this override, docker-compose creates a named volume 'agnes_data'
on the boot disk, ignoring any persistent disk mounted at /data by the
VM startup script. This override makes the 'data' volume a bind mount
to host /data, so persistent disks work as expected.
2026-04-21 16:42:23 +02:00
ZdenekSrotyr
a2c05a5d97
infra: refactor Terraform into reusable customer-instance module
...
Breaking changes:
- infra/main.tf, variables.tf, outputs.tf, terraform.tfvars.example removed
- Single-file monolith replaced by reusable module + example
New structure:
- infra/modules/customer-instance/ — the module:
- main.tf: VMs, disks, firewall, Secret Manager, dedicated VM SA
- variables.tf: prod_instance + dev_instances flexible schema
- outputs.tf: IPs, SA email, JWT secret reference
- startup-script.sh.tpl: bootstraps VM, fetches secrets, runs compose,
adds Watchtower for auto-upgrade
- infra/examples/minimal/ — OSS self-host quickstart using the module
Supports:
- Per-customer GCP project isolation
- Branch-aware dev VMs via dev_instances list (any image_tag)
- Persistent /data disk (rebuild-safe)
- OS Login (no per-user SSH keys)
- Caddy TLS mode (opt-in via tls_mode="caddy" + domain)
- Watchtower auto-upgrade (opt-in via upgrade_mode="auto")
2026-04-21 16:18:35 +02:00
ZdenekSrotyr
0dd8b13d62
infra: add fetch-env-from-secrets.sh for VM-side .env generation
...
Reads JWT_SECRET_KEY and KEBOOLA_STORAGE_TOKEN from Secret Manager,
combines with non-secret config, writes .env with chmod 600.
Run as part of VM startup or manually for rotation.
2026-04-21 16:18:35 +02:00
ZdenekSrotyr
5ad96e5f86
infra: add bootstrap-gcp.sh for per-customer GCP setup
...
Creates agnes-deploy SA with Terraform-scoped roles, GCS tfstate bucket,
and generates a JSON key. Idempotent — safe to re-run.
Expanded .gitignore to block *-key.json files from ever being committed.
2026-04-21 16:18:35 +02:00
ZdenekSrotyr
e514f57267
Merge pull request #6 from keboola/dependabot/uv/python-multipart-0.0.26
...
chore(deps): bump python-multipart from 0.0.24 to 0.0.26
2026-04-21 15:27:25 +02:00
dependabot[bot]
6e93461918
chore(deps): bump python-multipart from 0.0.24 to 0.0.26
...
Bumps [python-multipart](https://github.com/Kludex/python-multipart ) from 0.0.24 to 0.0.26.
- [Release notes](https://github.com/Kludex/python-multipart/releases )
- [Changelog](https://github.com/Kludex/python-multipart/blob/master/CHANGELOG.md )
- [Commits](https://github.com/Kludex/python-multipart/compare/0.0.24...0.0.26 )
---
updated-dependencies:
- dependency-name: python-multipart
dependency-version: 0.0.26
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
2026-04-21 13:26:19 +00:00
ZdenekSrotyr
e53de59a42
docs: multi-customer deployment spec + implementation plan
...
- Spec: pure self-deploy model with per-customer GCP project
- Public upstream repo with TF module; private template + per-customer repos
- Branch-aware dev VMs via dev_instances list
- Caddy TLS, Secret Manager for tokens, SA JSON key for CI (WIF follow-up)
- 6-phase implementation plan with bite-sized tasks
2026-04-21 15:25:17 +02:00
ZdenekSrotyr
cf8528b5cf
Merge pull request #7 from keboola/dependabot/uv/authlib-1.6.11
...
chore(deps): bump authlib from 1.6.9 to 1.6.11
2026-04-21 15:24:57 +02:00
ZdenekSrotyr
bd6921c4d5
docs,tests: anonymize customer references
...
Replace identifying customer names and infrastructure URLs in
documentation and test fixtures with generic placeholders.
Test semantics preserved.
2026-04-21 11:56:19 +02:00
dependabot[bot]
043ae4b378
chore(deps): bump authlib from 1.6.9 to 1.6.11
...
Bumps [authlib](https://github.com/authlib/authlib ) from 1.6.9 to 1.6.11.
- [Release notes](https://github.com/authlib/authlib/releases )
- [Changelog](https://github.com/authlib/authlib/blob/v1.6.11/docs/changelog.rst )
- [Commits](https://github.com/authlib/authlib/compare/v1.6.9...v1.6.11 )
---
updated-dependencies:
- dependency-name: authlib
dependency-version: 1.6.11
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
2026-04-17 00:41:27 +00:00
ZdenekSrotyr
c74a1fab53
Merge pull request #4 from keboola/feature/v2-fastapi-duckdb-docker-cli
...
test: comprehensive test suite — 1169 tests, 4 layers
2026-04-13 16:14:11 +02:00
ZdenekSrotyr
5bbd82bacd
fix: address Devin review — docker-e2e .env, jira webhook test isolation
...
- Create empty .env before docker compose up in CI (env_file: .env is required)
- Mock get_jira_service in webhook HMAC test to isolate signature check
from Jira API availability — strict assert 200 instead of permissive 500
2026-04-13 14:36:31 +02:00
ZdenekSrotyr
863453b2e2
fix: address code review findings — duplicate fixture, JWT key length, async deprecation
...
- Remove duplicate mock_extract_factory fixture in conftest.py
- Use 32+ char JWT_SECRET_KEY everywhere (was 15 chars, triggered warnings)
- Replace deprecated asyncio.get_event_loop() with asyncio.run()
- Unify WebhookEventFactory sign methods (consistent json.dumps)
2026-04-13 13:47:51 +02:00
ZdenekSrotyr
12480b8c35
fix: graceful skip for telegram bot tests when log dir unavailable in CI
2026-04-13 13:31:51 +02:00
ZdenekSrotyr
98af8e2df3
fix: make bot.py FileHandler resilient to missing log directory
2026-04-13 13:28:59 +02:00
ZdenekSrotyr
0045f5d324
fix: ensure DATA_DIR and notifications dir exist before bot.py import in CI
2026-04-13 13:26:18 +02:00
ZdenekSrotyr
1a68decd4e
fix: patch BOT_LOG_FILE at import time for CI/xdist compatibility
2026-04-13 13:21:04 +02:00
ZdenekSrotyr
9a144f8291
fix: unify JWT_SECRET_KEY across all test modules for xdist stability
2026-04-12 14:28:17 +02:00
ZdenekSrotyr
ed58075419
Merge branch 'worktree-agent-a417e289' into feature/v2-fastapi-duckdb-docker-cli
2026-04-12 14:24:39 +02:00
ZdenekSrotyr
325f785ef4
fix: get_instance_name reads nested instance.name from YAML
2026-04-12 14:23:54 +02:00
ZdenekSrotyr
209643becb
fix: return filename instead of absolute path in upload responses
2026-04-12 14:23:51 +02:00
ZdenekSrotyr
31e210c7e3
fix: require admin/km_admin role for web admin pages
2026-04-12 14:23:47 +02:00
ZdenekSrotyr
01b5f80ef9
fix: restrict script deploy/execute to analyst role, undeploy to admin
2026-04-12 14:23:44 +02:00
ZdenekSrotyr
5bfff6616c
ci: add parallel test execution and nightly Docker E2E job
2026-04-12 14:15:46 +02:00
ZdenekSrotyr
2ec50b4e4f
test: add telegram API endpoint tests (verify, unlink, status)
2026-04-12 14:12:28 +02:00
ZdenekSrotyr
e25a7aba7d
fix: resolve JWT secret key test isolation issue
...
Replace module-level SECRET_KEY cache with lazy _get_cached_secret_key()
that re-reads env vars in test mode. This fixes 20 test failures caused
by JWT secret mismatch when test modules load in different orders.
2026-04-12 14:05:41 +02:00
ZdenekSrotyr
833de96cd7
merge: resolve Block E conflicts in pytest.ini and conftest.py
2026-04-12 11:17:26 +02:00
ZdenekSrotyr
d70d645902
Merge branch 'worktree-agent-afb2461f' into feature/v2-fastapi-duckdb-docker-cli
2026-04-12 11:15:35 +02:00
ZdenekSrotyr
8e22eed669
Merge branch 'worktree-agent-aaa8db4c' into feature/v2-fastapi-duckdb-docker-cli
2026-04-12 11:15:34 +02:00
ZdenekSrotyr
44317a86c6
merge: resolve factories.py conflict — keep Faker factories + add Block D convenience methods
2026-04-12 11:15:15 +02:00
ZdenekSrotyr
7967279181
test: add E2E journey tests (J1-J8) covering full user flows
...
40 tests across 8 files covering bootstrap/auth, sync+query, hybrid
queries, RBAC+access-requests, Jira webhooks, corporate memory,
analyst uploads, and multi-source orchestration. Adds mock_extract_factory
and admin_user fixtures to conftest, and journey marker to pytest.ini.
2026-04-12 11:13:51 +02:00