Commit graph

74 commits

Author SHA1 Message Date
ZdenekSrotyr
b091cf7003 feat(ui): version badge in footer + /api/version endpoint
UI now shows a small footer badge with:
- release channel + CalVer version (e.g. 'stable-2026.04.47')
- floating image tag (e.g. 'stable')
- time since last container restart (proxy for 'last deployed')

Backend:
- app/api/health.py: /api/health returns image_tag, commit_sha, deployed_at
- app/api/health.py: new /api/version endpoint (lightweight, no DB hit, for
  footer badge polling)

Infra:
- startup-script.sh.tpl: resolves image digest from ghcr pull, derives
  channel + version from the tag name, and writes AGNES_VERSION /
  RELEASE_CHANNEL / AGNES_COMMIT_SHA into .env so the app can surface them
  to the UI.

UI:
- app/web/templates/base.html: footer loads /api/version asynchronously and
  renders '<channel>-<version> · <tag> · deployed <relative> (<UTC>)'.
  Tooltip shows full detail (commit sha, schema version).
2026-04-21 20:19:40 +02:00
ZdenekSrotyr
2b17973796 fix(auth): /auth/bootstrap activates seed users, disabled only by real password
Bug: SEED_ADMIN_EMAIL creates a password-less user at app startup, which made
/auth/bootstrap return 403 '1 users already exist' on a fresh deployment —
leaving the operator no way to log in (the seed user has no password, and
/auth/token requires one).

Fix: bootstrap is now disabled only when at least one user has a
password_hash set. On a fresh deploy with a seed user:
- POST /auth/bootstrap { email: <matches seed>, password: X } → sets the
  password on the seed user, promotes to admin, returns token.
- With a non-matching email, a new admin is created alongside the seed user.

Lock semantics: bootstrap self-deactivates as soon as any password is set.

Tests: 8 passing, including new test_bootstrap_activates_seed_user and
test_bootstrap_disabled_when_password_user_exists covering the two halves.
2026-04-21 20:01:20 +02:00
ZdenekSrotyr
e25a7aba7d fix: resolve JWT secret key test isolation issue
Replace module-level SECRET_KEY cache with lazy _get_cached_secret_key()
that re-reads env vars in test mode. This fixes 20 test failures caused
by JWT secret mismatch when test modules load in different orders.
2026-04-12 14:05:41 +02:00
ZdenekSrotyr
ed43feb4e6 feat: add POST /api/query/hybrid endpoint for two-phase BQ+DuckDB queries 2026-04-11 11:09:42 +02:00
ZdenekSrotyr
fbad3f5538 fix: address Devin review — partial download cleanup, category validation, path escaping, docs
- cli/commands/analyst.py: delete partial parquet file on download failure to unblock re-download
- cli/commands/analyst.py: escape single quotes in parquet path to prevent SQL injection
- app/api/metrics.py: replace tempfile-based import with inline YAML parse + direct repo.create(); validates name+category upfront and returns 400 if missing; removes os/tempfile imports
- CLAUDE.md: update schema version text to v4 with full migration chain

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 09:41:29 +02:00
ZdenekSrotyr
4b4c071959 fix: async httpx in metadata push, guard access_token, add push test
- Replace synchronous httpx.post() with async httpx.AsyncClient in push_metadata_to_source endpoint to avoid blocking the event loop
- Guard data["access_token"] in CLI analyst setup with .get() and a clear error message on missing key
- Add test_push_non_keboola_table_fails and test_push_keboola_table to TestMetadataAPI, covering 400/404 path and the happy path with mocked async httpx
2026-04-11 08:33:10 +02:00
ZdenekSrotyr
126d151413 fix: address code review — path injection, multi-table search, metrics import API, error handling
- Validate view names with _SAFE_IDENTIFIER regex and check path traversal in _initialize_duckdb()
- find_by_table() and get_table_map() now also search the tables[] array field
- Add POST /api/admin/metrics/import endpoint for YAML file upload
- Replace generic except in _connect_to_instance() with specific HTTPStatusError/TimeoutException handlers
- Generate .claude/settings.json in _generate_claude_md() bootstrap
- Update test_find_by_table and test_get_table_map to cover tables[] array lookups
- Add test_import_metrics_yaml in TestMetricsAPI
2026-04-10 19:56:00 +02:00
ZdenekSrotyr
d0cdfcf8c1 feat: add column metadata API with Keboola push support 2026-04-10 19:44:03 +02:00
ZdenekSrotyr
5cf0df77fc feat: add Metrics API endpoints (GET/POST/DELETE) with admin auth
- New app/api/metrics.py: GET /api/metrics, GET /api/metrics/{id:path},
  POST /api/admin/metrics (201), DELETE /api/admin/metrics/{id:path}
- Add require_admin dependency to app/auth/dependencies.py
- Register metrics_router in app/main.py before web_router
- Deprecate GET /api/catalog/metrics/{path} with 301 redirect to new endpoint
- 7 new tests in TestMetricsAPI covering CRUD, 404, RBAC, category filter
2026-04-10 19:32:13 +02:00
ZdenekSrotyr
5836bcde4c fix: /setup redirects to /login instead of /dashboard for unauthenticated users 2026-04-10 17:38:08 +02:00
ZdenekSrotyr
795f602348 fix: verify_token → test_connection, url → stack_url (Devin round 6)
- KeboolaClient has test_connection() not verify_token() — every
  /api/admin/configure call for Keboola was failing with AttributeError
- Renamed data_source.keboola.url → stack_url to match
  instance.yaml.example (line 106) and avoid user confusion

663 tests pass.
2026-04-10 15:34:17 +02:00
ZdenekSrotyr
44b99f25ca fix: address Devin review round 5 — empty secret file, CI .env
- secrets.py: validate file content is non-empty before using it;
  regenerate if file exists but is empty/corrupted
- release.yml: touch .env before docker compose in smoke test
  (env_file: .env in docker-compose.yml requires the file to exist)

663 tests pass.
2026-04-10 14:55:31 +02:00
ZdenekSrotyr
dc8a9275e6 fix: address Devin review round 3 — retry exhaustion, discover path, WAL snapshot
- CalVer retry loop now exits with error if all 5 attempts fail
  (prevents pushing Docker image with unclaimed version tag)
- discover_tables endpoint reads data_source.keboola.url (consistent
  with configure_instance and _discover_and_register_tables)
- Pre-migration snapshot flushes WAL via CHECKPOINT before copying
  and copies .wal file if it still exists after flush

663 tests pass.
2026-04-10 14:11:17 +02:00
ZdenekSrotyr
c79d85f87c fix: config path mismatch + CalVer race condition (Devin review round 2)
- _discover_and_register_tables reads from data_source.keboola.url
  (matches what /api/admin/configure writes) instead of top-level
  keboola.url which doesn't exist
- CalVer: claim git tag BEFORE Docker build with retry loop (up to 5
  attempts). Prevents race where two concurrent CI runs get same N.
  Git tag acts as a distributed lock for version uniqueness.

663 tests pass.
2026-04-10 13:30:05 +02:00
ZdenekSrotyr
49f109bf73 fix: address PR review findings — config write, CalVer, error handling
- Config writes to DATA_DIR/state/instance.yaml (writable) instead of
  CONFIG_DIR (read-only :ro in Docker)
- instance_config.py checks DATA_DIR/state/ first, then falls back to
  CONFIG_DIR for backward compat
- CalVer counter is now global across channels (*-YYYY.MM.*) per spec
- Keboola error messages sanitized — log full error, return generic msg
- chmod in secrets.py wrapped in try/except for Windows compat
- Setup wizard JS handles 401 (expired JWT) with user-facing message
- deploy.yml changed to workflow_dispatch only (no duplicate test runs)
- Smoke test uses docker-compose.prod.yml + AGNES_TAG instead of sed
- docker-compose.prod.yml uses ${AGNES_TAG:-stable} env var

663 tests pass. 8 E2E verification tests pass.
2026-04-10 13:16:40 +02:00
ZdenekSrotyr
6c53082295 feat: multi-instance deployment — all 14 must-have items from spec
CalVer CI (release.yml) with stable/dev channels, health endpoint
with version/channel/schema_version, JWT secret auto-generation with
file persistence, smoke test script + Docker-in-CI, pre-migration
snapshot, /api/admin/configure for headless setup, /api/admin/
discover-and-register, /setup wizard, OpenAPI snapshot test, custom
connector mount support, CHANGELOG, migration safety tests, startup
banner.

663 tests pass (6 new migration safety + 3 OpenAPI snapshot + 1
updated JWT test).
2026-04-10 11:57:42 +02:00
ZdenekSrotyr
b7a3c8dd13 fix: hide Google login button when OAuth is not configured 2026-04-09 19:44:59 +02:00
ZdenekSrotyr
582e06c859 fix: cookie secure flag based on DOMAIN env — allows HTTP for dev/staging 2026-04-09 19:37:25 +02:00
ZdenekSrotyr
5ae13b199c feat: add web login handler — form POST sets cookie and redirects to dashboard 2026-04-09 19:33:25 +02:00
ZdenekSrotyr
d49844c1fe fix: Flask url_for compatibility shim + login template routes 2026-04-09 19:28:37 +02:00
ZdenekSrotyr
ad73af47b7 fix: /login/password shows login form, not account setup form 2026-04-09 19:20:44 +02:00
ZdenekSrotyr
86042f17d8 fix: add missing /login/password and /login/email web routes 2026-04-09 19:15:05 +02:00
ZdenekSrotyr
7d036760f5 fix: wrap Google OAuth DB connection in try/finally to ensure it is always closed
The system DB connection opened in google_callback is now closed in a
finally block, so it is released even when an exception occurs between
open and close.
2026-04-09 18:42:56 +02:00
ZdenekSrotyr
471982d3f9 fix: route admin_edit through KnowledgeRepository.update instead of raw SQL 2026-04-09 18:42:52 +02:00
ZdenekSrotyr
7e0cb80ed2 fix: move argon2 imports to top-level and catch VerifyMismatchError specifically
PasswordHasher and VerifyMismatchError are now imported at module level in
router.py and providers/password.py. Wrong-password errors are caught as
VerifyMismatchError (401); unexpected errors fall through to a 500 with logging.
2026-04-09 18:42:51 +02:00
ZdenekSrotyr
f6d2d1487f fix: remove duplicate Path alias in upload.py, replace _Path with Path 2026-04-09 18:42:48 +02:00
ZdenekSrotyr
5fe177c309 feat: add audit logging for authentication events
Log token_created, login_failed, and bootstrap_completed events via
AuditRepository. Extracts a shared _audit() helper that swallows
errors so audit failures never block auth. Also tightens password
verification to catch VerifyMismatchError specifically and log
unexpected errors at 500 rather than silently swallowing them.
2026-04-09 18:42:38 +02:00
ZdenekSrotyr
7f523788c2 fix: correct YAML path for instance name and subtitle
get_instance_name and get_instance_subtitle now look up the nested
instance.name and instance.subtitle keys to match the YAML structure.
2026-04-09 16:31:56 +02:00
ZdenekSrotyr
7bada9f32b fix: force secure cookie in production, reduce max_age to 1 day
Use TESTING env var to detect production instead of relying on
request scheme, and align cookie max_age with JWT expiry (86400s).
2026-04-09 16:31:50 +02:00
ZdenekSrotyr
c55dd02196 fix: stop leaking server file paths in upload responses
Return filename instead of full server-side path in upload_session
and upload_artifact responses.
2026-04-09 16:31:46 +02:00
ZdenekSrotyr
2043594670 fix: restrict script execution endpoints to analyst/admin roles
deploy, run, and run-deployed require analyst; undeploy requires admin.
Update test to use admin token for undeploy.
2026-04-09 16:31:42 +02:00
ZdenekSrotyr
449053bf8a fix: enforce per-table access control on catalog profile endpoints
Add can_access_table check to GET /api/catalog/profile/{table_name} and
POST /api/catalog/profile/{table_name}/refresh, returning 403 for
unauthorized tables. Update test_api_complete to cover new 403 behaviour
and fix the existing 404 test to use admin token.
2026-04-09 16:30:24 +02:00
ZdenekSrotyr
ad6b3a96e4 fix: enforce role guards on admin web pages
Add require_role(Role.ADMIN) to /admin/tables and /admin/permissions,
and require_role(Role.KM_ADMIN) to /corporate-memory/admin so that
non-admin users receive 403 instead of being served the page.

Fix admin_cookie test fixture to supply a password_hash (required since
the /auth/token endpoint blocks passwordless requests). Add analyst
fixture and TestAdminRoleGuards tests verifying analysts get 403 and
admins get 200 on the protected routes.
2026-04-09 16:30:13 +02:00
ZdenekSrotyr
3205a8d300 fix: block /auth/token for OAuth-only users without password_hash
Users without a password_hash (Google OAuth / magic-link accounts) could
obtain a JWT by simply posting their email to /auth/token. Add an else
clause that rejects such requests with 401, directing them to their
configured auth provider. Update and extend tests accordingly.
2026-04-09 16:29:47 +02:00
ZdenekSrotyr
55515266ea fix: block DuckDB metadata functions and relative paths in query endpoint
Add information_schema, duckdb_* introspection functions, pragma_* functions,
and relative path traversal patterns to the SQL blocklist so users cannot
enumerate schema metadata regardless of RBAC. Add six corresponding tests.
2026-04-09 16:29:11 +02:00
ZdenekSrotyr
8df8183a9f feat: add 50 MB upload size limit to session and artifact endpoints
Rejects files exceeding MAX_UPLOAD_SIZE with HTTP 413 before writing to disk.
2026-04-09 07:14:16 +02:00
ZdenekSrotyr
c56511f69f refactor: replace local _get_data_dir() with shared app.utils.get_data_dir()
Replace copy-pasted _get_data_dir() functions in catalog.py and upload.py
with import from app.utils.get_data_dir(). sync.py and data.py already use
the shared utility.
2026-04-09 07:05:50 +02:00
ZdenekSrotyr
53a9e838f9 feat: add graceful shutdown handler
- Add close_system_db() function in src/db.py to cleanly close shared DB connection
- Add lifespan context manager in app/main.py to trigger shutdown on app exit
- Integrate lifespan into FastAPI app initialization
- All API tests pass (77/77)
2026-04-09 07:03:45 +02:00
ZdenekSrotyr
1b3acce7e9 fix: replace substring table access check with word-boundary regex
Replace substring matching with word-boundary regex in query endpoint's
table access validation. Prevents false positives where short table names
like 'id' would block any query containing the word. Uses re.escape() to
safely handle special characters in table names.

- Import re module at top
- Use regex pattern with word boundaries (\b) for matching
- Add tests to verify no false positives and proper blocking
2026-04-09 07:00:48 +02:00
ZdenekSrotyr
1b219cabe9 fix: remove dead PRAGMA enable_wal code
DuckDB has used WAL by default since v0.8, so this pragma is not
valid DuckDB syntax. Removed obsolete try-except block that attempted
to enable WAL on system database initialization.
2026-04-09 06:59:57 +02:00
ZdenekSrotyr
535b5fb1bf security: strip VIRTUAL_ENV/PYTHONPATH from script sandbox and block httpx
Replace inherited env vars with a minimal env dict (PATH, DATA_DIR, HOME only),
omitting VIRTUAL_ENV and PYTHONPATH to prevent subprocess access to installed
packages. Switch subprocess invocation to sys.executable so the correct
interpreter is used with the restricted PATH. Add httpx to blocked_patterns
and BLOCKED_MODULES. Add test_sandbox_cannot_import_httpx to test_security.py.
2026-04-09 06:58:26 +02:00
ZdenekSrotyr
3321d2e266 security: reduce JWT expiry to 24h and add jti claim
Tokens previously lasted 30 days with no revocation path. Expiry is now
24 hours and every token carries a unique jti (UUID hex) to support future
revocation checks.
2026-04-09 06:57:23 +02:00
ZdenekSrotyr
23ae6a602c security: harden query endpoint SQL blocklist and disable external access
Expand blocked keywords to cover parquet_scan, read_csv_auto, query_table,
iceberg_scan, delta_scan, call, URL schemes (http/https/s3/gcs), and
additional file-scan functions. Set enable_external_access=false on the
non-read-only analytics connection path. Add three new tests covering
parquet_scan, read_csv_auto, and query_table blocking.
2026-04-09 06:54:58 +02:00
ZdenekSrotyr
4aa97c23d2 fix: raise RuntimeError on missing JWT_SECRET_KEY in non-test environments
Prevents production deployments from silently using a hardcoded default
secret. TESTING=1 still resolves to a built-in test key so the existing
test suite is unaffected. Adds a test that verifies the RuntimeError is
raised when neither JWT_SECRET_KEY nor TESTING is set.
2026-04-09 06:54:29 +02:00
ZdenekSrotyr
94c6b0f839 fix: require password verification when user has password_hash in /auth/token
Previously the password check was gated on both user.password_hash and
request.password being truthy, so an attacker could omit the password
field (which defaults to "") and receive a valid JWT. Now any user with a
stored hash must supply a non-empty password that passes argon2 verification.

Adds six TestTokenEndpoint tests covering empty, missing, wrong, and correct
password, plus no-hash user and unknown user cases.
2026-04-09 06:44:31 +02:00
ZdenekSrotyr
92fbb88c15 chore: Docker prod config (Python 3.13, no reload), fix utcnow deprecation, update docs 2026-04-08 12:10:47 +02:00
ZdenekSrotyr
05a1b452e9 security: harden query (read-only DB), uploads (path sanitization), scripts (AST validation) 2026-04-08 12:09:19 +02:00
ZdenekSrotyr
224635b88d security: fix auth (argon2, cookie, JWT), CORS, session middleware, pyproject.toml 2026-04-08 12:08:52 +02:00
ZdenekSrotyr
d5659d7091 fix: login page uses login_buttons format expected by template 2026-04-08 07:11:03 +02:00
ZdenekSrotyr
67a1e0bb45 feat: Jira webhook FastAPI adapter — replaces Flask Blueprint 2026-04-08 07:04:50 +02:00