agnes-the-ai-analyst

Author	SHA1	Message	Date
ZdenekSrotyr	e9d7af3cce	feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening This squashes 13 commits from ma/staging plus a small docstring translation into a single coherent unit. Three workstreams. == RBAC v13 redesign == - Drops core.viewer/analyst/km_admin/admin hierarchy and the internal_roles / group_mappings / user_role_grants / plugin_access tables. - Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry. - Two authorization primitives in app.auth.access: require_admin — Admin-group god-mode require_resource_access(rt, "{path}") — entity-scoped grants Single DB lookup per request; no session cache; no implies BFS. - /admin/access UI (single page) replaces /admin/role-mapping + /admin/plugin-access. CLI `da admin group/grant ` replaces `da admin role/mapping/grant-role/revoke-role/effective-roles`. - ResourceType.TABLE listing-only — admins can record table grants, runtime enforcement still flows through legacy dataset_permissions (migration plan in docs/TODO-rbac-data-enforcement.md). == Claude Code marketplace == - Aggregated /marketplace.zip + /marketplace.git/ (PAT-gated, RBAC-filtered, content-addressed cache via dulwich). - Admin god-mode dropped on the marketplace surface — admins curate their own view via grants like everyone else. - Bare-repo cache materializes per RBAC-filtered ETag; stale entries not pruned in this iteration (disclaimed in git_backend.py docstring). == #81 #83 #44 security/ops hardening == - #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias). - #81 Group B — Keboola extractor 3-state exit codes: 0 success / 1 total fail / 2 PARTIAL fail Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary alerting must teach it the new partial signal. - #81 Group C — schema v10 view_ownership; rejects silent overwrite of a prior connector's view name on collision. - #81 Group D — extractor-side identifier validation. - #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset + path-traversal fix. - #44 — entire /api/scripts/* surface is admin-only (planted-script + sandbox-bypass risk closed). == Web UI polish + deploy fix == - /admin/access: live grant-count badges (no stale snapshot revert), shared-header CSS link added to /catalog and /admin/{tables,permissions}, per-resource-type colored stripes. - docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't silently shadow sub-mounts and write state to the wrong disk. == OSS vendor-neutralization (waves 1+2) == - scripts/grpn/ → scripts/ops/. Customer-specific identifiers (project IDs, internal hostnames, dev/prod VM IPs, brand names) replaced with placeholders across code, docs, Terraform, Caddyfile, OAuth probe, and planning docs. Downstream infra repos that copied scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must update the path. == Translation == - src/repositories/user_groups.py::ensure_system docstring translated from Czech to English for codebase consistency. Co-authored-by: Mina Rustamyan <mina@keboola.com>	2026-04-28 14:25:04 +02:00
ZdenekSrotyr	24e81fb671	fix(security): gate Script-API /run on admin role (#44 ) (#92 ) * fix(security): gate Script-API /run on admin role (#44) The AST + string-blocklist sandbox in `_execute_script` is defense-in-depth, not a primary trust boundary. It does not block `vars()`, `type()`, or `__class__.__bases__` introspection chains, and the string blocklist is trivially evadable via concatenation/dunder encoding. Treat the role gate as the actual barrier: only admin can run scripts. - `POST /api/scripts/run` and `POST /api/scripts/{id}/run` now require admin. - `POST /api/scripts/deploy` stays analyst-accessible (storing != executing). - Existing /run tests retargeted to admin_token; added regression tests asserting analyst → 403 on both endpoints. - CHANGELOG: BREAKING (security) bullet under Unreleased/Changed. Closes #44. * fix(security): admin-gate /deploy + harden sandbox blocklist (review #92) Reviewer of PR #92 flagged three MUST-FIXes that #44 wasn't fully closed: 1. /api/scripts/deploy still accepted analyst → planted-script attack path (analyst plants malicious source, waits for admin to /run). Now: /deploy also requires admin; the entire Script API is admin-only. 2. The "Minimum (same-day)" blocklist mitigations from issue #44 weren't applied. Added the introspection-chain dunders that the issue PoC pivots through: __subclasses__, __globals__, __class__, __base__, __bases__, __mro__, __dict__, __code__, __builtins__. Plus `vars` in BLOCKED_FUNCTIONS. Deliberately NOT adding __init__ / __getattribute__ (substring match would flag every legit `def __init__`) nor `type`/`dir` (frequent in legitimate admin scripts). Documented the trade-off inline. 3. Tests didn't cover the actual PoC payload nor non-analyst non-admin roles. Added test_run_pwn_payload_blocked parametrized over the issue's own PoC + two equivalent variants (lambda+__globals__, __mro__ traversal); these stay green only as long as the dunder list does. test__requires_admin tests now parametrize over (analyst, viewer, km_admin) so all three non-admin core roles are pinned at 403. Conftest extension: seeded_app now exposes viewer_token and km_admin_token as siblings to admin_token / analyst_token. CHANGELOG bullet updated to reflect /deploy gate change and new internal regression tests. 35/35 scripts tests pass locally. Refs review of #92. fix(tests): test_security TestScriptSandbox needs admin token after #44 hardening CI failure on PR #92 caught a missed test file. tests/test_security.py seeded only an analyst user and used the analyst token to drive sandbox tests. After the #44 admin-gate (deploy + run both admin-only), every sandbox test got 403 from the role gate before the AST/string check could run, so 'blocks os.system' / 'blocks eval' / etc. all failed. Fix: extend the fixture to also seed an admin user and return the admin token. Sandbox tests now reach the sandbox layer; access-control tests further down in the module continue to use the analyst that was kept around. 41/41 test_security.py tests pass locally. * fix(security): #92 round-3 — gate GET /api/scripts on admin role Devin Review caught: GET /api/scripts (app/api/scripts.py:44-51) was left on Depends(get_current_user) when the rest of the API moved to admin-only. ScriptRepository.list_all() does SELECT * FROM script_registry which returns ALL columns including 'source' (the full script body). So any authenticated user (viewer / analyst / km_admin) could read admin-deployed scripts — leak of code that may contain credentials, business logic, or admin-only operational details. CHANGELOG already says 'The entire Script API is now admin-only', which was true for /deploy, /run, /{id}/run, DELETE — just not for GET. Now consistent: every Script endpoint requires admin. Tests: - New parametrized test_list_scripts_requires_admin over (analyst, viewer, km_admin) tokens — all assert 403. - Updated test_list_scripts_empty in both test_scripts_api.py and test_api_scripts.py to use admin_token. 79 tests pass. Refs Devin Review of #92. * fix: cleanup unused imports, stale docstrings, and incomplete CHANGELOG - Remove unused imports: Path, List, get_current_user (ruff F401) - Trim docstrings to describe current behavior, not change history - CHANGELOG now lists GET /api/scripts among admin-gated endpoints - Remove diff-commenting inline comments from tests Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com> * fix: merge duplicate Changed sections into one per CLAUDE.md convention Co-Authored-By: zdenek.srotyr <zdenek.srotyr@keboola.com> --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-04-27 21:13:56 +02:00
ZdenekSrotyr	863453b2e2	fix: address code review findings — duplicate fixture, JWT key length, async deprecation - Remove duplicate mock_extract_factory fixture in conftest.py - Use 32+ char JWT_SECRET_KEY everywhere (was 15 chars, triggered warnings) - Replace deprecated asyncio.get_event_loop() with asyncio.run() - Unify WebhookEventFactory sign methods (consistent json.dumps)	2026-04-13 13:47:51 +02:00
ZdenekSrotyr	0045f5d324	fix: ensure DATA_DIR and notifications dir exist before bot.py import in CI	2026-04-13 13:26:18 +02:00
ZdenekSrotyr	9a144f8291	fix: unify JWT_SECRET_KEY across all test modules for xdist stability	2026-04-12 14:28:17 +02:00
ZdenekSrotyr	833de96cd7	merge: resolve Block E conflicts in pytest.ini and conftest.py	2026-04-12 11:17:26 +02:00
ZdenekSrotyr	7967279181	test: add E2E journey tests (J1-J8) covering full user flows 40 tests across 8 files covering bootstrap/auth, sync+query, hybrid queries, RBAC+access-requests, Jira webhooks, corporate memory, analyst uploads, and multi-source orchestration. Adds mock_extract_factory and admin_user fixtures to conftest, and journey marker to pytest.ini.	2026-04-12 11:13:51 +02:00
ZdenekSrotyr	510608813c	test: add shared test infrastructure (fixtures, factories, assertions, mocks) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 11:05:35 +02:00
ZdenekSrotyr	53a9e838f9	feat: add graceful shutdown handler - Add close_system_db() function in src/db.py to cleanly close shared DB connection - Add lifespan context manager in app/main.py to trigger shutdown on app exit - Integrate lifespan into FastAPI app initialization - All API tests pass (77/77)	2026-04-09 07:03:45 +02:00
ZdenekSrotyr	617e724d21	feat: add E2E test suite — API, extractor, Docker tests/conftest.py: shared fixtures (e2e_env, seeded_app, create_mock_extract) tests/test_e2e_api.py: 11 tests — full sync flow, RBAC, table lifecycle tests/test_e2e_extract.py: 6 tests — Keboola/BQ/Jira pipelines, multi-source, corrupt handling tests/test_e2e_docker.py: 3 tests — Docker health + full flow (opt-in via -m docker) Fix admin update route (duplicate id kwarg, .dict() → .model_dump()). 705 tests passing.	2026-03-31 08:18:54 +02:00

10 commits