agnes-the-ai-analyst/tests
Petr Simecek 6c36b26979
release(0.11.3): internal roles + external→internal group mapping (foundation) (#71)
* feat(auth): internal roles + external→internal group mapping (foundation)

Two-layer authorization model: external Cloud Identity groups (org-managed)
get mapped onto internal Agnes-defined capabilities (app-managed) via an
admin-curated many-to-many table. Per-request permission checks read off
the session — no DB hit. Refresh requires re-login.

Schema v8 — new tables:
- internal_roles (id, key UNIQUE, display_name, description, owner_module, …)
  — app-defined capabilities like 'context_admin'. Modules self-register at
  import; the startup hook syncs the registry into this table (idempotent).
- group_mappings (id, external_group_id, internal_role_id FK, …)
  — admin-managed bindings, UNIQUE(external_group_id, internal_role_id).

app/auth/role_resolver.py — new module:
- register_internal_role(key, display_name, description, owner_module)
  Module-author entry point. lower_snake_case key, immutable, validated.
  Same key + same fields = no-op (re-import safe); same key + different
  fields = ValueError so two modules can't silently overwrite each other.
- sync_registered_roles_to_db(conn) — startup reconciliation. Inserts new
  keys, updates drifted metadata, never deletes (preserves mappings).
- resolve_internal_roles(external_groups, conn) — joins group_mappings.
  Sorted, deduplicated role-key list. Plugged into google_callback +
  dev-bypass branch in get_current_user.
- require_internal_role('key') — FastAPI dependency factory; reads
  session.internal_roles; 403 with explicit message when missing.

Resolution runs at sign-in only (Google callback + LOCAL_DEV_GROUPS change
in dev-bypass) — same semantics as session.google_groups. No admin UI yet;
mappings created via repository directly until follow-up PR ships UI.

21 new tests in tests/test_role_resolver.py: register/list, idempotency,
collision detection, key-format validation; sync insert/update/no-delete;
resolve empty/single/many-to-many/malformed-input; e2e via
LOCAL_DEV_GROUPS — gated endpoint allowed/denied + direct session-cookie
inspection. Full sweep: 178/178 passed across auth + db + repo tests.
(Two pre-existing test_catalog_export.py failures verified unrelated.)

* fix(auth): polish review feedback — first-request dev populate + PAT doc

Two follow-ups from a code-reviewer pass on the foundation commit before
opening the PR:

- Dev-bypass populates session["internal_roles"] on the first request
  after sign-in, not just when external groups change. The previous
  guard only resolved when groups_changed=True, which left a hole for
  the LOCAL_DEV_GROUPS=`""` (explicit empty) flow: target=[],
  current=None, neither write branch fires, internal_roles stays
  unset, and require_internal_role then 403s with no roles to check
  against. The OAuth callback writes session["internal_roles"]
  unconditionally on sign-in (even []); dev-bypass now matches that
  semantics. Adds a single-pass populate gated on the key being
  absent from the session, so subsequent same-state requests still
  no-op (cheap session lookup, no resolver call).

- Document that internal roles are session-scoped and PAT/headless
  clients will get 403 from any require_internal_role(...) endpoint.
  Same constraint already applies to session.google_groups (PAT JWTs
  deliberately don't snapshot group memberships — they could change
  after issuance with no way to re-sign), but the doc didn't surface
  this — an operator pointing a CLI at a role-gated endpoint would
  see 403 with no clue why. New "PAT and headless requests" section
  spells out the constraint, the rationale, and the three escape
  valves (use users.role for the gate; route through OAuth; wait for
  the planned `da admin grant-role` CLI helper).

54 auth tests still pass locally (21 role-resolver + 33 existing
auth-provider).

* release(0.11.3): cut release for the internal-roles foundation

Bumps pyproject.toml 0.11.2 → 0.11.3 and renames CHANGELOG's
[Unreleased] section to [0.11.3] — 2026-04-26 (with a fresh
empty [Unreleased] skeleton appended). Adds the matching
[0.11.3] link reference at the bottom of CHANGELOG so the
section heading renders as a hyperlink to the GitHub release
page once the tag lands.

The bullet itself is unchanged content; the rephrasing of
"dev-bypass when external groups change" → "dev-bypass —
populates on first request and whenever external groups
change, mirroring the OAuth callback's always-write
semantics" reflects the polish committed in d590579, plus
the appended PAT/headless caveat pointing at the doc
section that landed in the same polish pass.

* fix(auth): address review feedback from Pavel — PAT-specific 403, audit logs, hardening

Round-2 polish over the internal-roles foundation, addressing Pavel's review
on PR #71. No behavior change for the happy path; tightens the safety rails
and makes the failure modes self-explanatory.

User-visible:
- require_internal_role now distinguishes "no session" (Bearer/PAT caller)
  from "signed in but missing role" and surfaces a PAT-specific 403 detail
  in the first case ("This endpoint needs an interactive (OAuth) session
  — Bearer/PAT tokens do not carry session-resolved roles by design").
- docs/internal-roles.md documents deactivate+reactivate as the supported
  "force re-resolve now" lever for users that can't be made to log out.

Internal hardening:
- INFO-level audit log on every successful resolve (OAuth callback +
  dev-bypass) so a wrong-role complaint is debuggable from the log alone.
- Startup warning when SESSION_SECRET is shorter than 32 chars, matching
  the existing JWT_SECRET_KEY gate — both HMAC surfaces sign trust-laden
  state (session.internal_roles, session.google_groups, JWTs).
- _clear_registry_for_tests() now refuses to run unless TESTING=1 so a
  stray import path in production can't drop the registered capabilities.

Tests:
- 4 new tests in tests/test_role_resolver.py covering: stale-session
  contract after a mid-session mapping revoke (pin the documented
  limitation), PAT 403 detail wording, OAuth pipeline data flow from
  external groups to internal_roles, and the dev-bypass empty-list
  fallback when the resolver raises.

CHANGELOG.md updated under [0.11.3] (### Changed + ### Internal).
CLAUDE.md schema doc bumped from v7 to v8.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-26 23:49:10 +02:00
..
helpers fix: address code review findings — duplicate fixture, JWT key length, async deprecation 2026-04-13 13:47:51 +02:00
snapshots feat: multi-instance deployment — all 14 must-have items from spec 2026-04-10 11:57:42 +02:00
__init__.py Initial commit: OSS data distribution platform 2026-03-08 23:31:28 +01:00
conftest.py fix: address code review findings — duplicate fixture, JWT key length, async deprecation 2026-04-13 13:47:51 +02:00
test_access_control.py refactor: remove legacy webapp + add missing tests + housekeeping 2026-03-31 13:44:06 +02:00
test_access_requests_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_admin_configure_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_admin_tokens_ui.py feat(auth): Google Workspace groups on /profile + tag-triggered Keboola deploy workflow (#56) 2026-04-26 00:56:44 +02:00
test_analyst_bootstrap.py feat: add da analyst setup command with bootstrap flow 2026-04-10 19:43:36 +02:00
test_api.py feat: add POST /api/query/hybrid endpoint for two-phase BQ+DuckDB queries 2026-04-11 11:09:42 +02:00
test_api_complete.py fix: return filename instead of absolute path in upload responses 2026-04-12 14:23:51 +02:00
test_api_scripts.py fix: restrict script execution endpoints to analyst/admin roles 2026-04-09 16:31:42 +02:00
test_app_version.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_auth_providers.py release(0.11.2): LOCAL_DEV_GROUPS dev mock + Makefile defaults + docs/local-development.md (#70) 2026-04-26 16:48:55 +02:00
test_auto_profiling.py Add self-service data onboarding system 2026-03-09 14:25:37 +01:00
test_bigquery_extractor.py feat: add graceful shutdown handler 2026-04-09 07:03:45 +02:00
test_bigquery_extractor_full.py test: add connector test suite (Block D) — 5 files, 58 tests 2026-04-12 11:12:50 +02:00
test_bootstrap.py fix(auth): /auth/bootstrap activates seed users, disabled only by real password 2026-04-21 20:01:20 +02:00
test_catalog_export.py refactor: delete old sync pipeline — 9,500 lines removed 2026-03-31 07:50:37 +02:00
test_cli.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_cli_admin.py User management + PAT + CLI distribution + HTML auth redirect (#9 #10 #11 #12) (#28) 2026-04-22 14:24:28 +02:00
test_cli_analyst.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_artifacts.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_cli_auth.py User management + PAT + CLI distribution + HTML auth redirect (#9 #10 #11 #12) (#28) 2026-04-22 14:24:28 +02:00
test_cli_diagnose.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_explore.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_metrics.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_query.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_server.py test: add CLI gap tests for all 9 command groups 2026-04-12 11:13:15 +02:00
test_cli_sync.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_cli_update_check.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_column_metadata.py feat: add ColumnMetadataRepository with CRUD and proposal import 2026-04-10 19:41:53 +02:00
test_connector_kit_poc.py User management + PAT + CLI distribution + HTML auth redirect (#9 #10 #11 #12) (#28) 2026-04-22 14:24:28 +02:00
test_corporate_memory_collector.py test: add Block C services tests (68 tests across 6 files) 2026-04-12 11:11:48 +02:00
test_db.py test: add correctness test for _reattach_remote_extensions 2026-04-12 08:40:12 +02:00
test_docker_full.py fix(tests): refresh nightly docker-e2e asserts after auth + health refactors (#69) 2026-04-26 16:12:20 +02:00
test_duckdb_manager.py Add per-partition streaming sync and hybrid query architecture 2026-03-12 13:20:41 +01:00
test_e2e_api.py refactor: remove legacy webapp + add missing tests + housekeeping 2026-03-31 13:44:06 +02:00
test_e2e_docker.py fix(tests): refresh nightly docker-e2e asserts after auth + health refactors (#69) 2026-04-26 16:12:20 +02:00
test_e2e_extract.py fix: use SCHEMA_VERSION constant in e2e migration test 2026-04-10 19:39:19 +02:00
test_generate_sample_data.py Add --format parquet using project's ParquetManager 2026-03-10 21:46:20 +01:00
test_instance_config.py fix: address code review findings — duplicate fixture, JWT key length, async deprecation 2026-04-13 13:47:51 +02:00
test_jira_incremental.py test: add connector test suite (Block D) — 5 files, 58 tests 2026-04-12 11:12:50 +02:00
test_jira_service.py test: add missing coverage for web UI, Jira extract, instance config, and concurrent rebuild 2026-04-09 07:15:14 +02:00
test_jira_service_full.py test: add connector test suite (Block D) — 5 files, 58 tests 2026-04-12 11:12:50 +02:00
test_jira_webhooks.py fix: address Devin review — docker-e2e .env, jira webhook test isolation 2026-04-13 14:36:31 +02:00
test_journey_analyst.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_bootstrap_auth.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_hybrid.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_jira.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_memory.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_multisource.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_rbac.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_journey_sync_query.py test: add E2E journey tests (J1-J8) covering full user flows 2026-04-12 11:13:51 +02:00
test_keboola_extractor.py feat: add graceful shutdown handler 2026-04-09 07:03:45 +02:00
test_keboola_extractor_full.py test: add connector test suite (Block D) — 5 files, 58 tests 2026-04-12 11:12:50 +02:00
test_live_bigquery.py test: add Docker E2E and live connector test files 2026-04-12 11:10:06 +02:00
test_live_jira.py test: add Docker E2E and live connector test files 2026-04-12 11:10:06 +02:00
test_live_keboola.py test: add Docker E2E and live connector test files 2026-04-12 11:10:06 +02:00
test_llm_connector.py Add modular LLM connector for Corporate Memory 2026-03-23 12:08:33 +01:00
test_llm_providers_full.py test: add connector test suite (Block D) — 5 files, 58 tests 2026-04-12 11:12:50 +02:00
test_memory_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_metadata_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_metrics.py fix: address code review — path injection, multi-table search, metrics import API, error handling 2026-04-10 19:56:00 +02:00
test_migration.py fix: replace os.environ direct assignment with monkeypatch.setenv in test fixtures 2026-04-09 07:11:36 +02:00
test_openapi_snapshot.py feat: multi-instance deployment — all 14 must-have items from spec 2026-04-10 11:57:42 +02:00
test_openmetadata_client.py Implement OpenMetadata catalog integration (Phase 1) 2026-03-12 14:07:13 +01:00
test_openmetadata_enricher.py refactor: delete old sync pipeline — 9,500 lines removed 2026-03-31 07:50:37 +02:00
test_openmetadata_transformer.py docs,tests: anonymize customer references 2026-04-21 11:56:19 +02:00
test_orchestrator.py test: add missing coverage for web UI, Jira extract, instance config, and concurrent rebuild 2026-04-09 07:15:14 +02:00
test_password_flows.py feat(auth): password reset & invite flows for web + admin (#34) (#37) 2026-04-22 17:43:57 +02:00
test_pat.py feat(auth): Google Workspace groups on /profile + tag-triggered Keboola deploy workflow (#56) 2026-04-26 00:56:44 +02:00
test_permissions.py fix: replace os.environ direct assignment with monkeypatch.setenv in test fixtures 2026-04-09 07:11:36 +02:00
test_permissions_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_profiler.py Initial commit: OSS data distribution platform 2026-03-08 23:31:28 +01:00
test_rbac.py fix: replace os.environ direct assignment with monkeypatch.setenv in test fixtures 2026-04-09 07:11:36 +02:00
test_remote_query.py fix: BQ COUNT subquery alias, wrap ImportError in RemoteQueryError 2026-04-11 20:29:03 +02:00
test_repositories.py fix: replace os.environ direct assignment with monkeypatch.setenv in test fixtures 2026-04-09 07:11:36 +02:00
test_role_resolver.py release(0.11.3): internal roles + external→internal group mapping (foundation) (#71) 2026-04-26 23:49:10 +02:00
test_scheduler.py Support multiple daily sync times (e.g., "daily 07:00,13:00,18:00") 2026-03-16 23:09:48 +01:00
test_scheduler_full.py test: add Block C services tests (68 tests across 6 files) 2026-04-12 11:11:48 +02:00
test_scripts_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_security.py fix: address code review findings — duplicate fixture, JWT key length, async deprecation 2026-04-13 13:47:51 +02:00
test_selective_gzip.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_session_collector.py test: add Block C services tests (68 tests across 6 files) 2026-04-12 11:11:48 +02:00
test_settings_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_setup_instructions.py release(2.1.0): durable sync, CLI auto-update, versioned wheel URL, version unification (#43) 2026-04-22 21:18:18 +02:00
test_telegram_api.py test: add telegram API endpoint tests (verify, unlink, status) 2026-04-12 14:12:28 +02:00
test_telegram_bot.py fix: address code review findings — duplicate fixture, JWT key length, async deprecation 2026-04-13 13:47:51 +02:00
test_telegram_storage.py test: add Block C services tests (68 tests across 6 files) 2026-04-12 11:11:48 +02:00
test_upload_api.py test: add 132 API gap tests across 8 endpoint modules 2026-04-12 11:13:24 +02:00
test_user_management.py User management + PAT + CLI distribution + HTML auth redirect (#9 #10 #11 #12) (#28) 2026-04-22 14:24:28 +02:00
test_web_ui.py feat(auth): Google Workspace groups on /profile + tag-triggered Keboola deploy workflow (#56) 2026-04-26 00:56:44 +02:00
test_ws_gateway.py test: add Block C services tests (68 tests across 6 files) 2026-04-12 11:11:48 +02:00