Commit graph

10 commits

Author SHA1 Message Date
Vojtech
c552bf8243
feat(api): enforce API design rules via pytest + fix DELETE/status-code violations (#338)
* feat(api): enforce API design rules via pytest + fix DELETE/status-code violations

Adds tests/test_api_design_rules.py with four forward-only design guardrails
that prevent new endpoints from accumulating REST debt:

  Rule 1 — No new verbs in URL paths (existing 28 grandfathered via allowlist)
  Rule 2 — DELETE must declare 204 No Content (zero allowlist entries)
  Rule 3 — Creator POSTs (path has GET counterpart) must declare 201/202
  Rule 4 — All protected /api/* routes must declare 401 and 403

Fixes found by running the rules:

- DELETE /api/admin/metrics/{metric_id}: return 204, drop redundant body
- DELETE /api/memory/{item_id}/dismiss (undismiss): return 204, drop body
- POST /api/memory/admin/contradictions: add status_code=201 (creates a resource)
- app/main.py: _add_auth_error_responses() injected into app.openapi() at startup;
  declares 401/403 on all protected /api/* operations centrally, fixing the 120
  routes that previously omitted these response codes from the spec.

Closes #337

* fix(api): resolve CI failures — extend 204 fixes + complete allowlists

- Fix remaining 6 DELETE endpoints to return 204: store entities,
  store entity install, marketplace curated install, marketplace plugin
  system flag, admin store submission, and observability view
- Update all affected tests to expect 204 (removed body assertions)
- Add 4 missing verb paths to _VERB_PATH_ALLOWLIST in test_api_design_rules.py
- Add 2 upsert endpoints to _CREATOR_POST_ALLOWLIST
- Update admin_marketplaces.html to not call r.json() on 204 DELETE

* fix(tests): align 2 DELETE-asserting tests with 204 contract (post-#339 rebase)

CI's test-shard (1) and (4) failures on this PR were caused by
Vojta's second commit (`fix(api): resolve CI failures — extend 204
fixes`) flipping more DELETE endpoints to status_code=204 than just
the two mentioned in the PR body. Two tests assert status_code==200
on the DELETE response and broke:

- tests/test_admin_store_submissions.py::TestQuarantineGates::test_admin_can_delete_quarantined
  (DELETE /api/store/entities/{entity_id})
- tests/test_store_api.py::TestInstallCycle::test_admin_hard_delete_cascades_installs
  (DELETE /api/store/entities/{entity_id}?hard=true)

Updated both to assert 204 with a comment pointing at
tests/test_api_design_rules.py rule 2 so future reviewers can
trace the contract. Verified via broader scan that no other test
asserts == 200 on a .delete() response directly (4 other sites do
.delete() then check 200 on a subsequent GET — those are fine).

* release: 0.54.26 — API design rules (test_api_design_rules.py) + 8 DELETE endpoints flip to 204

---------

Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
2026-05-18 15:25:07 +02:00
ZdenekSrotyr
9e948abc9c
release(0.54.18): Curated Memory restructure + per-user Dismiss + bundled adversarial-review fixes (#316/#320/#322) (#324)
* feat(web): Curated Memory restructure + per-user Dismiss + filter-state utility

Squashed from cvrysanek/zsrotyr's 4-commit PR branch + rebased onto
current main + CHANGELOG bullets spliced into [Unreleased] (preserves
existing #316/#320/#322 entries that landed on main since the branch
was authored).

Routes + access:
- /corporate-memory now user-facing (get_current_user), in primary
  nav next to "Data Packages" — same gate as /api/memory/*.
- /admin/corporate-memory is the new admin review queue location
  (was /corporate-memory/admin); reached via Admin dropdown. Template
  renamed: corporate_memory_admin.html → admin_corporate_memory.html.

Visual chrome:
- Both pages migrate to shared _page_hero.html blue hero band.

Per-user Dismiss (new feature, schema v46):
- knowledge_item_user_dismissed(user_id, item_id, dismissed_at) + index.
- POST /api/memory/{id}/dismiss + DELETE (idempotent).
- Mandatory items can never be dismissed — enforced at 2 layers.
- GET /api/memory: hide_dismissed=false default + dismissed_by_me flag.
- GET /api/memory/bundle: always excludes dismissed for the caller.
- UI: Dismiss/Undismiss button per item (hidden for mandatory),
  gray-out + line-through for dismissed rows, Hide-dismissed toggle.

Admin edit modal:
- Category as <select> + "Add new category…" reveal.
- Audience as <select> with (unset)/all/group:<name> from RBAC.
- Tags: full tag-input widget (pills, ×-remove, Backspace pop,
  Enter/comma to add, ↑/↓ typeahead from EXISTING_TAGS).

Bulk-edit modal pickers (closes #128):
- Move-to-category / Add-tag: <select> + add-new.
- Set-audience: <select> (no more typo-able 'gourp:eng').
- Remove-tag: closed-set picker.

FilterState utility:
- app/web/static/js/filter-state.js — save/load/clear/bindInputs
  for per-page localStorage filter state. Adopted on /corporate-memory.

E2E verified live on a real VM through the API + browser flow.

* release: 0.54.18 — Curated Memory restructure + 4 adversarial-review fixes

Bundles together:
- #316  fix(store): surface review failures + harden publish gate
        (BREAKING fail-CLOSED guardrail, override v2+ promote, restore guard,
        retry/rescan staged-bundle, banner widening, LLM truncation retry)
- #320  fix(store): C2 bundle export RBAC + H2 per-entity write lock +
        H3 update_status compare-and-swap with bg_verdict_skipped audit
- #322  fix(store): M1 prompt sentinel filename escape + M2 atomic
        promote_to_version helper + L1 admin forensic download per-version
- #324  Curated Memory restructure + per-user Dismiss + FilterState utility

Bump from 0.54.17 → 0.54.18 (patch — pre-1.0 policy: every cycle is patch).
2026-05-15 18:51:05 +02:00
ZdenekSrotyr
70672204fe
feat(memory): admin Edit + MEMORY_DOMAIN RBAC + ai-section UI (#141)
Cuts release 0.23.0.

## Highlights
- Single-item Edit button on every memory item card (modal hits PATCH /api/memory/admin/{id}).
- MEMORY_DOMAIN RBAC resource type — admins grant user_groups access to specific domains via /admin/access. Composes with existing audience filter (OR semantics, no-op when no grants).
- ai: section editable in /admin/server-config — admins can set ANTHROPIC_API_KEY / model / provider / base_url for the corporate-memory extractor without editing instance.yaml directly. api_key auto-masked.

## Devin findings addressed
- Modal NULL→empty fix (audience visibility wouldn't break).
- Stats endpoint granted_domains parity with list endpoint.
- Documented intentional MEMORY_DOMAIN→audience bypass.
- Documented conscious ai.base_url SSRF exclusion (legit internal LiteLLM/vLLM proxies).

See CHANGELOG [0.23.0] for full notes.
2026-04-30 11:04:41 +02:00
ZdenekSrotyr
82c5d71d63
feat(memory): #62 — duplicate hints + tree-view + bulk-edit (#126)
Issue #62. Tree view with cross-axis filtering, duplicate-candidate hints (Jaccard score on entity overlap), bulk-edit endpoints (PATCH /api/memory/admin/{id} + POST /api/memory/admin/bulk-update), schema v17 (knowledge_item_relations), full CLI parity (da admin memory tree/edit/bulk-edit/duplicates list/resolve).
2026-04-29 13:55:15 +02:00
PavelDo
e1108b6112
feat(memory): corporate memory v1+v1.5 + 0.15.0 (#72)
Adds corporate memory v1 (verification flywheel + contradiction detection + confidence scoring) and v1.5 (audience-based distribution + per-item privacy + admin curation). Server: GET /api/memory/bundle returns mandatory + ranked-approved items within a token budget; POST /api/memory/admin/mandate accepts an audience field gated against user_group_members; /api/memory/stats uses SQL aggregation. CLI: da sync writes received items to .claude/rules/km_*.md. Verification detector extracts knowledge candidates from session JSONL files. Auto-tagging via Haiku when ai: is configured. Adapted from the v9-era branch onto v13/v14 RBAC: _is_privileged_viewer + _effective_groups now query user_group_members JOIN user_groups; require_role(Role.KM_ADMIN) replaced with require_admin (km_admin collapsed into admin). Schema v15: knowledge_items context-engineering columns + knowledge_contradictions + session_extraction_state. Schema v16: verification_evidence. Cuts release v0.15.0 (also bundles #116 /me/debug page).
2026-04-29 07:16:22 +02:00
ZdenekSrotyr
5f6bb7a4b2
fix(security+ops) + release(0.12.1): #82 #85 #87 hardening + cut 0.12.1 (#104)
* fix(security+ops): #82 #85 #87 — auth hardening, API validation, deploy posture

Security and operational hardening across three issue groups:

- M23: docker-compose.override.yml → docker-compose.dev.yml (BREAKING, prod foot-gun)
- C13: Container runs as non-root user 'agnes' (USER directive in Dockerfile)
- M21: Docker resource limits (mem_limit, cpus) on app + scheduler
- M22: Caddyfile security headers (X-Frame-Options, X-Content-Type-Options, Referrer-Policy, -Server)
- M17: /api/health split into minimal (unauth) + /api/health/detailed (auth) (BREAKING)
- M26: release.yml restricts build-and-push to main + workflow_dispatch; paths-ignore for docs

- C2: table_id traversal validation on /api/data/{table_id}/download
- M4: Upload streaming (chunk-read + temp file) instead of full-buffer; /local-md hashed filename

- C5: reset_token removed from POST /api/users/{id}/reset-password response
- C8: Startup WARNING when no user has password_hash (bootstrap window visible)
- M9: Audit log on failed web form login (mirrors /auth/token endpoint)
- M10: Atomic magic-link consume via compare-and-swap (CONSUMED: marker + DuckDB conflict catch)

Also: SSRF protection on /api/admin/configure (#46), memory stats SQL aggregation (#90)

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(review): SSRF 169.254.x.x + IPv6 multicast; M10 marker cleanup safety

Review fixes:
- Add 169.254.0.0/16 (link-local, cloud metadata) to SSRF regex — was
  missing, allowing requests to AWS/GCP/Azure metadata endpoints
- Add ff[0-9a-f]{2}: (IPv6 multicast) to SSRF regex
- M10: wrap Step 3 (CONSUMED marker cleanup) in try-except with
  warning log — prevents unhandled exception if DB write fails after
  successful token consumption
- Add test for 169.254.169.254 SSRF rejection

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(review): SSRF IPv6 bypass, CLI health endpoint, upload FD leak

Address Devin Review findings on PR #104:

1. SSRF IPv6 bypass: Replace hostname regex with DNS resolution +
   ipaddress module checks. The old regex patterns like `fe80:` only
   matched up to the first colon, missing real IPv6 addresses like
   `fe80::1`, `fc00::1`, `ff02::1`. The new approach resolves the
   hostname via getaddrinfo and checks each resulting IP against
   ipaddress.is_private/is_loopback/is_link_local/is_reserved/is_multicast.

2. CLI commands broken: `da setup test-connection`, `da setup verify`,
   `da diagnose`, `da status` all called /api/health expecting the old
   format (status=="healthy", services dict). Now they call
   /api/health/detailed for service-level checks (with graceful fallback
   to the minimal endpoint when auth is not configured).

3. Temp file handle leak: _stream_to_temp returns an open
   NamedTemporaryFile; callers now close it before shutil.move() to
   prevent FD leaks until GC.

Also adds IPv6 SSRF test cases (loopback, link-local, unique-local,
multicast) with mocked DNS resolution for test environment independence.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(review): download regex blocks hyphenated IDs; document health split

Address Devin Review round-3 findings on PR #104:

1. _SAFE_IDENTIFIER regex blocked hyphenated table IDs: The download
   endpoint used the strict SQL-identifier regex which does not allow
   dots or hyphens, but Keboola table IDs like in.c-crm.orders
   contain both. Switched to _SAFE_QUOTED_IDENTIFIER which allows dots
   and hyphens while still blocking path-traversal chars (/, .., \)
   and quote/control characters. Added test for hyphenated/dotted IDs.

2. Documented health endpoint split in DEPLOYMENT.md: Added Health
   checks & external monitoring section explaining both endpoints
   (minimal unauth /api/health vs authenticated /api/health/detailed)
   and how to wire external monitoring tools to the detailed endpoint
   with a PAT.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* release(0.12.1): cut hotfix for snapshot integrity + #82/#85/#87 hardening

* fix(security): apply CAS pattern to password reset confirm (#82/M10 follow-up)

Devin review on the rebased PR flagged the asymmetry: magic-link verify
got the atomic compare-and-swap pattern in the original M10 fix, but
password reset confirm at /auth/password/reset/confirm was still using
read-validate-clear. Two concurrent POSTs with the same valid reset
token could both succeed in setting different new passwords (last-write-
wins). Lower severity than the magic-link race because the attacker
would need the reset token AND to race the legitimate user, but the
asymmetry was a polish gap.

Mirrors app/auth/providers/email.py::_consume_token CAS exactly: write
unique CONSUMED:<random> marker via UPDATE...WHERE token=old_token, then
SELECT to verify our marker won, then proceed. Only the winner clears
the marker and applies the password change.

New regression test_concurrent_reset_only_one_wins in
tests/test_password_flows.py::TestResetConfirm pins the contract: two
ThreadPoolExecutor workers + Barrier hit /reset/confirm with the same
token; exactly one gets 302 (password applied), the other gets 200 with
'Invalid or expired'. Sanity-checked against the pre-CAS code — both
POSTs got 302 (race confirmed).

---------

Co-authored-by: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-04-28 19:57:30 +02:00
ZdenekSrotyr
e9d7af3cce feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening
This squashes 13 commits from ma/staging plus a small docstring translation
into a single coherent unit. Three workstreams.

== RBAC v13 redesign ==
- Drops core.viewer/analyst/km_admin/admin hierarchy and the
  internal_roles / group_mappings / user_role_grants / plugin_access tables.
- Replaced by user_group_members + resource_grants. Atomic v12→v13 backfill
  wrapped in BEGIN/COMMIT; ROLLBACK leaves schema_version at 12 for retry.
- Two authorization primitives in app.auth.access:
    require_admin                        — Admin-group god-mode
    require_resource_access(rt, "{path}") — entity-scoped grants
  Single DB lookup per request; no session cache; no implies BFS.
- /admin/access UI (single page) replaces /admin/role-mapping +
  /admin/plugin-access. CLI `da admin group/grant *` replaces
  `da admin role/mapping/grant-role/revoke-role/effective-roles`.
- ResourceType.TABLE listing-only — admins can record table grants,
  runtime enforcement still flows through legacy dataset_permissions
  (migration plan in docs/TODO-rbac-data-enforcement.md).

== Claude Code marketplace ==
- Aggregated /marketplace.zip + /marketplace.git/* (PAT-gated,
  RBAC-filtered, content-addressed cache via dulwich).
- Admin god-mode dropped on the marketplace surface — admins curate
  their own view via grants like everyone else.
- Bare-repo cache materializes per RBAC-filtered ETag; stale entries
  not pruned in this iteration (disclaimed in git_backend.py docstring).

== #81 #83 #44 security/ops hardening ==
- #81 Group A — orchestrator ATTACH allow-listing (extension/url/alias).
- #81 Group B — Keboola extractor 3-state exit codes:
    0 success / 1 total fail / 2 PARTIAL fail
  Sync API logs PARTIAL FAILURE alert on exit 2. Operators with binary
  alerting must teach it the new partial signal.
- #81 Group C — schema v10 view_ownership; rejects silent overwrite
  of a prior connector's view name on collision.
- #81 Group D — extractor-side identifier validation.
- #83 — Jira webhook fail-closed when JIRA_WEBHOOK_SECRET unset
  + path-traversal fix.
- #44 — entire /api/scripts/* surface is admin-only (planted-script +
  sandbox-bypass risk closed).

== Web UI polish + deploy fix ==
- /admin/access: live grant-count badges (no stale snapshot revert),
  shared-header CSS link added to /catalog and /admin/{tables,permissions},
  per-resource-type colored stripes.
- docker-compose.host-mount.yml: bind,rbind so dual-disk hosts don't
  silently shadow sub-mounts and write state to the wrong disk.

== OSS vendor-neutralization (waves 1+2) ==
- scripts/grpn/ → scripts/ops/. Customer-specific identifiers
  (project IDs, internal hostnames, dev/prod VM IPs, brand names)
  replaced with placeholders across code, docs, Terraform, Caddyfile,
  OAuth probe, and planning docs. Downstream infra repos that copied
  scripts/grpn/agnes-tls-rotate.sh or agnes-auto-upgrade.sh must
  update the path.

== Translation ==
- src/repositories/user_groups.py::ensure_system docstring translated
  from Czech to English for codebase consistency.

Co-authored-by: Mina Rustamyan <mina@keboola.com>
2026-04-28 14:25:04 +02:00
ZdenekSrotyr
471982d3f9 fix: route admin_edit through KnowledgeRepository.update instead of raw SQL 2026-04-09 18:42:52 +02:00
ZdenekSrotyr
1287e63ed9 feat: complete system — web UI, all API endpoints, governance, admin, CLI commands
Major additions:
- Web UI: Jinja2 templates in FastAPI (login, dashboard, catalog, corporate memory, admin)
- API: catalog profiles/metrics, telegram verify/unlink/status, admin table registry CRUD
- Corporate memory governance: approve/reject/mandate/revoke/edit/batch + audit log
- Sync: real DataSyncManager trigger, sync-settings, table-subscriptions
- CLI: setup (init/test/deploy/verify), server (logs/restart/deploy/backup), explore
- Instance config integration (instance.yaml loaded at startup)
- 140 tests passing (25 new)
2026-03-27 16:52:22 +01:00
ZdenekSrotyr
a3918d3833 feat: add FastAPI server with auth, RBAC, and all API endpoints
- JWT auth with role-based access control (viewer/analyst/admin/km_admin)
- Endpoints: health, sync manifest, data download, query, users CRUD,
  corporate memory, session/artifact upload
- 18 API tests covering auth, RBAC, all endpoints
2026-03-27 15:19:18 +01:00