fix(api): harden API surface before Swagger (issue #336) (#339)

* fix(api): harden API surface before Swagger — 9 findings from issue #336

ADV-001: POST /api/sync/table-subscriptions now checks can_access() per
table entry, matching the gate already on POST /api/sync/settings.

ADV-002: GET /webhooks/jira/health gated behind require_admin; jira_domain
removed from response to prevent anonymous info disclosure.

ADV-003: GET /api/version no longer exposes commit_sha or schema_version.

ADV-005: /docs, /redoc, /openapi.json now require a valid session via custom
FastAPI routes (docs_url=None, redoc_url=None, openapi_url=None).

ADV-006: /cli/ and /webhooks/ added to _API_PATH_PREFIXES so future
auth-gated routes there return JSON 401 not an HTML redirect.

ADV-007: GET /api/catalog/tables wired to CatalogTablesResponse model.

ADV-008: TableSubscriptionUpdate.tables capped at max_length=500.

ADV-009: GET /api/users and GET /auth/admin/tokens accept limit/offset
(default 1000, max 10000); repositories updated accordingly.

Tests: 11 new regression tests in TestApiHardening336; test_jira_webhooks
fixture updated with seeded admin user; OpenAPI snapshot regenerated.

* fix(test): update test_journey_jira health check to use admin auth after ADV-002 gate

* fix(security): close /auth/bootstrap auth-bypass + BREAKING markers on ADV-002/003/005

Reviewer-flagged regression introduced by ADV-009's pagination on
UserRepository.list_all(): the silent default LIMIT 1000 broke the
bootstrap check at app/auth/router.py and the startup no-password
warning at app/main.py — both call list_all() with no args and depend
on exhaustive enumeration.

On an instance with >1000 users where no password-holder lands in
the email-sorted first page, [u for u in list_all() if
u.get('password_hash')] becomes empty → bootstrap re-opens → an
unauthenticated caller can claim admin via /auth/bootstrap. Real
auth-bypass on a security-sensitive boot path.

Fix:
- src/repositories/users.py: list_all() restored to no-arg, returns
  EVERY row (no LIMIT). Comment explicitly warns against re-adding
  pagination here. API-surface pagination moved to a new
  list_paginated(limit, offset) method with its own docstring.
- app/api/users.py: GET /api/users now calls list_paginated().
  Existing query-param validation (limit <= 10000) preserved.

Regression guards in tests/test_security.py::TestApiHardening336:
- test_users_list_all_returns_every_row_no_silent_limit asserts
  list_all() takes no params other than self (via inspect.signature)
  so a future cleanup can't accidentally re-add limit/offset.
- test_users_list_paginated_is_separate_method asserts the
  paginated variant is a distinct method, not an overload.

CHANGELOG: added **BREAKING** markers per CLAUDE.md release
discipline to three pre-existing ADV bullets that are observable
breaking changes for external consumers:
- ADV-002 (webhook health going from anonymous to admin-only)
- ADV-003 (/api/version dropping commit_sha + schema_version)
- ADV-005 (/docs, /redoc, /openapi.json going from anonymous to
  session-required)

* release: 0.54.25 — API hardening before Swagger (ADV-001..009) + bootstrap-bypass regression fix

---------

Co-authored-by: ZdenekSrotyr <zdenek.srotyr@keboola.com>
This commit is contained in:
Vojtech 2026-05-18 17:13:21 +04:00 committed by GitHub
parent bc5956ac94
commit c5948f26fc
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
15 changed files with 12045 additions and 320 deletions

View file

@ -10,6 +10,36 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C
## [Unreleased]
## [0.54.25] — 2026-05-18
### Fixed
- `POST /api/sync/table-subscriptions` now enforces the same RBAC gate as
`POST /api/sync/settings` — authenticated users can no longer subscribe to
tables they have no `resource_grants` row for (ADV-001, issue #336).
- **BREAKING:** `GET /webhooks/jira/health` is now admin-only; `jira_domain`
removed from the response to prevent anonymous information disclosure
(ADV-002). Uptime monitors that polled this endpoint anonymously must now
attach an admin PAT or switch to `/api/health` (which remains public).
- **BREAKING:** `GET /api/version` no longer exposes `commit_sha` or
`schema_version` — only `version`, `channel`, `image_tag`, `deployed_at`
remain (ADV-003). Deploy scripts / dashboards scraping the removed fields
must either authenticate against a (separate, forthcoming) admin endpoint
or read them from the GHCR image labels.
- **BREAKING:** `/docs`, `/redoc`, and `/openapi.json` now require a valid
session — the full admin API surface is no longer visible to
unauthenticated requests (ADV-005). CLI tools generating client code from
the schema must attach a PAT or use an authenticated browser session.
### Changed
- `/cli/` and `/webhooks/` prefixes added to `_API_PATH_PREFIXES` so any
future auth-gated endpoint under those paths returns JSON `401` rather than
an HTML redirect (ADV-006).
- `GET /api/users` and `GET /auth/admin/tokens` accept `limit` (default 1000,
max 10 000) and `offset` query parameters; `POST /api/sync/table-subscriptions`
now rejects `tables` dicts with more than 500 entries (ADV-008, ADV-009).
- `GET /api/catalog/tables` now has a typed `response_model` (`CatalogTablesResponse`)
so Swagger generates an accurate schema for that endpoint (ADV-007).
### Internal
- Added `TestFullLifecycleFromInstaller` integration test class
(`tests/test_store_entity_versions.py`) covering the full

View file

@ -1,8 +1,10 @@
"""Catalog endpoints — table profiles, metrics."""
import json
from typing import List, Optional
from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel
import duckdb
from app.auth.dependencies import get_current_user, _get_db
@ -13,6 +15,20 @@ from src.rbac import can_access_table
router = APIRouter(prefix="/api/catalog", tags=["catalog"])
class CatalogTableItem(BaseModel):
id: str
name: str
description: Optional[str] = None
source_type: Optional[str] = None
sync_strategy: Optional[str] = None
query_mode: str = "local"
class CatalogTablesResponse(BaseModel):
tables: List[CatalogTableItem]
count: int
@router.get("/profile/{table_name}")
async def get_table_profile(
table_name: str,
@ -40,7 +56,7 @@ async def get_table_profile(
return profile
@router.get("/tables")
@router.get("/tables", response_model=CatalogTablesResponse)
async def list_catalog_tables(
user: dict = Depends(get_current_user),
conn: duckdb.DuckDBPyConnection = Depends(_get_db),

View file

@ -491,7 +491,5 @@ async def version_info():
"version": os.environ.get("AGNES_VERSION", "dev"),
"channel": os.environ.get("RELEASE_CHANNEL", "dev"),
"image_tag": os.environ.get("AGNES_TAG", "unknown"),
"commit_sha": os.environ.get("AGNES_COMMIT_SHA", "unknown"),
"schema_version": SCHEMA_VERSION,
"deployed_at": _DEPLOYED_AT,
}

View file

@ -11,11 +11,12 @@ import json
import logging
from datetime import datetime, timezone
from fastapi import APIRouter, Request, Response
from fastapi import APIRouter, Depends, Request, Response
from fastapi.responses import JSONResponse
import re
from app.auth.access import require_admin as _require_admin
from connectors.jira.service import Config, get_jira_service
from connectors.jira.validation import is_valid_issue_key, safe_join_under
@ -182,13 +183,12 @@ async def receive_jira_webhook(request: Request) -> Response:
@router.get("/jira/health")
async def jira_webhook_health() -> dict:
"""Health check for Jira webhook endpoint."""
async def jira_webhook_health(user: dict = Depends(_require_admin)) -> dict:
"""Health check for Jira webhook endpoint. Admin-only: exposes secret presence."""
jira_service = get_jira_service()
return {
"status": "ok",
"configured": jira_service.is_configured(),
"webhook_secret_set": bool(Config.JIRA_WEBHOOK_SECRET),
"jira_domain": Config.JIRA_DOMAIN or "(not set)",
}

View file

@ -12,7 +12,7 @@ from pathlib import Path
from typing import Any, Optional, List
from fastapi import APIRouter, Body, Depends, HTTPException, BackgroundTasks
from pydantic import BaseModel
from pydantic import BaseModel, Field
import duckdb
from app.auth.access import require_admin
@ -1025,7 +1025,7 @@ async def update_sync_settings(
class TableSubscriptionUpdate(BaseModel):
table_mode: str = "all" # "all" or "explicit"
tables: dict = {} # {table_name: bool}
tables: dict = Field(default_factory=dict, max_length=500) # {table_name: bool}
@router.get("/table-subscriptions")
@ -1045,8 +1045,21 @@ async def update_table_subscriptions(
user: dict = Depends(get_current_user),
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
):
"""Update per-table subscription preferences."""
"""Update per-table subscription preferences.
Mirrors the RBAC gate in POST /settings: a table can only be subscribed
to when the user holds a resource_grants row for it (or is Admin). This
prevents an authenticated user from subscribing to tables they cannot read.
"""
from app.auth.access import can_access
from app.resource_types import ResourceType
repo = SyncSettingsRepository(conn)
results = {}
for table_name, enabled in request.tables.items():
if not can_access(user["id"], ResourceType.TABLE.value, table_name, conn):
results[table_name] = {"error": "no permission"}
continue
repo.set_dataset_enabled(user["id"], table_name, enabled)
return {"table_mode": request.table_mode, "updated": len(request.tables)}
results[table_name] = {"enabled": enabled}
return {"table_mode": request.table_mode, "updated": results}

View file

@ -6,7 +6,7 @@ from datetime import datetime, timezone, timedelta
from typing import Optional, List
import duckdb
from fastapi import APIRouter, Depends, HTTPException
from fastapi import APIRouter, Depends, HTTPException, Query
from pydantic import BaseModel
from app.auth.access import require_admin
@ -207,10 +207,12 @@ async def revoke_token(
@admin_router.get("", response_model=List[AdminTokenItem])
async def admin_list_tokens(
limit: int = Query(default=1000, ge=1, le=10000),
offset: int = Query(default=0, ge=0),
user: dict = Depends(require_admin),
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
):
return [_row_to_admin_item(r) for r in AccessTokenRepository(conn).list_all_with_user()]
return [_row_to_admin_item(r) for r in AccessTokenRepository(conn).list_all_with_user(limit=limit, offset=offset)]
@admin_router.delete("/{token_id}", status_code=204)

View file

@ -7,7 +7,7 @@ from datetime import datetime, timezone
from typing import Optional, List
import duckdb
from fastapi import APIRouter, Depends, HTTPException, Request
from fastapi import APIRouter, Depends, HTTPException, Query, Request
from pydantic import BaseModel
from argon2 import PasswordHasher
@ -228,10 +228,12 @@ def _set_admin_membership(
@router.get("", response_model=List[UserResponse])
async def list_users(
limit: int = Query(default=1000, ge=1, le=10000),
offset: int = Query(default=0, ge=0),
user: dict = Depends(require_admin),
conn: duckdb.DuckDBPyConnection = Depends(_get_db),
):
return [_to_response(u, conn) for u in UserRepository(conn).list_all()]
return [_to_response(u, conn) for u in UserRepository(conn).list_paginated(limit=limit, offset=offset)]
@router.get("/{user_id}", response_model=UserResponse)

View file

@ -36,7 +36,7 @@ setup_logging("app")
from app.version import APP_VERSION, MIN_COMPAT_CLI_VERSION
from fastapi import FastAPI
from fastapi import Depends, FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import RedirectResponse
from fastapi.staticfiles import StaticFiles
@ -354,6 +354,12 @@ def create_app() -> FastAPI:
description="Data distribution platform for AI analytical systems",
version=APP_VERSION,
lifespan=lifespan,
# Swagger UI / OpenAPI JSON gated behind authentication — custom
# routes added below before the web_router catch-all. Setting these
# to None disables FastAPI's default unauthenticated endpoints.
docs_url=None,
redoc_url=None,
openapi_url=None,
# Intentionally NOT debug=DEBUG: FastAPI's debug=True installs
# Starlette's ServerErrorMiddleware which intercepts unhandled
# Exceptions and renders a plain-HTML traceback BEFORE our
@ -691,6 +697,27 @@ def create_app() -> FastAPI:
from a2wsgi import WSGIMiddleware
app.mount("/marketplace.git", WSGIMiddleware(make_git_wsgi_app()))
# Authenticated Swagger / ReDoc / OpenAPI JSON — requires a valid session
# so the full admin API surface is not visible to unauthenticated callers.
# Must be registered before web_router (catch-all). /openapi.json is also
# added to _API_PATH_PREFIXES below so auth failures return JSON 401
# rather than an HTML redirect.
from fastapi.openapi.docs import get_swagger_ui_html, get_redoc_html
from fastapi.responses import HTMLResponse as _HTMLResponse
from app.auth.dependencies import get_current_user as _get_current_user
@app.get("/docs", include_in_schema=False, response_class=_HTMLResponse)
async def swagger_ui(user: dict = Depends(_get_current_user)):
return get_swagger_ui_html(openapi_url="/openapi.json", title="Agnes API")
@app.get("/redoc", include_in_schema=False, response_class=_HTMLResponse)
async def redoc_ui(user: dict = Depends(_get_current_user)):
return get_redoc_html(openapi_url="/openapi.json", title="Agnes API — ReDoc")
@app.get("/openapi.json", include_in_schema=False)
async def openapi_spec(user: dict = Depends(_get_current_user)):
return app.openapi()
# Web UI router (must be last — has catch-all routes)
app.include_router(web_router)
@ -699,6 +726,9 @@ def create_app() -> FastAPI:
_API_PATH_PREFIXES: tuple[str, ...] = (
"/api/",
"/auth/",
"/cli/",
"/openapi.json",
"/webhooks/",
"/marketplace.zip",
"/marketplace.git",
"/marketplace/",

View file

@ -1,6 +1,6 @@
[project]
name = "agnes-the-ai-analyst"
version = "0.54.24"
version = "0.54.25"
description = "Agnes — AI Data Analyst platform for AI analytical systems"
requires-python = ">=3.11,<3.14"
license = "MIT"

View file

@ -60,7 +60,7 @@ class AccessTokenRepository:
columns = [desc[0] for desc in self.conn.description]
return [dict(zip(columns, r)) for r in rows]
def list_all_with_user(self) -> List[Dict[str, Any]]:
def list_all_with_user(self, limit: int = 1000, offset: int = 0) -> List[Dict[str, Any]]:
"""Admin view: all tokens joined with the owning user's email.
Returns dict rows including every column of `personal_access_tokens`
@ -73,7 +73,9 @@ class AccessTokenRepository:
FROM personal_access_tokens t
LEFT JOIN users u ON u.id = t.user_id
ORDER BY t.created_at DESC
"""
LIMIT ? OFFSET ?
""",
[limit, offset],
).fetchall()
if not rows:
return []

View file

@ -25,12 +25,43 @@ class UserRepository:
return self._row_to_dict(result)
def list_all(self) -> List[Dict[str, Any]]:
results = self.conn.execute("SELECT * FROM users ORDER BY email").fetchall()
"""Return EVERY user row. Used by bootstrap-lock + startup
warning paths that need to inspect the whole table (see
``app/auth/router.py::bootstrap`` and ``app/main.py``'s
no-password-set warning). Do NOT add a LIMIT here the
bootstrap check ``[u for u in list_all() if u.get('password_hash')]``
re-opens the endpoint if any password-holder gets paginated
out, which would let an unauthenticated caller claim admin
on instances with >LIMIT users. API-surface pagination uses
``list_paginated()`` below.
"""
results = self.conn.execute(
"SELECT * FROM users ORDER BY email"
).fetchall()
if not results:
return []
columns = [desc[0] for desc in self.conn.description]
return [dict(zip(columns, row)) for row in results]
def list_paginated(self, limit: int = 1000, offset: int = 0) -> List[Dict[str, Any]]:
"""Paginated user listing for the admin API surface (#336
ADV-009). Safe to bound callers explicitly opt into the
windowed shape and the API enforces ``limit <= 10000`` at
the Query()-validation layer. Do not call from bootstrap /
startup paths that need exhaustive enumeration; use
``list_all()`` for those.
"""
results = self.conn.execute(
"SELECT * FROM users ORDER BY email LIMIT ? OFFSET ?", [limit, offset]
).fetchall()
if not results:
return []
columns = [desc[0] for desc in self.conn.description]
return [dict(zip(columns, row)) for row in results]
def count_all(self) -> int:
return self.conn.execute("SELECT COUNT(*) FROM users").fetchone()[0]
def create(
self,
id: str,

File diff suppressed because it is too large Load diff

View file

@ -18,12 +18,15 @@ def _sign(payload: bytes, secret: str) -> str:
@pytest.fixture()
def webhook_client(tmp_path, monkeypatch):
"""Create a TestClient with required env vars and dirs."""
"""Create a TestClient with required env vars, dirs, and a seeded admin user."""
data_dir = tmp_path / "data"
data_dir.mkdir()
(data_dir / "issues").mkdir()
state_dir = tmp_path / "state"
state_dir.mkdir()
monkeypatch.setenv("DATA_DIR", str(data_dir))
monkeypatch.setenv("STATE_DIR", str(state_dir))
monkeypatch.setenv("JWT_SECRET_KEY", "test-jwt-secret")
monkeypatch.setenv("JIRA_WEBHOOK_SECRET", "test-webhook-secret")
monkeypatch.setenv("JIRA_DATA_DIR", str(data_dir))
@ -36,32 +39,53 @@ def webhook_client(tmp_path, monkeypatch):
# Reset singleton so it picks up fresh Config values
svc._jira_service = None
# Reimport app to pick up router
from src.db import SYSTEM_ADMIN_GROUP, get_system_db
from src.repositories.user_group_members import UserGroupMembersRepository
from src.repositories.users import UserRepository
from app.auth.jwt import create_access_token
from app.main import create_app
conn = get_system_db()
UserRepository(conn).create(id="wh_admin", email="whadmin@test.com", name="WH Admin")
admin_gid = conn.execute(
"SELECT id FROM user_groups WHERE name = ?", [SYSTEM_ADMIN_GROUP]
).fetchone()[0]
UserGroupMembersRepository(conn).add_member("wh_admin", admin_gid, source="system_seed")
conn.close()
app = create_app()
return TestClient(app)
admin_token = create_access_token("wh_admin", "whadmin@test.com")
return {"client": TestClient(app), "admin_token": admin_token}
def test_health_requires_auth(webhook_client):
"""GET /webhooks/jira/health returns 401 without credentials (ADV-002)."""
resp = webhook_client["client"].get("/webhooks/jira/health")
assert resp.status_code == 401
def test_health(webhook_client):
"""GET /webhooks/jira/health returns 200."""
resp = webhook_client.get("/webhooks/jira/health")
"""GET /webhooks/jira/health returns 200 for admin; jira_domain not exposed."""
headers = {"Authorization": f"Bearer {webhook_client['admin_token']}"}
resp = webhook_client["client"].get("/webhooks/jira/health", headers=headers)
assert resp.status_code == 200
body = resp.json()
assert body["status"] == "ok"
assert "webhook_secret_set" in body
assert "jira_domain" not in body
def test_missing_signature_401(webhook_client):
"""POST without signature header returns 401."""
payload = json.dumps({"webhookEvent": "jira:issue_updated", "issue": {"key": "TEST-1"}}).encode()
resp = webhook_client.post("/webhooks/jira", content=payload, headers={"Content-Type": "application/json"})
resp = webhook_client["client"].post("/webhooks/jira", content=payload, headers={"Content-Type": "application/json"})
assert resp.status_code == 401
def test_invalid_signature_401(webhook_client):
"""POST with wrong signature returns 401."""
payload = json.dumps({"webhookEvent": "jira:issue_updated", "issue": {"key": "TEST-1"}}).encode()
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -85,7 +109,7 @@ def test_valid_signature_accepted(webhook_client):
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = True
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -100,7 +124,7 @@ def test_empty_payload_400(webhook_client):
"""POST with empty body and valid signature returns 400."""
payload = b""
sig = _sign(payload, "test-webhook-secret")
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -166,7 +190,7 @@ def test_path_traversal_in_issue_key_rejected(webhook_client, bad_key):
}).encode()
sig = _sign(payload, "test-webhook-secret")
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -188,7 +212,7 @@ def test_null_issue_field_does_not_crash(webhook_client):
}).encode()
sig = _sign(payload, "test-webhook-secret")
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -214,7 +238,7 @@ def test_valid_issue_key_accepted(webhook_client):
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = True
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -247,7 +271,7 @@ def test_webhook_event_path_traversal_sanitized(webhook_client, tmp_path, monkey
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = True
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -287,7 +311,7 @@ def test_valid_hmac_signature_accepted(webhook_client):
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = True
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -307,7 +331,7 @@ def test_invalid_hmac_signature_rejected_401(webhook_client):
# Sign with the wrong secret
sig = _sign(payload, "wrong-secret")
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -325,7 +349,7 @@ def test_missing_signature_header_rejected(webhook_client):
"issue": {"key": "PROJ-1"},
}).encode()
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={"Content-Type": "application/json"},
@ -346,7 +370,7 @@ def test_x_hub_signature_legacy_header_accepted(webhook_client):
# Since the handler uses hmac.new with sha256, a sha1= prefix will still
# be checked against sha256 HMAC. This test verifies the fallback header
# is read at all (the signature won't match sha256, so expect 401).
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -363,7 +387,7 @@ def test_malformed_json_payload_handled_gracefully(webhook_client):
payload = b'this is not json {!><'
sig = _sign(payload, "test-webhook-secret")
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -390,7 +414,7 @@ def test_duplicate_event_processed_twice(webhook_client):
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = True
resp1 = webhook_client.post(
resp1 = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -398,7 +422,7 @@ def test_duplicate_event_processed_twice(webhook_client):
"X-Hub-Signature-256": sig,
},
)
resp2 = webhook_client.post(
resp2 = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -429,7 +453,7 @@ def test_signature_without_sha256_prefix(webhook_client):
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = True
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -453,7 +477,7 @@ def test_jira_service_not_configured_returns_503(webhook_client):
with patch("app.api.jira_webhooks.get_jira_service") as mock_svc:
mock_svc.return_value.is_configured.return_value = False
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -478,7 +502,7 @@ def test_process_webhook_event_failure_returns_500(webhook_client):
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = False
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={
@ -504,7 +528,7 @@ def test_issue_key_at_top_level_accepted(webhook_client):
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = True
resp = webhook_client.post(
resp = webhook_client["client"].post(
"/webhooks/jira",
content=payload,
headers={

View file

@ -32,13 +32,15 @@ SAMPLE_JIRA_EVENT = {
@pytest.mark.journey
class TestJiraWebhookJourney:
def test_webhook_health_check(self, seeded_app):
"""Jira webhook health endpoint is always accessible."""
"""Jira webhook health endpoint is accessible to admins."""
c = seeded_app["client"]
resp = c.get("/webhooks/jira/health")
headers = {"Authorization": f"Bearer {seeded_app['admin_token']}"}
resp = c.get("/webhooks/jira/health", headers=headers)
assert resp.status_code == 200
body = resp.json()
assert "status" in body
assert body["status"] == "ok"
assert "jira_domain" not in body
def test_webhook_with_no_secret_configured_refused(self, seeded_app):
"""Issue #83: when JIRA_WEBHOOK_SECRET is not set, webhook is REFUSED

View file

@ -398,3 +398,176 @@ class TestJwtSecretHardening:
# Clean up the temporary TESTING flag if we added it
if saved_key is None and saved_testing is None:
os.environ.pop("TESTING", None)
# ---- API hardening (issue #336) ----
@pytest.fixture
def hardening_client(tmp_path, monkeypatch):
"""Minimal seeded app with an admin and a plain analyst for RBAC tests."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
monkeypatch.setenv("JWT_SECRET_KEY", "test-secret-key-minimum-32-characters!!")
from app.main import create_app
from src.db import get_system_db, SYSTEM_ADMIN_GROUP
from src.repositories.users import UserRepository
from src.repositories.user_group_members import UserGroupMembersRepository
from app.auth.jwt import create_access_token
conn = get_system_db()
repo = UserRepository(conn)
repo.create(id="hadmin", email="hadmin@test.com", name="HAdmin")
repo.create(id="hanalyst", email="hanalyst@test.com", name="HAnalyst")
gid = conn.execute(
"SELECT id FROM user_groups WHERE name = ?", [SYSTEM_ADMIN_GROUP]
).fetchone()[0]
UserGroupMembersRepository(conn).add_member("hadmin", gid, source="system_seed")
conn.close()
app = create_app()
c = TestClient(app)
return {
"client": c,
"admin_token": create_access_token("hadmin", "hadmin@test.com"),
"analyst_token": create_access_token("hanalyst", "hanalyst@test.com"),
}
class TestApiHardening336:
"""Regression tests for the ADV-001…ADV-009 findings from issue #336."""
# ADV-001: RBAC on POST /api/sync/table-subscriptions
def test_table_subscriptions_rbac_blocks_unauthorized_table(self, hardening_client):
c = hardening_client["client"]
token = hardening_client["analyst_token"]
resp = c.post(
"/api/sync/table-subscriptions",
json={"table_mode": "explicit", "tables": {"secret_table": True}},
headers={"Authorization": f"Bearer {token}"},
)
assert resp.status_code == 200
assert resp.json()["updated"]["secret_table"] == {"error": "no permission"}
def test_table_subscriptions_rbac_allows_accessible_table(self, hardening_client):
# Admin has access to everything — verify subscription writes go through
c = hardening_client["client"]
token = hardening_client["admin_token"]
resp = c.post(
"/api/sync/table-subscriptions",
json={"table_mode": "explicit", "tables": {"any_table": True}},
headers={"Authorization": f"Bearer {token}"},
)
assert resp.status_code == 200
assert resp.json()["updated"]["any_table"] == {"enabled": True}
def test_table_subscriptions_dict_size_limit(self, hardening_client):
c = hardening_client["client"]
token = hardening_client["analyst_token"]
huge = {f"t{i}": True for i in range(501)}
resp = c.post(
"/api/sync/table-subscriptions",
json={"table_mode": "explicit", "tables": huge},
headers={"Authorization": f"Bearer {token}"},
)
assert resp.status_code == 422 # ADV-008 max_length guard
# ADV-003: /api/version must not expose commit_sha or schema_version
def test_version_does_not_expose_internal_fields(self, hardening_client):
resp = hardening_client["client"].get("/api/version")
assert resp.status_code == 200
body = resp.json()
assert "commit_sha" not in body, "commit_sha must not be in public /api/version"
assert "schema_version" not in body, "schema_version must not be in public /api/version"
assert "version" in body
# ADV-005: /docs, /redoc, /openapi.json gated behind auth
def test_docs_requires_auth(self, hardening_client):
resp = hardening_client["client"].get("/docs", follow_redirects=False)
assert resp.status_code in (401, 302)
def test_openapi_json_requires_auth(self, hardening_client):
resp = hardening_client["client"].get("/openapi.json")
assert resp.status_code == 401
def test_openapi_json_accessible_to_authenticated_user(self, hardening_client):
token = hardening_client["analyst_token"]
resp = hardening_client["client"].get(
"/openapi.json", headers={"Authorization": f"Bearer {token}"}
)
assert resp.status_code == 200
assert "paths" in resp.json()
# ADV-006: /webhooks/ and /cli/ get JSON 401, not HTML redirect
def test_cli_path_gets_json_401_not_redirect(self, hardening_client):
# /cli/latest is public — test a hypothetical future gated endpoint
# by confirming the prefix is in _API_PATH_PREFIXES via /openapi.json
# returning JSON 401 (not a redirect) when unauthenticated.
resp = hardening_client["client"].get("/openapi.json", follow_redirects=False)
assert resp.status_code == 401
assert resp.headers.get("content-type", "").startswith("application/json")
# ADV-009: list endpoints accept limit/offset
def test_users_list_pagination_params(self, hardening_client):
token = hardening_client["admin_token"]
resp = hardening_client["client"].get(
"/api/users?limit=5&offset=0",
headers={"Authorization": f"Bearer {token}"},
)
assert resp.status_code == 200
def test_tokens_list_pagination_params(self, hardening_client):
token = hardening_client["admin_token"]
resp = hardening_client["client"].get(
"/auth/admin/tokens?limit=5&offset=0",
headers={"Authorization": f"Bearer {token}"},
)
assert resp.status_code == 200
# Regression guard for the reviewer-flagged auth-bypass on /auth/bootstrap
# introduced (and reverted) during ADV-009. The original ADV-009 fix
# repurposed UserRepository.list_all() to default to LIMIT 1000, which
# silently broke the bootstrap check `[u for u in list_all() if
# u.get('password_hash')]` on instances with >1000 users: if no
# password-holder landed in the email-sorted first page, bootstrap
# re-opened and an unauthenticated caller could claim admin.
#
# Fix shape: `list_all()` returns EVERY row (no LIMIT), API surface uses
# the new `list_paginated(limit, offset)` instead. The two tests below
# lock in both halves of the contract so a future cleanup doesn't
# silently re-introduce the bypass.
def test_users_list_all_returns_every_row_no_silent_limit(self):
"""``UserRepository.list_all()`` must NOT apply a default LIMIT
the bootstrap-lock check at ``app/auth/router.py`` and the
startup no-password warning at ``app/main.py`` both call this
no-arg and depend on exhaustive enumeration. Silent pagination
here is a real auth-bypass on instances with >LIMIT users."""
import inspect
from src.repositories.users import UserRepository
sig = inspect.signature(UserRepository.list_all)
# No params (other than self) means no caller can accidentally
# pass limit=N and end up with a windowed result.
non_self_params = [
p for p in sig.parameters.values() if p.name != "self"
]
assert non_self_params == [], (
f"UserRepository.list_all() must take no arguments other than "
f"self (got {non_self_params}). Add limit/offset to "
f"list_paginated() instead — list_all() is the "
f"bootstrap-lock / startup-warning path and MUST enumerate "
f"every row."
)
def test_users_list_paginated_is_separate_method(self):
"""The paginated variant must be a distinct method named
``list_paginated`` (not an overload of ``list_all``) so call
sites are explicit about which contract they want.
"""
from src.repositories.users import UserRepository
assert hasattr(UserRepository, "list_paginated"), (
"API-surface pagination must live on a separate "
"`list_paginated` method, not on `list_all`."
)
import inspect
sig = inspect.signature(UserRepository.list_paginated)
param_names = {p.name for p in sig.parameters.values() if p.name != "self"}
assert "limit" in param_names and "offset" in param_names