agnes-the-ai-analyst/tests/test_jira_webhooks.py
ZdenekSrotyr 2f783c5c0a
fix(security): close Jira webhook fail-open + path traversal (#83) (#93)
* fix(security): close Jira webhook fail-open + path traversal (#83)

Two related vulnerabilities:

1. Fail-open signature check: when JIRA_WEBHOOK_SECRET was unset,
   _verify_signature returned True and any unauthenticated POST to
   /webhooks/jira would run the full ingest pipeline. Now fail-closed —
   the handler short-circuits with 503 (operator-misconfiguration signal,
   distinct from 401 wrong-signature) when the secret is missing.

2. Path traversal via attacker-controlled issue_key: webhook payloads
   carry issue.key, which flowed unsanitized into save_issue (issues_dir /
   "{issue_key}.json"), download_attachment (attachments_dir / issue_key),
   and incremental_transform (raw_dir / "issues" / "{issue_key}.json"). A
   crafted webhook with issue.key="../../etc/passwd" could write outside
   the Jira data dir.

Defense-in-depth: new connectors/jira/validation.py exposes
is_valid_issue_key (whitelist regex ^[A-Z][A-Z0-9_]{0,31}-\d{1,12}$) and
safe_join_under (Path.resolve() containment check). Both are enforced at
the webhook entry point AND at every filesystem boundary in the connector.

Tests:
- New tests/test_jira_validation.py — unit tests for both helpers
  (parametrized invalid keys, traversal/symlink/absolute-path cases).
- Webhook tests: test_unconfigured_secret_returns_503,
  test_path_traversal_in_issue_key_rejected (parametrized over 10 bad keys),
  test_valid_issue_key_accepted.

CHANGELOG: two CRITICAL Fixed bullets under Unreleased.

Closes #83.

* fix(security): close remaining #83 review findings — webhookEvent traversal, _handle_deletion guard, regex tightening

Reviewer of PR #93 flagged four MUST-FIXes:

1. _log_webhook_event used the attacker-controlled `webhookEvent` field
   as a filename component without sanitization. Payload with
   `webhookEvent: "../../tmp/pwn"` could escape WEBHOOK_LOG_DIR. Now:
   - non-`[A-Za-z0-9_-]` runs are replaced with `_` (dot excluded so
     `..` cannot survive sanitization as a directory component)
   - length capped at 64 chars
   - final path routed through safe_join_under
   New regression test `test_webhook_event_path_traversal_sanitized`.

2. _handle_deletion (connectors/jira/service.py:530) and
   process_webhook_event (line 487) still used raw issue_key in path
   builds. Even though the webhook handler validates upstream, the
   "defense-in-depth at every filesystem boundary" claim required these
   too. Both now run is_valid_issue_key and safe_join_under guards.

3. Regex `^[A-Z][A-Z0-9_]{0,31}-\d{1,12}$` permitted underscores in
   project keys. Atlassian's project-key validator does not — `A_B-1`
   is rejected by Jira itself. Tightened to `[A-Z0-9]` and updated
   tests: `ABC_DEF-1` is now invalid, added Cyrillic А-1 (lookalike),
   CRLF, and oversize cases to the bad-key parametrization.

4. Existing test test_deletion_of_nonexistent_issue_returns_true used
   `PROJ-NOEXIST` which is not a real Jira key shape. Updated to
   `PROJ-99999`. The test still exercises the same intent (deletion of
   issue with no local file is idempotent).

73/73 jira tests pass locally (test_jira_webhooks + test_jira_validation
+ test_jira_service + test_jira_service_full + test_jira_incremental).

CHANGELOG updated to document the regex tightening and the new
webhookEvent sanitization.

Refs review of #93.

* fix(tests): test_journey_jira tests assumed fail-open before #83 fix

CI failure on PR #93 caught two journey tests that pinned the OLD
fail-open contract:

- test_webhook_with_no_secret_configured_accepted asserted 200 when
  JIRA_WEBHOOK_SECRET was unset. After the #83 fix that's a 503
  (operator misconfig). Renamed to _refused and flipped the assertion.

- test_webhook_empty_payload_rejected didn't set the secret, so the
  503 short-circuit fired before the empty-payload 400 could. Set
  JIRA_WEBHOOK_SECRET in the patched Config so the test exercises the
  intended path.

56/56 jira journey + webhook + validation tests now pass.

* fix(security): #93 round-3 — webhook fallback format + save_issue early validation

Devin Review caught two real findings:

1. Webhook handler regression: the round-2 fix extracted issue_key only
   from event_data['issue']['key'], but process_webhook_event has long
   supported a fallback 'issue_key' top-level field for certain Jira
   event formats (e.g. delete events historically). The handler now
   blocks those events with 400 before they reach the service layer.
   Fix: mirror process_webhook_event's fallback in the handler — try
   issue.key first, fall through to event_data.get('issue_key') when
   empty. is_valid_issue_key still validates whichever source provided
   the key.

2. save_issue defense-in-depth was incomplete: is_valid_issue_key ran
   AFTER fetch_remote_links and fetch_sla_fields had already used the
   unvalidated issue_key in HTTP URL construction
   ({base_url}/issue/{issue_key}/remotelink etc.). A future internal
   caller invoking save_issue directly with attacker-controlled input
   could trigger outbound requests with a malicious path component
   (limited SSRF / URL-path manipulation against the Jira API server).
   Fix: move the is_valid_issue_key check to immediately after the
   null guard, before any HTTP request or filesystem op. Webhook layer
   still validates upstream, this is the second layer.

66 jira tests pass.

Refs Devin Review of #93.

* fix(changelog): #93 round-4 — add BREAKING marker to fail-closed bullet

Devin Review caught: the JIRA_WEBHOOK_SECRET fail-closed change is a
behavior change for operators (response code 503 vs old 200) that
existing alerting may treat differently. Per CLAUDE.md changelog
discipline rule, operators grep for **BREAKING** before bumping the
pin. Added the marker + a short note on what action operators need
to take (set the env var if they haven't).

Refs Devin Review of #93.

* fix: #93 round-5 — null-issue crash + comment drift

Devin Review caught two findings on the round-4 commit:

1. Pre-existing crash on null issue field: a webhook payload with
   {"issue": null} (rather than omitting the key) caused
   event_data.get("issue", {}) to return None, then issue.get("key")
   raised AttributeError → unhandled 500. Pre-existing but reachable.
   Fix: 'event_data.get("issue") or {}' normalises None to {}, then
   the existing fallback / validation path returns 400 cleanly.
   New regression test test_null_issue_field_does_not_crash.

2. Inline comment drift: the comment at line 77 documented the allowed
   character class as [A-Za-z0-9._-] (with dot) but the regex at line 27
   excludes dot deliberately (so '..' cannot survive sanitization).
   Fixed the comment to match.

52 jira tests pass.

Refs Devin Review of #93 round 5.

* fix: #93 round-6 — process_webhook_event also normalises null issue field

Devin Review caught: the webhook handler at app/api/jira_webhooks.py
correctly handles {"issue": null} via 'event_data.get("issue") or {}',
but process_webhook_event at connectors/jira/service.py:509 still
used the bare 'event_data.get("issue", {})' which returns None on
explicit null. Internal callers (anything that invokes
process_webhook_event without going through the HTTP handler) would
hit the same AttributeError the round-5 fix closed at the handler
layer. Same one-line fix.

32 jira tests pass.

Refs Devin Review of #93 round 5.

* fix: #93 round-7 — issue-key regex uses [0-9] not \d

Devin Review caught: Python 3's \d matches any Unicode decimal digit
(Arabic-Indic ٣, Bengali ৩, Devanagari ३, …). A key like TEST-٣ would
pass the regex even though it's not a valid Jira input. Tightened to
[0-9] (ASCII only).

Added three Unicode-digit cases to the bad-key parametrization in
test_jira_validation.py to lock in the contract.

Refs Devin Review of #93 round 6.

* fix: #93 round-8 — use \\Z anchor not $ in issue-key regex

Devin Review caught: Python's $ anchor matches before a trailing \\n,
so re.match('…$', 'TEST-1\\n') returns a match. is_valid_issue_key
returned True for CRLF-injected keys. \\Z is hard end-of-string and
closes that bypass.

Manual verification:
  is_valid_issue_key('TEST-1\\n') → False (was True before fix)
  is_valid_issue_key('TEST-1\\r\\n') → False
  is_valid_issue_key('TEST-1') → True

Refs Devin Review of #93 round 7.

* docs: #93 round-9 — CHANGELOG regex matches implementation
2026-04-27 19:53:55 +02:00

268 lines
9.3 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

"""Tests for Jira webhook FastAPI router."""
import hashlib
import hmac
import json
import os
import tempfile
import pytest
from fastapi.testclient import TestClient
def _sign(payload: bytes, secret: str) -> str:
"""Compute sha256=<HMAC hex> for a given payload and secret."""
mac = hmac.new(secret.encode("utf-8"), payload, hashlib.sha256).hexdigest()
return f"sha256={mac}"
@pytest.fixture()
def webhook_client(tmp_path, monkeypatch):
"""Create a TestClient with required env vars and dirs."""
data_dir = tmp_path / "data"
data_dir.mkdir()
(data_dir / "issues").mkdir()
monkeypatch.setenv("DATA_DIR", str(data_dir))
monkeypatch.setenv("JWT_SECRET_KEY", "test-jwt-secret")
monkeypatch.setenv("JIRA_WEBHOOK_SECRET", "test-webhook-secret")
monkeypatch.setenv("JIRA_DATA_DIR", str(data_dir))
# Re-read env into Config (class attrs read os.environ at import time)
from connectors.jira import service as svc
monkeypatch.setattr(svc.Config, "JIRA_WEBHOOK_SECRET", "test-webhook-secret")
monkeypatch.setattr(svc.Config, "JIRA_DATA_DIR", data_dir)
# Reset singleton so it picks up fresh Config values
svc._jira_service = None
# Reimport app to pick up router
from app.main import create_app
app = create_app()
return TestClient(app)
def test_health(webhook_client):
"""GET /webhooks/jira/health returns 200."""
resp = webhook_client.get("/webhooks/jira/health")
assert resp.status_code == 200
body = resp.json()
assert body["status"] == "ok"
assert "webhook_secret_set" in body
def test_missing_signature_401(webhook_client):
"""POST without signature header returns 401."""
payload = json.dumps({"webhookEvent": "jira:issue_updated", "issue": {"key": "TEST-1"}}).encode()
resp = webhook_client.post("/webhooks/jira", content=payload, headers={"Content-Type": "application/json"})
assert resp.status_code == 401
def test_invalid_signature_401(webhook_client):
"""POST with wrong signature returns 401."""
payload = json.dumps({"webhookEvent": "jira:issue_updated", "issue": {"key": "TEST-1"}}).encode()
resp = webhook_client.post(
"/webhooks/jira",
content=payload,
headers={
"Content-Type": "application/json",
"X-Hub-Signature-256": "sha256=badhex",
},
)
assert resp.status_code == 401
def test_valid_signature_accepted(webhook_client):
"""POST with correct HMAC-SHA256 passes signature check (not 401)."""
from unittest.mock import patch
payload = json.dumps({"webhookEvent": "jira:issue_updated", "issue": {"key": "TEST-1"}}).encode()
sig = _sign(payload, "test-webhook-secret")
# Mock process_webhook_event so the test only checks HMAC validation,
# not the full Jira API flow (which requires a real Jira connection).
with patch("app.api.jira_webhooks.get_jira_service") as mock_svc:
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = True
resp = webhook_client.post(
"/webhooks/jira",
content=payload,
headers={
"Content-Type": "application/json",
"X-Hub-Signature-256": sig,
},
)
assert resp.status_code == 200
def test_empty_payload_400(webhook_client):
"""POST with empty body and valid signature returns 400."""
payload = b""
sig = _sign(payload, "test-webhook-secret")
resp = webhook_client.post(
"/webhooks/jira",
content=payload,
headers={
"Content-Type": "application/json",
"X-Hub-Signature-256": sig,
},
)
assert resp.status_code == 400
def test_unconfigured_secret_returns_503(tmp_path, monkeypatch):
"""Issue #83: missing JIRA_WEBHOOK_SECRET must fail-closed (no fall-through to 200)."""
data_dir = tmp_path / "data"
data_dir.mkdir()
(data_dir / "issues").mkdir()
monkeypatch.setenv("DATA_DIR", str(data_dir))
monkeypatch.setenv("JWT_SECRET_KEY", "test-jwt-secret")
monkeypatch.delenv("JIRA_WEBHOOK_SECRET", raising=False)
monkeypatch.setenv("JIRA_DATA_DIR", str(data_dir))
from connectors.jira import service as svc
monkeypatch.setattr(svc.Config, "JIRA_WEBHOOK_SECRET", "")
monkeypatch.setattr(svc.Config, "JIRA_DATA_DIR", data_dir)
svc._jira_service = None
from app.main import create_app
client = TestClient(create_app())
payload = json.dumps({"webhookEvent": "jira:issue_updated", "issue": {"key": "TEST-1"}}).encode()
resp = client.post(
"/webhooks/jira",
content=payload,
headers={"Content-Type": "application/json"},
)
assert resp.status_code == 503
assert "secret" in resp.json()["detail"].lower()
@pytest.mark.parametrize(
"bad_key",
[
"../../etc/passwd",
"../foo",
"TEST-1/../../../bar",
"TEST-1\x00.json",
"TEST-1\r\n", # CRLF injection
"test-1", # lowercase project — Jira keys are uppercase
"TEST", # missing -<num>
"TEST-", # missing num
"-1", # missing project
"", # empty
"A" * 100 + "-1", # absurd length
"ABC_DEF-1", # underscore — not allowed in real Jira
"А-1", # Cyrillic А (looks like Latin A)
],
)
def test_path_traversal_in_issue_key_rejected(webhook_client, bad_key):
"""Issue #83: malformed issue keys must be rejected with 400, not used in paths."""
payload = json.dumps({
"webhookEvent": "jira:issue_updated",
"issue": {"key": bad_key},
}).encode()
sig = _sign(payload, "test-webhook-secret")
resp = webhook_client.post(
"/webhooks/jira",
content=payload,
headers={
"Content-Type": "application/json",
"X-Hub-Signature-256": sig,
},
)
assert resp.status_code == 400, f"key {bad_key!r} should have been rejected, got {resp.status_code}"
def test_null_issue_field_does_not_crash(webhook_client):
"""Issue #83 round-5: a payload with `issue: null` (not just missing)
used to raise AttributeError on `issue.get('key')` → unhandled 500.
The handler now normalises None to {} and falls through to the
400 'Malformed or missing issue key' response."""
payload = json.dumps({
"webhookEvent": "jira:issue_updated",
"issue": None,
}).encode()
sig = _sign(payload, "test-webhook-secret")
resp = webhook_client.post(
"/webhooks/jira",
content=payload,
headers={
"Content-Type": "application/json",
"X-Hub-Signature-256": sig,
},
)
assert resp.status_code == 400
assert "issue key" in resp.json()["detail"].lower()
def test_valid_issue_key_accepted(webhook_client):
"""Sanity: a well-formed issue key still passes validation."""
from unittest.mock import patch
payload = json.dumps({
"webhookEvent": "jira:issue_updated",
"issue": {"key": "PROJ-42"},
}).encode()
sig = _sign(payload, "test-webhook-secret")
with patch("app.api.jira_webhooks.get_jira_service") as mock_svc:
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = True
resp = webhook_client.post(
"/webhooks/jira",
content=payload,
headers={
"Content-Type": "application/json",
"X-Hub-Signature-256": sig,
},
)
assert resp.status_code == 200
def test_webhook_event_path_traversal_sanitized(webhook_client, tmp_path, monkeypatch):
"""Issue #83: `webhookEvent` is attacker-controlled and was used to build
the webhook log filename. A payload with `../../tmp/pwn` for `webhookEvent`
must NOT escape the WEBHOOK_LOG_DIR; the file (if written at all) lands
under WEBHOOK_LOG_DIR with the traversal characters sanitized."""
from unittest.mock import patch
import app.api.jira_webhooks as wh
log_dir = tmp_path / "webhook_log"
log_dir.mkdir()
monkeypatch.setattr(wh, "WEBHOOK_LOG_DIR", log_dir)
payload = json.dumps({
"webhookEvent": "../../tmp/pwn",
"issue": {"key": "TEST-1"},
}).encode()
sig = _sign(payload, "test-webhook-secret")
with patch("app.api.jira_webhooks.get_jira_service") as mock_svc:
mock_svc.return_value.is_configured.return_value = True
mock_svc.return_value.process_webhook_event.return_value = True
resp = webhook_client.post(
"/webhooks/jira",
content=payload,
headers={
"Content-Type": "application/json",
"X-Hub-Signature-256": sig,
},
)
assert resp.status_code == 200
# No file landed outside log_dir.
parent = log_dir.parent
assert not (parent / "tmp" / "pwn.json").exists(), "path traversal succeeded"
# Either nothing was written (refused), or file is under log_dir with
# traversal chars replaced by underscores.
written = list(log_dir.glob("*.json"))
for f in written:
assert f.is_relative_to(log_dir), f"file {f} escaped log dir"
assert "/" not in f.name and ".." not in f.name