fix(jira): harden _remote_links fetch — prevent transient outage from wiping parquet rows (#319)
* fix(jira): harden _remote_links fetch — transient API failure no longer wipes parquet rows Pre-fix, all three fetch_remote_links call sites (service.py, scripts/backfill.py, scripts/backfill_remote_links.py) silently returned [] on 401/403/429/5xx or httpx.RequestError. Callers overlaid that [] onto cached issue JSON, and transform_remote_links interpreted the empty list as 'issue legitimately has no remote links — delete existing rows', so a transient Jira auth blip permanently wiped remote-link history. Now: - Every fetch site raises JiraFetchError on non-200/non-404 status, on httpx.RequestError, and on the 'service not configured' path. - Overlay sites skip the _remote_links key when fetch raises, leaving it ABSENT (not present-but-empty). - transform_remote_links returns None for absent/null keys (preserve existing rows) vs [] (legitimate empty — wipe). - Both consumers (batch transform_all, incremental transform_single_issue) honor the new contract. - End-to-end tests test_incremental_preserves_remote_links_when_overlay_absent and test_incremental_wipes_remote_links_when_overlay_present_but_empty lock both halves. Adversarial-review fixes bundled: - service.py: unconfigured-service path now raises JiraFetchError instead of returning [] (a webhook can arrive while API creds are missing — HMAC verification uses a separate JIRA_WEBHOOK_SECRET). Regression guard test_raises_when_unconfigured added. - consistency_check.py: AUTO_FIX_THRESHOLD bumped 10 -> 20 to cover typical SLA-poller hiccups before escalating to ERROR. - CLAUDE.md: connectors/jira/transform.py removed from 'Files NOT to modify' (overlay-contract change required touching it; module remains sensitive but is no longer off-limits). * release: 0.54.19 — jira remote_links hardening (transient API failure no longer wipes parquet rows)
This commit is contained in:
parent
9e948abc9c
commit
ed3e8337ab
11 changed files with 578 additions and 74 deletions
41
CHANGELOG.md
41
CHANGELOG.md
|
|
@ -10,6 +10,47 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C
|
|||
|
||||
## [Unreleased]
|
||||
|
||||
## [0.54.19] — 2026-05-15
|
||||
|
||||
### Changed
|
||||
- `connectors/jira/scripts/consistency_check.py` —
|
||||
`AUTO_FIX_THRESHOLD` bumped from 10 to 20. Auto-backfill now covers
|
||||
typical SLA-poller hiccups before escalating to ERROR.
|
||||
`WARNING_THRESHOLD` unchanged.
|
||||
|
||||
### Fixed
|
||||
- **`connectors/jira` — transient Jira API failure no longer wipes
|
||||
existing `remote_links` parquet rows.** Pre-fix, all three
|
||||
`fetch_remote_links` sites (`service.py`, `scripts/backfill.py`,
|
||||
`scripts/backfill_remote_links.py`) silently returned `[]` on
|
||||
401/403/429/5xx or `httpx.RequestError`. Callers overlaid that `[]`
|
||||
onto cached issue JSON, and `transform_remote_links` interpreted the
|
||||
empty list as "issue legitimately has no remote links — delete its
|
||||
existing rows", so a transient Jira auth blip (or a webhook burst
|
||||
hitting Jira's rate limiter) permanently wiped remote-link history.
|
||||
Now: every fetch site raises `JiraFetchError` on non-200/non-404
|
||||
status and `httpx.RequestError` (including the "service not
|
||||
configured" path — a webhook arriving while API creds are missing
|
||||
no longer surfaces as a silent wipe), overlay sites skip the
|
||||
`_remote_links` key on raise (leaving it ABSENT, not
|
||||
present-but-empty), and `transform_remote_links` returns `None` for
|
||||
absent / `null` keys (preserve existing rows) vs `[]` (legitimate
|
||||
empty — wipe). Both consumers (batch `transform_all` and incremental
|
||||
`transform_single_issue`) honor the new contract. End-to-end tests
|
||||
lock both halves: `test_incremental_preserves_remote_links_when_overlay_absent`
|
||||
+ `test_incremental_wipes_remote_links_when_overlay_present_but_empty`.
|
||||
The bulk-backfill scripts retain their existing `Retry-After`
|
||||
sleep+retry loop for 429 (appropriate for non-interactive batch
|
||||
contexts); only the webhook hot path raises on 429.
|
||||
|
||||
### Internal
|
||||
- `CLAUDE.md` — `connectors/jira/transform.py` removed from the
|
||||
"Files NOT to modify" list. The `_remote_links` hardening required
|
||||
modifying `transform_remote_links` and `transform_all` to honor the
|
||||
new "overlay absent → preserve existing rows" contract; the module
|
||||
remains sensitive (touch only with end-to-end understanding of the
|
||||
JSON-overlay / parquet-rewrite pipeline) but is no longer off-limits.
|
||||
|
||||
## [0.54.18] — 2026-05-15
|
||||
|
||||
### Added
|
||||
|
|
|
|||
|
|
@ -293,9 +293,15 @@ Auth providers in `app/auth/` (FastAPI-based):
|
|||
|
||||
### Files NOT to modify (stable infrastructure)
|
||||
- `connectors/jira/file_lock.py` — advisory file locking
|
||||
- `connectors/jira/transform.py` — core Jira transform logic
|
||||
- `services/ws_gateway/` — WebSocket notification gateway
|
||||
|
||||
(`connectors/jira/transform.py` was previously listed here but has been
|
||||
removed: the `_remote_links` hardening in 0.54.19 required modifying
|
||||
`transform_remote_links` and `transform_all` to honor a new "overlay
|
||||
absent → preserve existing rows" contract. The transform module remains
|
||||
sensitive — touch it only when you understand the JSON-overlay /
|
||||
parquet-rewrite pipeline end-to-end — but it is no longer off-limits.)
|
||||
|
||||
## Release process
|
||||
|
||||
Full recipe, deploy workflows, and CI quirks: [`docs/RELEASING.md`](docs/RELEASING.md). The non-negotiable rules:
|
||||
|
|
|
|||
|
|
@ -217,10 +217,19 @@ def transform_single_issue(
|
|||
updated_paths.append(path)
|
||||
|
||||
# Remote links
|
||||
existing_remote_links = load_parquet_month(output_dir / "remote_links", month_key)
|
||||
updated_remote_links = upsert_dataframe(existing_remote_links, remote_links_records, "issue_key", issue_key)
|
||||
path = save_parquet_month(updated_remote_links, REMOTE_LINKS_SCHEMA, output_dir / "remote_links", month_key)
|
||||
updated_paths.append(path)
|
||||
if remote_links_records is not None:
|
||||
existing_remote_links = load_parquet_month(output_dir / "remote_links", month_key)
|
||||
updated_remote_links = upsert_dataframe(existing_remote_links, remote_links_records, "issue_key", issue_key)
|
||||
path = save_parquet_month(updated_remote_links, REMOTE_LINKS_SCHEMA, output_dir / "remote_links", month_key)
|
||||
updated_paths.append(path)
|
||||
else:
|
||||
# The writer (save_issue / backfill / backfill_remote_links) skipped
|
||||
# the _remote_links overlay due to a Jira fetch failure. Preserve the
|
||||
# existing parquet rows for this issue instead of wiping them.
|
||||
logger.warning(
|
||||
f"Skipping remote_links upsert for {issue_key}: overlay absent "
|
||||
f"(fetch failure). Existing rows preserved."
|
||||
)
|
||||
|
||||
# Update extract.duckdb _meta for all affected tables
|
||||
try:
|
||||
|
|
|
|||
|
|
@ -41,6 +41,7 @@ import httpx
|
|||
from dotenv import load_dotenv
|
||||
|
||||
from app.logging_config import setup_logging
|
||||
from connectors.jira.service import JiraFetchError
|
||||
|
||||
setup_logging(__name__)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
|
@ -244,11 +245,10 @@ class JiraBackfill:
|
|||
"""
|
||||
Fetch remote links for an issue from Jira.
|
||||
|
||||
Args:
|
||||
issue_key: Issue key (e.g., "PROJ-123")
|
||||
|
||||
Returns:
|
||||
List of remote link dicts, empty list on failure
|
||||
Mirrors connectors.jira.service.JiraService.fetch_remote_links —
|
||||
raises JiraFetchError on auth (401/403) or server (5xx) failure so
|
||||
the caller can skip the overlay rather than wipe existing parquet
|
||||
rows. 429 rate-limit retries are kept as-is (legitimate transient).
|
||||
"""
|
||||
url = f"{self.base_url}/issue/{issue_key}/remotelink"
|
||||
|
||||
|
|
@ -259,23 +259,34 @@ class JiraBackfill:
|
|||
auth=self.auth,
|
||||
headers={"Accept": "application/json"},
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
elif response.status_code == 404:
|
||||
return []
|
||||
elif response.status_code == 429:
|
||||
retry_after = int(response.headers.get("Retry-After", 60))
|
||||
logger.warning(f"Rate limited on remote links, waiting {retry_after}s...")
|
||||
time.sleep(retry_after)
|
||||
return self.fetch_remote_links(issue_key)
|
||||
else:
|
||||
logger.debug(f"Failed to fetch remote links for {issue_key}: {response.status_code}")
|
||||
return []
|
||||
|
||||
except httpx.RequestError as e:
|
||||
logger.debug(f"Error fetching remote links for {issue_key}: {e}")
|
||||
raise JiraFetchError(
|
||||
f"Backfill remote-links fetch for {issue_key} failed: connection — {e}"
|
||||
) from e
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
if response.status_code == 404:
|
||||
return []
|
||||
if response.status_code == 429:
|
||||
retry_after = int(response.headers.get("Retry-After", 60))
|
||||
logger.warning(f"Rate limited on remote links, waiting {retry_after}s...")
|
||||
time.sleep(retry_after)
|
||||
return self.fetch_remote_links(issue_key)
|
||||
if response.status_code in (401, 403):
|
||||
raise JiraFetchError(
|
||||
f"Backfill remote-links fetch for {issue_key} failed: auth error "
|
||||
f"({response.status_code}) — token may be expired/revoked"
|
||||
)
|
||||
if response.status_code >= 500:
|
||||
raise JiraFetchError(
|
||||
f"Backfill remote-links fetch for {issue_key} failed: server error "
|
||||
f"({response.status_code})"
|
||||
)
|
||||
raise JiraFetchError(
|
||||
f"Backfill remote-links fetch for {issue_key} failed: unexpected status "
|
||||
f"{response.status_code}"
|
||||
)
|
||||
|
||||
def save_issue(self, issue_data: dict) -> Path | None:
|
||||
"""
|
||||
|
|
@ -398,8 +409,15 @@ class JiraBackfill:
|
|||
self.stats["failed"] += 1
|
||||
return False
|
||||
|
||||
# Fetch and embed remote links for Parquet transform
|
||||
issue_data["_remote_links"] = self.fetch_remote_links(issue_key)
|
||||
# Fetch and embed remote links for Parquet transform. If fetch fails,
|
||||
# leave the key ABSENT so transform_remote_links preserves existing rows.
|
||||
try:
|
||||
issue_data["_remote_links"] = self.fetch_remote_links(issue_key)
|
||||
except JiraFetchError as e:
|
||||
logger.warning(
|
||||
f"Skipping _remote_links overlay for {issue_key}: {e}. "
|
||||
f"Existing parquet rows will be preserved."
|
||||
)
|
||||
|
||||
# Save JSON
|
||||
if not self.save_issue(issue_data):
|
||||
|
|
|
|||
|
|
@ -37,6 +37,7 @@ import httpx
|
|||
from dotenv import load_dotenv
|
||||
|
||||
from app.logging_config import setup_logging
|
||||
from connectors.jira.service import JiraFetchError
|
||||
|
||||
setup_logging(__name__)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
|
@ -72,7 +73,12 @@ def load_config() -> dict:
|
|||
|
||||
|
||||
def fetch_remote_links(base_url: str, auth: tuple[str, str], issue_key: str) -> list[dict]:
|
||||
"""Fetch remote links for a single issue from Jira API."""
|
||||
"""Fetch remote links for a single issue from Jira API.
|
||||
|
||||
Raises JiraFetchError on auth/server failure so the caller can
|
||||
skip the overlay rather than write [] and let the next incremental
|
||||
transform wipe existing parquet rows.
|
||||
"""
|
||||
url = f"{base_url}/issue/{issue_key}/remotelink"
|
||||
|
||||
try:
|
||||
|
|
@ -82,23 +88,34 @@ def fetch_remote_links(base_url: str, auth: tuple[str, str], issue_key: str) ->
|
|||
auth=auth,
|
||||
headers={"Accept": "application/json"},
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
elif response.status_code == 404:
|
||||
return []
|
||||
elif response.status_code == 429:
|
||||
retry_after = int(response.headers.get("Retry-After", 60))
|
||||
logger.warning(f"Rate limited, waiting {retry_after}s...")
|
||||
time.sleep(retry_after)
|
||||
return fetch_remote_links(base_url, auth, issue_key)
|
||||
else:
|
||||
logger.debug(f"Failed to fetch remote links for {issue_key}: {response.status_code}")
|
||||
return []
|
||||
|
||||
except httpx.RequestError as e:
|
||||
logger.debug(f"Error fetching remote links for {issue_key}: {e}")
|
||||
raise JiraFetchError(
|
||||
f"Targeted backfill remote-links fetch for {issue_key} failed: connection — {e}"
|
||||
) from e
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
if response.status_code == 404:
|
||||
return []
|
||||
if response.status_code == 429:
|
||||
retry_after = int(response.headers.get("Retry-After", 60))
|
||||
logger.warning(f"Rate limited, waiting {retry_after}s...")
|
||||
time.sleep(retry_after)
|
||||
return fetch_remote_links(base_url, auth, issue_key)
|
||||
if response.status_code in (401, 403):
|
||||
raise JiraFetchError(
|
||||
f"Targeted backfill remote-links fetch for {issue_key} failed: auth error "
|
||||
f"({response.status_code}) — token may be expired/revoked"
|
||||
)
|
||||
if response.status_code >= 500:
|
||||
raise JiraFetchError(
|
||||
f"Targeted backfill remote-links fetch for {issue_key} failed: server error "
|
||||
f"({response.status_code})"
|
||||
)
|
||||
raise JiraFetchError(
|
||||
f"Targeted backfill remote-links fetch for {issue_key} failed: unexpected status "
|
||||
f"{response.status_code}"
|
||||
)
|
||||
|
||||
|
||||
def process_file(json_path: Path, base_url: str, auth: tuple[str, str]) -> str:
|
||||
|
|
@ -119,8 +136,17 @@ def process_file(json_path: Path, base_url: str, auth: tuple[str, str]) -> str:
|
|||
if not issue_key:
|
||||
return "failed"
|
||||
|
||||
# Fetch remote links
|
||||
remote_links = fetch_remote_links(base_url, auth, issue_key)
|
||||
# Fetch remote links. If fetch fails, return "failed" without writing
|
||||
# the JSON — leaves the file as-is (no _remote_links key) so the next
|
||||
# incremental transform preserves existing parquet rows.
|
||||
try:
|
||||
remote_links = fetch_remote_links(base_url, auth, issue_key)
|
||||
except JiraFetchError as e:
|
||||
logger.warning(
|
||||
f"Skipping _remote_links embed for {issue_key}: {e}. "
|
||||
f"Existing parquet rows will be preserved."
|
||||
)
|
||||
return "failed"
|
||||
|
||||
# Embed in data
|
||||
data["_remote_links"] = remote_links
|
||||
|
|
|
|||
|
|
@ -119,7 +119,7 @@ class JiraConsistencyChecker:
|
|||
GRACE_PERIOD_MINUTES = 5
|
||||
|
||||
# Thresholds for auto-backfill
|
||||
AUTO_FIX_THRESHOLD = 10 # Auto-fix if ≤10 issues missing
|
||||
AUTO_FIX_THRESHOLD = 20 # Auto-fix if ≤20 issues missing
|
||||
WARNING_THRESHOLD = 5 # Log WARNING if >5 issues
|
||||
|
||||
# Jira API limits
|
||||
|
|
|
|||
|
|
@ -23,6 +23,15 @@ from connectors.jira.validation import is_valid_issue_key, safe_join_under
|
|||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class JiraFetchError(Exception):
|
||||
"""Raised by Jira fetch helpers when the API returns an auth (401/403)
|
||||
or server (5xx) error. Callers that overlay the result onto cached
|
||||
issue JSON (save_issue, backfill processors) MUST catch this and
|
||||
skip the overlay; otherwise a transient outage silently wipes
|
||||
existing parquet rows for that issue.
|
||||
"""
|
||||
|
||||
|
||||
class _JiraConfig:
|
||||
"""Jira configuration from environment variables."""
|
||||
JIRA_DOMAIN = os.environ.get("JIRA_DOMAIN", "")
|
||||
|
|
@ -118,7 +127,7 @@ class JiraService:
|
|||
Fetch complete issue data from Jira.
|
||||
|
||||
Args:
|
||||
issue_key: Issue key (e.g., "KSP-123")
|
||||
issue_key: Issue key (e.g., "PROJ-123")
|
||||
|
||||
Returns:
|
||||
Issue data dict or None if fetch failed
|
||||
|
|
@ -208,14 +217,28 @@ class JiraService:
|
|||
"""
|
||||
Fetch remote links for an issue from Jira.
|
||||
|
||||
Args:
|
||||
issue_key: Issue key (e.g., "KSP-123")
|
||||
|
||||
Returns:
|
||||
List of remote link dicts, empty list on failure
|
||||
Returns the list of remote links on 200; an empty list on 404
|
||||
(issue legitimately has no remote links). Raises JiraFetchError
|
||||
on ANY other status code or on httpx.RequestError, so callers
|
||||
that overlay this onto cached issue data can skip the overlay
|
||||
instead of wiping existing rows. Critically, 429 rate limits
|
||||
also raise — silently returning [] there would re-trigger the
|
||||
same wipe bug (a webhook burst hitting Jira's rate limiter is
|
||||
the most likely production scenario).
|
||||
"""
|
||||
# Unconfigured-service case: per the new contract, callers
|
||||
# interpret `[]` as "issue legitimately has no remote links"
|
||||
# and `JiraFetchError` as "fetch failed, preserve existing
|
||||
# rows". Silently returning `[]` here would overlay an empty
|
||||
# list onto cached issue JSON and wipe existing parquet rows
|
||||
# the next time a webhook fires while creds happen to be
|
||||
# missing — the exact regression this PR closes for the 401
|
||||
# / 429 / 5xx paths. Raise instead so the overlay site skips.
|
||||
if not self.is_configured():
|
||||
return []
|
||||
raise JiraFetchError(
|
||||
f"Remote-links fetch for {issue_key} failed: "
|
||||
"Jira service not configured (missing API credentials)"
|
||||
)
|
||||
|
||||
url = f"{self.base_url}/issue/{issue_key}/remotelink"
|
||||
|
||||
|
|
@ -226,21 +249,34 @@ class JiraService:
|
|||
auth=self.auth,
|
||||
headers={"Accept": "application/json"},
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
elif response.status_code == 404:
|
||||
return []
|
||||
else:
|
||||
logger.warning(
|
||||
f"Failed to fetch remote links for {issue_key}: "
|
||||
f"{response.status_code}"
|
||||
)
|
||||
return []
|
||||
|
||||
except httpx.RequestError as e:
|
||||
logger.warning(f"Request error fetching remote links for {issue_key}: {e}")
|
||||
raise JiraFetchError(
|
||||
f"Remote-links fetch for {issue_key} failed: connection — {e}"
|
||||
) from e
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
if response.status_code == 404:
|
||||
return []
|
||||
if response.status_code in (401, 403):
|
||||
raise JiraFetchError(
|
||||
f"Remote-links fetch for {issue_key} failed: auth error "
|
||||
f"({response.status_code}) — token may be expired/revoked"
|
||||
)
|
||||
if response.status_code == 429:
|
||||
raise JiraFetchError(
|
||||
f"Remote-links fetch for {issue_key} failed: rate limited "
|
||||
f"(429) — retry later"
|
||||
)
|
||||
if response.status_code >= 500:
|
||||
raise JiraFetchError(
|
||||
f"Remote-links fetch for {issue_key} failed: server error "
|
||||
f"({response.status_code})"
|
||||
)
|
||||
raise JiraFetchError(
|
||||
f"Remote-links fetch for {issue_key} failed: unexpected status "
|
||||
f"{response.status_code}"
|
||||
)
|
||||
|
||||
def save_issue(self, issue_data: dict[str, Any]) -> Path | None:
|
||||
"""
|
||||
|
|
@ -276,10 +312,19 @@ class JiraService:
|
|||
# Add metadata
|
||||
issue_data["_synced_at"] = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
# Fetch and embed remote links for Parquet transform
|
||||
# Overlay-skip guard: if fetch_remote_links raises (auth/server failure),
|
||||
# leave the _remote_links key ABSENT. transform_remote_links treats absent key
|
||||
# as "no fresh data, preserve existing parquet rows". A present-but-empty list
|
||||
# would be interpreted as "this issue has no remote links — wipe existing".
|
||||
issue_key_for_links = issue_data.get("key")
|
||||
if issue_key_for_links:
|
||||
issue_data["_remote_links"] = self.fetch_remote_links(issue_key_for_links)
|
||||
try:
|
||||
issue_data["_remote_links"] = self.fetch_remote_links(issue_key_for_links)
|
||||
except JiraFetchError as e:
|
||||
logger.warning(
|
||||
f"Skipping _remote_links overlay for {issue_key_for_links}: {e}. "
|
||||
f"Existing parquet rows will be preserved."
|
||||
)
|
||||
|
||||
# Overlay SLA fields from JSM service account (personal token lacks permissions)
|
||||
sla_fields = self.fetch_sla_fields(issue_key)
|
||||
|
|
|
|||
|
|
@ -526,14 +526,32 @@ def transform_issuelinks(raw_issue: dict) -> list[dict]:
|
|||
return records
|
||||
|
||||
|
||||
def transform_remote_links(raw_issue: dict) -> list[dict]:
|
||||
def transform_remote_links(raw_issue: dict) -> list[dict] | None:
|
||||
"""Extract and transform remote links from an issue.
|
||||
|
||||
Remote links are embedded in the raw issue JSON as `_remote_links`
|
||||
by the fetch layer (jira_service.py / jira_backfill.py).
|
||||
Returns:
|
||||
- list[dict]: fresh records to upsert into parquet. May be empty,
|
||||
meaning the issue legitimately has no remote links right now
|
||||
(HTTP 200 with [] or HTTP 404 from the fetch).
|
||||
- None: the _remote_links key was absent from raw_issue, which
|
||||
signals that save_issue (or the equivalent backfill writer)
|
||||
could not refresh remote links — typically a 401/403/5xx from
|
||||
the Jira API. Callers MUST treat None as "skip the upsert";
|
||||
overwriting with [] would delete existing parquet rows for
|
||||
this issue.
|
||||
|
||||
The key shape is set by the writers (JiraService.save_issue,
|
||||
JiraBackfiller, backfill_remote_links): present means the fetch
|
||||
succeeded (200 or 404), absent means the fetch raised.
|
||||
"""
|
||||
issue_key = raw_issue.get("key")
|
||||
remote_links = raw_issue.get("_remote_links", [])
|
||||
# Treat both absent key and explicit None as the "no fresh data" signal —
|
||||
# absent is the contract from save_issue/backfill writers, None is the
|
||||
# defensive case where a JSON edit or older buggy code stored an explicit
|
||||
# null (would otherwise blow up on `for rl in None`).
|
||||
remote_links = raw_issue.get("_remote_links")
|
||||
if remote_links is None:
|
||||
return None
|
||||
|
||||
records = []
|
||||
for rl in remote_links:
|
||||
|
|
@ -664,7 +682,15 @@ def transform_all(
|
|||
|
||||
changelog_by_month[month_key].extend(transform_changelog(raw_issue))
|
||||
issuelinks_by_month[month_key].extend(transform_issuelinks(raw_issue))
|
||||
remote_links_by_month[month_key].extend(transform_remote_links(raw_issue))
|
||||
rl_records = transform_remote_links(raw_issue)
|
||||
if rl_records is not None:
|
||||
remote_links_by_month[month_key].extend(rl_records)
|
||||
# else: _remote_links overlay was skipped (fetch failure). The batch
|
||||
# rebuild writes monthly parquets from scratch, so this issue simply
|
||||
# contributes no rows to the rebuild — it doesn't "preserve" anything.
|
||||
# A re-run after the outage clears will repopulate. The incremental
|
||||
# path (incremental_transform.py) is what genuinely preserves
|
||||
# existing rows; batch mode is full-rebuild and not the hot path.
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error processing {json_file}: {e}")
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
[project]
|
||||
name = "agnes-the-ai-analyst"
|
||||
version = "0.54.18"
|
||||
version = "0.54.19"
|
||||
description = "Agnes — AI Data Analyst platform for AI analytical systems"
|
||||
requires-python = ">=3.11,<3.14"
|
||||
license = "MIT"
|
||||
|
|
|
|||
|
|
@ -1,5 +1,6 @@
|
|||
"""Tests for incremental Jira parquet transform (upsert_dataframe and friends)."""
|
||||
|
||||
import json
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch
|
||||
|
||||
|
|
@ -10,8 +11,10 @@ import pytest
|
|||
from connectors.jira.incremental_transform import (
|
||||
load_parquet_month,
|
||||
save_parquet_month,
|
||||
transform_single_issue,
|
||||
upsert_dataframe,
|
||||
)
|
||||
from connectors.jira.transform import REMOTE_LINKS_SCHEMA
|
||||
|
||||
|
||||
# Minimal schema compatible with ISSUES_SCHEMA for testing purposes
|
||||
|
|
@ -183,3 +186,129 @@ class TestParquetMonthlyPartitioning:
|
|||
assert proj1.iloc[0]["summary"] == "Updated"
|
||||
proj2 = final[final["issue_key"] == "PROJ-2"]
|
||||
assert proj2.iloc[0]["summary"] == "Keep"
|
||||
|
||||
|
||||
def _seed_remote_links_parquet(parquet_root, month_key, rows):
|
||||
"""Write a starter remote_links parquet so we can assert preservation."""
|
||||
df = pd.DataFrame(rows)
|
||||
target = parquet_root / "remote_links"
|
||||
target.mkdir(parents=True, exist_ok=True)
|
||||
save_parquet_month(df, REMOTE_LINKS_SCHEMA, target, month_key)
|
||||
|
||||
|
||||
def _write_raw_issue(raw_dir, issue_key, payload):
|
||||
"""Seed raw issue JSON at the path transform_single_issue reads from."""
|
||||
issues_dir = raw_dir / "issues"
|
||||
issues_dir.mkdir(parents=True, exist_ok=True)
|
||||
(issues_dir / f"{issue_key}.json").write_text(json.dumps(payload))
|
||||
|
||||
|
||||
def test_incremental_preserves_remote_links_when_overlay_absent(tmp_path):
|
||||
"""When the _remote_links key is absent from the raw JSON (the writer
|
||||
skipped the overlay due to a Jira fetch failure), transform_single_issue
|
||||
must NOT wipe existing parquet rows for that issue. Existing rows must
|
||||
remain untouched and the function must report success."""
|
||||
raw_dir = tmp_path / "raw"
|
||||
output_dir = tmp_path / "parquet"
|
||||
attachments_dir = tmp_path / "attachments"
|
||||
output_dir.mkdir()
|
||||
attachments_dir.mkdir()
|
||||
|
||||
# Pre-seed an existing remote-link row for PROJ-1 in month 2026-05.
|
||||
_seed_remote_links_parquet(output_dir, "2026-05", [{
|
||||
"issue_key": "PROJ-1",
|
||||
"remote_link_id": "rl-existing",
|
||||
"url": "https://example.com/old",
|
||||
"title": "Pre-existing link",
|
||||
"application_name": "X",
|
||||
"application_type": "x",
|
||||
}])
|
||||
|
||||
# Raw issue WITHOUT _remote_links key — overlay was skipped upstream.
|
||||
_write_raw_issue(raw_dir, "PROJ-1", {
|
||||
"key": "PROJ-1",
|
||||
"id": "10001",
|
||||
"fields": {
|
||||
"summary": "test",
|
||||
"status": {"name": "Open"},
|
||||
"issuetype": {"name": "Bug"},
|
||||
"attachment": [],
|
||||
"comment": {"comments": []},
|
||||
"created": "2026-05-15T00:00:00.000+0000",
|
||||
"updated": "2026-05-15T00:00:00.000+0000",
|
||||
},
|
||||
# NOTE: no _remote_links key — that is the test condition.
|
||||
})
|
||||
|
||||
ok = transform_single_issue(
|
||||
issue_key="PROJ-1",
|
||||
raw_dir=raw_dir,
|
||||
output_dir=output_dir,
|
||||
attachments_dir=attachments_dir,
|
||||
)
|
||||
assert ok is True
|
||||
|
||||
df = load_parquet_month(output_dir / "remote_links", "2026-05")
|
||||
assert df is not None and len(df) == 1, \
|
||||
"Existing remote-link row was wiped — overlay-absent signal not honored"
|
||||
assert df.iloc[0]["remote_link_id"] == "rl-existing"
|
||||
|
||||
|
||||
def test_incremental_wipes_remote_links_when_overlay_present_but_empty(tmp_path):
|
||||
"""The mirror-image case: when _remote_links IS present but the list is
|
||||
empty, that's a successful fetch confirming the issue legitimately has no
|
||||
remote links right now. The transform MUST wipe any existing parquet rows
|
||||
for that issue — keeping them would be stale data.
|
||||
|
||||
Together with test_incremental_preserves_remote_links_when_overlay_absent,
|
||||
this locks the absent-vs-empty contract end-to-end. A future regression
|
||||
that 'simplifies' the transform to treat [] the same as absent (i.e.,
|
||||
skip upsert) would be caught here."""
|
||||
raw_dir = tmp_path / "raw"
|
||||
output_dir = tmp_path / "parquet"
|
||||
attachments_dir = tmp_path / "attachments"
|
||||
output_dir.mkdir()
|
||||
attachments_dir.mkdir()
|
||||
|
||||
# Pre-seed a stale row that should be wiped.
|
||||
_seed_remote_links_parquet(output_dir, "2026-05", [{
|
||||
"issue_key": "PROJ-2",
|
||||
"remote_link_id": "rl-stale",
|
||||
"url": "https://example.com/stale",
|
||||
"title": "Stale link to be wiped",
|
||||
"application_name": "X",
|
||||
"application_type": "x",
|
||||
}])
|
||||
|
||||
# Raw issue WITH _remote_links: [] — fresh fetch confirmed empty.
|
||||
_write_raw_issue(raw_dir, "PROJ-2", {
|
||||
"key": "PROJ-2",
|
||||
"id": "10002",
|
||||
"fields": {
|
||||
"summary": "test",
|
||||
"status": {"name": "Open"},
|
||||
"issuetype": {"name": "Bug"},
|
||||
"attachment": [],
|
||||
"comment": {"comments": []},
|
||||
"created": "2026-05-15T00:00:00.000+0000",
|
||||
"updated": "2026-05-15T00:00:00.000+0000",
|
||||
},
|
||||
"_remote_links": [],
|
||||
})
|
||||
|
||||
ok = transform_single_issue(
|
||||
issue_key="PROJ-2",
|
||||
raw_dir=raw_dir,
|
||||
output_dir=output_dir,
|
||||
attachments_dir=attachments_dir,
|
||||
)
|
||||
assert ok is True
|
||||
|
||||
df = load_parquet_month(output_dir / "remote_links", "2026-05")
|
||||
# Either the file was unlinked (df is None) or the row was removed.
|
||||
# Both outcomes satisfy the contract — the stale row must not survive.
|
||||
if df is not None:
|
||||
remaining = df[df["issue_key"] == "PROJ-2"]
|
||||
assert len(remaining) == 0, \
|
||||
"Stale remote-link row survived a successful empty-list fetch — " \
|
||||
"the empty-list (legitimate) signal was misinterpreted as preserve"
|
||||
|
|
|
|||
|
|
@ -8,6 +8,10 @@ import pytest
|
|||
|
||||
from tests.helpers.factories import WebhookEventFactory
|
||||
|
||||
from connectors.jira.scripts.consistency_check import JiraConsistencyChecker
|
||||
from connectors.jira.service import JiraFetchError
|
||||
from connectors.jira.transform import transform_remote_links
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def jira_env(tmp_path, monkeypatch):
|
||||
|
|
@ -180,3 +184,203 @@ class TestJiraServiceWebhookProcessing:
|
|||
secret.encode("utf-8"), payload, hashlib.sha256
|
||||
).hexdigest()
|
||||
assert sig == f"sha256={expected_mac}"
|
||||
|
||||
|
||||
class TestFetchRemoteLinks:
|
||||
"""fetch_remote_links must raise on auth/server failure to prevent the
|
||||
save_issue overlay from writing [] into cached JSON, which downstream
|
||||
would interpret as 'delete existing remote_links rows for this issue'."""
|
||||
|
||||
def _mock_http(self, status_code, json_body=None):
|
||||
"""Build a MagicMock httpx.Client context manager returning a fixed response."""
|
||||
response = MagicMock()
|
||||
response.status_code = status_code
|
||||
response.json.return_value = json_body or []
|
||||
client = MagicMock()
|
||||
client.get.return_value = response
|
||||
client.__enter__ = lambda s: client
|
||||
client.__exit__ = MagicMock(return_value=False)
|
||||
return client
|
||||
|
||||
def test_returns_list_on_200(self, jira_env):
|
||||
service = _make_jira_service(jira_env)
|
||||
with patch("connectors.jira.service.httpx.Client",
|
||||
return_value=self._mock_http(200, [{"id": "1"}])):
|
||||
assert service.fetch_remote_links("PROJ-1") == [{"id": "1"}]
|
||||
|
||||
def test_returns_empty_on_404(self, jira_env):
|
||||
service = _make_jira_service(jira_env)
|
||||
with patch("connectors.jira.service.httpx.Client",
|
||||
return_value=self._mock_http(404)):
|
||||
assert service.fetch_remote_links("PROJ-1") == []
|
||||
|
||||
def test_raises_on_401(self, jira_env):
|
||||
service = _make_jira_service(jira_env)
|
||||
with patch("connectors.jira.service.httpx.Client",
|
||||
return_value=self._mock_http(401)):
|
||||
with pytest.raises(JiraFetchError, match="auth"):
|
||||
service.fetch_remote_links("PROJ-1")
|
||||
|
||||
def test_raises_on_403(self, jira_env):
|
||||
service = _make_jira_service(jira_env)
|
||||
with patch("connectors.jira.service.httpx.Client",
|
||||
return_value=self._mock_http(403)):
|
||||
with pytest.raises(JiraFetchError, match="auth"):
|
||||
service.fetch_remote_links("PROJ-1")
|
||||
|
||||
def test_raises_on_500(self, jira_env):
|
||||
service = _make_jira_service(jira_env)
|
||||
with patch("connectors.jira.service.httpx.Client",
|
||||
return_value=self._mock_http(500)):
|
||||
with pytest.raises(JiraFetchError, match="server"):
|
||||
service.fetch_remote_links("PROJ-1")
|
||||
|
||||
def test_raises_on_429_rate_limit(self, jira_env):
|
||||
# 429 in the webhook hot path must raise (not silently return []).
|
||||
# A webhook burst hitting Jira's rate limiter is the most likely
|
||||
# production scenario; returning [] would re-trigger the wipe bug.
|
||||
service = _make_jira_service(jira_env)
|
||||
with patch("connectors.jira.service.httpx.Client",
|
||||
return_value=self._mock_http(429)):
|
||||
with pytest.raises(JiraFetchError, match="rate limited"):
|
||||
service.fetch_remote_links("PROJ-1")
|
||||
|
||||
def test_raises_on_unexpected_status(self, jira_env):
|
||||
# Any non-success/non-404 status raises — covers 400, 405, 418, etc.
|
||||
# No silent fall-through.
|
||||
service = _make_jira_service(jira_env)
|
||||
with patch("connectors.jira.service.httpx.Client",
|
||||
return_value=self._mock_http(418)):
|
||||
with pytest.raises(JiraFetchError, match="unexpected status"):
|
||||
service.fetch_remote_links("PROJ-1")
|
||||
|
||||
def test_raises_when_unconfigured(self, jira_env):
|
||||
"""Regression guard for the Devin adversarial-review finding:
|
||||
when the Jira service is unconfigured (missing API credentials)
|
||||
the fetch must raise JiraFetchError, NOT silently return [].
|
||||
Silent [] would overlay an empty list onto cached issue JSON via
|
||||
save_issue and wipe existing remote_links parquet rows the next
|
||||
time a webhook fires while creds happen to be missing — the
|
||||
exact scenario this PR closes for the 401 / 429 / 5xx paths.
|
||||
The webhook hot path passes HMAC verification with
|
||||
JIRA_WEBHOOK_SECRET (separate from API creds), so a
|
||||
missing-creds + webhook-arrives combo is realistic, not
|
||||
theoretical."""
|
||||
from connectors.jira import service as svc
|
||||
# Mirror _make_jira_service except DROP the API credentials so
|
||||
# is_configured() returns False.
|
||||
svc.Config.JIRA_DOMAIN = ""
|
||||
svc.Config.JIRA_EMAIL = ""
|
||||
svc.Config.JIRA_API_TOKEN = ""
|
||||
svc.Config.JIRA_DATA_DIR = jira_env
|
||||
svc.Config.JIRA_WEBHOOK_SECRET = "webhook-secret-123"
|
||||
svc._jira_service = None
|
||||
service = svc.get_jira_service()
|
||||
assert not service.is_configured()
|
||||
with pytest.raises(JiraFetchError, match="not configured"):
|
||||
service.fetch_remote_links("PROJ-1")
|
||||
|
||||
def test_raises_on_request_error(self, jira_env):
|
||||
import httpx
|
||||
service = _make_jira_service(jira_env)
|
||||
client = MagicMock()
|
||||
client.get.side_effect = httpx.RequestError("connection reset")
|
||||
client.__enter__ = lambda s: client
|
||||
client.__exit__ = MagicMock(return_value=False)
|
||||
with patch("connectors.jira.service.httpx.Client", return_value=client):
|
||||
with pytest.raises(JiraFetchError, match="connection"):
|
||||
service.fetch_remote_links("PROJ-1")
|
||||
|
||||
|
||||
class TestSaveIssueRemoteLinksOverlay:
|
||||
"""save_issue must NOT set _remote_links when fetch_remote_links raises.
|
||||
The absent key is the contract with transform_remote_links: it signals
|
||||
'preserve existing rows'. A present-but-empty list would wipe them."""
|
||||
|
||||
def test_sets_remote_links_on_success(self, jira_env):
|
||||
service = _make_jira_service(jira_env)
|
||||
with patch.object(service, "fetch_remote_links", return_value=[{"id": "rl-1"}]), \
|
||||
patch.object(service, "fetch_sla_fields", return_value=None):
|
||||
path = service.save_issue(_fake_issue_data("PROJ-1"))
|
||||
with open(path) as f:
|
||||
data = json.load(f)
|
||||
assert data["_remote_links"] == [{"id": "rl-1"}]
|
||||
|
||||
def test_sets_empty_remote_links_on_404(self, jira_env):
|
||||
# 404 stays as [] — legitimately means "issue has no remote links".
|
||||
service = _make_jira_service(jira_env)
|
||||
with patch.object(service, "fetch_remote_links", return_value=[]), \
|
||||
patch.object(service, "fetch_sla_fields", return_value=None):
|
||||
path = service.save_issue(_fake_issue_data("PROJ-1"))
|
||||
with open(path) as f:
|
||||
data = json.load(f)
|
||||
assert data["_remote_links"] == []
|
||||
|
||||
def test_omits_remote_links_key_on_fetch_error(self, jira_env):
|
||||
# When fetch raises, the key MUST be absent — that's the signal
|
||||
# to the transform that this isn't fresh data and should be skipped.
|
||||
service = _make_jira_service(jira_env)
|
||||
with patch.object(service, "fetch_remote_links",
|
||||
side_effect=JiraFetchError("auth")), \
|
||||
patch.object(service, "fetch_sla_fields", return_value=None):
|
||||
path = service.save_issue(_fake_issue_data("PROJ-1"))
|
||||
with open(path) as f:
|
||||
data = json.load(f)
|
||||
assert "_remote_links" not in data, \
|
||||
"Absent key is the contract with transform_remote_links — " \
|
||||
"do not change to an empty list."
|
||||
|
||||
|
||||
class TestTransformRemoteLinks:
|
||||
"""transform_remote_links returns None when _remote_links is absent
|
||||
(preserve-existing signal) and [] when present-but-empty (legitimate 'none')."""
|
||||
|
||||
def test_returns_list_when_links_present(self):
|
||||
result = transform_remote_links({
|
||||
"key": "PROJ-1",
|
||||
"_remote_links": [{
|
||||
"id": "rl-1",
|
||||
"object": {"url": "https://x", "title": "X"},
|
||||
"application": {"name": "App", "type": "type"},
|
||||
}],
|
||||
})
|
||||
assert result is not None
|
||||
assert len(result) == 1
|
||||
assert result[0]["remote_link_id"] == "rl-1"
|
||||
|
||||
def test_returns_empty_list_when_key_present_but_empty(self):
|
||||
result = transform_remote_links({"key": "PROJ-1", "_remote_links": []})
|
||||
assert result == []
|
||||
|
||||
def test_returns_none_when_key_absent(self):
|
||||
# Absent key = save_issue skipped the overlay because fetch failed.
|
||||
# Signal to caller: preserve existing parquet rows for this issue.
|
||||
result = transform_remote_links({"key": "PROJ-1"})
|
||||
assert result is None
|
||||
|
||||
def test_returns_none_when_key_is_explicit_null(self):
|
||||
# Defensive: a JSON with `_remote_links: null` (from older buggy code
|
||||
# or a manual edit) would otherwise blow up on `for rl in None`.
|
||||
# Treat null the same as absent — both mean "no fresh data".
|
||||
result = transform_remote_links({"key": "PROJ-1", "_remote_links": None})
|
||||
assert result is None
|
||||
|
||||
|
||||
class TestAutoFixThreshold:
|
||||
"""The auto-fix threshold determines at what gap size we auto-backfill
|
||||
vs require manual review. Legacy bumped it from 10 to 20 — small enough
|
||||
to stay safe, big enough to absorb a typical SLA-poller hiccup."""
|
||||
|
||||
def test_threshold_is_twenty(self):
|
||||
assert JiraConsistencyChecker.AUTO_FIX_THRESHOLD == 20
|
||||
|
||||
def test_alert_level_warning_at_threshold(self):
|
||||
# 20 missing should still be WARNING (auto-fix territory), not ERROR.
|
||||
config = MagicMock()
|
||||
checker = JiraConsistencyChecker(config)
|
||||
assert checker.get_alert_level(20) == "WARNING"
|
||||
|
||||
def test_alert_level_error_above_threshold(self):
|
||||
config = MagicMock()
|
||||
checker = JiraConsistencyChecker(config)
|
||||
assert checker.get_alert_level(21) == "ERROR"
|
||||
|
|
|
|||
Loading…
Reference in a new issue