* feat(observability): optional PostHog integration (errors, LLM traces, replay, flags)
Off by default. Activates when POSTHOG_API_KEY is set in env. Defaults
to PostHog Cloud EU; override host for US Cloud or self-hosted.
Coverage:
- FastAPI 500 handler captures unhandled exceptions
- src/orchestrator.py rebuild + rebuild_source failures
- services/scheduler/ HTTP-job failures
- cli/main.py uncaught CLI errors (Typer.Exit/SystemExit/KeyboardInterrupt
skipped; flushes before re-raise so short-lived CLI invocations don't
drop events)
- connectors/llm/anthropic_provider.py + openai_compat.py emit
$ai_generation events with provider, model, latency, token counts
(prompt/completion bodies stay off unless POSTHOG_LLM_PAYLOADS=1
because LLM prompts here routinely include customer SQL/data)
- Browser snippet injected into every text/html response by
PosthogInjectionMiddleware — registered inside the GZip layer so it
sees uncompressed HTML before compression. Many templates are
standalone (their own DOCTYPE) and never extend base.html, so a
per-template include would miss them.
- Frontend: $pageview, $pageleave, JS error capture via window.error
and unhandledrejection handlers, masked session replay
(maskAllInputs: true plus CSS-selector mask for known data surfaces),
feature flags (browser posthog.isFeatureEnabled + server-side
feature_enabled with fallback for older SDKs).
Identification mode operator-configurable: none / id / email / full.
Default email ships user.id + email but never name. CLI entry point
moves from cli.main:app to cli.main:main (Typer wrapper).
Files:
- src/observability/posthog_client.py — lazy singleton, no network
when disabled, single-process flush on shutdown
- src/observability/llm_tracing.py — trace_generation context manager
- app/middleware/posthog_inject.py — HTML rewrite middleware
- app/web/templates/_posthog.html — browser snippet template
- docs/observability.md — operator guide
- config/.env.template — documented POSTHOG_* knobs
- tests/test_posthog_disabled.py + tests/test_posthog_client.py +
tests/test_llm_tracing.py — 18 tests covering disabled state,
identify-mode payloads, $ai_generation shape, error variant.
CHANGELOG entry under [Unreleased] Added.
* feat(observability): tag every PostHog event with environment + release
Splits PostHog dashboards cleanly between localhost / dev / staging /
production without manual tagging on every capture call.
- POSTHOG_ENVIRONMENT explicit override; auto-resolves to "local" when
LOCAL_DEV_MODE=1, else RELEASE_CHANNEL, else AGNES_DEPLOYMENT_ENV,
else "unknown".
- AGNES_VERSION → RELEASE_CHANNEL fallback feeds the `release` property
for "is this error new in this release?" cohorting.
- Backend gets both via the PostHog SDK's super_properties constructor
arg (every captured event picks them up automatically).
- Browser snippet calls posthog.register({environment, release}) inside
the loaded callback so $pageview, $exception, autocapture, etc. all
carry the same labels.
- request.state.user now populated by auth dependencies so the snippet
can actually call posthog.identify(user_id, {email}) for logged-in
users (previously the user block always resolved to None because
nothing wrote to request.state.user).
4 new tests cover env resolution: explicit > LOCAL_DEV_MODE > channel
> unknown, plus super-properties forwarding into the SDK constructor.
* feat(observability): inline user attrs on every PostHog event + debug throw route
PostHog's UI shows person properties on the Person profile page, not
inline on each event — so a reviewer triaging an exception couldn't tell
which user hit the bug without clicking through. Fix it on both sides.
- Backend capture_exception merges user_id / user_email / user_name into
the event properties (gated by POSTHOG_IDENTIFY_PII: none/id/email/full).
Backed by a new _user_props_for_event helper on PosthogClient.
- Browser snippet registers user_id + user_email + user_name as super-
properties via posthog.register({...}) so every $exception, $pageview,
and custom event coming from posthog.captureException() carries them
inline. Mirrors the backend so cross-referencing client/server events
doesn't require a person-profile lookup.
- /api/debug/throw — debug-only endpoint gated by DEBUG=1 (404 in prod).
Runs Depends(get_current_user) first so request.state.user is set when
the unhandled-exception handler captures the event. Lets operators
exercise the full observability path end-to-end without hand-rolling
a TestClient script. Configurable via ?kind=ValueError&msg=...
7 new tests cover: backend user-attr merge across identify modes,
anonymous request fall-through, browser snippet super-prop emission for
logged-in / anonymous / id-only / full-name cases.
* fix(observability): address minasarustamyan PR #231 review
Two bugs caught in review.
1. PosthogInjectionMiddleware dropped Response.background on every
return path. BaseHTTPMiddleware materialises the body and asks
subclasses to return a fresh Response — three paths in dispatch()
omitted background=, silently cancelling any BackgroundTask /
BackgroundTasks the route attached (audit logging, async webhooks,
email sends) with no log line. Fix: route every return through a
_passthrough() helper that forwards background.
Also adds a _MAX_BUFFER_BYTES (4 MB) cap so a streamed-HTML response
can't balloon RSS during buffering. Bigger bodies short-circuit
through with a warning rather than being injected.
Regression tests in tests/test_posthog_inject_middleware.py exercise
four return paths (snippet present, render-fail, double-injection
guard, non-HTML passthrough) plus the streaming-guard short-circuit.
2. $ai_input / $ai_output_choices were emitted without truncation, so
POSTHOG_LLM_PAYLOADS=1 silently dropped events past PostHog's ~32 KB
per-event ingest limit — exactly the calls (large prompts with
schemas / sample rows / SQL) an operator would want to inspect.
Fix: clip both at POSTHOG_LLM_PAYLOAD_MAX_CHARS (default 30000) with
an explicit "…[truncated N chars]" marker so readers don't mistake
truncated captures for complete ones. Metadata (provider, model,
tokens, latency, error) flows regardless. Three new tests cover
default-cap clipping, env-override, and pass-through under the cap.
37 PostHog tests pass.
174 lines
6 KiB
Python
174 lines
6 KiB
Python
"""Anthropic provider for structured JSON extraction.
|
|
|
|
Uses the Anthropic API with native structured output (json_schema)
|
|
for reliable JSON extraction. Includes retry logic for transient errors.
|
|
"""
|
|
|
|
import json
|
|
import logging
|
|
import time
|
|
|
|
import anthropic
|
|
|
|
from .exceptions import (
|
|
LLMAuthError,
|
|
LLMFormatError,
|
|
LLMRateLimitError,
|
|
LLMRefusalError,
|
|
LLMTimeoutError,
|
|
)
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
# Retry configuration
|
|
MAX_RETRIES = 3
|
|
INITIAL_BACKOFF_SECONDS = 2
|
|
BACKOFF_MULTIPLIER = 2
|
|
|
|
|
|
def _strict_json_schema(schema):
|
|
"""Return a copy of the schema with additionalProperties=False on every object type.
|
|
|
|
The Anthropic structured-output API rejects schemas where a `{"type": "object"}` node
|
|
omits `additionalProperties` (HTTP 400 invalid_request_error). We walk the schema
|
|
recursively and force the field where missing.
|
|
"""
|
|
if isinstance(schema, dict):
|
|
out = {k: _strict_json_schema(v) for k, v in schema.items()}
|
|
if out.get("type") == "object" and "additionalProperties" not in out:
|
|
out["additionalProperties"] = False
|
|
return out
|
|
if isinstance(schema, list):
|
|
return [_strict_json_schema(item) for item in schema]
|
|
return schema
|
|
|
|
|
|
class AnthropicExtractor:
|
|
"""Structured JSON extractor using the Anthropic API.
|
|
|
|
Uses output_config with json_schema format for structured output.
|
|
Retries transient errors (rate limit, timeout, connection) with
|
|
exponential backoff.
|
|
"""
|
|
|
|
def __init__(self, api_key: str, model: str) -> None:
|
|
"""Initialize the Anthropic extractor.
|
|
|
|
Args:
|
|
api_key: Anthropic API key.
|
|
model: Model identifier (e.g., "claude-haiku-4-5-20251001").
|
|
"""
|
|
self._client = anthropic.Anthropic(api_key=api_key)
|
|
self._model = model
|
|
|
|
def extract_json(
|
|
self,
|
|
prompt: str,
|
|
max_tokens: int,
|
|
json_schema: dict,
|
|
schema_name: str,
|
|
) -> dict:
|
|
"""Extract structured JSON using the Anthropic API.
|
|
|
|
Args:
|
|
prompt: The extraction prompt to send to the model.
|
|
max_tokens: Maximum tokens in the response.
|
|
json_schema: JSON Schema that the response must conform to.
|
|
schema_name: Human-readable name for the schema.
|
|
|
|
Returns:
|
|
Parsed JSON dictionary conforming to the provided schema.
|
|
|
|
Raises:
|
|
LLMAuthError: Invalid API key.
|
|
LLMRateLimitError: Rate limited after all retries.
|
|
LLMTimeoutError: Timeout/connection error after all retries.
|
|
LLMFormatError: Response is not valid JSON.
|
|
LLMRefusalError: Model refused to respond.
|
|
"""
|
|
last_exception: Exception | None = None
|
|
|
|
for attempt in range(1, MAX_RETRIES + 1):
|
|
try:
|
|
return self._attempt_extraction(
|
|
prompt, max_tokens, json_schema, schema_name, attempt,
|
|
)
|
|
except LLMAuthError:
|
|
raise
|
|
except LLMRefusalError:
|
|
raise
|
|
except (LLMRateLimitError, LLMTimeoutError) as e:
|
|
last_exception = e
|
|
if attempt < MAX_RETRIES:
|
|
delay = INITIAL_BACKOFF_SECONDS * (BACKOFF_MULTIPLIER ** (attempt - 1))
|
|
logger.warning(
|
|
"Transient error on attempt %d/%d for model %s, "
|
|
"retrying in %ds: %s",
|
|
attempt, MAX_RETRIES, self._model, delay,
|
|
type(e).__name__,
|
|
)
|
|
time.sleep(delay)
|
|
|
|
raise last_exception # type: ignore[misc]
|
|
|
|
def _attempt_extraction(
|
|
self,
|
|
prompt: str,
|
|
max_tokens: int,
|
|
json_schema: dict,
|
|
schema_name: str,
|
|
attempt: int,
|
|
) -> dict:
|
|
"""Single extraction attempt against the Anthropic API."""
|
|
logger.info(
|
|
"Anthropic extraction attempt %d/%d, model=%s, schema=%s",
|
|
attempt, MAX_RETRIES, self._model, schema_name,
|
|
)
|
|
|
|
from src.observability import trace_generation
|
|
|
|
try:
|
|
with trace_generation(provider="anthropic", model=self._model) as _trace:
|
|
_trace.set_input(prompt)
|
|
response = self._client.messages.create(
|
|
model=self._model,
|
|
max_tokens=max_tokens,
|
|
messages=[{"role": "user", "content": prompt}],
|
|
output_config={
|
|
"format": {
|
|
"type": "json_schema",
|
|
"schema": _strict_json_schema(json_schema),
|
|
},
|
|
},
|
|
)
|
|
_trace.set_output_from_anthropic(response)
|
|
except anthropic.AuthenticationError as e:
|
|
raise LLMAuthError("Anthropic authentication failed (check API key)") from e
|
|
except anthropic.RateLimitError as e:
|
|
raise LLMRateLimitError("Anthropic rate limited") from e
|
|
except (anthropic.APITimeoutError, anthropic.APIConnectionError) as e:
|
|
raise LLMTimeoutError(
|
|
f"Anthropic connection error ({type(e).__name__})"
|
|
) from e
|
|
|
|
# Check for truncation - raise and let outer retry loop handle it
|
|
if response.stop_reason == "max_tokens":
|
|
raise LLMFormatError(
|
|
f"Response truncated (max_tokens) for schema {schema_name}"
|
|
)
|
|
|
|
# Check for refusal
|
|
if response.stop_reason == "end_turn" and not response.content:
|
|
raise LLMRefusalError(
|
|
f"Model refused to generate response for schema {schema_name}"
|
|
)
|
|
|
|
# Parse JSON from response
|
|
try:
|
|
text = response.content[0].text
|
|
return json.loads(text)
|
|
except (json.JSONDecodeError, IndexError, AttributeError) as e:
|
|
raise LLMFormatError(
|
|
f"Failed to parse Anthropic response as JSON for "
|
|
f"schema {schema_name} ({type(e).__name__})"
|
|
) from e
|