* feat(observability): optional PostHog integration (errors, LLM traces, replay, flags)
Off by default. Activates when POSTHOG_API_KEY is set in env. Defaults
to PostHog Cloud EU; override host for US Cloud or self-hosted.
Coverage:
- FastAPI 500 handler captures unhandled exceptions
- src/orchestrator.py rebuild + rebuild_source failures
- services/scheduler/ HTTP-job failures
- cli/main.py uncaught CLI errors (Typer.Exit/SystemExit/KeyboardInterrupt
skipped; flushes before re-raise so short-lived CLI invocations don't
drop events)
- connectors/llm/anthropic_provider.py + openai_compat.py emit
$ai_generation events with provider, model, latency, token counts
(prompt/completion bodies stay off unless POSTHOG_LLM_PAYLOADS=1
because LLM prompts here routinely include customer SQL/data)
- Browser snippet injected into every text/html response by
PosthogInjectionMiddleware — registered inside the GZip layer so it
sees uncompressed HTML before compression. Many templates are
standalone (their own DOCTYPE) and never extend base.html, so a
per-template include would miss them.
- Frontend: $pageview, $pageleave, JS error capture via window.error
and unhandledrejection handlers, masked session replay
(maskAllInputs: true plus CSS-selector mask for known data surfaces),
feature flags (browser posthog.isFeatureEnabled + server-side
feature_enabled with fallback for older SDKs).
Identification mode operator-configurable: none / id / email / full.
Default email ships user.id + email but never name. CLI entry point
moves from cli.main:app to cli.main:main (Typer wrapper).
Files:
- src/observability/posthog_client.py — lazy singleton, no network
when disabled, single-process flush on shutdown
- src/observability/llm_tracing.py — trace_generation context manager
- app/middleware/posthog_inject.py — HTML rewrite middleware
- app/web/templates/_posthog.html — browser snippet template
- docs/observability.md — operator guide
- config/.env.template — documented POSTHOG_* knobs
- tests/test_posthog_disabled.py + tests/test_posthog_client.py +
tests/test_llm_tracing.py — 18 tests covering disabled state,
identify-mode payloads, $ai_generation shape, error variant.
CHANGELOG entry under [Unreleased] Added.
* feat(observability): tag every PostHog event with environment + release
Splits PostHog dashboards cleanly between localhost / dev / staging /
production without manual tagging on every capture call.
- POSTHOG_ENVIRONMENT explicit override; auto-resolves to "local" when
LOCAL_DEV_MODE=1, else RELEASE_CHANNEL, else AGNES_DEPLOYMENT_ENV,
else "unknown".
- AGNES_VERSION → RELEASE_CHANNEL fallback feeds the `release` property
for "is this error new in this release?" cohorting.
- Backend gets both via the PostHog SDK's super_properties constructor
arg (every captured event picks them up automatically).
- Browser snippet calls posthog.register({environment, release}) inside
the loaded callback so $pageview, $exception, autocapture, etc. all
carry the same labels.
- request.state.user now populated by auth dependencies so the snippet
can actually call posthog.identify(user_id, {email}) for logged-in
users (previously the user block always resolved to None because
nothing wrote to request.state.user).
4 new tests cover env resolution: explicit > LOCAL_DEV_MODE > channel
> unknown, plus super-properties forwarding into the SDK constructor.
* feat(observability): inline user attrs on every PostHog event + debug throw route
PostHog's UI shows person properties on the Person profile page, not
inline on each event — so a reviewer triaging an exception couldn't tell
which user hit the bug without clicking through. Fix it on both sides.
- Backend capture_exception merges user_id / user_email / user_name into
the event properties (gated by POSTHOG_IDENTIFY_PII: none/id/email/full).
Backed by a new _user_props_for_event helper on PosthogClient.
- Browser snippet registers user_id + user_email + user_name as super-
properties via posthog.register({...}) so every $exception, $pageview,
and custom event coming from posthog.captureException() carries them
inline. Mirrors the backend so cross-referencing client/server events
doesn't require a person-profile lookup.
- /api/debug/throw — debug-only endpoint gated by DEBUG=1 (404 in prod).
Runs Depends(get_current_user) first so request.state.user is set when
the unhandled-exception handler captures the event. Lets operators
exercise the full observability path end-to-end without hand-rolling
a TestClient script. Configurable via ?kind=ValueError&msg=...
7 new tests cover: backend user-attr merge across identify modes,
anonymous request fall-through, browser snippet super-prop emission for
logged-in / anonymous / id-only / full-name cases.
* fix(observability): address minasarustamyan PR #231 review
Two bugs caught in review.
1. PosthogInjectionMiddleware dropped Response.background on every
return path. BaseHTTPMiddleware materialises the body and asks
subclasses to return a fresh Response — three paths in dispatch()
omitted background=, silently cancelling any BackgroundTask /
BackgroundTasks the route attached (audit logging, async webhooks,
email sends) with no log line. Fix: route every return through a
_passthrough() helper that forwards background.
Also adds a _MAX_BUFFER_BYTES (4 MB) cap so a streamed-HTML response
can't balloon RSS during buffering. Bigger bodies short-circuit
through with a warning rather than being injected.
Regression tests in tests/test_posthog_inject_middleware.py exercise
four return paths (snippet present, render-fail, double-injection
guard, non-HTML passthrough) plus the streaming-guard short-circuit.
2. $ai_input / $ai_output_choices were emitted without truncation, so
POSTHOG_LLM_PAYLOADS=1 silently dropped events past PostHog's ~32 KB
per-event ingest limit — exactly the calls (large prompts with
schemas / sample rows / SQL) an operator would want to inspect.
Fix: clip both at POSTHOG_LLM_PAYLOAD_MAX_CHARS (default 30000) with
an explicit "…[truncated N chars]" marker so readers don't mistake
truncated captures for complete ones. Metadata (provider, model,
tokens, latency, error) flows regardless. Three new tests cover
default-cap clipping, env-override, and pass-through under the cap.
37 PostHog tests pass.
156 lines
6.3 KiB
Python
156 lines
6.3 KiB
Python
"""HTML-injection middleware that places the PostHog snippet in every page.
|
|
|
|
Many of this app's Jinja templates are standalone (their own ``<!DOCTYPE
|
|
html>``) and do not extend ``base.html`` / ``base_login.html`` — including
|
|
the dashboard, catalog, admin pages, and activity center. Adding
|
|
``{% include '_posthog.html' %}`` to each one is fragile and easy to miss.
|
|
|
|
Instead, this middleware rewrites every HTML response to inject the
|
|
rendered snippet immediately before ``</head>``. When PostHog is disabled
|
|
(no ``POSTHOG_API_KEY``) the middleware is a no-op.
|
|
|
|
Skips:
|
|
* Non-HTML responses (everything API, JSON, parquet, CSV).
|
|
* Responses larger than ``_MAX_BUFFER_BYTES`` — defends against
|
|
genuine HTML streams (rare but legal: large dashboards rendered
|
|
as chunked transfer) where buffering the entire body would balloon
|
|
memory. Snippet injection is best-effort.
|
|
* Responses that already contain ``posthog.init`` (defensive — keeps
|
|
base-extending templates from getting a double-injection if a
|
|
future change re-includes the partial there).
|
|
|
|
Background tasks attached to a route via ``Response.background`` are
|
|
preserved on every return path. ``BaseHTTPMiddleware`` materialises the
|
|
body and asks subclasses to return a fresh ``Response``; forgetting to
|
|
forward ``background`` would silently cancel any deferred work the
|
|
handler scheduled (audit logging, async webhooks, deferred email sends),
|
|
with no log line. Caught in PR #231 review (minasarustamyan).
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import logging
|
|
from typing import Awaitable, Callable
|
|
|
|
from starlette.middleware.base import BaseHTTPMiddleware
|
|
from starlette.requests import Request
|
|
from starlette.responses import Response
|
|
from starlette.types import ASGIApp
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
_HEAD_CLOSE = b"</head>"
|
|
|
|
# Hard ceiling on how much body we're willing to buffer in memory just to
|
|
# inject ~3 KB of snippet. 4 MB covers every HTML page this app currently
|
|
# emits with ample headroom while preventing a pathological streamed-HTML
|
|
# response from ballooning RSS. Adjust if a legitimate page exceeds it.
|
|
_MAX_BUFFER_BYTES = 4 * 1024 * 1024
|
|
|
|
|
|
def _passthrough(body: bytes, response: Response) -> Response:
|
|
"""Return a fresh ``Response`` carrying ``body`` plus every attribute of
|
|
``response`` that ``BaseHTTPMiddleware`` would otherwise drop —
|
|
importantly ``background`` so any ``BackgroundTask`` /
|
|
``BackgroundTasks`` the handler attached still fires.
|
|
"""
|
|
return Response(
|
|
content=body,
|
|
status_code=response.status_code,
|
|
headers=dict(response.headers),
|
|
media_type=response.media_type,
|
|
background=response.background,
|
|
)
|
|
|
|
|
|
class PosthogInjectionMiddleware(BaseHTTPMiddleware):
|
|
"""Inject the PostHog snippet into every HTML response."""
|
|
|
|
def __init__(self, app: ASGIApp) -> None:
|
|
super().__init__(app)
|
|
|
|
async def dispatch(
|
|
self,
|
|
request: Request,
|
|
call_next: Callable[[Request], Awaitable[Response]],
|
|
) -> Response:
|
|
from src.observability import get_posthog
|
|
if not get_posthog().enabled:
|
|
return await call_next(request)
|
|
|
|
response = await call_next(request)
|
|
|
|
content_type = response.headers.get("content-type", "")
|
|
if "text/html" not in content_type.lower():
|
|
return response
|
|
|
|
# Buffer the body. ``BaseHTTPMiddleware`` consumes
|
|
# ``response.body_iterator`` here — once we iterate it, the only
|
|
# way to forward the response is to return a new one. Bail out
|
|
# past ``_MAX_BUFFER_BYTES`` so a streamed HTML response (rare but
|
|
# legal) doesn't balloon memory.
|
|
chunks: list[bytes] = []
|
|
total = 0
|
|
too_big = False
|
|
async for chunk in response.body_iterator: # type: ignore[attr-defined]
|
|
buf = chunk if isinstance(chunk, (bytes, bytearray)) else chunk.encode("utf-8")
|
|
total += len(buf)
|
|
if total > _MAX_BUFFER_BYTES:
|
|
too_big = True
|
|
# Still need to drain the iterator to avoid breaking the
|
|
# ASGI stream contract; but stop appending so we don't
|
|
# hold every chunk.
|
|
continue
|
|
chunks.append(buf)
|
|
if too_big:
|
|
logger.warning(
|
|
"PostHog snippet injection skipped: HTML response > %d bytes (path=%s)",
|
|
_MAX_BUFFER_BYTES, request.url.path,
|
|
)
|
|
# We've consumed the iterator; rebuild from the chunks we
|
|
# captured before the cap. Better to serve a truncated body
|
|
# than to crash, but in practice the cap is set so this
|
|
# branch shouldn't fire for legitimate pages.
|
|
return _passthrough(b"".join(chunks), response)
|
|
|
|
body = b"".join(chunks)
|
|
|
|
if _HEAD_CLOSE not in body or b"posthog.init" in body:
|
|
return _passthrough(body, response)
|
|
|
|
try:
|
|
snippet = _render_snippet(request)
|
|
except Exception:
|
|
logger.exception("PostHog snippet render failed; serving response unmodified")
|
|
return _passthrough(body, response)
|
|
|
|
body = body.replace(_HEAD_CLOSE, snippet.encode("utf-8") + _HEAD_CLOSE, 1)
|
|
# content-length must reflect the rewritten body — Starlette's
|
|
# ``Response`` sets it for us when we drop the prior header.
|
|
new_headers = {k: v for k, v in response.headers.items() if k.lower() != "content-length"}
|
|
return Response(
|
|
content=body,
|
|
status_code=response.status_code,
|
|
headers=new_headers,
|
|
media_type=response.media_type,
|
|
background=response.background,
|
|
)
|
|
|
|
|
|
def _render_snippet(request: Request) -> str:
|
|
"""Render ``_posthog.html`` with the current request's identify state."""
|
|
from app.web.router import templates, _posthog_user_block, _posthog_config_global
|
|
|
|
cfg = _posthog_config_global()
|
|
user_block = _posthog_user_block(request)
|
|
|
|
template = templates.get_template("_posthog.html")
|
|
return template.render(
|
|
request=request,
|
|
posthog_config=cfg,
|
|
# ``_posthog.html`` calls ``posthog_user_block(request)`` itself —
|
|
# provide the same callable so the template renders identically
|
|
# to the inline-include path.
|
|
posthog_user_block=lambda _r: user_block,
|
|
)
|