* feat(observability): optional PostHog integration (errors, LLM traces, replay, flags)
Off by default. Activates when POSTHOG_API_KEY is set in env. Defaults
to PostHog Cloud EU; override host for US Cloud or self-hosted.
Coverage:
- FastAPI 500 handler captures unhandled exceptions
- src/orchestrator.py rebuild + rebuild_source failures
- services/scheduler/ HTTP-job failures
- cli/main.py uncaught CLI errors (Typer.Exit/SystemExit/KeyboardInterrupt
skipped; flushes before re-raise so short-lived CLI invocations don't
drop events)
- connectors/llm/anthropic_provider.py + openai_compat.py emit
$ai_generation events with provider, model, latency, token counts
(prompt/completion bodies stay off unless POSTHOG_LLM_PAYLOADS=1
because LLM prompts here routinely include customer SQL/data)
- Browser snippet injected into every text/html response by
PosthogInjectionMiddleware — registered inside the GZip layer so it
sees uncompressed HTML before compression. Many templates are
standalone (their own DOCTYPE) and never extend base.html, so a
per-template include would miss them.
- Frontend: $pageview, $pageleave, JS error capture via window.error
and unhandledrejection handlers, masked session replay
(maskAllInputs: true plus CSS-selector mask for known data surfaces),
feature flags (browser posthog.isFeatureEnabled + server-side
feature_enabled with fallback for older SDKs).
Identification mode operator-configurable: none / id / email / full.
Default email ships user.id + email but never name. CLI entry point
moves from cli.main:app to cli.main:main (Typer wrapper).
Files:
- src/observability/posthog_client.py — lazy singleton, no network
when disabled, single-process flush on shutdown
- src/observability/llm_tracing.py — trace_generation context manager
- app/middleware/posthog_inject.py — HTML rewrite middleware
- app/web/templates/_posthog.html — browser snippet template
- docs/observability.md — operator guide
- config/.env.template — documented POSTHOG_* knobs
- tests/test_posthog_disabled.py + tests/test_posthog_client.py +
tests/test_llm_tracing.py — 18 tests covering disabled state,
identify-mode payloads, $ai_generation shape, error variant.
CHANGELOG entry under [Unreleased] Added.
* feat(observability): tag every PostHog event with environment + release
Splits PostHog dashboards cleanly between localhost / dev / staging /
production without manual tagging on every capture call.
- POSTHOG_ENVIRONMENT explicit override; auto-resolves to "local" when
LOCAL_DEV_MODE=1, else RELEASE_CHANNEL, else AGNES_DEPLOYMENT_ENV,
else "unknown".
- AGNES_VERSION → RELEASE_CHANNEL fallback feeds the `release` property
for "is this error new in this release?" cohorting.
- Backend gets both via the PostHog SDK's super_properties constructor
arg (every captured event picks them up automatically).
- Browser snippet calls posthog.register({environment, release}) inside
the loaded callback so $pageview, $exception, autocapture, etc. all
carry the same labels.
- request.state.user now populated by auth dependencies so the snippet
can actually call posthog.identify(user_id, {email}) for logged-in
users (previously the user block always resolved to None because
nothing wrote to request.state.user).
4 new tests cover env resolution: explicit > LOCAL_DEV_MODE > channel
> unknown, plus super-properties forwarding into the SDK constructor.
* feat(observability): inline user attrs on every PostHog event + debug throw route
PostHog's UI shows person properties on the Person profile page, not
inline on each event — so a reviewer triaging an exception couldn't tell
which user hit the bug without clicking through. Fix it on both sides.
- Backend capture_exception merges user_id / user_email / user_name into
the event properties (gated by POSTHOG_IDENTIFY_PII: none/id/email/full).
Backed by a new _user_props_for_event helper on PosthogClient.
- Browser snippet registers user_id + user_email + user_name as super-
properties via posthog.register({...}) so every $exception, $pageview,
and custom event coming from posthog.captureException() carries them
inline. Mirrors the backend so cross-referencing client/server events
doesn't require a person-profile lookup.
- /api/debug/throw — debug-only endpoint gated by DEBUG=1 (404 in prod).
Runs Depends(get_current_user) first so request.state.user is set when
the unhandled-exception handler captures the event. Lets operators
exercise the full observability path end-to-end without hand-rolling
a TestClient script. Configurable via ?kind=ValueError&msg=...
7 new tests cover: backend user-attr merge across identify modes,
anonymous request fall-through, browser snippet super-prop emission for
logged-in / anonymous / id-only / full-name cases.
* fix(observability): address minasarustamyan PR #231 review
Two bugs caught in review.
1. PosthogInjectionMiddleware dropped Response.background on every
return path. BaseHTTPMiddleware materialises the body and asks
subclasses to return a fresh Response — three paths in dispatch()
omitted background=, silently cancelling any BackgroundTask /
BackgroundTasks the route attached (audit logging, async webhooks,
email sends) with no log line. Fix: route every return through a
_passthrough() helper that forwards background.
Also adds a _MAX_BUFFER_BYTES (4 MB) cap so a streamed-HTML response
can't balloon RSS during buffering. Bigger bodies short-circuit
through with a warning rather than being injected.
Regression tests in tests/test_posthog_inject_middleware.py exercise
four return paths (snippet present, render-fail, double-injection
guard, non-HTML passthrough) plus the streaming-guard short-circuit.
2. $ai_input / $ai_output_choices were emitted without truncation, so
POSTHOG_LLM_PAYLOADS=1 silently dropped events past PostHog's ~32 KB
per-event ingest limit — exactly the calls (large prompts with
schemas / sample rows / SQL) an operator would want to inspect.
Fix: clip both at POSTHOG_LLM_PAYLOAD_MAX_CHARS (default 30000) with
an explicit "…[truncated N chars]" marker so readers don't mistake
truncated captures for complete ones. Metadata (provider, model,
tokens, latency, error) flows regardless. Three new tests cover
default-cap clipping, env-override, and pass-through under the cap.
37 PostHog tests pass.
180 lines
5.9 KiB
Python
180 lines
5.9 KiB
Python
"""Regression tests for ``app/middleware/posthog_inject.py``.
|
|
|
|
Two narrow concerns from PR #231 review (minasarustamyan):
|
|
|
|
1. ``Response.background`` MUST be forwarded on every return path.
|
|
``BaseHTTPMiddleware`` materialises the body and asks subclasses to
|
|
return a fresh ``Response``; a missed ``background`` parameter cancels
|
|
any ``BackgroundTask`` / ``BackgroundTasks`` the route attached, with
|
|
no log line.
|
|
2. Oversized HTML responses must short-circuit gracefully — the
|
|
middleware buffers in memory by design, so a streamed-HTML route
|
|
would blow up RSS without a cap.
|
|
|
|
Tests boot a minimal FastAPI app (no DB, no auth, no real PostHog) and
|
|
run via ``TestClient`` so they exercise the actual middleware stack.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import os
|
|
from unittest.mock import patch
|
|
|
|
import pytest
|
|
from fastapi import FastAPI
|
|
from fastapi.responses import HTMLResponse
|
|
from fastapi.testclient import TestClient
|
|
from starlette.background import BackgroundTask
|
|
|
|
|
|
@pytest.fixture
|
|
def posthog_enabled(monkeypatch):
|
|
monkeypatch.setenv("POSTHOG_API_KEY", "phc_test")
|
|
from src.observability import reset_posthog
|
|
reset_posthog()
|
|
yield
|
|
reset_posthog()
|
|
|
|
|
|
def _make_app() -> FastAPI:
|
|
"""Minimal FastAPI app with the injection middleware mounted.
|
|
|
|
Avoids importing ``app.main`` so the test stays fast and self-contained.
|
|
"""
|
|
from app.middleware.posthog_inject import PosthogInjectionMiddleware
|
|
|
|
app = FastAPI()
|
|
app.add_middleware(PosthogInjectionMiddleware)
|
|
return app
|
|
|
|
|
|
def test_background_task_runs_on_html_response(posthog_enabled):
|
|
"""A BackgroundTask attached to an HTMLResponse must still fire after
|
|
the middleware rewrites the body. Was silently dropped before fix."""
|
|
fired: list[bool] = []
|
|
|
|
def _mark():
|
|
fired.append(True)
|
|
|
|
with patch("posthog.Posthog"):
|
|
# ``_render_snippet`` reaches into app.web.router; stub it so the
|
|
# middleware doesn't drag in the full app dependency tree.
|
|
with patch("app.middleware.posthog_inject._render_snippet", return_value="<!--ph-->"):
|
|
app = _make_app()
|
|
|
|
@app.get("/page", response_class=HTMLResponse)
|
|
def page():
|
|
return HTMLResponse(
|
|
"<html><head></head><body>x</body></html>",
|
|
background=BackgroundTask(_mark),
|
|
)
|
|
|
|
client = TestClient(app)
|
|
res = client.get("/page")
|
|
|
|
assert res.status_code == 200
|
|
assert "<!--ph-->" in res.text # snippet injected
|
|
# Background task ran. Without the fix, fired stays [].
|
|
assert fired == [True]
|
|
|
|
|
|
def test_background_task_runs_when_snippet_render_fails(posthog_enabled):
|
|
"""If snippet rendering raises, the response still serves and the
|
|
background task still fires."""
|
|
fired: list[bool] = []
|
|
|
|
def _mark():
|
|
fired.append(True)
|
|
|
|
with patch("posthog.Posthog"):
|
|
with patch(
|
|
"app.middleware.posthog_inject._render_snippet",
|
|
side_effect=RuntimeError("template blew up"),
|
|
):
|
|
app = _make_app()
|
|
|
|
@app.get("/page", response_class=HTMLResponse)
|
|
def page():
|
|
return HTMLResponse(
|
|
"<html><head></head><body>x</body></html>",
|
|
background=BackgroundTask(_mark),
|
|
)
|
|
|
|
client = TestClient(app)
|
|
res = client.get("/page")
|
|
|
|
assert res.status_code == 200
|
|
assert fired == [True]
|
|
|
|
|
|
def test_background_task_runs_when_snippet_already_present(posthog_enabled):
|
|
"""Defensive double-injection guard path — body unchanged but
|
|
background still forwarded."""
|
|
fired: list[bool] = []
|
|
|
|
def _mark():
|
|
fired.append(True)
|
|
|
|
with patch("posthog.Posthog"):
|
|
with patch("app.middleware.posthog_inject._render_snippet", return_value="<!--ph-->"):
|
|
app = _make_app()
|
|
|
|
@app.get("/page", response_class=HTMLResponse)
|
|
def page():
|
|
# Body already contains posthog.init -> middleware skips re-injection.
|
|
return HTMLResponse(
|
|
"<html><head><script>posthog.init('x')</script></head><body></body></html>",
|
|
background=BackgroundTask(_mark),
|
|
)
|
|
|
|
client = TestClient(app)
|
|
res = client.get("/page")
|
|
|
|
assert res.status_code == 200
|
|
assert fired == [True]
|
|
|
|
|
|
def test_non_html_response_passthrough_does_not_buffer(posthog_enabled):
|
|
"""JSON / non-HTML responses must skip the middleware entirely —
|
|
no body materialisation, no background-task interference."""
|
|
fired: list[bool] = []
|
|
|
|
def _mark():
|
|
fired.append(True)
|
|
|
|
with patch("posthog.Posthog"):
|
|
app = _make_app()
|
|
|
|
@app.get("/api/health")
|
|
def health():
|
|
from fastapi.responses import JSONResponse
|
|
return JSONResponse({"ok": True}, background=BackgroundTask(_mark))
|
|
|
|
client = TestClient(app)
|
|
res = client.get("/api/health")
|
|
|
|
assert res.status_code == 200
|
|
assert res.json() == {"ok": True}
|
|
assert fired == [True]
|
|
|
|
|
|
def test_oversized_html_response_short_circuits(posthog_enabled, monkeypatch):
|
|
"""Body bigger than the buffer cap serves without injection rather
|
|
than buffering arbitrarily large streams in memory."""
|
|
monkeypatch.setattr("app.middleware.posthog_inject._MAX_BUFFER_BYTES", 1024)
|
|
|
|
with patch("posthog.Posthog"):
|
|
with patch("app.middleware.posthog_inject._render_snippet", return_value="<!--ph-->"):
|
|
app = _make_app()
|
|
|
|
@app.get("/big", response_class=HTMLResponse)
|
|
def big():
|
|
# 2 KB body — twice the patched cap.
|
|
return HTMLResponse("<html><head></head><body>" + ("X" * 2048) + "</body></html>")
|
|
|
|
client = TestClient(app)
|
|
res = client.get("/big")
|
|
|
|
assert res.status_code == 200
|
|
# Snippet NOT injected — middleware bailed out at the cap.
|
|
assert "<!--ph-->" not in res.text
|