* feat(home): status frame on /home — last sync, sessions, prompts, tokens, projects
Adds the homepage status frame: a 5-card row above the install-hero /
offboard-strip on /home showing the calling user's Last sync (their
last `agnes pull`), Sessions, Prompts, Tokens used, and Projects worked
on, with a 24h/7d pill toggle.
Backed by `GET /api/me/home-stats?window=` (one DuckDB CTE joining
`users` + `usage_session_summary` + `usage_events`) and SSR'd from the
same `compute_home_stats` helper on initial paint so there's no
spinner. The window toggle is the only JS-driven path.
Side surfaces:
- `GET /api/sync/manifest` now stamps `users.last_pull_at` so
`agnes pull` (and the Claude Code SessionStart hook that wraps it)
imprints the analyst's last sync time for the new card.
- `usage_session_summary` gains four BIGINT token counters
(input_tokens, output_tokens, cache_read_tokens, cache_creation_tokens)
summed from JSONL `message.usage.*` per assistant turn.
- `USAGE_PROCESSOR_VERSION` bumps 1 → 2 so the session-pipeline
reprocess loop invalidates stale summaries and backfills tokens
on the next tick.
Schema migration v43 → v44 is idempotent ALTERs (last_pull_at +
4 token columns) — fresh installs receive them from `_SYSTEM_SCHEMA`,
upgrade path runs `_v43_to_v44`. Defaults (NULL / 0) backfill
existing rows cleanly.
9 new tests in tests/test_home_stats.py cover the migration,
endpoint shapes (24h/7d/unknown/empty/missing-user), and the
manifest-side last_pull_at bump.
* docs(CHANGELOG): homepage status frame entries under [Unreleased]
The post-rebase release-cut now belongs to whichever PR lands next
after main rolled to 0.54.9. This PR logs its bullets under
[Unreleased] (Added: homepage status frame, per-user pull tracking,
token counters; Changed: schema v43 → v44 migration) so they ride
out with the next release-cut.
* fix(tests): bump test_schema_v42_migration asserts to v44
CI failed because tests/test_schema_v42_migration.py hardcoded
`assert SCHEMA_VERSION == 43` and `assert v == 43` after init.
v44 (homepage stats frame backing columns) was introduced in the
preceding feat commit; this aligns the existing v42-era migration
tests with the new schema version.
* feat(home): gate status frame on operator flag + user.onboarded
Two gates on the homepage status frame:
1. **Operator master switch** — `get_home_status_frame_visibility()` in
app/instance_config.py mirrors the existing `get_home_automode_visibility()`
shape: env var `AGNES_HOME_SHOW_STATUS_FRAME` > yaml
`instance.home.show_status_frame` > default `True`. Cautious-rollout
instances can disable the frame without forking; the yaml example
documents both knobs.
2. **Onboarded gate** — the template only renders the frame when the
caller's `users.onboarded` is true. First-day users see a clean
install-hero before all-zero stat cards; the frame appears
automatically on the next render after `agnes init` POSTs
`/api/me/onboarded`.
Router skips the `compute_home_stats` DB read entirely when either
gate is closed; `home_stats` arrives at the template as None in that
branch and the `{% if %}` shortcuts the include.
Why both gates: PostHog feature flags evaluated and rejected — this
codebase uses PostHog for analytics capture only, not feature gating;
adding a per-user feature_enabled() call on the /home critical path
would couple the homepage render to a remote eval and still require
an admin master switch. The onboarded gate is a UX coherence rule
layered on top of the operator switch, not an A/B test signal.
3 new tests in test_home_stats.py cover the env-var resolution
(falsey values + default-true). The yaml example gets a `home:`
block documenting both `show_automode` (pre-existing flag, was
undocumented in the example) and `show_status_frame`.
469 lines
18 KiB
Python
469 lines
18 KiB
Python
"""Pure helpers for UsageProcessor — event extraction from Claude Code session jsonls.
|
|
|
|
Session JSONL shape (as documented in dev_docs/session_explore.md and verified
|
|
against live samples):
|
|
|
|
Each line is a top-level event dict with:
|
|
{
|
|
"type": "user" | "assistant" | "progress" | "system" |
|
|
"tool_use_result" | "summary" | "file-history-snapshot" |
|
|
"queue-operation" | ...,
|
|
"uuid": "event-uuid",
|
|
"parentUuid": "parent-event-uuid",
|
|
"sessionId": "session-uuid",
|
|
"timestamp": "2026-05-12T07:30:00.000Z",
|
|
"cwd": "/path/to/cwd",
|
|
"message": {
|
|
"role": "user" | "assistant",
|
|
"model": "claude-...", # present on assistant turns
|
|
"content": [ # array or plain string on user turns
|
|
{"type": "text", "text": "..."},
|
|
{"type": "tool_use", "id": "tu_123", "name": "Bash", "input": {...}},
|
|
{"type": "tool_result", "tool_use_id": "tu_123", "is_error": false, "content": [...]}
|
|
]
|
|
}
|
|
}
|
|
|
|
Tool results appear as:
|
|
- Inline content items of type "tool_result" inside a user-role message, OR
|
|
- As top-level events of type "tool_use_result" (older Claude Code versions)
|
|
|
|
is_error correlation: build a map of {tool_use_id: True} from tool_result
|
|
items on the first pass, then apply to matching tool_use events.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import re
|
|
from collections import Counter
|
|
from dataclasses import dataclass
|
|
from datetime import datetime, timezone, timedelta
|
|
from typing import Iterator
|
|
|
|
USAGE_PROCESSOR_VERSION = 2
|
|
|
|
BUILTIN_TOOLS = frozenset({
|
|
"Bash", "Read", "Edit", "Write", "Grep", "Glob", "TodoWrite",
|
|
"Task", "Agent", "NotebookEdit", "WebFetch", "WebSearch", "ExitPlanMode",
|
|
"LS", # also built-in
|
|
})
|
|
|
|
# Slash commands: "/something" or "/namespace:something" at start of user text
|
|
SLASH_RE = re.compile(r"^\s*/([A-Za-z][\w:-]*)")
|
|
|
|
# Event types to skip entirely
|
|
_SKIP_TYPES = frozenset({
|
|
"system", "summary", "file-history-snapshot",
|
|
"queue-operation", "progress",
|
|
})
|
|
|
|
|
|
@dataclass(frozen=True)
|
|
class ParsedEvent:
|
|
event_uuid: str | None
|
|
parent_uuid: str | None
|
|
tool_id: str | None # tool_use 'id' (tu_xxx) from message.content item; None for slash_command
|
|
event_type: str # 'tool_use' | 'slash_command' | 'subagent' | 'mcp_call'
|
|
tool_name: str | None
|
|
skill_name: str | None
|
|
subagent_type: str | None
|
|
command_name: str | None
|
|
is_error: bool
|
|
model: str | None
|
|
cwd: str | None
|
|
occurred_at: datetime
|
|
|
|
|
|
def _parse_ts(ts_str: str | None) -> datetime | None:
|
|
"""Parse ISO 8601 timestamp to aware datetime. Returns None on failure."""
|
|
if not ts_str:
|
|
return None
|
|
try:
|
|
ts_str = ts_str.replace("Z", "+00:00")
|
|
return datetime.fromisoformat(ts_str)
|
|
except (ValueError, TypeError):
|
|
return None
|
|
|
|
|
|
def _collect_error_map(turns: list[dict]) -> dict[str, bool]:
|
|
"""First-pass: collect tool_use_id → is_error from all tool_result items.
|
|
|
|
Tool results appear in two places:
|
|
1. As content items inside user-role messages (type='tool_result')
|
|
2. As top-level events of type='tool_use_result'
|
|
"""
|
|
errors: dict[str, bool] = {}
|
|
for turn in turns:
|
|
turn_type = turn.get("type", "")
|
|
|
|
# Top-level tool_use_result events (older Claude Code)
|
|
if turn_type == "tool_use_result":
|
|
tu_id = turn.get("tool_use_id") or turn.get("toolUseId")
|
|
if tu_id and turn.get("is_error"):
|
|
errors[tu_id] = True
|
|
|
|
# Inline tool_result content blocks inside user messages
|
|
msg = turn.get("message", {}) or {}
|
|
content = msg.get("content", [])
|
|
if isinstance(content, list):
|
|
for item in content:
|
|
if not isinstance(item, dict):
|
|
continue
|
|
if item.get("type") == "tool_result":
|
|
tu_id = item.get("tool_use_id")
|
|
if tu_id and item.get("is_error"):
|
|
errors[tu_id] = True
|
|
|
|
return errors
|
|
|
|
|
|
def iter_events(turns: list[dict]) -> Iterator[ParsedEvent]:
|
|
"""Walk parsed JSONL turns and yield ParsedEvent for each observable event.
|
|
|
|
Recognises:
|
|
- Assistant tool_use blocks → event_type='tool_use' (or 'subagent'/'mcp_call')
|
|
- Skill tool → also extracts skill_name
|
|
- Task/Agent tools → event_type='subagent'
|
|
- mcp__* tools → event_type='mcp_call'
|
|
- User messages starting with '/' → event_type='slash_command'
|
|
|
|
Skips: system, summary, file-history-snapshot, queue-operation, progress.
|
|
"""
|
|
error_map = _collect_error_map(turns)
|
|
|
|
for turn in turns:
|
|
turn_type = turn.get("type", "")
|
|
if turn_type in _SKIP_TYPES:
|
|
continue
|
|
|
|
ts = _parse_ts(turn.get("timestamp")) or datetime.now(timezone.utc)
|
|
cwd = turn.get("cwd")
|
|
event_uuid = turn.get("uuid")
|
|
parent_uuid = turn.get("parentUuid")
|
|
|
|
msg = turn.get("message", {}) or {}
|
|
content = msg.get("content", [])
|
|
model = msg.get("model")
|
|
|
|
if turn_type == "assistant":
|
|
if isinstance(content, list):
|
|
for item in content:
|
|
if not isinstance(item, dict):
|
|
continue
|
|
if item.get("type") != "tool_use":
|
|
continue
|
|
|
|
tool_id = item.get("id", "")
|
|
tool_name = item.get("name") or ""
|
|
inp = item.get("input") or {}
|
|
is_error = error_map.get(tool_id, False)
|
|
|
|
# Classify event type
|
|
skill_name: str | None = None
|
|
subagent_type: str | None = None
|
|
command_name: str | None = None
|
|
|
|
if tool_name == "Skill":
|
|
event_type = "tool_use"
|
|
# Real Skill input shape varies; check both keys
|
|
skill_name = inp.get("skill") or inp.get("name") or None
|
|
elif tool_name in ("Task", "Agent"):
|
|
event_type = "subagent"
|
|
subagent_type = inp.get("subagent_type") or tool_name
|
|
elif tool_name.startswith("mcp__"):
|
|
event_type = "mcp_call"
|
|
else:
|
|
event_type = "tool_use"
|
|
|
|
yield ParsedEvent(
|
|
event_uuid=event_uuid,
|
|
parent_uuid=parent_uuid,
|
|
tool_id=tool_id or None,
|
|
event_type=event_type,
|
|
tool_name=tool_name or None,
|
|
skill_name=skill_name,
|
|
subagent_type=subagent_type,
|
|
command_name=command_name,
|
|
is_error=is_error,
|
|
model=model,
|
|
cwd=cwd,
|
|
occurred_at=ts,
|
|
)
|
|
|
|
elif turn_type == "user":
|
|
# Slash-command detection from text content
|
|
if isinstance(content, str):
|
|
text_parts = [content]
|
|
elif isinstance(content, list):
|
|
text_parts = [
|
|
item.get("text", "")
|
|
for item in content
|
|
if isinstance(item, dict) and item.get("type") == "text"
|
|
]
|
|
else:
|
|
text_parts = []
|
|
|
|
for text in text_parts:
|
|
if not text:
|
|
continue
|
|
m = SLASH_RE.match(text)
|
|
if m:
|
|
yield ParsedEvent(
|
|
event_uuid=event_uuid,
|
|
parent_uuid=parent_uuid,
|
|
tool_id=None,
|
|
event_type="slash_command",
|
|
tool_name=None,
|
|
skill_name=None,
|
|
subagent_type=None,
|
|
command_name=m.group(1),
|
|
is_error=False,
|
|
model=None,
|
|
cwd=cwd,
|
|
occurred_at=ts,
|
|
)
|
|
|
|
|
|
class AttributionLookup:
|
|
"""Preloads attribution tables into memory for O(1) event attribution.
|
|
|
|
Resolves (source, ref_id) for each event. Built-in tools and unknowns
|
|
return ('builtin', None). curated wins over flea (alphabetical ordering
|
|
means 'curated' < 'flea' → first-write-wins when iterating ORDER BY source).
|
|
"""
|
|
|
|
def __init__(self, conn):
|
|
self._skills: dict[str, tuple[str, str]] = {} # name -> (source, ref_id)
|
|
self._agents: dict[str, tuple[str, str]] = {}
|
|
self._commands: dict[str, tuple[str, str]] = {}
|
|
|
|
for row in conn.execute(
|
|
"SELECT skill_name, source, ref_id FROM usage_attribution_skills ORDER BY source ASC"
|
|
).fetchall():
|
|
self._skills.setdefault(row[0], (row[1], row[2]))
|
|
|
|
for row in conn.execute(
|
|
"SELECT agent_name, source, ref_id FROM usage_attribution_agents ORDER BY source ASC"
|
|
).fetchall():
|
|
self._agents.setdefault(row[0], (row[1], row[2]))
|
|
|
|
for row in conn.execute(
|
|
"SELECT command_name, source, ref_id FROM usage_attribution_commands ORDER BY source ASC"
|
|
).fetchall():
|
|
self._commands.setdefault(row[0], (row[1], row[2]))
|
|
|
|
def attribute(self, event: ParsedEvent) -> tuple[str, str | None]:
|
|
"""Resolve (source, ref_id). Returns ('builtin', None) for built-ins or unknowns.
|
|
|
|
Lookup order:
|
|
1. Skill invocations → skill attribution table (bypasses BUILTIN_TOOLS check
|
|
because Skill tool is built-in but the *skill name* identifies the plugin).
|
|
2. Subagent dispatches → agent attribution table (bypasses BUILTIN_TOOLS check
|
|
because Task/Agent are built-in but the *subagent_type* identifies the plugin).
|
|
3. Slash commands → command attribution table.
|
|
4. Built-in tool names → ('builtin', None).
|
|
5. Unknown tool names → ('builtin', None) fallback.
|
|
"""
|
|
# Skill name takes priority over tool_name check
|
|
if event.skill_name and event.skill_name in self._skills:
|
|
return self._skills[event.skill_name]
|
|
|
|
# Subagent type takes priority over tool_name check
|
|
if event.subagent_type and event.subagent_type in self._agents:
|
|
return self._agents[event.subagent_type]
|
|
|
|
# Slash command attribution
|
|
if event.command_name and event.command_name in self._commands:
|
|
return self._commands[event.command_name]
|
|
|
|
# Built-in tool names (Task/Agent fall through to here only when
|
|
# their subagent_type is not in the attribution table)
|
|
if event.tool_name in BUILTIN_TOOLS:
|
|
return ("builtin", None)
|
|
|
|
# Unknown tool name → builtin fallback
|
|
return ("builtin", None)
|
|
|
|
|
|
def compute_active_seconds(timestamps: list[datetime]) -> int:
|
|
"""Sum of intra-block durations. Gap >10 minutes = new block."""
|
|
if not timestamps:
|
|
return 0
|
|
timestamps = sorted(timestamps)
|
|
GAP = 600 # 10 minutes
|
|
blocks = []
|
|
block_start = timestamps[0]
|
|
prev = timestamps[0]
|
|
for ts in timestamps[1:]:
|
|
gap = (ts - prev).total_seconds()
|
|
if gap > GAP:
|
|
blocks.append((block_start, prev))
|
|
block_start = ts
|
|
prev = ts
|
|
blocks.append((block_start, prev))
|
|
return int(sum((end - start).total_seconds() for start, end in blocks))
|
|
|
|
|
|
def compute_summary(turns: list[dict], events: list[dict]) -> dict:
|
|
"""Build the usage_session_summary row dict from parsed turns and event rows.
|
|
|
|
Caller must fill in 'session_file' and 'username' after calling this.
|
|
events is a list of dicts (as produced by UsageProcessor, not ParsedEvent).
|
|
"""
|
|
# session_id: first turn with a sessionId field
|
|
session_id = None
|
|
for t in turns:
|
|
sid = t.get("sessionId")
|
|
if sid:
|
|
session_id = sid
|
|
break
|
|
|
|
# Timestamps from all turns that have one
|
|
timestamps: list[datetime] = []
|
|
user_messages = 0
|
|
assistant_messages = 0
|
|
model_counter: Counter = Counter()
|
|
input_tokens = 0
|
|
output_tokens = 0
|
|
cache_read_tokens = 0
|
|
cache_creation_tokens = 0
|
|
|
|
for t in turns:
|
|
ts = _parse_ts(t.get("timestamp"))
|
|
if ts:
|
|
timestamps.append(ts)
|
|
turn_type = t.get("type", "")
|
|
if turn_type == "user":
|
|
user_messages += 1
|
|
elif turn_type == "assistant":
|
|
assistant_messages += 1
|
|
msg = t.get("message", {}) or {}
|
|
m = msg.get("model")
|
|
if m:
|
|
model_counter[m] += 1
|
|
# Anthropic API usage block on assistant turns. Older sessions
|
|
# may lack `cache_*` keys (pre-prompt-caching) — `.get(k, 0)`
|
|
# tolerates that. Non-int values (corrupted JSONL) are skipped
|
|
# to keep one bad turn from poisoning the whole summary.
|
|
usage = msg.get("usage") or {}
|
|
for key, accum in (
|
|
("input_tokens", "input_tokens"),
|
|
("output_tokens", "output_tokens"),
|
|
("cache_read_input_tokens", "cache_read_tokens"),
|
|
("cache_creation_input_tokens", "cache_creation_tokens"),
|
|
):
|
|
v = usage.get(key, 0)
|
|
if isinstance(v, int):
|
|
if accum == "input_tokens":
|
|
input_tokens += v
|
|
elif accum == "output_tokens":
|
|
output_tokens += v
|
|
elif accum == "cache_read_tokens":
|
|
cache_read_tokens += v
|
|
elif accum == "cache_creation_tokens":
|
|
cache_creation_tokens += v
|
|
|
|
started_at = min(timestamps) if timestamps else None
|
|
ended_at = max(timestamps) if timestamps else None
|
|
wall_seconds = (
|
|
int((ended_at - started_at).total_seconds()) if started_at and ended_at else 0
|
|
)
|
|
active_seconds = compute_active_seconds(timestamps)
|
|
|
|
# Aggregate counts from events
|
|
tool_calls = sum(1 for e in events if e["event_type"] == "tool_use")
|
|
tool_errors = sum(1 for e in events if e.get("is_error"))
|
|
skill_invocations = sum(1 for e in events if e.get("skill_name"))
|
|
subagent_dispatches = sum(1 for e in events if e["event_type"] == "subagent")
|
|
mcp_calls = sum(1 for e in events if e["event_type"] == "mcp_call")
|
|
slash_commands = sum(1 for e in events if e["event_type"] == "slash_command")
|
|
distinct_tools = len({e["tool_name"] for e in events if e.get("tool_name")})
|
|
distinct_skills = len({e["skill_name"] for e in events if e.get("skill_name")})
|
|
primary_model = model_counter.most_common(1)[0][0] if model_counter else None
|
|
|
|
return {
|
|
"session_id": session_id or "",
|
|
"started_at": started_at,
|
|
"ended_at": ended_at,
|
|
"active_seconds": active_seconds,
|
|
"wall_seconds": wall_seconds,
|
|
"user_messages": user_messages,
|
|
"assistant_messages": assistant_messages,
|
|
"tool_calls": tool_calls,
|
|
"tool_errors": tool_errors,
|
|
"skill_invocations": skill_invocations,
|
|
"subagent_dispatches": subagent_dispatches,
|
|
"mcp_calls": mcp_calls,
|
|
"slash_commands": slash_commands,
|
|
"distinct_tools": distinct_tools,
|
|
"distinct_skills": distinct_skills,
|
|
"primary_model": primary_model,
|
|
"input_tokens": input_tokens,
|
|
"output_tokens": output_tokens,
|
|
"cache_read_tokens": cache_read_tokens,
|
|
"cache_creation_tokens": cache_creation_tokens,
|
|
"processor_version": USAGE_PROCESSOR_VERSION,
|
|
}
|
|
|
|
|
|
def rebuild_rollups(conn, *, since_day=None) -> None:
|
|
"""Rebuild daily rollups from usage_events.
|
|
|
|
Default since_day = CURRENT_DATE - 7 (incremental refresh on every tick).
|
|
Pass since_day=None to do full rebuild on reprocess.
|
|
|
|
Both rollup tables are updated inside a single transaction so a partial
|
|
failure never leaves them inconsistent.
|
|
"""
|
|
if since_day is None:
|
|
since_day = (datetime.now(timezone.utc) - timedelta(days=7)).date()
|
|
|
|
try:
|
|
conn.execute("BEGIN")
|
|
conn.execute("DELETE FROM usage_tool_daily WHERE day >= ?", [since_day])
|
|
conn.execute(
|
|
"""
|
|
INSERT INTO usage_tool_daily
|
|
(day, tool_name, source, invocations, error_count, distinct_users, distinct_sessions)
|
|
SELECT
|
|
CAST(occurred_at AS DATE) AS day,
|
|
tool_name,
|
|
source,
|
|
COUNT(*) AS invocations,
|
|
SUM(CASE WHEN is_error THEN 1 ELSE 0 END) AS error_count,
|
|
COUNT(DISTINCT username) AS distinct_users,
|
|
COUNT(DISTINCT session_id) AS distinct_sessions
|
|
FROM usage_events
|
|
WHERE CAST(occurred_at AS DATE) >= ?
|
|
AND tool_name IS NOT NULL
|
|
GROUP BY day, tool_name, source
|
|
""",
|
|
[since_day],
|
|
)
|
|
conn.execute("DELETE FROM usage_plugin_daily WHERE day >= ?", [since_day])
|
|
conn.execute(
|
|
"""
|
|
INSERT INTO usage_plugin_daily
|
|
(day, source, ref_id, invocations, distinct_users, distinct_sessions)
|
|
SELECT
|
|
CAST(occurred_at AS DATE) AS day,
|
|
source,
|
|
ref_id,
|
|
COUNT(*),
|
|
COUNT(DISTINCT username),
|
|
COUNT(DISTINCT session_id)
|
|
FROM usage_events
|
|
WHERE CAST(occurred_at AS DATE) >= ?
|
|
AND ref_id IS NOT NULL
|
|
AND source IN ('curated', 'flea')
|
|
GROUP BY day, source, ref_id
|
|
""",
|
|
[since_day],
|
|
)
|
|
conn.execute("COMMIT")
|
|
except Exception:
|
|
try:
|
|
conn.execute("ROLLBACK")
|
|
except Exception:
|
|
pass
|
|
raise
|