agnes-the-ai-analyst/app
ZdenekSrotyr cc1886c97c
release: 0.47.4 — Docker collector skip + FIFO session-pipeline check (#229)
## Summary

Two minimum-viable fixes after today's 0.44.0 → 0.47.3 release train and the production 30-user launch. Devil's advocate review of a 3-PR / 7-item plan cut scope to these 2 — the rest is deferred to a separate "operate-first, instrument-second" backlog item.

### B2 — Docker session_collector log skip

`services/session_collector` was logging `Collection complete: 0 users, 0 files copied` + `WARNING: Group 'data-ops' not found, using default group` every 10 minutes in the Docker layout (where `/home/*/user/sessions/` doesn't exist). New env var `AGNES_SKIP_LEGACY_COLLECTOR=1` set by default in `docker-compose.yml` short-circuits the collector pass.

The bare-VM deployment path (where /home/* IS populated by Claude Code) leaves the env var unset and continues to scan normally — including the data-ops warning, which is load-bearing for catching missing-group mis-deploys.

### O2 — FIFO check in `_check_session_pipeline`

The existing check compares `MAX(processed_at)` to newest jsonl mtime — catches "detector hasn't run lately" but blind to "old file was skipped while newer ones were processed". New code finds the oldest FS jsonl that's NOT in `session_extraction_state.session_file` and flags if its mtime is older than `SESSION_PIPELINE_STUCK_FILE_GRACE_SECONDS` (default 4× the existing grace = 2h).

Severity intentionally starts at `info` so we can collect prod data on false-positive rate before tightening to `warning`. The aggregator already treats `info` as non-promoting (see the severity vocabulary docstring at the top of `app/api/health.py`), so the headline `status` stays at `healthy` even when this fires — the operator sees the entry in the per-check breakdown but no spurious `degraded` overall.

## Test plan

- [x] `pytest tests/test_session_collector.py` — 17 tests pass (existing 9 + new 8 covering env-set/unset, truthy variants, falsy non-skip).
- [x] `pytest tests/test_health_session_pipeline.py` — 8 tests pass (existing 4 + new 4 FIFO tests covering stuck-file, under-threshold, all-processed, env-override).
<!-- devin-review-badge-begin -->

---

<a href="https://app.devin.ai/review/keboola/agnes-the-ai-analyst/pull/229" target="_blank">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
    <img src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1" alt="Open in Devin Review">
  </picture>
</a>
<!-- devin-review-badge-end -->
2026-05-08 09:38:21 +02:00
..
api release: 0.47.4 — Docker collector skip + FIFO session-pipeline check (#229) 2026-05-08 09:38:21 +02:00
auth feat(tokens): add scope + ttl_seconds fields with bootstrap-analyst clamp 2026-05-04 17:00:54 +02:00
debug feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
marketplace_server feat(store): /store + /my-ai-stack — community marketplace + per-user composition 2026-05-05 02:53:49 +02:00
middleware feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
web release: 0.47.1 — Keboola connector v27 (incremental, partitioned, where_filters, typed parquet) (#217) 2026-05-07 19:01:27 +02:00
__init__.py feat: add FastAPI server with auth, RBAC, and all API endpoints 2026-03-27 15:19:18 +01:00
instance_config.py fix: Devin Review on #194 round 2 — 3 BUG-class findings 2026-05-05 20:02:50 +02:00
logging_config.py feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136) 2026-04-29 22:54:21 +02:00
main.py release: 0.47.0 — source-agnostic catalog metadata + cache discipline (#223) 2026-05-07 18:33:55 +02:00
resource_types.py feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150) 2026-04-30 22:02:16 +02:00
secrets.py feat: STATE_DIR env var + flat-mount overlay (parallel disks) 2026-05-05 19:28:07 +02:00
utils.py feat(store): /store + /my-ai-stack — community marketplace + per-user composition 2026-05-05 02:53:49 +02:00
version.py docs(version): clarify APP_VERSION scope + middleware /api prefix rationale 2026-05-06 23:23:23 +02:00