agnes-the-ai-analyst

History

ZdenekSrotyr b6cdd68e8d feat(catalog): entity_type + validated where_examples + view-aware cost-guard + scheduler hygiene Three behavioural improvements driven by the sub-agent end-to-end test findings, plus scheduler tweaks to prevent the post-deploy contention burst we measured. CATALOG (catalog-side bugs the test agents tripped on): - new entity_type field per remote row (BASE TABLE / VIEW / MATERIALIZED VIEW). For views, rows + size_bytes return null instead of the misleading 0 that __TABLES__ reports. - where_examples now validates against the table's actual schema (cached known_columns from refresh). The pre-fix behavior blindly advertised `country_code = 'CZ'` on tables with no country_code column — the sub-agent tests reliably hit this on unit_economics. - new known_columns + entity_type columns on bq_metadata_cache; populated by bq_metadata_refresh.refresh_one from the same fetch_bq_columns_full call (no extra BQ roundtrip) plus a cheap INFORMATION_SCHEMA.TABLES lookup for table_type. QUERY COST-GUARD: - remote_scan_too_large suggestion now names views explicitly: `Target(s) <ids> are VIEW or MATERIALIZED VIEW. BigQuery does not push LIMIT into the view body — SELECT * FROM <view> LIMIT 1 still runs the full underlying scan.` Programmatic consumers get a new view_targets field on the error detail. SCHEDULER HYGIENE (the post-deploy 1-minute window where concurrent parquet downloads dropped to ~1 MB/s): - SCHEDULER_STARTUP_GRACE_SECONDS (default 60) holds the first tick so the burst doesn't overlap cache_warmup writes. - SCHEDULER_BQ_METADATA_INITIAL_OFFSET_MAX_SECONDS (default 900) randomises bq-metadata-refresh's first-fire offset. TESTS: - test_bq_metadata_cache_repo: entity_type + known_columns round-trip - test_v2_catalog_remote_metadata: where_examples validation, views return null rows/size_bytes, cold rows have empty examples - test_api_query_guardrail: VIEW-aware suggestion text + view_targets - test_connectors_bigquery_metadata: entity_type lookup mock + new fields in TableMetadata expectations - test_scheduler_sidecar: grace + jitter env-var resolution		2026-05-12 10:37:35 +02:00
..
__init__.py	feat: add FastAPI server with auth, RBAC, and all API endpoints	2026-03-27 15:19:18 +01:00
_metadata_models.py	feat(catalog): entity_type + validated where_examples + view-aware cost-guard + scheduler hygiene	2026-05-12 10:37:35 +02:00
access.py	System plugins (schema v39) + marketplace UX polish + drop legacy pages (#241 )	2026-05-10 19:15:41 +00:00
admin.py	Flea-market upload guardrails + soft delete + JOIN-based admin queue (#233 )	2026-05-09 17:32:53 +04:00
admin_bigquery_test.py	feat(admin): #160 BQ test-connection endpoint + billing_project placeholder UI	2026-05-04 10:31:35 +02:00
bq_metadata_refresh.py	feat(catalog): entity_type + validated where_examples + view-aware cost-guard + scheduler hygiene	2026-05-12 10:37:35 +02:00
cache_warmup.py	release: 0.50.0 — persistent BQ metadata cache + scheduled refresh; catalog never blocks on BigQuery	2026-05-11 20:37:17 +02:00
catalog.py	feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150 )	2026-04-30 22:02:16 +02:00
claude_md.py	chore(cli-rename): replace stale `da` verbs in active code paths	2026-05-04 21:10:43 +02:00
cli_artifacts.py	chore: rename stale 'da' references to 'agnes' + CHANGELOG	2026-05-06 23:23:59 +02:00
data.py	feat(caddy): file_server for parquet downloads — bypass uvicorn	2026-05-05 16:41:33 +02:00
health.py	Extract session-pipeline framework + UsageProcessor skeleton (#232 )	2026-05-08 19:47:46 +02:00
jira_webhooks.py	fix(security): close Jira webhook fail-open + path traversal (#83 ) (#93 )	2026-04-27 19:53:55 +02:00
marketplace.py	System plugins (schema v39) + marketplace UX polish + drop legacy pages (#241 )	2026-05-10 19:15:41 +00:00
marketplaces.py	System plugins (schema v39) + marketplace UX polish + drop legacy pages (#241 )	2026-05-10 19:15:41 +00:00
me.py	feat(home): state-aware /home + /setup-advanced + schema v26 (#228 )	2026-05-08 18:28:47 +02:00
me_debug.py	feat(auth): /me/debug self-only auth diagnostic page (#116 )	2026-04-29 06:36:28 +02:00
memory.py	feat(memory): admin Edit + MEMORY_DOMAIN RBAC + ai-section UI (#141 )	2026-04-30 11:04:41 +02:00
metadata.py	feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening	2026-04-28 14:25:04 +02:00
metrics.py	feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening	2026-04-28 14:25:04 +02:00
my_stack.py	System plugins (schema v39) + marketplace UX polish + drop legacy pages (#241 )	2026-05-10 19:15:41 +00:00
news.py	feat(home): state-aware /home + /setup-advanced + schema v26 (#228 )	2026-05-08 18:28:47 +02:00
query.py	feat(catalog): entity_type + validated where_examples + view-aware cost-guard + scheduler hygiene	2026-05-12 10:37:35 +02:00
query_hybrid.py	feat(rbac+marketplace): RBAC v13 + Claude Code marketplace + #81/#83/#44 hardening	2026-04-28 14:25:04 +02:00
scripts.py	feat(scheduler): re-wire sync_schedule + script.schedule; tune via env; OpenMetadata TLS (#135 )	2026-04-29 22:06:30 +02:00
settings.py	feat(rbac): drop dataset_permissions + users.role + is_public; v19 migration (#150 )	2026-04-30 22:02:16 +02:00
store.py	Flea-market edit feature with version history (schema v37) (#239 )	2026-05-10 00:14:33 +04:00
sync.py	release: 0.47.1 — Keboola connector v27 (incremental, partitioned, where_filters, typed parquet) (#217 )	2026-05-07 19:01:27 +02:00
telegram.py	feat: complete system — web UI, all API endpoints, governance, admin, CLI commands	2026-03-27 16:52:22 +01:00
tokens.py	chore(lint): final ruff fixes	2026-05-04 19:32:52 +02:00
upload.py	fix(security+ops) + release(0.12.1): #82 #85 #87 hardening + cut 0.12.1 (#104 )	2026-04-28 19:57:30 +02:00
users.py	System plugins (schema v39) + marketplace UX polish + drop legacy pages (#241 )	2026-05-10 19:15:41 +00:00
v2_arrow.py	feat(v2): claude-driven fetch primitives + 0.14.0 (#102 )	2026-04-29 01:07:19 +02:00
v2_cache.py	feat(v2): claude-driven fetch primitives + 0.14.0 (#102 )	2026-04-29 01:07:19 +02:00
v2_catalog.py	feat(catalog): entity_type + validated where_examples + view-aware cost-guard + scheduler hygiene	2026-05-12 10:37:35 +02:00
v2_quota.py	refactor(quota): #160 relocate _build_quota_tracker to v2_quota.py	2026-05-04 10:31:35 +02:00
v2_sample.py	release: 0.46.5 — agnes describe -n parses, server sanitizes NaN (#224 )	2026-05-07 18:16:21 +02:00
v2_scan.py	perf: Tier 1 event-loop unblocking — async def → def on BQ-bound handlers	2026-05-05 17:44:08 +02:00
v2_schema.py	release: 0.47.0 — source-agnostic catalog metadata + cache discipline (#223 )	2026-05-07 18:33:55 +02:00
welcome.py	fix(devin-review): dashboard CTA respects override; PUT validates anon path	2026-05-03 21:45:32 +02:00
where_validator.py	feat(v2): claude-driven fetch primitives + 0.14.0 (#102 )	2026-04-29 01:07:19 +02:00