History

ZdenekSrotyr 9f5adbce37 ci: consolidate release pipeline (salvageable subset of #139 ) (#314 ) * ci: add actionlint workflow lint, drop superseded deploy.yml stub * ci: extract rollback into reusable rollback.yml, wire into release smoke-test * ci: add weekly prune-dev-tags workflow for legacy CalVer tag/image cleanup * release: 0.54.17 — CI/release workflow consolidation * fix(ci): warn when rollback.yml receives a non-stable failed_image_tag * fix(ci): rollback.yml + prune-dev-tags.sh review findings rollback.yml: - Pass workflow_dispatch inputs (failed_image_tag, target_image_tag) through env: instead of textual ${{ }} splicing into bash run blocks — prevents an actor with workflow_dispatch privilege from injecting shell via quote/backtick payloads. - Guard against TARGET == FAILED when only one stable-* tag exists (fresh repo, or aggressive pruning at month boundary). Fail loudly rather than re-push the broken image as :stable. - Add commit SHA to the rollback tracking-issue body — github.sha is inherited across workflow_call, so on-call no longer has to navigate rollback run → caller-workflow breadcrumb → failing commit. prune-dev-tags.sh: - Replace 'printf … \| head -20' preview pipeline with array slice ('"${TO_PRUNE[@]:0:20}"'). Under set -o pipefail, head closing the pipe early SIGPIPEs printf (exit 141) and aborts the script before any deletion runs — exactly the multi-month-backlog scenario the script targets. - Refactor GHCR-pass: fetch versions JSON once before the loop, then build a tag→version-id map up-front. Closes two problems: 1. O(N × pages) GHCR API calls collapse to one paginated listing — months of accumulated CalVer tags no longer risk tripping abuse detection. 2. The new jq filter excludes any version that ALSO carries a floating alias (:stable, :dev, -latest). GHCR DELETE-version drops the entire manifest, so pruning a CalVer tag that shares a manifest with :stable (e.g. after a rollback re-tag) would have vaporized :stable. Now it's skipped with a log line. lint-workflows.yml: - Add an explicit shellcheck step. actionlint only walks .github/workflows/ and the shell embedded in their run: blocks, so freestanding scripts/ops/.sh (which are in the workflow's path filter) were never actually validated despite triggering CI. * fix(ci): shellcheck --severity=warning to skip pre-existing info findings The new shellcheck step caught info-level findings (SC1091, SC2015) in agnes-auto-upgrade.sh / agnes-tls-rotate.sh — pre-existing, not regressed by this PR. Constrain shellcheck to warning+ severity (real bugs) so info and style findings don't block CI; mirrors the actionlint step's continue-on-error initial-rollout posture. * fix(ci): second-pass review findings — concurrency, walk-back, failure propagation rollback.yml: - Add own concurrency block (group: rollback-<repo>-<failed_tag>, cancel-in-progress: false). The caller release.yml uses cancel-in-progress: true to avoid duplicate CalVer claims, but a second push to main mid-rollback would otherwise kill the workflow between the :stable recovery push and the :deprecated-* audit push, leaving :stable stuck on the broken image. A reusable workflow's own concurrency overrides the inherited one. - Walk back through stable-* tags newest-first, skipping any whose :deprecated-<stripped> GHCR alias already exists (carries the mark of a prior failed rollback). The previous 'second-most-recent' heuristic could re-point :stable at a known-broken image on cascading failures. - Reorder re-tag step: push :stable recovery FIRST, then the :deprecated-* audit tag. Defense in depth — even if the concurrency block somehow misfires, the worst case is missing audit metadata rather than production stuck on the broken image. - Move GHCR login before resolve step so 'docker manifest inspect' can probe for :deprecated-* aliases during walk-back. - Document the top-level permissions block's dual semantics (workflow_dispatch grants directly; workflow_call acts as a cap intersected with the caller's job-level permissions). release.yml: - Rewrite the 'issues: write' comment. Old wording ('default for jobs') was factually wrong — GITHUB_TOKEN's default for issues is never write — and read as 'this line just documents a default', so a future cleanup PR could delete it. The line is load-bearing: workflow_call permissions are bounded by the caller's GITHUB_TOKEN scope, and removing it would silently 403 rollback.yml's gh issue create step. prune-dev-tags.sh: - Drop the '\|\| echo "[]"' fallback on the GHCR versions fetch. The fallback turned every API failure (403 missing scope, 429 rate limit, transient 5xx) into a silent no-op with exit 0 — operators saw a green run while every TAG fell through to the same 'no eligible version' skip message used for legitimate manifest-collision skips. - Reorder: fetch GHCR versions BEFORE any git-tag deletion. Git-tag delete is irrecoverable (next run rebuilds TO_PRUNE from 'git tag -l', so an orphan GHCR image is never enumerated again). Fetching first means an API failure aborts cleanly with no state change. - Track PRUNE_FAILED flag. 'git push --delete' fallback is no longer unconditional — local 'git tag -d' is gated on successful remote push, so a refused remote delete (tag-protection rule, missing contents:write) leaves the local tag in place for retry. The flag propagates to a final 'exit 1' so the cron run turns red on any push or DELETE failure. lint-workflows.yml: - shellcheck step now uses 'find scripts/ops -type f -name .sh' to match the workflow's recursive 'scripts/ops/.sh' path filter. The previous bare 'scripts/ops/.sh' glob only matched top-level files; a future script under a subdirectory would have triggered the workflow but never been linted. * docs(releasing): document rollback.yml, prune-dev-tags.yml, lint-workflows.yml Reflects the new operational workflows landing in this release: - Auto-rollback paragraph in release.yml description (smoke-test job + rollback-on-smoke-fail → rollback.yml) - rollback.yml subsection — workflow_call + workflow_dispatch entry points, walk-back target resolution, immutability + concurrency guarantees, manual operator gh workflow run examples - prune-dev-tags.yml subsection — weekly cron, KEEP_MONTHS retention semantics, floating-alias safety, dry_run preview, failure-propagation exit-non-zero behavior - lint-workflows.yml CI quirk — actionlint (continue-on-error) + shellcheck (--severity=warning blocking) advisory checks CLAUDE.md non-negotiable rules unchanged — still high-level and correct (changelog discipline + release-cut belongs to the PR + run the full test suite).		2026-05-15 14:06:59 +02:00
..
admin	release: 0.47.0 — source-agnostic catalog metadata + cache discipline (#223 )	2026-05-07 18:33:55 +02:00
archive	docs: consolidate and de-clutter the documentation tree (#306 )	2026-05-14 18:54:22 +00:00
examples	Marketplace UX overhaul: rich plugin/skill/agent detail + filename rename (#251 )	2026-05-12 08:38:39 +00:00
HOWTO	feat(cli): agnes marketplace search/detail/add/remove + retire stale subcommands (#280 )	2026-05-13 05:20:56 +00:00
metrics	chore(docs): replace stale `da` verbs and vendor-specific install paths	2026-05-04 21:22:19 +02:00
operator	feat(home): state-aware /home + /setup-advanced + schema v26 (#228 )	2026-05-08 18:28:47 +02:00
setup	release: 0.47.1 — Keboola connector v27 (incremental, partitioned, where_filters, typed parquet) (#217 )	2026-05-07 19:01:27 +02:00
testing	docs(testing): add coverage honesty + prerequisites to E2E plan	2026-05-04 19:59:47 +02:00
ADR-corporate-memory-v1.md	feat(memory): corporate memory v1+v1.5 + 0.15.0 (#72 )	2026-04-29 07:16:22 +02:00
agent-setup-prompt.md	fix(api): align PUT validation autoescape with runtime (False); docs match	2026-05-03 21:30:24 +02:00
agent-workspace-prompt.md	chore(docs): replace stale `da` verbs and vendor-specific install paths	2026-05-04 21:22:19 +02:00
architecture.md	fix(compose): drop corporate-memory + session-collector services (#176 )	2026-05-04 23:59:44 +02:00
auth-google-oauth.md	refactor(docs): sweep DA_* env vars + surviving da-verbs in docs/*.md (Task 0.5 fix)	2026-05-04 16:43:15 +02:00
auth-groups.md	refactor(docs): sweep DA_* env vars + surviving da-verbs in docs/*.md (Task 0.5 fix)	2026-05-04 16:43:15 +02:00
CONFIGURATION.md	Activity Center: audit log + telemetry + sessions + agnes_* tables (#278 )	2026-05-12 22:41:19 +02:00
corporate-memory-governance.md	Add Corporate Memory governance — Phase 1 (data model + admin API)	2026-03-23 19:15:33 +01:00
curated-marketplace-format.md	Marketplace UX overhaul: rich plugin/skill/agent detail + filename rename (#251 )	2026-05-12 08:38:39 +00:00
DATA_SOURCES.md	remove agnes query --register-bq from client CLI	2026-05-12 18:18:13 +02:00
DEPLOYMENT.md	Activity Center: audit log + telemetry + sessions + agnes_* tables (#278 )	2026-05-12 22:41:19 +02:00
development.md	feat(observability): request_id end-to-end + dev debug toolbar + centralized logging (#136 )	2026-04-29 22:54:21 +02:00
HEADLESS_USAGE.md	feat(web): consolidate the personal /me/* surface — /me/activity + /me/profile (#304 )	2026-05-14 21:29:51 +02:00
initial-workspace-override.md	fix: refresh-marketplace enables stack plugins; override sentinel is init-time only (#307 )	2026-05-14 18:43:32 +02:00
llm-routing.md	docs,tests: anonymize customer references	2026-04-21 11:56:19 +02:00
local-development.md	feat(web): consolidate the personal /me/* surface — /me/activity + /me/profile (#304 )	2026-05-14 21:29:51 +02:00
marketplace.md	docs: consolidate and de-clutter the documentation tree (#306 )	2026-05-14 18:54:22 +00:00
observability.md	feat(observability): optional PostHog integration (#231 )	2026-05-08 17:57:10 +04:00
ONBOARDING.md	Activity Center: audit log + telemetry + sessions + agnes_* tables (#278 )	2026-05-12 22:41:19 +02:00
PLATFORM_SETUP.md	Activity Center: audit log + telemetry + sessions + agnes_* tables (#278 )	2026-05-12 22:41:19 +02:00
QUICKSTART.md	docs: consolidate and de-clutter the documentation tree (#306 )	2026-05-14 18:54:22 +00:00
RBAC.md	refactor(docs): sweep DA_* env vars + surviving da-verbs in docs/*.md (Task 0.5 fix)	2026-05-04 16:43:15 +02:00
README.md	docs: consolidate and de-clutter the documentation tree (#306 )	2026-05-14 18:54:22 +00:00
RELEASE_CHECKLIST.md	docs: consolidate and de-clutter the documentation tree (#306 )	2026-05-14 18:54:22 +00:00
RELEASING.md	ci: consolidate release pipeline (salvageable subset of #139 ) (#314 )	2026-05-15 14:06:59 +02:00
sample-data.md	chore(docs): replace stale `da` verbs and vendor-specific install paths	2026-05-04 21:22:19 +02:00
state-dir.md	fix: Devin Review on #194 round 2 — 3 BUG-class findings	2026-05-05 20:02:50 +02:00
STORE_GUARDRAILS.md	Flea-market upload guardrails + soft delete + JOIN-based admin queue (#233 )	2026-05-09 17:32:53 +04:00
theme-reference.html	Fix clipped annotation badges in theme-reference.html	2026-03-11 14:09:04 +01:00

README.md

Agnes documentation

Index of all documentation, organized by who needs it. New here? Start with the row that matches your role.

You are…	Start with
Analyst — using Agnes to query data	`QUICKSTART.md`, then `HOWTO/`
Operator — deploying & running an instance	`PLATFORM_SETUP.md`
Developer — working on Agnes itself	`../ARCHITECTURE.md` + `architecture.md`

For analysts

Using the platform to analyze data.

QUICKSTART.md — local setup + first sync
HOWTO/ — task-oriented cookbook (querying, snapshots, common workflows)
DATA_SOURCES.md — data source connectors (Keboola, BigQuery, CSV) and how tables surface
metrics/ — canonical business-metric definitions (YAML)
HEADLESS_USAGE.md — PAT auth for CI / headless clients

For operators

Deploying, configuring, and running an Agnes instance.

PLATFORM_SETUP.md — the consolidated operator playbook (bootstrap, TLS, marketplaces, scheduler, telemetry)
ONBOARDING.md — end-to-end Terraform deployment into a new GCP project
DEPLOYMENT.md — picks between the Terraform and Docker Compose paths
CONFIGURATION.md — instance.yaml, env vars, per-instance options
state-dir.md — persistent data layout (data + state tiers, mount layouts, migration)
RBAC.md — access control: groups, members, resource grants
auth-google-oauth.md — Google OAuth setup + operator gotchas
auth-groups.md — Google Workspace group sync
admin/query-modes.md — table registration query modes
agent-setup-prompt.md — customize the /setup page banner
agent-workspace-prompt.md — customize the generated analyst CLAUDE.md
initial-workspace-override.md — per-instance analyst-workspace skeleton override
curated-marketplace-format.md — authoring marketplace-metadata.json for curated marketplaces
observability.md — PostHog integration (exceptions, tracing, session replay)
operator/news-content-guide.md — editorial guidelines for in-app news content

For developers

Working on the Agnes codebase.

../ARCHITECTURE.md — high-level system overview (the summary)
architecture.md — detailed architecture reference (module map, extract.duckdb contract, components)
../CLAUDE.md — project instructions for AI agents working in this repo
development.md — logging, request correlation, debug toolbar
local-development.md — LOCAL_DEV_MODE setup (what's mocked vs. real)
RELEASING.md — release process, deploy workflows, CI quirks
RELEASE_CHECKLIST.md — pre-merge checks for bootstrap-path changes
testing/ — test plans (clean-analyst bootstrap, VM test)
marketplace.md — Claude Code marketplace ingestion + re-serving internals
STORE_GUARDRAILS.md — flea-market upload guardrails (static checks + LLM review)
corporate-memory-governance.md — knowledge-distribution governance design
ADR-corporate-memory-v1.md — ADR: corporate-memory v1 decisions
llm-routing.md — design: provider-agnostic LLM routing
sample-data.md — sample data generator (e-commerce schema, size presets)
theme-reference.html — web UI theme/color reference
../dev_docs/ — server/developer-internal docs (not synced to analyst machines): server ops, disaster recovery, security audit, desktop app, design system, Telegram bot

Code-adjacent READMEs: ../connectors/jira/README.md, ../services/corporate_memory/README.md, ../scripts/README.md. Agent skill files: ../cli/skills/.

Other

../CHANGELOG.md — full change history (Keep-a-Changelog format)
archive/ — historical planning artifacts and superseded docs; not maintained, see archive/README.md