agnes-the-ai-analyst/docs/ADR-corporate-memory-v1.md
PavelDo e1108b6112
feat(memory): corporate memory v1+v1.5 + 0.15.0 (#72)
Adds corporate memory v1 (verification flywheel + contradiction detection + confidence scoring) and v1.5 (audience-based distribution + per-item privacy + admin curation). Server: GET /api/memory/bundle returns mandatory + ranked-approved items within a token budget; POST /api/memory/admin/mandate accepts an audience field gated against user_group_members; /api/memory/stats uses SQL aggregation. CLI: da sync writes received items to .claude/rules/km_*.md. Verification detector extracts knowledge candidates from session JSONL files. Auto-tagging via Haiku when ai: is configured. Adapted from the v9-era branch onto v13/v14 RBAC: _is_privileged_viewer + _effective_groups now query user_group_members JOIN user_groups; require_role(Role.KM_ADMIN) replaced with require_admin (km_admin collapsed into admin). Schema v15: knowledge_items context-engineering columns + knowledge_contradictions + session_extraction_state. Schema v16: verification_evidence. Cuts release v0.15.0 (also bundles #116 /me/debug page).
2026-04-29 07:16:22 +02:00

18 KiB
Raw Permalink Blame History

ADR — Corporate Memory V1 must-fix changes

Status: Accepted Date: 2026-04-25 Branch: pabu/local-dev

This ADR records the architectural decisions made to address the three high-severity findings raised before V1 of the corporate-memory feature ships. V1.5 follow-ups are tracked separately in docs/TODO.md.


Change set overview

Fix Files Reason
1. is_personal privacy enforcement app/api/memory.py, src/repositories/knowledge.py, tests/test_memory_api.py Personal items were leaking through three vectors.
2. Wire contradiction detection + fix candidate SQL services/verification_detector/detector.py, src/repositories/knowledge.py, tests/test_corporate_memory_v1.py The feature shipped inert; the SQL pre-filter was structurally broken.
3. Stop trusting LLM-supplied confidence + persist evidence services/verification_detector/{schemas,prompts,detector}.py, src/db.py, src/repositories/knowledge.py, tests/test_corporate_memory_v1.py The LLM should not set its own credibility, and the most valuable signal (the user quote) was being discarded.

Schema bumped from v8 to v10 (v9: audience column; v10: users.groups column) (SCHEMA_VERSION = 10 in src/db.py) — v9 adds the verification_evidence table and knowledge_items.audience column; v10 adds the users.groups JSON column.


Decision 1 — is_personal is a hard privacy boundary, not a UI hint

Context

Five paths interacted with is_personal:

  1. GET /api/memory — accepted caller-controlled exclude_personal (default True) with no role check. Any authenticated user could pass exclude_personal=false and read everyone's personal items.
  2. repo.search() — ignored is_personal entirely. GET /api/memory?search=… bypassed the filter unconditionally.
  3. GET /{id}/provenance — no per-item privacy check.
  4. POST /{id}/vote — no per-item privacy check.
  5. POST /{id}/personal — already correctly gated to the contributor.

The toggle UI presents is_personal as "make this item private to me", so treating it as a UI convenience would have been a confidentiality breach.

Decision

Treat is_personal as an authorization rule enforced at every read site.

  • KnowledgeRepository.search() gains an exclude_personal parameter and appends AND (is_personal = FALSE OR is_personal IS NULL) when set.
  • app/api/memory.list_knowledge derives effective_exclude_personal:
    • Privileged viewers (km_admin, admin) — caller's choice respected.
    • Everyone else — silently coerced to True. The query parameter is intentionally not rejected with 403 to avoid encouraging client-side probing for the role boundary.
  • A shared _can_view_item(user, item) helper centralizes the rule (not is_personal OR contributor OR privileged_viewer) and is invoked from /{id}/provenance and /{id}/vote.
  • Denied access returns 404, not 403. The motivation is existence- hiding: a 403 response would let an attacker enumerate item IDs and learn which ones exist as personal items belonging to others.
  • Contributors continue to see their own personal items via /api/memory/my-contributions, which already has the correct semantics.

Alternatives considered

  • Reject exclude_personal=false from non-admins with 403. Simpler to log, but advertises the role boundary. The silent-coerce path matches the principle "the API behaves the same regardless of whether private data exists."
  • Strip is_personal from query/response entirely for non-admins. Too destructive — admins still need the toggle and the indicator. Centralizing on _can_view_item keeps the mental model uniform.

Trade-offs

  • 404-vs-403 is a deliberate UX cost: a contributor who mistypes a URL gets the same 404 as an attacker, with no hint that the issue is auth. The existence-leak risk dominates.
  • The repo search() signature changed (added one parameter with a default of False). Existing callers need not change, but the API layer now always passes the effective flag.

Decision 2 — Contradiction detection runs inline; SQL pre-filter is structured

Context

services/corporate_memory/contradiction.detect_and_record() exists, but services/verification_detector/detector.run() never called it. Items were being created and the contradictions table was always empty. Combined with:

  • find_contradiction_candidates in src/repositories/knowledge.py joined domain and keyword conditions with OR. Combined with ORDER BY updated_at DESC LIMIT 10, recent same-domain noise crowded out genuine conflicts and cross-domain conflicts were treated as same-domain noise.

The feature was effectively a stub.

Decision

  1. Wire the call inline. After repo.create(...) succeeds in detector.run(), the new item is reloaded and passed to contradiction.detect_and_record(extractor, new_item, repo).
    • Wrapped in try/except so an LLM judge failure does not abort the session — the verification still lands, the contradiction simply isn't recorded for that item.
    • A new contradictions_recorded counter is added to stats.
  2. Fix the SQL pre-filter. find_contradiction_candidates now treats domain as a top-level conjunct and keywords as an inner OR group:
    • Both supplied → domain = ? AND (keyword OR keyword OR …)
    • Domain only → domain = ?
    • Keywords only → (keyword OR keyword) (cross-domain fallback)
    • Neither supplied → return empty list.

Alternatives considered

  • Enqueue contradiction checks for a nightly batch job. This is the V2 plan (Anthropic Batch API, 50 % cost reduction), tracked in docs/TODO.md. Inline is the right V1 choice: detector volume is low, governance is high-priority, and admin queues need fresh signal.
  • Keep keyword-only candidate matching. Rejected — the original bug meant cross-domain hits were treated as conflicts (e.g. a "data pipeline churn" doc would conflict with a "finance churn metric"). Adding cross-domain fallback only when domain is missing preserves recall without re-introducing the bug.

Trade-offs

  • Each new verification now triggers up to 10 extra LLM calls (one per candidate). At V1 session volume this is acceptable; if it bites, the V2 batch executor (docs/TODO.md) is the upgrade path.
  • The detector loop is now tighter coupled to the corporate-memory module via direct import. This is fine for now — both modules are in the same service tier — but if a future split is needed, the call would move behind a queue.

Decision 3 — Confidence is computed in code; user_quote is persistent evidence

Context

services/verification_detector/schemas.py required base_confidence from the LLM, and detector.py stored it directly into knowledge_items.confidence. The handcrafted compute_confidence(...) table existed but was bypassed. Two failure modes follow:

  1. Trust: The LLM (or a prompt injection inside a session) can elevate confidence at will, undermining the entire governance flywheel.
  2. Calibration: Hardcoded numbers in _BASE_CONFIDENCE cannot be tuned without a deploy, and the raw evidence (the user's exact quote that triggered the verification) was extracted by the LLM but immediately discarded.

Decision

  1. Drop base_confidence from the schema. Removed from VERIFICATION_SCHEMA properties and required. The prompt no longer asks the LLM for it.
  2. Compute confidence in code. confidence = compute_confidence("user_verification", v["detection_type"]). If detection_type is unknown (e.g. an LLM hallucinated a new value), we fall back to the confirmation baseline rather than the LLM's number.
  3. Persist evidence. New table verification_evidence:
    CREATE TABLE verification_evidence (
        id              VARCHAR PRIMARY KEY,
        item_id         VARCHAR NOT NULL,
        source_user     VARCHAR,
        source_ref      VARCHAR,
        detection_type  VARCHAR,
        user_quote      TEXT,
        created_at      TIMESTAMP DEFAULT current_timestamp
    )
    
    Indexed on item_id. Multiple rows per item are expected — one per distinct verification event. Repository methods: create_evidence(...) and list_evidence(item_id).
  4. Schema version bump. SCHEMA_VERSION = 9. Added _V8_TO_V9_MIGRATIONS following the existing migration pattern; the table is also added to the fresh-install _SYSTEM_SCHEMA block.

Alternatives considered

  • Keep base_confidence in the schema but ignore it on read. Rejected — it would still consume LLM tokens and invite regressions where someone later "improves" the pipeline by reading the field.
  • Store evidence inline on knowledge_items (e.g. last_user_quote TEXT). Rejected — multiple analysts independently verifying the same item over time is exactly the signal we want to accumulate. A 1:N table is the right shape.
  • Compute confidence as a derived view over verification_evidence. Tempting but premature — the apply_decay(), modifier, and per-source floor logic make this non-trivial. Stored materialized value with a recompute job (V1.5, docs/TODO.md) is the staged plan.

Trade-offs

  • _BASE_CONFIDENCE is still hardcoded. Externalizing it to instance.yaml is a CLAUDE.md violation we are knowingly carrying through V1. It is the first item in the V1.5 plan (docs/TODO.md § Q3) and should land before any production rollout beyond a single instance.
  • Evidence rows accumulate without bounded retention. At V1 volumes this is fine; once a re-confidence job is added (V1.5), retention can piggy- back on that pipeline.

Decisions explicitly deferred (NOT made in this change)

These were proposed in the review but pushed to V1.5 because they are either non-blocking or require design work that should not delay the must-fix bundle. Each is tracked in docs/TODO.md.

  • Explicit duplicate-candidate hints (knowledge_item_relations table, entity-overlap detection, admin UI surface).
  • Embedding pre-filter for contradiction candidates.
  • Anthropic Batch API executor for nightly contradiction sweeps.
  • Externalized confidence config in instance.yaml, exponential decay with per-source floors, per-domain decay policy.
  • Bayesian prior calibration from admin actions, with random-sample holdout.
  • Typed entity registry with canonical IDs, word-boundary regex, hierarchical parent_id.
  • Multi-evidence boost re-computation on verification_evidence insert.

Schema v9 → v10: Audience Distribution (users.groups)

Change: Added groups JSON column to the users table.

Purpose: Enables audience-based knowledge distribution. knowledge_items.audience (added in v9) restricts item visibility to specific groups. users.groups stores the list of Agnes groups a user belongs to (e.g. ["finance", "engineering"]), resolved at login from the identity provider.

Migration: ALTER TABLE users ADD COLUMN IF NOT EXISTS groups JSON — runs automatically at startup via _ensure_schema().

Privacy note: The user_groups filter in KnowledgeRepository.list_items() and search() applies an audience IN (...) predicate at the SQL layer. Items with audience IS NULL or audience = 'all' are visible to everyone; items with a group audience require the caller to be a member of that group. Admin callers bypass this filter entirely.

Follow-up: The current implementation stores raw group strings in users.groups. A future Agnes Groups Mapping UI (tracked in GitHub Issues) will replace this with a first-class Agnes roles registry, where admins map identity-provider groups to Agnes roles.


Decision 4 — Topics + judgment + resolution as one Haiku structured call

Context

Decision 2 wired detect_and_record(...) into the detector and fixed the OR → AND SQL bug, but kept the underlying architecture: a SQL keyword pre-filter (title.split() words >3 chars matched via ILIKE) followed by N sequential LLM judge calls (one per candidate). Two structural problems with that design carried over:

  1. Recall holes in the SQL pre-filter. Synonyms, paraphrases, language variants, and metric aliases miss. "Churn", "attrition", and "logo churn" are the same concept; the keyword filter would treat them as unrelated.
  2. N sequential calls. Each new verification fired up to 10 Haiku calls. Latency and cost both grow linearly with the candidate cap.

Anthropic's Structured Outputs feature is now GA on Haiku 4.5, with strict schema enforcement (required fields guaranteed, types guaranteed, additionalProperties: false honored). This makes a single batched structured-output call reliable enough to replace both the keyword filter and the per-candidate loop.

Decision

Replace the keyword SQL pre-filter and the N-call judge loop with one batched Haiku call that returns:

  • a judgment per same-domain candidate (is_contradiction, severity, explanation),
  • and a structured resolution suggestion in the same response (resolution_action{kept_a, kept_b, merge, both_valid}, with resolution_merged_content populated only when merge, plus resolution_justification).

Concretely:

  • services/corporate_memory/prompts.py adds BATCH_CONTRADICTION_PROMPT
    • BATCH_CONTRADICTION_SCHEMA. Resolution is flattened into the judgment object (resolution_action, resolution_merged_content, resolution_justification) rather than nested, to stay inside the strict-output constraints (no recursion, predictable required fields).
  • services/corporate_memory/contradiction.py's find_and_judge(...) loads same-domain candidates from SQL and runs one Haiku call. detect_and_record(...) persists the resulting records.
  • src/repositories/knowledge.py:
    • find_contradiction_candidates is reduced to a domain-only SELECT. The title_words parameter is removed entirely. Domain remains a hard SQL conjunct (cheap; bounds prompt size as the corpus grows).
    • create_contradiction(suggested_resolution=…) now accepts either a dict (JSON-encoded into the existing TEXT column) or a string (legacy/free-form). list_contradictions and get_contradiction decode JSON-shaped values back into dicts on read.

Defenses against LLM failure modes

  • Hallucinated candidate IDs: every returned candidate_id is validated against the input set; rows with unknown IDs are dropped.
  • Out-of-enum severity: normalized to None before persistence (the schema enum should already block this, but we don't trust it).
  • Out-of-enum resolution_action: dropped from the persisted record. The contradiction itself stays — only the resolution suggestion is discarded.
  • Empty corpus: short-circuits before any LLM call (cost guard).
  • LLM error: caught and logged; record creation simply yields no contradictions for that item, the verification still lands, the session is still marked processed.

Alternatives considered

  • Keep N sequential calls but switch them to structured output. Half the gain. Latency and per-item cost stay linear in candidate count.
  • No SQL at all — pass the entire knowledge_items corpus to Haiku. Rejected — token cost grows with the corpus size and gets unbounded for large instances. Domain filter is a cheap hard ACL.
  • Per-domain shards via embedding cosine + Haiku judge. Tracked as a future V2 refinement when a domain corpus exceeds ~100 items (docs/TODO.md). Not needed for current scale.
  • Nested resolution object in the schema. Tested but flattened — Anthropic's strict structured output works most reliably with no nested optional objects; required-fields-with-null is more reproducible than anyOf/oneOf.

Trade-offs

  • Fail-open on the LLM. If Haiku says "no contradiction" for everything, contradictions silently never surface. The previous SQL keyword filter at least seeded obvious matches. Mitigation (deferred): log + alert when a busy domain shows zero contradictions for N consecutive new items.
  • No textual fallback. When the LLM is down, no contradictions are detected at all. Acceptable because the judge was already LLM-dependent — the only thing additionally lost is the candidate seed.
  • Prompt size growth. Prompt scales with same-domain corpus size. At ~100 items × ~50 tokens each the input is ~5k tokens — still cheap on Haiku 4.5. Sharding kicks in via DEFAULT_CANDIDATE_LIMIT = 100 if a domain blows past that.
  • Test rewrites. All contradiction-side tests previously asserted on the legacy single-pair {contradicts, …} schema and title_words parameter. They were rewritten to the batched {judgments: [{candidate_id, is_contradiction, severity, …}]} shape.

Schema-side impact

None. knowledge_contradictions.suggested_resolution is already TEXT; JSON-encoded dicts fit there without a migration.


Verification

Tests covering each decision live in:

  • Decision 1: tests/test_memory_api.py::TestPersonalItemPrivacy
  • Decision 2: tests/test_corporate_memory_v1.py::TestContradictionCandidateSqlNarrowing, TestDetectorWiresContradictionDetection
  • Decision 3: tests/test_corporate_memory_v1.py::TestDetectorIgnoresLLMConfidence, TestDetectorPersistsEvidence
  • Decision 4: tests/test_corporate_memory_v1.py::TestContradictionDetectionIntegration, TestBatchedContradictionFindAndJudge (single-call cost guarantee, hallucinated-id defense, mixed batch, severity normalization, resolution round-trip, legacy-string back-compat)

Run:

pytest tests/test_memory_api.py tests/test_corporate_memory_v1.py -v