Adds corporate memory v1 (verification flywheel + contradiction detection + confidence scoring) and v1.5 (audience-based distribution + per-item privacy + admin curation). Server: GET /api/memory/bundle returns mandatory + ranked-approved items within a token budget; POST /api/memory/admin/mandate accepts an audience field gated against user_group_members; /api/memory/stats uses SQL aggregation. CLI: da sync writes received items to .claude/rules/km_*.md. Verification detector extracts knowledge candidates from session JSONL files. Auto-tagging via Haiku when ai: is configured. Adapted from the v9-era branch onto v13/v14 RBAC: _is_privileged_viewer + _effective_groups now query user_group_members JOIN user_groups; require_role(Role.KM_ADMIN) replaced with require_admin (km_admin collapsed into admin). Schema v15: knowledge_items context-engineering columns + knowledge_contradictions + session_extraction_state. Schema v16: verification_evidence. Cuts release v0.15.0 (also bundles #116 /me/debug page).
27 KiB
Corporate Memory V1 — review notes (pd × ps)
Branch: pabu/local-dev (worktree review)
Date: 2026-04-25
Reviewers: pd (with Claude Code), independent second opinion from Codex (GPT-5.5, xhigh reasoning)
Scope: verification flywheel + corporate memory V1 — services/verification_detector/, services/corporate_memory/, app/api/memory.py, schema v7→v8.
This document captures open questions and proposed changes raised during a walkthrough of the branch. It is not a blocker list — it is a starting point for a follow-up conversation. Severity ratings are pd's initial estimate; please push back where they feel off.
Context
The branch implements a "verification flywheel": session JSONLs are scanned by verification_detector, an LLM extracts corrections / confirmations / unprompted definitions, and the resulting facts feed knowledge_items with calibrated confidence. Admin governance (HITL) gates everything before approval. New schema columns track provenance (source_type, source_ref, confidence, domain, entities, valid_from/until, supersedes, sensitivity, is_personal) and two new tables (knowledge_contradictions, session_extraction_state).
The mechanism is sound. The notes below are about edges where V1 calibration choices may not survive contact with real data, and one suspected access-control issue.
1. Knowledge-item deduplication
Problem
services/verification_detector/detector.py derives the item ID as:
id = "kv_" + sha256(f"{title}:{content}").hexdigest()[:12]
Two analysts independently surfacing the same fact will almost never produce identical title and content strings — LLM phrasing varies even on the same underlying claim. So this hash collides ~never in practice. It is a deterministic ID, not a deduplication mechanism.
The branch has no other dedup safeguard at create time. Practical result: the admin review queue will grow with near-duplicates that humans must triage.
Our proposal
Do not attempt auto-dedup at create time. Instead, surface near-duplicates into the admin queue via three layers:
- V1.5 — entity-tag match: items sharing ≥N normalized entities become a "likely duplicate" candidate pair, exposed in the admin UI alongside the existing contradictions tab.
- V2 — embedding similarity: at create time, compute cosine against existing approved items in the same domain; threshold (e.g. ≥ 0.85) → flag as
likely_duplicate_of: <id>for admin merge. - The contradiction detector already surfaces "soft contradictions" — these can be repurposed to also catch near-duplicates with a single LLM judge call (same infra, different prompt).
Reasoning: auto-merge at create time is risky (false positives bury new nuance under stale items). Admin queue spam is the lesser evil; embedding pre-filter at V2 keeps admin load bounded.
Codex second opinion
Codex run:
gpt-5.5,model_reasoning_effort = xhigh, executed against the same files in worktree.
- Verdict: Partial. Don't auto-merge at create time, but pre-check likely duplicates before inserting into the admin queue (don't punt entirely to admin manual review).
- Blind spot we missed: contradiction detection is not a duplicate detector, and in the current pipeline it is not wired into
verification_detector.run()at all. Items are created atdetector.py:178butdetect_and_record()is never called. The contradiction prompt explicitly says "specificity / different perspective is not a contradiction", so near-duplicates would not surface there even if it were wired. - Concrete alternative: add a relation table
knowledge_item_relations(item_a_id, item_b_id, relation_type, score, resolved)nearsrc/db.py:76. Addrepo.create_relation()nearsrc/repositories/knowledge.py:182. Indetector.py:171, run duplicate-candidate lookup (entity-overlap is fine for V1.5; embeddings later) and attachrelation_type='likely_duplicate'so the admin queue shows duplicate candidates explicitly. - Severity: medium. Must fix before broad historical backfill; acceptable for narrow V1 only if confirmations are limited and admins see explicit duplicate candidates.
- Confidence: 90%.
Plan (revised after Codex)
- V1: accept that duplicate hint is missing; flag this loudly in PR description so reviewers know what's deferred.
- V1.5: add
knowledge_item_relationstable + entity-overlap-based duplicate suggestion. Surface in admin UI alongside contradictions. ~1 day work. - V2: embedding similarity at create time, gated by domain.
Severity: medium (was: medium). Codex agrees on default but raises the bar for V1.5 — duplicate candidates should be surfaced explicitly, not discovered ad-hoc by admin.
2. Contradiction detection — synchronous + SQL pre-filter vs Anthropic Batch API
Current
services/corporate_memory/contradiction.py runs synchronously per new item:
- Pre-filter (DuckDB): find candidates with same
domain+ keyword match ontitle.split()words >3 chars. Limit 10. - LLM-as-judge (sync): Haiku prompt returns
{contradicts, explanation, severity, suggested_resolution}.
The pre-filter has predictable recall problems: synonyms, paraphrases, cross-domain conflicts (a finance metric definition can contradict a data engineering definition of the same metric).
Proposed migration to Batch API
Anthropic's Batch API offers ~50% cost reduction with async SLA (≤24h, typically <1h). This is attractive because contradiction detection does not need real-time response — admins review queues, not push notifications.
Our proposal
Layered evolution:
- V1: keep synchronous + SQL filter as-is. Ship.
- V1.5: add embedding-based pre-filter. Voyage embeddings at ~$0.02/1M tokens are essentially free at our volume. Replace keyword filter with
cosine(item, candidate) > 0.6per domain; LLM judge unchanged. Catches paraphrases the keyword filter misses. - V2: switch the LLM judge phase to Batch API. Run a nightly sweep over
pending × approvedper domain. With 50% cost reduction and higher rate limits, we can afford O(N²) within domain shards (no pre-filter needed in Batch mode — let the model judge all pairs).
Open questions
- Hidden Batch API costs beyond cost: dev experience (test cycles), observability (job tracking, retries), debug latency. Worth it?
- Hybrid mode: keep sync for high-priority sources (admin_mandate, user corrections) and batch for bulk (
session_transcript, low-confidence pending)? Or single mode for simplicity? - Embedding threshold: 0.6 is a guess. Calibrate against held-out labeled pairs from V1 data once we have them.
Codex second opinion
- Verdict: Partial. Batch API is likely a V2 win for bulk sweeps, but not as the only mode.
- Blind spot we missed (two of them, both severe):
- The SQL pre-filter is weaker than we described. It is not
domain AND keyword; it isdomain OR keyword—src/repositories/knowledge.py:287joins all conditions withOR. Combined withORDER BY updated_at DESC LIMIT 10, recent same-domain noise crowds out the actual conflict. - Detector-created items never invoke contradiction detection at all.
detect_and_record()exists inservices/corporate_memory/contradiction.pybut is not called fromservices/verification_detector/detector.py. So V1 contradiction governance is effectively a stub.
- The SQL pre-filter is weaker than we described. It is not
- Concrete alternative: first fix V1 before optimizing.
- In
src/repositories/knowledge.py:272, builddomain = ?as a top-level conjunct (AND) and the title/content keyword expansion as an innerORgroup. - In
services/verification_detector/detector.py:178, afterrepo.create(), callcontradiction.detect_and_record(extractor, item_dict, repo). Or enqueue it for nightly batch — but it must run somewhere. - For V2, use one shared job model with two executors: sync for admin/manual/high-priority items, batch for session backfills and nightly sweeps. Don't fork prompt/candidate logic.
- In
- On embeddings threshold:
cosine > 0.6is arbitrary. Calibrate using labeled pairs (duplicate / contradiction / related-but-compatible / unrelated). Optimize for recall before LLM judging — target>95% recallwith bounded candidate count. Use top-k plus threshold, not threshold alone. - Severity: high. If V1 claims contradiction governance, this needs fixing before merge — currently the feature is shipped but inert.
- Confidence: 92%.
Plan (revised after Codex)
- V1 must-fix (was missed):
- Fix
OR→AND domain + (OR keywords)infind_contradiction_candidates. - Wire
detect_and_record()fromverification_detector.detector.run(). Or, if we don't want sync LLM cost in detector, enqueue items into apending_contradiction_checktable for a nightly batch. - Either of those, or remove the V1 claim that contradictions are surfaced.
- Fix
- V1.5: embedding pre-filter (top-k + threshold, calibrated against labeled pairs). Single shared job model.
- V2: batch executor for nightly sweeps; sync executor for admin/manual; same prompt+candidate logic.
Severity: high (was: low for V1). Codex's nailed two bugs we missed entirely.
3. Confidence scoring — calibration, decay, feedback
Current
services/corporate_memory/confidence.py is a hand-rolled lookup:
_BASE_CONFIDENCE: hard-coded dict, e.g.user_verification.correction = 0.90,admin_mandate = 1.00,claude_local_md = 0.50._MODIFIER_EFFECTS: hard-coded modifiers (+0.05per additional verifier,+0.20for admin confirmation).apply_decay(): linear0.02per month → everything reaches 0 at ~50 months including admin policies.
Three problems:
- Tuning requires deploy. The numbers are guesses; in real use we will discover that, say,
user_verification.confirmation(currently 0.60) is overoptimistic. Each iteration = code change + deploy. - Linear decay is wrong for admin policies.
admin_mandate = 1.00decaying linearly to 0 in 50 months is sematically incorrect — admin policies are policies, not "aging facts". They are explicitly revoked, not slowly forgotten. - No feedback from admin actions. When an admin rejects an item, that signal is lost — we don't update the prior for
(source_type, detection_type)based on observed approval rates.
Our proposal
Three layers:
A. Externalize all numbers to instance.yaml (V1.5, mandatory):
corporate_memory:
confidence:
base:
user_verification.correction: 0.90
user_verification.confirmation: 0.60
user_verification.unprompted_definition: 0.90
admin_mandate: 1.00
claude_local_md: 0.50
session_transcript: 0.50
modifiers:
additional_verifiers_per_user: 0.05
admin_confirmed_bonus: 0.20
decay:
mode: exponential # or "linear" for back-compat
half_life_months: 12 # confidence halves every 12 months
floor:
admin_mandate: 0.50 # admin items never decay below 0.5
user_verification.correction: 0.10
claude_local_md: 0.0
session_transcript: 0.0
B. Switch to exponential decay with per-source-type floor (V1.5):
def apply_decay(confidence, created_at, source_type, half_life_months=12, floor_per_source=None):
age_months = ...
decayed = confidence * (0.5 ** (age_months / half_life_months))
floor = (floor_per_source or {}).get(source_type, 0.0)
return max(decayed, floor)
C. Bayesian prior calibration from admin actions (V2):
Nightly: compute P(approve | source_type, detection_type) over the last N items per category. Surface as a metric in admin UI. Initially: human-edited config update. Later: gated auto-update with selection-bias mitigation (random sampling holdout).
Open questions
- Power-law vs exponential decay. Ebbinghaus forgetting curve research suggests power-law may match human knowledge half-life better. Does that translate to organizational knowledge? Probably yes for "memorized facts", probably no for "policies that are explicitly maintained".
- Per-domain calibration. Finance metrics churn quarterly; engineering conventions persist for years. Should
decay.half_life_monthsbe per-domain, not just per-source-type? - Selection bias in feedback loop. If priors gate which items admins see (sorted by confidence), and we update priors from approval rate, low-confidence items rarely get reviewed → priors stay biased. Mitigation: reserve a small random sample (e.g. 10%) outside the priority queue for unbiased measurement.
- No "post-create confirmation" mechanism. When a new verification cites an existing approved item, only the new item gets confidence; the existing item doesn't receive a verification-count boost. This loses the "multiple analysts independently cited this" signal over time.
Codex second opinion
- Verdict: Partial. External config + floors are good. The decay framing is wrong, and there are bigger V1 bugs in the data flow.
- Blind spot we missed (significant):
- LLM-controlled confidence.
services/verification_detector/prompts.py:37asks the LLM to returnbase_confidenceas part of the JSON output, andservices/verification_detector/detector.py:187stores it directly into the DB. Confidence should not be LLM-controlled. The currentcompute_confidence()exists but is bypassed for verification-detector items. - Lost evidence. The detector extracts
user_quotefrom the LLM (the exact quote that constitutes the verification — the most valuable signal!) and discards it. It is not stored anywhere. Withoutuser_quoteanddetection_typepersisted in DB, future Bayesian re-calibration has no raw material. - Decay framing is wrong. Confidence is being treated as "truth decays with age" (Ebbinghaus). Organizational facts usually change by events, scope, or validity windows — not by smooth forgetting. The
valid_from/valid_untilcolumns already exist; the right model is "fact has a validity window", not "fact slowly fades".
- LLM-controlled confidence.
- Concrete alternative:
- Remove
base_confidencefrom the LLM-required output inservices/verification_detector/schemas.py:30, or ignore it on the read side. Compute confidence in code:compute_confidence("user_verification", v["detection_type"]). - Add an evidence table:
verification_evidence(id, item_id, source_user, source_ref, detection_type, user_quote, created_at). Persistuser_quoteanddetection_typeper-verification. Multiple evidence rows per item enables real "additional verifiers" boost computation (one user × one quote = one evidence row). - Split "evidence confidence" (signal strength from sources) from "freshness / review-due" (validity windows, explicit revocation). Don't conflate them in one number.
- Remove
- On power-law vs exponential: secondary issue. Domain-specific volatility matters more — use hierarchical priors: global
(source_type, detection_type)priors plus per-domain freshness policy. - On feedback loops: mitigate via random review sampling, holdout calibration sets, and never use learned priors as the sole gate for admin visibility. Codex agrees with our concern but emphasizes random sampling more strongly.
- Severity: medium. Bayesian/config overhaul can wait, but LLM-controlled confidence and missing evidence should be fixed before V1.
- Confidence: 88%.
Plan (revised after Codex)
- V1 must-fix (was missed):
- Stop trusting LLM-returned
base_confidence. Either drop it fromVERIFICATION_SCHEMAor ignore it on insert and callcompute_confidence(...)instead. - Add
verification_evidencetable to persistuser_quote+detection_type+ source reference. This is also what makes "multi-user verification boost" actually computable post-create.
- Stop trusting LLM-returned
- V1.5: externalize numbers to
instance.yaml(A) + exponential decay with per-source floor (B). - V2: Bayesian metric surface (read-only) + per-domain decay policy. Auto-update behind feature flag with random-sample holdout.
Severity: high for V1 must-fix items above; medium for the rest.
4. Entity resolution v1
Problem
services/corporate_memory/entities.py does case-insensitive substring match against a static registry. Two consequences:
- False positives on short tokens. A registry containing
"MD"matches every occurrence of"markdown","command","admin","medical". Severe noise. - False negatives on synonyms.
"churn"does not match"attrition","customer loss","logo churn". Severe miss. - No lemmatization, no co-occurrence, no confidence on the match itself.
Our proposal
Three layers:
1. V1.5 — word-boundary regex + canonical+aliases registry:
corporate_memory:
entities:
metrics:
- canonical: churn
aliases: [attrition, customer loss, logo churn]
- canonical: MRR
aliases: [monthly recurring revenue, monthly revenue]
import re
def resolve_entities(content, title, registry):
text = f"{title} {content}"
matched = set()
for category, entries in registry.items():
for entry in entries:
patterns = [entry["canonical"]] + entry.get("aliases", [])
for p in patterns:
if re.search(rf"\b{re.escape(p)}\b", text, re.IGNORECASE):
matched.add(entry["canonical"]) # always store canonical
break
return sorted(matched)
Cost: ~30 LOC, no new dependencies, deterministic, eliminates false positives on short tokens.
2. V2 — embedding-based fuzzy match. Pre-compute embeddings for all canonical entities (cache). Per item: embed text, cosine to each entity embedding, threshold 0.75. Catches paraphrases the alias list misses. Voyage cost is negligible.
3. V2.5 — LLM extraction for low-entity items, batch mode. Items with <2 resolved entities go into a nightly batch LLM job that suggests candidates for admin curation. This is also how the synonym registry grows over time without manual curation.
Open questions
- Skip V1.5, go straight to embeddings? Argument for: per-item cost is ~$0.0000002, negligible. Argument against: word-boundary fix is so cheap (~30 LOC) it's a no-brainer; embeddings pull in voyage_sdk + caching infrastructure that we don't need yet.
- Synonym registry maintenance. Who curates it? Does it become a Big Ball Of Mud? Mitigation: V2.5 LLM-suggest pipeline auto-grows it; admin reviews additions.
- Hierarchical entities. "EMEA Sales" is a sub-team of "Sales". "MRR Q3" is an instance of "MRR". Layered approach doesn't model this. Do we need it for V1? Probably not. For V2? Maybe — depends on how often analysts ask drill-down questions.
Codex second opinion
- Verdict: Partial. Do not skip V1.5. Embeddings are not a replacement for deterministic entity resolution.
- Blind spot we missed: embeddings are bad at exact short entities, acronyms, product codes, metric names, and aliases where precision matters (e.g.
MRR,NPS, internal product codes). Substring → embeddings would trade one class of false positives for a different class of silent errors that are harder to detect. Word-boundary regex with a maintained alias list is more correct than embeddings for these cases, not just cheaper. - Concrete alternative: rebuild
entities.py:14to use a typed registry with stable canonical IDs:
Replace substring matching atmetrics: - id: metric.churn canonical: churn aliases: [attrition, customer loss, logo churn] kind: metric parent_id: null blocked_terms: [] - id: org.team.sales.emea canonical: EMEA Sales kind: team parent_id: org.team.salesentities.py:57with escaped word-boundary regex, with special rules for short aliases (e.g. require leading + trailing whitespace or punctuation). Store canonical IDs (metric.churn) inknowledge_items.entities, not display strings — display is a join. Use embeddings/LLM only to suggest aliases or candidates for admin approval, never to bypass deterministic resolution. - On hierarchies: worth modeling minimally now.
parent_idis sufficient.EMEA Salesrolls up toSales.MRR Q3is "metric + time period", not a separate flat entity — model it asentity_ref + valid_from/valid_until. - Severity: medium. Not a V1 blocker by itself, but definitely required before V1.5 if dedup or contradiction depend on entities (which our V1.5 plan does).
- Confidence: 86%.
Plan (revised after Codex)
- V1: ship as-is, accept noise.
- V1.5: typed registry with canonical IDs, word-boundary regex, store IDs not strings, support
parent_idfor hierarchies. Use this for duplicate-candidate detection (Q1 V1.5 plan). ~2 days now (more scope than originally planned). - V2+: embeddings + LLM-suggest pipeline for alias growth, not entity resolution. Admin approves auto-suggested aliases.
Severity: medium. Codex sharpened the design — store canonical IDs, not strings; hierarchy is V1.5, not V2.
5. is_personal flag — suspected leak
Code paths
app/api/memory.py:
POST /{item_id}/personal(line ~196) — only the contributor (item.source_user == user.email) can flag. Correct.GET /api/memory(line ~63) — accepts query parameterexclude_personal: bool = True. Default excludes personal items, but the endpoint acceptsexclude_personal=Falsefrom any authenticated user with no role check.GET /api/memory/my-contributions— returns contributor's own items including personal ones. Correct.
Concern
Any authenticated user can request:
GET /api/memory?exclude_personal=false
…and receive items flagged is_personal=true by other users. The flag is meant to keep an item private to the contributor, but the list endpoint exposes it whenever the caller asks.
Impact
is_personal was introduced for emergency-exit cases (the detector pulled out something private that shouldn't be team-wide). If any user can override the default, the flag provides false security. This is a confidentiality leak unless is_personal is purely an "admin convenience flag" — but that's not how the toggle UI presents it.
Proposed fix
Two options:
Option A — kill the override entirely. Remove the exclude_personal query parameter. Personal items are never visible via GET /api/memory. Contributors see them only via /my-contributions. Admins see them via a separate admin endpoint guarded by Role.KM_ADMIN.
Option B — gate the override by role. Keep the parameter, but require Role.KM_ADMIN to set it to False. Add a server-side check; non-admin requests with exclude_personal=False get 403 or are silently coerced to True.
Option A is simpler, safer, and matches the most likely product intent. Option B preserves a flexible audit endpoint for admins but is easier to misuse.
Codex second opinion
- Verdict: Y. This is a leak.
- Blind spot we missed (worse than the list path alone):
- The public list endpoint passes caller-controlled
exclude_personalintorepo.list_items()atapp/api/memory.py:87, andrepo.list_items()only filters when that flag is true atsrc/repositories/knowledge.py:113— caller controls it. - Search ignores
exclude_personalentirely viaapp/api/memory.py:78andsrc/repositories/knowledge.py:119. Any authenticated user canGET /api/memory?search=fooand see personal items unconditionally. - Direct item access (
/{id},/{id}/provenance, vote endpoints) has nois_personalcheck either. If a user knows or guesses an item ID, they retrieve it.
- The public list endpoint passes caller-controlled
- Concrete alternative:
- For the public
GET /api/memory, forceexclude_personal=Trueunlessuser.role in ("km_admin", "admin"). Or simpler: always force true and leave personal review to admin endpoints. - Add
exclude_personalsupport torepo.search()(currently it's only onlist_items()). - Add a shared
_can_view_item(user, item)check used by provenance, vote, and direct item actions. Personal items are visible only to the contributor + admins.
- For the public
- Severity: high. Must fix before merging V1.
- Confidence: 99%.
Plan (revised after Codex)
- V1 blocker. Fix all three vectors (list, search, direct access), not just the list parameter.
- Force
exclude_personal=Truefor non-admins onGET /api/memory. - Plumb
exclude_personalthroughrepo.search(). - Add
_can_view_item(user, item)helper, apply on/{id},/{id}/provenance, and any other direct-item endpoint.
- Force
- Add regression test that an authenticated non-contributor cannot retrieve another user's
is_personalitem via list, search, or direct GET.
Severity: high.
Top issues to address before merging V1
(Initial pd list was three items. Codex review added two more high-severity findings we missed entirely. Updated list below.)
is_personalleak in full breadth (Q5). Fix list, search, and direct-item access. Forceexclude_personal=Truefor non-admins onGET /api/memory; plumb the filter throughrepo.search(); add_can_view_item(user, item)for/{id},/{id}/provenance, vote endpoints. Add regression test. Severity: high. Confidence: 99%.- Contradiction detection is shipped but inert (Q2).
detect_and_record()is never called fromverification_detector.run(). Either wire it (sync afterrepo.create()or enqueue for batch) or remove the V1 claim that contradictions are surfaced. Also fix theOR → ANDbug in the SQL pre-filter atsrc/repositories/knowledge.py:272-287sodomainis a top-level conjunct and keywords are an innerORgroup. Severity: high. - LLM-controlled confidence + lost evidence (Q3). Detector trusts
base_confidencefrom the LLM and discardsuser_quote. Dropbase_confidencefromVERIFICATION_SCHEMA(or ignore on insert), callcompute_confidence(...)in code instead. Addverification_evidencetable to persistuser_quote,detection_type,source_user,source_refper evidence row. Without this, "additional verifiers boost" is computable in theory only. Severity: high.
The rest (Q1 explicit duplicate hint, Q2 batch sweep, Q3 YAML externalize + exponential decay, Q4 typed registry) can land in V1.5 as planned.
What V1.5 must own (consequential refinements, not blockers)
- Q1:
knowledge_item_relationstable for explicit duplicate candidate hints in admin queue. - Q2: embedding pre-filter (top-k + threshold, calibrated against labeled pairs).
- Q3: externalize confidence to
instance.yaml; exponential decay with per-source floor. - Q4: typed entity registry with canonical IDs, word-boundary regex, hierarchical
parent_id.
Process notes
- Worktree:
/Users/padak/github/agnes-pabu-local-dev, branchpabu/local-devtracksorigin/pabu/local-dev. - Walkthrough done file-by-file with paired reading; second-opinion run via Codex CLI (model
gpt-5.5,model_reasoning_effort = xhigh) over the same files. - Document is intended for round-trip discussion: pd commits the first pass, ps reads, replies inline or in PR comments.