feat(admin-ui): /admin/server-config known-fields registry + structured nested editor

Today /admin/server-config renders fields by iterating Object.keys(payload) on the YAML value — if a key isn't in instance.yaml, the operator can't see it. They have to know to type it via the JSON-patch textarea (which only renders for empty sections) or SSH and edit YAML.

Adds a known-fields registry (`_KNOWN_FIELDS` in app/api/admin.py) the UI consumes alongside the YAML payload. Renderer shows BOTH:
  - existing fields (from YAML) with current value
  - known-but-unset fields with dashed-border placeholder + hint, ready to fill in

Renderer (`renderField`, `renderSection`, `collectSection`):
  - kind="string"|"secret"|"bool"|"int"|"select"|"object"|"array"|"map" — picks input type
  - kind="object" with `fields` — recursive structured form, arbitrary depth (corporate_memory needs 3-4 levels)
  - kind="array" with `item_kind` — vertical stack of typed inputs + add/remove buttons
  - kind="map" with `key_kind` + `value_kind` — key:value rows + add/remove (used for confidence.base, domain_owners, entity_resolution.entities)
  - data-path encoded as JSON segment array so map keys with embedded dots (e.g. 'user_verification.correction') survive collect → patch round-trip
  - .cfg-field.is-unset CSS — dashed border, muted label, italic hint

Sections newly exposed (added to _EDITABLE_SECTIONS):
  - openmetadata: url, token (secret), cache_ttl_seconds, verify_ssl
  - desktop: jwt_issuer, jwt_secret (secret), url_scheme

Known fields populated for existing sections:
  - data_source.bigquery: billing_project (the cause of the 403 USER_PROJECT_DENIED footgun when SA can read but not bill the data project), legacy_wrap_views (bigquery_query() wrap for VIEWs — issue #101 default off, ON for view-heavy deployments), max_bytes_per_materialize (cost guardrail)
  - data_source.keboola: stack_url, project_id (hints; values already populated)
  - ai: base_url (required for openai_compat), structured_output (select)
  - corporate_memory: full schema from instance.yaml.example — distribution_mode, approval_mode, review_period_months, notify_on_new_items, sources.{claude_local_md,session_transcripts}, extraction.{model,sensitivity_check,contradiction_check}, confidence.{base,modifiers,decay.{mode,half_life_months,decay_rate_monthly,floor}}, contradiction_detection.{enabled,max_candidates}, entity_resolution.{enabled,entities}, domain_owners, domains
  - Known partial: confidence.modifiers is map<string, map<string, float>> — falls through to JSON-textarea with TODO; structured editor for that one shape needs more renderer work

Tests:
  - test_admin_server_config_known_fields — registry envelope shape, smoke fixture
  - test_admin_server_config_renderer_depth — 4-level nested objects, arrays of strings, maps of floats, dotted-key safety
  - test_admin_server_config_corp_memory — full corporate_memory schema, 12 fields incl. nested
  - test_admin_server_config — existing tests adjusted for new shape
This commit is contained in:
ZdenekSrotyr 2026-05-01 20:27:01 +02:00
parent c63f54d643
commit df7f5b1d9a
5 changed files with 1569 additions and 47 deletions

View file

@ -108,6 +108,33 @@
.cfg-loading { padding: 32px 16px; text-align: center; color: var(--text-secondary, #6b7280); font-size: 13px; }
/* Known-but-unset fields (sourced from the known_fields registry) — render
dashed and de-emphasised so the operator sees "this is a knob you can
turn" without confusing it with a populated value. */
.cfg-field.is-unset label { color: var(--text-secondary, #9ca3af); }
.cfg-field.is-unset input[type="text"],
.cfg-field.is-unset input[type="password"],
.cfg-field.is-unset input[type="number"],
.cfg-field.is-unset select,
.cfg-field.is-unset textarea {
border-style: dashed;
background: var(--background, #fafafa);
}
.cfg-field.is-unset .field-help { font-style: italic; }
.cfg-divider {
border: 0;
border-top: 1px dashed var(--border, #e5e7eb);
margin: 12px 0;
}
.cfg-divider-label {
display: block;
font-size: 11px;
color: var(--text-secondary, #9ca3af);
margin-bottom: 8px;
text-transform: uppercase;
letter-spacing: 0.05em;
}
/* Confirmation modal — danger-zone gate */
.modal-backdrop {
position: fixed; inset: 0; background: rgba(15, 23, 42, 0.55);
@ -177,7 +204,7 @@ function isSecretKey(key) {
// Section copy — kept short; the issue's Scope section explains the rest.
const SECTION_META = {
instance: { title: "Instance", help: "Branding shown in the header and emails." },
data_source: { title: "Data source", help: "Switch source type or update connection details." },
data_source: { title: "Data source", help: "Switch source type or update connection details. Optional BQ + Keboola knobs render below as structured fields with hints; expand each to edit." },
email: { title: "Email (SMTP)", help: "SMTP relay for magic-link login. Leave blank to disable." },
telegram: { title: "Telegram", help: "Bot credentials for notifications." },
jira: { title: "Jira", help: "Jira webhook + REST credentials." },
@ -185,6 +212,12 @@ const SECTION_META = {
server: { title: "Server", help: "Hostname and host. Changing these can break OAuth callbacks." },
auth: { title: "Authentication", help: "Allowed sign-in domain and Google OAuth keys. Misconfiguration can lock everyone out." },
ai: { title: "AI / LLM", help: "Provider + API key for the corporate-memory extractor. provider=anthropic|openai_compat; api_key uses ${ENV_VAR} so the secret stays in .env." },
openmetadata: { title: "OpenMetadata", help: "Optional REST catalog enrichment. Without it, the app runs without catalog cross-links." },
desktop: { title: "Desktop app", help: "JWT auth for the desktop client (rarely changed)." },
corporate_memory: {
title: "Corporate Memory",
help: "Optional governance for AI-extracted knowledge. When the section is unset, the system runs in legacy democratic-wiki mode with no admin review.",
},
};
const DANGER_SECTIONS = new Set(["auth", "server"]);
@ -224,50 +257,407 @@ function escHtml(s) {
return d.innerHTML.replace(/"/g, "&quot;").replace(/'/g, "&#39;");
}
function renderField(section, key, value) {
const fieldId = `f_${section}_${key.replace(/\W/g, "_")}`;
const isSecret = isSecretKey(key);
// Determine input type — secrets render as password, ports as number,
// long strings (>120 chars) as textarea, the rest as text.
let input;
if (isSecret) {
input = `<input id="${fieldId}" type="password" class="is-secret" data-section="${section}" data-key="${escHtml(key)}" placeholder="${value === '<empty>' ? 'unset — type to set' : '*** — type to overwrite'}" autocomplete="off">`;
} else if (typeof value === "number") {
input = `<input id="${fieldId}" type="number" data-section="${section}" data-key="${escHtml(key)}" value="${escHtml(value)}">`;
} else if (typeof value === "boolean") {
input = `<select id="${fieldId}" data-section="${section}" data-key="${escHtml(key)}" data-cast="bool">
<option value="true" ${value ? "selected" : ""}>true</option>
<option value="false" ${!value ? "selected" : ""}>false</option>
</select>`;
} else if (value && typeof value === "object") {
// Nested object (e.g. data_source.keboola.{...}) — render the JSON
// for now and let the operator edit it as a blob. Keeps the UI simple
// while still allowing every field to be reachable.
input = `<textarea id="${fieldId}" data-section="${section}" data-key="${escHtml(key)}" data-cast="json">${escHtml(JSON.stringify(value, null, 2))}</textarea>`;
} else {
const v = value == null ? "" : value;
input = `<input id="${fieldId}" type="text" data-section="${section}" data-key="${escHtml(key)}" value="${escHtml(v)}">`;
// Encode a segment array as a JSON-string suitable for an HTML attribute.
// We store the path as JSON rather than dot-joined so that map keys (which
// are user-supplied data and can themselves contain '.', e.g.
// "user_verification.correction" in confidence.base) round-trip intact —
// splitting `data-key` on '.' would shred them into bogus extra segments.
function encodePath(segments) {
return escHtml(JSON.stringify(segments || []));
}
// Build a basic <input>/<select>/<textarea> for a leaf field. Returns the
// HTML for the input element only — the wrapping <div class="cfg-field">
// + label + hint is added by the caller.
//
// `pathSegments` is the array of registry path segments down to this leaf
// (e.g. ["bigquery", "billing_project"]). It's emitted as a JSON-encoded
// `data-path` attribute that the collector reads to rebuild the nested
// patch shape — bypassing the old dotted-string-splitting which would
// mis-parse map keys with embedded dots.
//
// `dottedKey` is kept for backward-compat / debugging; collectSection
// prefers data-path when present.
function renderLeafInput(fieldId, section, pathSegments, kind, value, opts, isUnset) {
const dottedKey = (pathSegments || []).join(".");
const dataPath = encodePath(pathSegments);
const leafKey = pathSegments && pathSegments.length ? pathSegments[pathSegments.length - 1] : "";
const isSecret = isSecretKey(String(leafKey)) || kind === "secret";
if (kind === "secret") {
const ph = isUnset
? "unset — type to set"
: (value === "<empty>" ? "unset — type to set" : "*** — type to overwrite");
return `<input id="${fieldId}" type="password" class="is-secret" data-section="${section}" data-key="${escHtml(dottedKey)}" data-path="${dataPath}" placeholder="${escHtml(ph)}" autocomplete="off">`;
}
const secretPill = isSecret ? '<span class="secret-pill">secret</span>' : "";
if (kind === "int") {
const v = (value == null || value === "") ? "" : value;
return `<input id="${fieldId}" type="number" data-section="${section}" data-key="${escHtml(dottedKey)}" data-path="${dataPath}" value="${escHtml(v)}">`;
}
if (kind === "float") {
const v = (value == null || value === "") ? "" : value;
return `<input id="${fieldId}" type="number" step="any" data-section="${section}" data-key="${escHtml(dottedKey)}" data-path="${dataPath}" data-cast="float" value="${escHtml(v)}">`;
}
if (kind === "bool") {
const v = !!value;
return `<select id="${fieldId}" data-section="${section}" data-key="${escHtml(dottedKey)}" data-path="${dataPath}" data-cast="bool">
<option value="true" ${v ? "selected" : ""}>true</option>
<option value="false" ${!v ? "selected" : ""}>false</option>
</select>`;
}
if (kind === "select" && Array.isArray(opts && opts.spec && opts.spec.options)) {
const sel = value == null ? "" : String(value);
const options = opts.spec.options.map(o => {
const ov = String(o);
return `<option value="${escHtml(ov)}" ${sel === ov ? "selected" : ""}>${escHtml(ov)}</option>`;
}).join("");
return `<select id="${fieldId}" data-section="${section}" data-key="${escHtml(dottedKey)}" data-path="${dataPath}">${options}</select>`;
}
// Default: text. Use the registry's default when unset, else the value.
const v = isUnset
? (opts && opts.spec && opts.spec.default != null ? String(opts.spec.default) : "")
: (value == null ? "" : value);
return `<input id="${fieldId}" type="text" data-section="${section}" data-key="${escHtml(dottedKey)}" data-path="${dataPath}" value="${escHtml(v)}">`;
}
// Cast a string raw value to the JS type implied by an item_kind / value_kind.
// Used by the array-of-scalars + map-of-scalars renderers when reading user
// input back out into a structured patch.
function castScalar(raw, kind) {
if (raw === "" || raw == null) return null;
if (kind === "int") {
const n = Number(raw);
return Number.isFinite(n) ? Math.trunc(n) : null;
}
if (kind === "float") {
const n = Number(raw);
return Number.isFinite(n) ? n : null;
}
if (kind === "bool") {
return raw === "true" || raw === true;
}
return String(raw);
}
// Render an array of scalars (e.g. detection_types: ["correction", ...]).
// Produces a vertical stack of text inputs, one per item, plus an add/remove
// affordance per row and a trailing "+ Add" button. The container's
// data-array-collect path collects each row's value at save time.
function renderArrayField(section, pathSegments, label, value, spec, depth) {
spec = spec || {};
const indent = (depth || 0) * 24;
const itemKind = spec.item_kind || "string";
const items = Array.isArray(value) ? value
: (value === undefined && Array.isArray(spec.default) ? spec.default : []);
const dataPath = encodePath(pathSegments);
const dottedKey = (pathSegments || []).join(".");
const arrow = depth > 0 ? "↳ " : "";
const hintBlock = spec.hint
? `<div class="field-help">${escHtml(spec.hint)}</div>`
: "";
// `data-array-collect="1"` marks the wrapper so collectSection can pick
// it up as a single unit (otherwise the per-row inputs would each emit
// their own patch leaf and clobber each other).
const rows = items.map((item, idx) => `
<div class="array-row" data-array-row="${idx}" style="display: flex; gap: 6px; margin-bottom: 4px;">
<input type="text" class="array-item-input" data-array-item="${idx}" value="${escHtml(item == null ? "" : String(item))}" style="flex: 1;">
<button type="button" class="cfg-btn" data-array-remove="${idx}" title="Remove this item">×</button>
</div>`).join("");
return `
<div class="cfg-field">
<label for="${fieldId}">${escHtml(key)}${secretPill}</label>
<div>${input}</div>
<div class="cfg-field nested-field" style="margin-left: ${indent}px;">
<label>${arrow}${escHtml(label)}</label>
<div>
<div class="array-container"
data-section="${section}"
data-key="${escHtml(dottedKey)}"
data-path="${dataPath}"
data-array-collect="1"
data-item-kind="${escHtml(itemKind)}">
<div class="array-rows">${rows}</div>
<button type="button" class="cfg-btn" data-array-add="1" data-item-kind="${escHtml(itemKind)}">+ Add item</button>
</div>
${hintBlock}
</div>
</div>`;
}
function renderSection(section, payload) {
// Render a map of string → scalar/array/object (e.g. confidence.base:
// {"user_verification.correction": 0.9, ...}). Produces a vertical stack
// of <key-input>: <value-input> rows plus a "+ Add row" button. Map keys
// are user-supplied data and may contain dots — we never reuse them as
// path segments at collect time; instead they become the *final* path
// segment of each row, JSON-encoded so collectors don't split them.
function renderMapField(section, pathSegments, label, value, spec, depth) {
spec = spec || {};
const indent = (depth || 0) * 24;
const valueKind = spec.value_kind || "string";
const valueItemKind = spec.value_item_kind || "string"; // for value_kind="array"
// Use registry default only when the value is genuinely missing (undefined).
// An explicit empty {} from YAML must not get backfilled with the example default.
const obj = (value && typeof value === "object" && !Array.isArray(value))
? value
: (value === undefined && spec.default && typeof spec.default === "object" ? spec.default : {});
const dataPath = encodePath(pathSegments);
const dottedKey = (pathSegments || []).join(".");
const arrow = depth > 0 ? "↳ " : "";
const hintBlock = spec.hint
? `<div class="field-help">${escHtml(spec.hint)}</div>`
: "";
const renderRow = (k, v, idx) => {
if (valueKind === "array") {
// Map<string, array<scalar>> — value column is itself a comma-separated
// text input. Operator can edit the list inline; collectSection splits
// on commas. (Full nested array UI inside a map row would require more
// wiring; comma-list is the pragmatic compromise.)
const items = Array.isArray(v) ? v.join(", ") : "";
return `
<div class="map-row" data-map-row="${idx}" style="display: grid; grid-template-columns: minmax(160px, 1fr) 2fr auto; gap: 6px; margin-bottom: 4px;">
<input type="text" class="map-key-input" data-map-key="${idx}" value="${escHtml(String(k))}" placeholder="key">
<input type="text" class="map-value-input" data-map-value="${idx}" value="${escHtml(items)}" placeholder="comma,separated,values">
<button type="button" class="cfg-btn" data-map-remove="${idx}" title="Remove this row">×</button>
</div>`;
}
// Scalar value column.
const inputType = (valueKind === "int" || valueKind === "float") ? "number" : "text";
const stepAttr = valueKind === "float" ? ' step="any"' : "";
return `
<div class="map-row" data-map-row="${idx}" style="display: grid; grid-template-columns: minmax(160px, 1fr) 1fr auto; gap: 6px; margin-bottom: 4px;">
<input type="text" class="map-key-input" data-map-key="${idx}" value="${escHtml(String(k))}" placeholder="key">
<input type="${inputType}"${stepAttr} class="map-value-input" data-map-value="${idx}" value="${escHtml(v == null ? "" : String(v))}" placeholder="value">
<button type="button" class="cfg-btn" data-map-remove="${idx}" title="Remove this row">×</button>
</div>`;
};
const rows = Object.entries(obj).map(([k, v], idx) => renderRow(k, v, idx)).join("");
return `
<div class="cfg-field nested-field" style="margin-left: ${indent}px;">
<label>${arrow}${escHtml(label)}</label>
<div>
<div class="map-container"
data-section="${section}"
data-key="${escHtml(dottedKey)}"
data-path="${dataPath}"
data-map-collect="1"
data-value-kind="${escHtml(valueKind)}"
data-value-item-kind="${escHtml(valueItemKind)}">
<div class="map-rows">${rows}</div>
<button type="button" class="cfg-btn" data-map-add="1" data-value-kind="${escHtml(valueKind)}">+ Add row</button>
</div>
${hintBlock}
</div>
</div>`;
}
// Render a single nested subfield row recursively. Each leaf input carries
// `data-path` (JSON-encoded segment array) so collectSection can rebuild
// the nested patch shape without splitting on '.' — important for map keys
// that themselves contain dots (e.g. confidence.base keys like
// "user_verification.correction").
//
// Recursion: arbitrary depth supported. When a child spec has kind="object"
// with its own `fields` map, we recurse with the indent bumped up. The depth
// bound is implicit (browser stack); registries with ridiculous depth would
// blow up, but the entries we ship max out around 4 levels (corporate_memory
// in subagent 3) which is comfortably within budget.
//
// `pathSegments` — array of registry path segments down to this field (e.g.
// ["bigquery", "billing_project"]). Used both for the rendered data-path
// attribute and to derive the legacy dotted key for back-compat.
function renderNestedField(section, pathSegments, label, value, spec, depth) {
spec = spec || {};
const segs = Array.isArray(pathSegments) ? pathSegments : [pathSegments];
const dottedKey = segs.join(".");
const indent = (depth || 0) * 24;
const kind = spec.kind || (
Array.isArray(value) ? "array"
: typeof value === "number" ? "int"
: typeof value === "boolean" ? "bool"
: (value && typeof value === "object") ? "object"
: "string"
);
const isSecret = isSecretKey(label) || kind === "secret";
const isUnset = (value === undefined);
const fieldId = `f_${section}_${dottedKey.replace(/\W/g, "_")}`;
const wrapperClass = "cfg-field nested-field" + (isUnset ? " is-unset" : "");
const arrow = depth > 0 ? "↳ " : "";
const secretPill = isSecret ? '<span class="secret-pill">secret</span>' : "";
const hintBlock = spec.hint
? `<div class="field-help">${escHtml(spec.hint)}</div>`
: "";
// Array-of-scalars: dedicated stack-of-inputs renderer.
if (kind === "array" && spec.item_kind && spec.item_kind !== "object") {
return renderArrayField(section, segs, label, value, spec, depth);
}
// Map<string, >: dedicated key/value-row renderer. Handles map of scalars,
// map of arrays, and (with a JSON-textarea fallback) map of complex objects.
if (kind === "map") {
if (spec.value_kind === "object" && spec.value_fields && Object.keys(spec.value_fields).length > 0) {
// TODO: structured editor for "map of objects with declared subfields"
// (e.g. confidence.modifiers — Map<string, Map<string, float>>).
// Falls through to the JSON-textarea fallback below for now.
} else {
return renderMapField(section, segs, label, value, spec, depth);
}
}
// Registry-declared object with explicit fields → recurse for each child
// as a structured form; emit a header row for the parent.
if (kind === "object" && spec.fields && typeof spec.fields === "object") {
const childValue = (value && typeof value === "object" && !Array.isArray(value)) ? value : {};
const knownChildKeys = Object.keys(spec.fields);
const knownSet = new Set(knownChildKeys);
const populatedChildKeys = Object.keys(childValue).filter(k => knownSet.has(k));
const unsetChildKeys = knownChildKeys.filter(k => !(k in childValue));
// YAML-only keys that aren't in the registry — preserve via a small JSON
// expander so admins who hand-edited an unusual key in the YAML don't
// lose it on round-trip. Keys are still editable as a single JSON blob
// (deliberately less prominent than registry-known leaves).
const fallbackKeys = Object.keys(childValue).filter(k => !knownSet.has(k));
const fallbackBlob = fallbackKeys.length
? Object.fromEntries(fallbackKeys.map(k => [k, childValue[k]]))
: null;
const renderChild = (k) => renderNestedField(
section,
segs.concat([k]),
k,
(k in childValue) ? childValue[k] : undefined,
spec.fields[k] || {},
(depth || 0) + 1,
);
const populatedHtml = populatedChildKeys.sort().map(renderChild).join("");
const unsetHtml = unsetChildKeys.sort().map(renderChild).join("");
const fallbackHtml = fallbackBlob
? (() => {
const fbId = `f_${section}_${dottedKey.replace(/\W/g, "_")}_fallback`;
const fbPath = encodePath(segs.concat(["__other__"]));
// The fallback uses the same path convention with a literal
// "__other__" leaf so the collector emits it under the parent
// in collectSection. Cast=json so the textarea content
// round-trips as an object.
const indentInner = ((depth || 0) + 1) * 24;
return `
<div class="cfg-field" style="margin-left: ${indentInner}px;">
<label for="${fbId}">↳ Other (YAML-only) keys</label>
<div>
<textarea id="${fbId}" data-section="${section}" data-key="${escHtml(dottedKey + ".__other__")}" data-path="${fbPath}" data-cast="json">${escHtml(JSON.stringify(fallbackBlob, null, 2))}</textarea>
<div class="field-help">Keys present in YAML but not in the registry. Edit as a JSON object — keys at this layer survive round-trip.</div>
</div>
</div>`;
})()
: "";
return `
<div class="cfg-field nested-field nested-parent" style="margin-left: ${indent}px;">
<label>${arrow}${escHtml(label)}</label>
<div>${hintBlock || `<div class="field-help">Nested structured fields below.</div>`}</div>
</div>
${populatedHtml}${unsetHtml}${fallbackHtml}`;
}
// Leaf field (string / int / float / bool / secret / select / array,
// OR an object without explicit `fields`, OR a map with complex values
// — the last two fall back to JSON).
let inp;
if (kind === "object" || kind === "map" || kind === "array") {
// No explicit structured renderer for this shape — JSON-textarea
// fallback so a YAML-populated subtree still round-trips even
// without finer-grained schema.
const blobValue = isUnset ? "" : JSON.stringify(value || (kind === "array" ? [] : {}), null, 2);
const dataPath = encodePath(segs);
inp = `<textarea id="${fieldId}" data-section="${section}" data-key="${escHtml(dottedKey)}" data-path="${dataPath}" data-cast="json" placeholder="${isUnset ? 'unset — paste JSON to populate' : ''}">${escHtml(blobValue)}</textarea>`;
} else {
inp = renderLeafInput(fieldId, section, segs, kind, value, { spec }, isUnset);
}
return `
<div class="${wrapperClass}" style="margin-left: ${indent}px;">
<label for="${fieldId}">${arrow}${escHtml(label)}${secretPill}</label>
<div>${inp}${hintBlock}</div>
</div>`;
}
function renderField(section, key, value, opts) {
// opts: { isUnset: bool, hint: string, kind: string, spec: {…} }
// - isUnset: render the field as a dashed placeholder (.is-unset) so the
// operator can tell at a glance that the value is sourced from the
// known_fields registry rather than the live YAML.
// - hint: one-line operator-facing help (rendered as .field-help).
// - kind: registry-declared input kind. Overrides the typeof-value
// heuristic for known-but-unset entries (we have no value to inspect).
// - spec: the raw registry entry — when kind="object" + spec.fields is
// declared, we render a fully-editable structured form (every leaf is
// a real input with a dotted-path data-key so collectSection rebuilds
// the nested patch). When spec.fields is absent / the object isn't in
// the registry, we fall back to the JSON-textarea path.
opts = opts || {};
const isUnset = !!opts.isUnset;
const valueForKind = isUnset ? undefined : value;
// Registry-declared structured object → delegate to the recursive
// nested-form renderer. Replaces the old read-only preview path.
if (opts.kind === "object" && opts.spec && opts.spec.fields && typeof opts.spec.fields === "object") {
return renderNestedField(section, [key], key, valueForKind, opts.spec, 0);
}
// Pass through ALL spec fields (item_kind, key_kind, value_kind, fields,
// value_fields, default, options, hint) so the top-level entry point can
// render arrays, maps, and primitive leaves correctly.
return renderNestedField(section, [key], key, valueForKind, opts.spec || {
kind: opts.kind,
hint: opts.hint,
}, 0);
}
function renderSection(section, payload, knownForSection) {
// knownForSection: registry slice for this section, e.g.
// { bigquery: { kind: "object", hint: "...", fields: { billing_project: {...} } } }
// Keys present in `payload` render as populated; keys present in
// `knownForSection` but absent from `payload` render as dashed
// placeholders (.is-unset).
const meta = SECTION_META[section] || { title: section, help: "" };
const isDanger = DANGER_SECTIONS.has(section);
const danger = isDanger ? '<span class="danger-pill">danger</span>' : "";
const keys = Object.keys(payload || {}).sort();
const fieldsHtml = keys.length
? keys.map(k => renderField(section, k, payload[k])).join("")
const populatedKeys = Object.keys(payload || {}).sort();
const known = knownForSection || {};
const populatedSet = new Set(populatedKeys);
const knownUnsetKeys = Object.keys(known).filter(k => !populatedSet.has(k)).sort();
const populatedHtml = populatedKeys.map(k => {
const spec = known[k] || {};
return renderField(section, k, payload[k], {
isUnset: false,
hint: spec.hint || "",
kind: spec.kind, // may be undefined; renderField falls back to typeof inference
spec,
});
}).join("");
const unsetHtml = knownUnsetKeys.map(k => {
const spec = known[k] || {};
return renderField(section, k, undefined, {
isUnset: true,
hint: spec.hint || "",
kind: spec.kind || "string",
spec,
});
}).join("");
// Visual divider between populated and known-but-unset rows so the
// operator sees at a glance which knobs they're already using vs which
// ones the registry exposes for them.
const divider = (populatedHtml && unsetHtml)
? `<hr class="cfg-divider"><span class="cfg-divider-label">Available but unset</span>`
: (unsetHtml ? `<span class="cfg-divider-label">Available but unset</span>` : "");
const fieldsHtml = (populatedHtml || unsetHtml)
? (populatedHtml + divider + unsetHtml)
: `<div class="section-help">No fields populated yet — type below to add common keys, or edit the YAML directly via the API.</div>`;
// For empty sections, give the operator a textarea so they can paste a
// YAML/JSON blob to bootstrap the section. We persist it via the JSON
// cast so non-trivial structures still merge correctly.
const bootstrap = keys.length === 0
// For empty sections (no populated *and* no known-but-unset), give the
// operator a textarea so they can paste a YAML/JSON blob to bootstrap
// the section. We persist it via the JSON cast so non-trivial structures
// still merge correctly.
const bootstrap = (populatedKeys.length === 0 && knownUnsetKeys.length === 0)
? `<div class="cfg-field">
<label for="bootstrap_${section}">JSON patch</label>
<div>
@ -297,12 +687,79 @@ function renderSection(section, payload) {
function renderAll(data) {
const wrap = document.getElementById("cfg-sections");
const sections = data.editable_sections || Object.keys(data.sections || {});
wrap.innerHTML = sections.map(s => renderSection(s, data.sections[s] || {})).join("");
const known = data.known_fields || {};
wrap.innerHTML = sections.map(s => renderSection(s, data.sections[s] || {}, known[s] || {})).join("");
document.getElementById("cfg-loading").style.display = "none";
wrap.hidden = false;
wrap.querySelectorAll('[data-action="save-section"]').forEach(btn =>
btn.addEventListener("click", () => onSaveSection(btn.dataset.section)));
// Wire array-of-scalars + map-of-scalars add/remove buttons via event
// delegation on the wrapper. Re-attaching after every renderAll() is
// fine because we replace innerHTML wholesale on each load.
wrap.addEventListener("click", (e) => {
const target = e.target;
if (!(target instanceof Element)) return;
// Add an array row.
if (target.dataset.arrayAdd) {
const container = target.closest('[data-array-collect="1"]');
if (!container) return;
const rows = container.querySelector('.array-rows');
const idx = rows.querySelectorAll('[data-array-row]').length;
const div = document.createElement("div");
div.className = "array-row";
div.dataset.arrayRow = String(idx);
div.style.display = "flex";
div.style.gap = "6px";
div.style.marginBottom = "4px";
div.innerHTML = `<input type="text" class="array-item-input" data-array-item="${idx}" value="" style="flex: 1;">
<button type="button" class="cfg-btn" data-array-remove="${idx}" title="Remove this item">×</button>`;
rows.appendChild(div);
const inp = div.querySelector('input');
if (inp) inp.focus();
return;
}
// Remove an array row.
if (target.dataset.arrayRemove != null) {
const row = target.closest('[data-array-row]');
if (row) row.remove();
return;
}
// Add a map row.
if (target.dataset.mapAdd) {
const container = target.closest('[data-map-collect="1"]');
if (!container) return;
const valueKind = container.dataset.valueKind || "string";
const rows = container.querySelector('.map-rows');
const idx = rows.querySelectorAll('[data-map-row]').length;
const div = document.createElement("div");
div.className = "map-row";
div.dataset.mapRow = String(idx);
div.style.display = "grid";
div.style.gridTemplateColumns = valueKind === "array"
? "minmax(160px, 1fr) 2fr auto"
: "minmax(160px, 1fr) 1fr auto";
div.style.gap = "6px";
div.style.marginBottom = "4px";
const valuePlaceholder = valueKind === "array" ? "comma,separated,values" : "value";
const inputType = (valueKind === "int" || valueKind === "float") ? "number" : "text";
const stepAttr = valueKind === "float" ? ' step="any"' : "";
div.innerHTML = `<input type="text" class="map-key-input" data-map-key="${idx}" value="" placeholder="key">
<input type="${inputType}"${stepAttr} class="map-value-input" data-map-value="${idx}" value="" placeholder="${valuePlaceholder}">
<button type="button" class="cfg-btn" data-map-remove="${idx}" title="Remove this row">×</button>`;
rows.appendChild(div);
const inp = div.querySelector('input');
if (inp) inp.focus();
return;
}
// Remove a map row.
if (target.dataset.mapRemove != null) {
const row = target.closest('[data-map-row]');
if (row) row.remove();
return;
}
});
}
// Recursively strip secret-keyed leaves whose value is the redaction sentinel
@ -326,21 +783,159 @@ function scrubRedactedSecrets(value) {
return value;
}
// Resolve the registry-path segments for a leaf input. We prefer the
// JSON-encoded `data-path` attribute (introduced for array/map renderers
// where data keys can themselves contain dots) and fall back to splitting
// the legacy `data-key` on '.' for older inputs.
//
// The "__other__" segment is the YAML-fallback expander — its parsed
// content is merged into the parent dict (not nested under the literal
// segment). See `setNested` for that special case.
function resolvePath(el) {
const raw = el.dataset && el.dataset.path;
if (raw) {
try {
const arr = JSON.parse(raw);
if (Array.isArray(arr)) return arr.map(s => String(s));
} catch (_) {
// fall through to dotted-key parsing
}
}
const dotKey = el.dataset && el.dataset.key;
if (!dotKey) return [];
return dotKey.split(".");
}
// Legacy alias kept for tests asserting on the helper name.
function splitDotted(dotKey) {
if (!dotKey) return [];
return dotKey.split(".");
}
// Set value at a nested path inside `out`, creating intermediate dicts as
// needed. The "__other__" segment is special-cased: its dict value gets
// merged into the parent rather than stored under the literal segment.
function setNested(out, segments, value) {
if (!segments.length) return;
let node = out;
for (let i = 0; i < segments.length - 1; i++) {
const seg = segments[i];
if (typeof node[seg] !== "object" || node[seg] === null || Array.isArray(node[seg])) {
node[seg] = {};
}
node = node[seg];
}
const last = segments[segments.length - 1];
if (last === "__other__") {
// Fallback expander: merge the JSON object into the parent. Skip if the
// user cleared the textarea or the value isn't an object.
if (value && typeof value === "object" && !Array.isArray(value)) {
Object.assign(node, value);
}
return;
}
node[last] = value;
}
// Collect the value of an array-of-scalars container (data-array-collect="1")
// — concatenates each non-empty row's input cast to the declared item_kind.
function collectArrayContainer(container) {
const itemKind = container.dataset.itemKind || "string";
const inputs = container.querySelectorAll('input[data-array-item]');
const out = [];
for (const inp of inputs) {
const raw = inp.value;
if (raw === "" || raw == null) continue; // drop blank rows
const cast = castScalar(raw, itemKind);
if (cast === null) continue;
out.push(cast);
}
return out;
}
// Collect the value of a map-of-scalars container (data-map-collect="1")
// — pairs each row's key-input + value-input, casting the value to the
// declared value_kind. Map keys keep their literal string form (we never
// split them on '.' — that's the whole point of the data-path/JSON encoding).
function collectMapContainer(container) {
const valueKind = container.dataset.valueKind || "string";
const valueItemKind = container.dataset.valueItemKind || "string";
const rows = container.querySelectorAll('[data-map-row]');
const out = {};
for (const row of rows) {
const keyInput = row.querySelector('[data-map-key]');
const valInput = row.querySelector('[data-map-value]');
if (!keyInput) continue;
const key = keyInput.value;
if (!key) continue; // skip incomplete rows
let value;
if (valueKind === "array") {
// Comma-separated list → array of scalars cast to value_item_kind.
const raw = valInput ? valInput.value : "";
value = raw.split(",").map(s => s.trim()).filter(s => s.length > 0)
.map(s => castScalar(s, valueItemKind))
.filter(v => v !== null);
} else {
const raw = valInput ? valInput.value : "";
value = castScalar(raw, valueKind);
if (value === null && raw === "") continue; // drop empty values
}
out[key] = value;
}
return out;
}
// ── Collect form values for one section ───────────────────────────────
function collectSection(section) {
const inputs = document.querySelectorAll(`[data-section="${section}"]`);
const sectionRoot = document.querySelector(`section.cfg-section[data-section="${section}"]`)
|| document;
const patch = {};
// Track ancestor paths covered by an array/map container so we don't
// double-collect their inner inputs as individual leaves.
const handledRoots = new Set();
// 1) Array containers — collect each as a single leaf.
const arrayContainers = sectionRoot.querySelectorAll('[data-array-collect="1"]');
for (const c of arrayContainers) {
if (c.dataset.section && c.dataset.section !== section) continue;
const segments = resolvePath(c);
if (!segments.length) continue;
handledRoots.add(c);
const arr = collectArrayContainer(c);
setNested(patch, segments, arr);
}
// 2) Map containers — collect each as a single dict leaf.
const mapContainers = sectionRoot.querySelectorAll('[data-map-collect="1"]');
for (const c of mapContainers) {
if (c.dataset.section && c.dataset.section !== section) continue;
const segments = resolvePath(c);
if (!segments.length) continue;
handledRoots.add(c);
const obj = collectMapContainer(c);
setNested(patch, segments, obj);
}
// 3) Plain leaf inputs (everything outside an array/map container).
const inputs = document.querySelectorAll(`[data-section="${section}"]`);
for (const el of inputs) {
if (el.dataset.action) continue; // skip buttons
const key = el.dataset.key;
if (!key) continue;
// Skip inner inputs that belong to an array/map container we already
// collected as a single unit.
if (el.closest('[data-array-collect="1"]') || el.closest('[data-map-collect="1"]')) {
// …unless the element IS itself the container (the container also
// carries data-section). In that case it was already handled above.
continue;
}
const dotKey = el.dataset.key;
if (!dotKey && !el.dataset.path) continue;
let raw = el.value;
// Skip empty secret fields — operator left them blank to preserve the
// existing value. Sending "" would overwrite the secret with empty.
if (el.classList.contains("is-secret") && raw === "") continue;
let value;
if (key === "__bootstrap__") {
if (dotKey === "__bootstrap__") {
// Bootstrap textarea — parse the entire blob and merge it as the
// section patch. Skip empty input entirely. Scrub redacted sentinels
// out of the parsed object so a round-trip can't overwrite real
@ -352,12 +947,17 @@ function collectSection(section) {
}
if (el.dataset.cast === "bool") {
value = raw === "true";
} else if (el.dataset.cast === "float") {
value = raw === "" ? null : Number(raw);
} else if (el.dataset.cast === "json") {
if (!raw.trim()) { value = null; }
else {
try { value = scrubRedactedSecrets(JSON.parse(raw)); }
catch (e) { throw new Error(`Field ${section}.${key} is not valid JSON: ${e.message}`); }
if (!raw.trim()) {
// Empty JSON textarea: skip entirely so a blank fallback expander
// doesn't wipe its parent. The deep-merge on the server preserves
// whatever's already on disk for this slot.
continue;
}
try { value = scrubRedactedSecrets(JSON.parse(raw)); }
catch (e) { throw new Error(`Field ${section}.${dotKey} is not valid JSON: ${e.message}`); }
} else if (el.type === "number") {
value = raw === "" ? null : Number(raw);
} else {
@ -366,8 +966,10 @@ function collectSection(section) {
// If the operator left a secret-keyed scalar at the redaction sentinel
// — e.g. typed nothing into a `token_env` text input that already shows
// `"***"` — drop it rather than persisting the placeholder.
if (isSecretKey(key) && (value === "***" || value === "<empty>")) continue;
patch[key] = value;
const segments = resolvePath(el);
const leafKey = segments[segments.length - 1] || "";
if (isSecretKey(leafKey) && (value === "***" || value === "<empty>")) continue;
setNested(patch, segments, value);
}
return patch;
}

View file

@ -801,3 +801,151 @@ class TestRedactionHelpers:
patch = {"a": {"y": 99}}
out = _deep_merge(base, patch)
assert out == {"a": {"x": 1, "y": 99}, "b": {"z": 3}}
# --- Phase J: BQ fields exposure in /admin/server-config ---------------------
class TestServerConfigBigQueryFields:
"""Phase J — billing_project, legacy_wrap_views, and
max_bytes_per_materialize are surfaced in the UI/API so an operator can
set them without SSH'ing to the VM. The first two were previously only
addressable via direct YAML edits; max_bytes_per_materialize had no UI
hint at all."""
def test_get_surfaces_bq_fields_even_when_unset(self, seeded_app, tmp_path, monkeypatch):
"""GET response always includes the three BQ fields under
data_source.bigquery so the UI's JSON-textarea rendering shows them
as editable keys even when YAML omits them. Without this, the
operator has no UI hint that the knobs exist."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
# Plant a minimal instance.yaml that has data_source.bigquery but
# NONE of the three fields set.
(state / "instance.yaml").write_text(yaml.dump({
"data_source": {
"type": "bigquery",
"bigquery": {"project": "my-data-prj", "location": "US"},
},
}))
import app.instance_config as ic
ic._instance_config = None
c = seeded_app["client"]
token = seeded_app["admin_token"]
resp = c.get("/api/admin/server-config", headers=_auth(token))
assert resp.status_code == 200, resp.text
bq = resp.json()["sections"]["data_source"]["bigquery"]
assert "billing_project" in bq, f"billing_project missing from GET: {bq}"
assert "legacy_wrap_views" in bq, f"legacy_wrap_views missing from GET: {bq}"
assert "max_bytes_per_materialize" in bq, \
f"max_bytes_per_materialize missing from GET: {bq}"
def test_get_preserves_existing_bq_field_values(self, seeded_app, tmp_path, monkeypatch):
"""When the operator HAS set the fields, GET must surface their actual
values, not the unset defaults."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
(state / "instance.yaml").write_text(yaml.dump({
"data_source": {
"type": "bigquery",
"bigquery": {
"project": "my-data-prj",
"billing_project": "my-billing-prj",
"legacy_wrap_views": True,
"max_bytes_per_materialize": 5368709120, # 5 GiB
},
},
}))
import app.instance_config as ic
ic._instance_config = None
c = seeded_app["client"]
token = seeded_app["admin_token"]
resp = c.get("/api/admin/server-config", headers=_auth(token))
bq = resp.json()["sections"]["data_source"]["bigquery"]
assert bq["billing_project"] == "my-billing-prj"
assert bq["legacy_wrap_views"] is True
assert bq["max_bytes_per_materialize"] == 5368709120
def test_post_persists_billing_project(self, seeded_app, tmp_path, monkeypatch):
"""POST through the existing section-patch flow persists
data_source.bigquery.billing_project to the overlay."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
(tmp_path / "state").mkdir(parents=True, exist_ok=True)
c = seeded_app["client"]
token = seeded_app["admin_token"]
resp = c.post(
"/api/admin/server-config",
json={"sections": {"data_source": {"bigquery": {
"billing_project": "my-billing-prj",
}}}},
headers=_auth(token),
)
assert resp.status_code == 200, resp.text
# Round-trip: GET should now reflect it.
resp = c.get("/api/admin/server-config", headers=_auth(token))
bq = resp.json()["sections"]["data_source"]["bigquery"]
assert bq["billing_project"] == "my-billing-prj"
def test_post_persists_legacy_wrap_views(self, seeded_app, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
(tmp_path / "state").mkdir(parents=True, exist_ok=True)
c = seeded_app["client"]
token = seeded_app["admin_token"]
resp = c.post(
"/api/admin/server-config",
json={"sections": {"data_source": {"bigquery": {
"legacy_wrap_views": True,
}}}},
headers=_auth(token),
)
assert resp.status_code == 200, resp.text
resp = c.get("/api/admin/server-config", headers=_auth(token))
bq = resp.json()["sections"]["data_source"]["bigquery"]
assert bq["legacy_wrap_views"] is True
def test_post_persists_max_bytes_per_materialize(self, seeded_app, tmp_path, monkeypatch):
monkeypatch.setenv("DATA_DIR", str(tmp_path))
(tmp_path / "state").mkdir(parents=True, exist_ok=True)
c = seeded_app["client"]
token = seeded_app["admin_token"]
resp = c.post(
"/api/admin/server-config",
json={"sections": {"data_source": {"bigquery": {
"max_bytes_per_materialize": 21474836480, # 20 GiB
}}}},
headers=_auth(token),
)
assert resp.status_code == 200, resp.text
resp = c.get("/api/admin/server-config", headers=_auth(token))
bq = resp.json()["sections"]["data_source"]["bigquery"]
assert bq["max_bytes_per_materialize"] == 21474836480
def test_template_documents_three_new_fields(self, seeded_app):
"""The three BQ optional fields are now surfaced through the
known-fields registry (GET /api/admin/server-config carries them
in `known_fields.data_source.bigquery.fields`), not hardcoded into
the template text. The renderer reads the registry at runtime and
creates a structured form with hints for each leaf so the test
verifies operator-discoverability through the API channel rather
than via static HTML inspection.
"""
c = seeded_app["client"]
token = seeded_app["admin_token"]
resp = c.get("/api/admin/server-config", headers=_auth(token))
assert resp.status_code == 200, resp.text
bq_fields = resp.json()["known_fields"]["data_source"]["bigquery"]["fields"]
assert "billing_project" in bq_fields, \
"registry must expose billing_project as a known field"
assert "legacy_wrap_views" in bq_fields, \
"registry must expose legacy_wrap_views as a known field"
assert "max_bytes_per_materialize" in bq_fields, \
"registry must expose max_bytes_per_materialize as a known field"
# Each field must carry a hint so the renderer can show operator-
# facing help text — no anonymous knobs.
for k in ("billing_project", "legacy_wrap_views", "max_bytes_per_materialize"):
assert "hint" in bq_fields[k] and bq_fields[k]["hint"], \
f"{k} must carry a non-empty hint"

View file

@ -0,0 +1,260 @@
"""Tests for the corporate_memory governance section in /admin/server-config.
corporate_memory is the deepest-nested schema in instance.yaml the
canonical reference is `config/instance.yaml.example` lines 224-317.
The whole section is optional; when omitted the system runs in legacy
democratic-wiki mode with no admin review. The registry must still
expose the full schema so admins can opt in via the editor without
hand-editing YAML.
Coverage:
- editable section + registry exposure
- top-level scalar fields (distribution_mode, approval_mode, )
- 4-level nested object access (sources.session_transcripts.detection_types)
- map shapes with dotted-string data keys (confidence.base)
- POST flow merges nested edits into the on-disk YAML.
"""
def _auth(token):
return {"Authorization": f"Bearer {token}"}
def test_corporate_memory_in_editable_sections(seeded_app):
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
assert r.status_code == 200
body = r.json()
assert "corporate_memory" in body["editable_sections"]
def test_corp_memory_top_level_fields_present(seeded_app):
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
fields = r.json()["known_fields"]["corporate_memory"]
for k in ["distribution_mode", "approval_mode", "review_period_months", "notify_on_new_items"]:
assert k in fields, f"missing top-level field {k!r}"
assert fields["distribution_mode"]["kind"] == "select"
assert "hybrid" in fields["distribution_mode"]["options"]
assert fields["distribution_mode"]["default"] == "hybrid"
assert fields["approval_mode"]["kind"] == "select"
assert "review_queue" in fields["approval_mode"]["options"]
assert fields["review_period_months"]["kind"] == "int"
assert fields["review_period_months"]["default"] == 6
assert fields["notify_on_new_items"]["kind"] == "bool"
assert fields["notify_on_new_items"]["default"] is True
def test_corp_memory_nested_sources_session_transcripts_detection_types(seeded_app):
"""Deep schema: sources.session_transcripts.detection_types is an
array of strings. The registry must navigate object object array."""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
fields = r.json()["known_fields"]["corporate_memory"]
sources = fields["sources"]
assert sources["kind"] == "object"
sess = sources["fields"]["session_transcripts"]
assert sess["kind"] == "object"
dt = sess["fields"]["detection_types"]
assert dt["kind"] == "array"
assert dt["item_kind"] == "string"
assert "correction" in dt["default"]
assert "confirmation" in dt["default"]
assert "unprompted_definition" in dt["default"]
def test_corp_memory_extraction_section_present(seeded_app):
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
extraction = r.json()["known_fields"]["corporate_memory"]["extraction"]
assert extraction["kind"] == "object"
assert "model" in extraction["fields"]
assert "sensitivity_check" in extraction["fields"]
assert extraction["fields"]["sensitivity_check"]["kind"] == "bool"
assert extraction["fields"]["sensitivity_check"]["default"] is True
def test_corp_memory_confidence_base_is_map_of_floats(seeded_app):
"""confidence.base is a map<string, float>. Keys preserve dotted
namespace (data, not path) e.g. user_verification.correction is
one map key, not two nested levels."""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
fields = r.json()["known_fields"]["corporate_memory"]
conf_base = fields["confidence"]["fields"]["base"]
assert conf_base["kind"] == "map"
assert conf_base["key_kind"] == "string"
assert conf_base["value_kind"] == "float"
# Dotted keys preserved as data keys (not path):
assert "user_verification.correction" in conf_base["default"]
assert conf_base["default"]["user_verification.correction"] == 0.90
assert conf_base["default"]["admin_mandate"] == 1.00
def test_corp_memory_confidence_decay_is_4_level_nested(seeded_app):
"""confidence.decay.floor goes object → object → object → map.
Pins down the renderer's ability to drill 4 levels deep."""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
fields = r.json()["known_fields"]["corporate_memory"]
decay = fields["confidence"]["fields"]["decay"]
assert decay["kind"] == "object"
assert decay["fields"]["mode"]["kind"] == "select"
assert "exponential" in decay["fields"]["mode"]["options"]
assert "linear" in decay["fields"]["mode"]["options"]
assert decay["fields"]["mode"]["default"] == "exponential"
floor = decay["fields"]["floor"]
assert floor["kind"] == "map"
assert floor["value_kind"] == "float"
assert floor["default"]["admin_mandate"] == 0.50
def test_corp_memory_entity_resolution_map_of_arrays(seeded_app):
"""entity_resolution.entities is a map<string, array<string>>."""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
fields = r.json()["known_fields"]["corporate_memory"]
er = fields["entity_resolution"]
assert er["kind"] == "object"
entities = er["fields"]["entities"]
assert entities["kind"] == "map"
assert entities["value_kind"] == "array"
assert entities["value_item_kind"] == "string"
assert "MRR" in entities["default"]["metrics"]
def test_corp_memory_domain_owners_map_of_email_arrays(seeded_app):
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
fields = r.json()["known_fields"]["corporate_memory"]
do = fields["domain_owners"]
assert do["kind"] == "map"
assert do["value_kind"] == "array"
assert do["value_item_kind"] == "string"
def test_corp_memory_domains_array_of_strings(seeded_app):
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
fields = r.json()["known_fields"]["corporate_memory"]
domains = fields["domains"]
assert domains["kind"] == "array"
assert domains["item_kind"] == "string"
assert "finance" in domains["default"]
assert "engineering" in domains["default"]
def test_corp_memory_section_renders_in_html(seeded_app, monkeypatch, tmp_path):
"""SECTION_META must include corporate_memory so the section header
has a friendly title + help instead of falling back to the raw key."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import yaml as _yaml
(state / "instance.yaml").write_text(_yaml.dump({
"data_source": {"type": "bigquery", "bigquery": {"project": "p"}},
}))
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
c.cookies.set("access_token", token)
try:
r = c.get("/admin/server-config", headers={"Accept": "text/html"})
finally:
c.cookies.clear()
assert r.status_code == 200, r.text
body = r.text
# SECTION_META entry — title is operator-friendly.
assert "corporate_memory" in body, "section name not exposed in template JS"
assert "Corporate Memory" in body, "SECTION_META entry missing"
finally:
ic._instance_config = None
def test_post_corp_memory_section_persists(seeded_app, monkeypatch, tmp_path):
"""POST corporate_memory section: distribution_mode + nested model +
array of domains all merge into instance.yaml."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.post(
"/api/admin/server-config",
headers=_auth(token),
json={
"sections": {
"corporate_memory": {
"distribution_mode": "admin_curated",
"review_period_months": 12,
"extraction": {"model": "claude-sonnet-4-6"},
"domains": ["finance", "engineering", "product"],
},
},
},
)
assert r.status_code in (200, 204), r.text
# Re-read from disk to verify persistence + merge.
import yaml as _yaml
loaded = _yaml.safe_load((state / "instance.yaml").read_text())
cm = loaded.get("corporate_memory", {})
assert cm.get("distribution_mode") == "admin_curated"
assert cm.get("review_period_months") == 12
assert cm.get("extraction", {}).get("model") == "claude-sonnet-4-6"
assert cm.get("domains") == ["finance", "engineering", "product"]
finally:
ic._instance_config = None
def test_post_corp_memory_with_dotted_map_keys_persists(seeded_app, monkeypatch, tmp_path):
"""The renderer's data-path encoding must let an admin save a
confidence.base entry whose KEY contains a literal dot (e.g.
user_verification.correction). Server-side this is just a dict; we
verify by POSTing the patch directly and reading it back."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.post(
"/api/admin/server-config",
headers=_auth(token),
json={
"sections": {
"corporate_memory": {
"confidence": {
"base": {
"user_verification.correction": 0.95,
"admin_mandate": 1.0,
},
},
},
},
},
)
assert r.status_code in (200, 204), r.text
import yaml as _yaml
loaded = _yaml.safe_load((state / "instance.yaml").read_text())
base = loaded["corporate_memory"]["confidence"]["base"]
# Dotted key survives literally — not split into nested objects.
assert base["user_verification.correction"] == 0.95
assert base["admin_mandate"] == 1.0
finally:
ic._instance_config = None

View file

@ -0,0 +1,341 @@
"""Tests for the known-fields registry exposure in /admin/server-config.
The /admin/server-config UI used to render only fields that already existed
in instance.yaml operators couldn't discover optional knobs like
``data_source.bigquery.billing_project`` without reading the docs or hitting
runtime errors. The known-fields registry lets the backend declare "these
fields are valid for this section even when YAML omits them" so the UI can
render them as dashed placeholders alongside the populated values.
This test file proves the wiring at three layers:
1. GET response carries `known_fields`
2. The HTML shell ships the CSS class + JS hook the renderer needs
3. Registry entries surface when the YAML doesn't list the field
"""
def _auth(token):
return {"Authorization": f"Bearer {token}"}
def test_get_server_config_returns_known_fields(seeded_app):
"""J2: GET response includes known_fields registry."""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
assert r.status_code == 200, r.text
body = r.json()
assert "known_fields" in body
assert isinstance(body["known_fields"], dict)
# Smoke fixture: data_source.bigquery.billing_project must be in the registry.
bq = body["known_fields"].get("data_source", {}).get("bigquery", {})
fields = bq.get("fields", {})
assert "billing_project" in fields, body["known_fields"]
assert "hint" in fields["billing_project"]
def test_known_field_billing_project_renders_in_ui(seeded_app, monkeypatch, tmp_path):
"""J3: renderer ships the CSS class + reads known_fields from the API.
We can't assert the dashed input directly (the page is shell-only — the
JS fills `#cfg-sections` from the GET response after the HTML loads).
Instead verify the static template ships the two markers the renderer
needs: the `is-unset` CSS class and a `known_fields` reference in the
JS. The two together prove the wiring exists.
"""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import yaml as _yaml
(state / "instance.yaml").write_text(_yaml.dump({
"data_source": {"type": "bigquery", "bigquery": {"project": "p"}},
}))
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
c.cookies.set("access_token", token)
try:
r = c.get("/admin/server-config", headers={"Accept": "text/html"})
finally:
c.cookies.clear()
assert r.status_code == 200, r.text
body = r.text
assert "is-unset" in body, "cfg-field.is-unset CSS class missing"
assert "known_fields" in body, "renderer JS needs to consume known_fields"
finally:
ic._instance_config = None
def test_known_field_value_unset_when_yaml_missing(seeded_app, monkeypatch, tmp_path):
"""J3 indirectly: when the YAML has no billing_project, the GET still
omits it from sections (it's not there to surface), but the registry
entry tells the UI it's a valid optional field worth exposing."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import yaml as _yaml
(state / "instance.yaml").write_text(_yaml.dump({
"data_source": {"type": "bigquery", "bigquery": {"project": "data-proj"}},
}))
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
assert r.status_code == 200, r.text
body = r.json()
bq_section = body.get("sections", {}).get("data_source", {}).get("bigquery", {})
# billing_project must be discoverable via known_fields even when
# absent from YAML — the registry is the single source for which
# optional knobs exist.
assert "billing_project" in body["known_fields"]["data_source"]["bigquery"]["fields"]
# The pre-existing _ensure_bq_optional_fields helper still seeds a
# default into the section payload, so billing_project shows up
# there too — that's fine, the registry exposes the *schema*
# (kind/hint) the UI needs to render the field nicely. What matters
# is that the registry is present so subagents 2-4 can populate
# fields that *don't* have a corresponding seed helper.
assert isinstance(bq_section, dict)
finally:
ic._instance_config = None
def test_known_fields_covers_all_editable_sections(seeded_app):
"""The registry has an entry (even if empty) for every editable section
so subagents 2-4 know where to add their entries without having to
decide whether the section needs a new top-level key.
"""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
body = r.json()
editable = set(body["editable_sections"])
known = set(body["known_fields"].keys())
# Every editable section must have a (possibly empty) known_fields entry.
missing = editable - known
assert not missing, f"sections without a known_fields slot: {sorted(missing)}"
# ── Part A: structured nested-field rendering ───────────────────────────────
def test_nested_field_renders_as_structured_form_not_json_blob(seeded_app, monkeypatch, tmp_path):
"""Renderer J3 upgrade: registry-declared nested fields get individual
inputs with dotted-path data-key, not a single JSON textarea for the
parent object. The JS renderer must contain the dotted-path collection
logic so subfields round-trip as a structured patch.
"""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import yaml as _yaml
(state / "instance.yaml").write_text(_yaml.dump({
"data_source": {"type": "bigquery", "bigquery": {"project": "p"}},
}))
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
c.cookies.set("access_token", token)
try:
r = c.get("/admin/server-config", headers={"Accept": "text/html"})
finally:
c.cookies.clear()
assert r.status_code == 200, r.text
body = r.text
# Renderer must ship the structured nested-field path. The JS uses
# dotted-path data-key for child inputs (e.g. data-key="bigquery.billing_project")
# and the collector reconstructs nested patches.
assert "nested-field" in body or "renderNestedField" in body or "dotted" in body or 'data-nested' in body, \
"renderer JS must support structured nested-field rendering"
# The collector must understand dotted-path keys (parent.child) and
# rebuild a nested patch from them — replaces the old JSON-textarea path.
assert "splitDotted" in body or '.split(".")' in body or "dotKey" in body or "nestedKey" in body, \
"collector JS must rebuild nested patches from dotted-path keys"
# Display-only mode must be GONE — child rows are now first-class inputs.
assert "data-display-only" not in body, \
"display-only fallback path must be removed; child fields are now editable"
finally:
ic._instance_config = None
# ── Part B: registry population ─────────────────────────────────────────────
def test_bigquery_subfields_populated(seeded_app):
"""Every documented BigQuery optional knob is in the registry under
data_source.bigquery.fields with the right kind."""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
assert r.status_code == 200
fields = r.json()["known_fields"]["data_source"]["bigquery"]["fields"]
assert "billing_project" in fields
assert "legacy_wrap_views" in fields
assert "max_bytes_per_materialize" in fields
assert fields["legacy_wrap_views"]["kind"] == "bool"
assert fields["legacy_wrap_views"]["default"] is False
assert fields["max_bytes_per_materialize"]["kind"] == "int"
assert fields["max_bytes_per_materialize"]["default"] == 10737418240
def test_keboola_registry_entries_present(seeded_app):
"""Keboola subfields exposed for hint discoverability."""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
fields = r.json()["known_fields"]["data_source"]["keboola"]["fields"]
assert "stack_url" in fields
assert "project_id" in fields
def test_ai_base_url_populated(seeded_app):
"""AI section exposes base_url + structured_output."""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
fields = r.json()["known_fields"]["ai"]
assert "base_url" in fields
assert "structured_output" in fields
assert fields["structured_output"]["kind"] == "select"
assert fields["structured_output"]["default"] == "auto"
def test_openmetadata_is_editable_section_with_known_fields(seeded_app):
"""openmetadata is a new editable section with full registry."""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
body = r.json()
assert "openmetadata" in body["editable_sections"]
fields = body["known_fields"].get("openmetadata", {})
assert "url" in fields
assert "token" in fields
assert fields["token"]["kind"] == "secret"
assert "verify_ssl" in fields
assert fields["verify_ssl"]["kind"] == "bool"
assert fields["verify_ssl"]["default"] is True
assert "cache_ttl_seconds" in fields
assert fields["cache_ttl_seconds"]["kind"] == "int"
def test_desktop_is_editable_section(seeded_app):
"""desktop is a new editable section with jwt_secret marked secret."""
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.get("/api/admin/server-config", headers=_auth(token))
body = r.json()
assert "desktop" in body["editable_sections"]
fields = body["known_fields"].get("desktop", {})
assert "jwt_issuer" in fields
assert "jwt_secret" in fields
assert fields["jwt_secret"]["kind"] == "secret"
assert "url_scheme" in fields
def test_post_openmetadata_section_persists(seeded_app, tmp_path, monkeypatch):
"""openmetadata is now in _EDITABLE_SECTIONS; POST flow accepts it."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.post(
"/api/admin/server-config",
headers=_auth(token),
json={
"sections": {
"openmetadata": {
"url": "https://om.example.com",
"cache_ttl_seconds": 1800,
"verify_ssl": True,
},
},
},
)
assert r.status_code in (200, 204), r.text
# Verify it landed on disk.
import yaml as _yaml
loaded = _yaml.safe_load((state / "instance.yaml").read_text())
assert loaded["openmetadata"]["url"] == "https://om.example.com"
assert loaded["openmetadata"]["cache_ttl_seconds"] == 1800
finally:
ic._instance_config = None
def test_post_desktop_section_persists(seeded_app, tmp_path, monkeypatch):
"""desktop section accepts patches via the standard editor flow."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
r = c.post(
"/api/admin/server-config",
headers=_auth(token),
json={"sections": {"desktop": {"jwt_issuer": "data-analyst"}}},
)
assert r.status_code in (200, 204), r.text
import yaml as _yaml
loaded = _yaml.safe_load((state / "instance.yaml").read_text())
assert loaded["desktop"]["jwt_issuer"] == "data-analyst"
finally:
ic._instance_config = None
def test_save_section_with_nested_field_merges_correctly(seeded_app, tmp_path, monkeypatch):
"""When the renderer ships a dotted-path patch (e.g. bigquery.billing_project=X),
the API merges it into the existing data_source.bigquery dict without wiping
the other keys (project, location, type)."""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import yaml as _yaml
(state / "instance.yaml").write_text(_yaml.dump({
"data_source": {
"type": "bigquery",
"bigquery": {"project": "data-proj", "location": "us-central1"},
},
}))
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
# Patch only billing_project nested under bigquery — type/project/location
# must survive the merge.
r = c.post(
"/api/admin/server-config",
headers=_auth(token),
json={
"sections": {
"data_source": {
"bigquery": {
"billing_project": "billing-proj",
},
},
},
},
)
assert r.status_code in (200, 204), r.text
# Re-read from disk to verify the deep-merge preserved siblings.
loaded = _yaml.safe_load((state / "instance.yaml").read_text())
bq = loaded["data_source"]["bigquery"]
assert bq.get("project") == "data-proj", bq
assert bq.get("location") == "us-central1", bq
assert bq.get("billing_project") == "billing-proj", bq
assert loaded["data_source"]["type"] == "bigquery"
finally:
ic._instance_config = None

View file

@ -0,0 +1,171 @@
"""Renderer depth/array/map tests for /admin/server-config.
The base renderer in `admin_server_config.html` already supports arbitrary
depth for `kind="object"` with `fields` (recursion is bounded only by the
browser stack). This file pins down the harder shapes corporate_memory
needs:
- Arrays of scalars (e.g. domains, detection_types) rendered as a
per-element stack with add/remove buttons rather than a single JSON
textarea.
- Maps of scalars (e.g. confidence.base) rendered as key:value rows with
add/remove.
- Maps whose values are arrays of strings (e.g. domain_owners,
entity_resolution.entities) rendered as key + nested array rows.
- Dotted keys present in *data* (e.g. confidence.base keys like
``user_verification.correction``) survive round-trip without being
mistaken for nested-path separators.
We assert structurally on the static template (the page is a shell JS
fills the form from /api/admin/server-config). The markers we look for
are the JS function/identifier names that implement each shape.
"""
def _auth(token):
return {"Authorization": f"Bearer {token}"}
def test_renderer_supports_array_of_scalars(seeded_app, monkeypatch, tmp_path):
"""An array-of-strings registry leaf renders as a vertical stack of
text inputs, not a JSON textarea.
Marker: the JS contains a renderer entry point for arrays-of-scalars
that produces add/remove controls `renderArrayField` or equivalent
plus an "addArrayItem" / "removeArrayItem" interaction handler.
"""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import yaml as _yaml
(state / "instance.yaml").write_text(_yaml.dump({
"data_source": {"type": "bigquery", "bigquery": {"project": "p"}},
}))
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
c.cookies.set("access_token", token)
try:
r = c.get("/admin/server-config", headers={"Accept": "text/html"})
finally:
c.cookies.clear()
assert r.status_code == 200, r.text
body = r.text
# The renderer ships a dedicated array-of-scalars path.
assert "renderArrayField" in body, \
"JS must implement renderArrayField for kind='array'+item_kind=scalar"
# Add/remove handlers for individual array items.
assert "data-array-add" in body, "missing add-row interaction marker"
assert "data-array-remove" in body, "missing remove-row interaction marker"
finally:
ic._instance_config = None
def test_renderer_supports_map_of_scalars(seeded_app, monkeypatch, tmp_path):
"""A map of string→float renders as key:value rows with add/remove,
not as a JSON textarea. Marker: `renderMapField` exists in the JS.
"""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import yaml as _yaml
(state / "instance.yaml").write_text(_yaml.dump({
"data_source": {"type": "bigquery", "bigquery": {"project": "p"}},
}))
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
c.cookies.set("access_token", token)
try:
r = c.get("/admin/server-config", headers={"Accept": "text/html"})
finally:
c.cookies.clear()
assert r.status_code == 200, r.text
body = r.text
assert "renderMapField" in body, \
"JS must implement renderMapField for kind='map'"
assert "data-map-add" in body, "missing map add-row interaction marker"
assert "data-map-remove" in body, "missing map remove-row interaction marker"
finally:
ic._instance_config = None
def test_renderer_path_is_json_encoded_not_dotted_string(seeded_app, monkeypatch, tmp_path):
"""When data keys themselves contain dots (e.g.
``confidence.base.user_verification.correction`` where
``user_verification.correction`` is one map key), the renderer must
NOT split on '.' to reconstruct the patch shape that would break
the dotted data key into two path segments.
Implementation: leaf inputs carry a `data-path` attribute holding the
JSON-encoded array of segments. The collector reads that array
instead of splitting `data-key` on '.'. The dotted `data-key` stays
around for backward compatibility (existing nested object fields
use it), but maps emit JSON paths so their keys round-trip intact.
"""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import yaml as _yaml
(state / "instance.yaml").write_text(_yaml.dump({
"data_source": {"type": "bigquery", "bigquery": {"project": "p"}},
}))
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
c.cookies.set("access_token", token)
try:
r = c.get("/admin/server-config", headers={"Accept": "text/html"})
finally:
c.cookies.clear()
assert r.status_code == 200, r.text
body = r.text
# The collector must understand JSON-encoded path arrays so map
# keys with embedded dots survive round-trip.
assert "data-path" in body, "JSON path attribute missing from renderer"
# The collector should prefer data-path over splitting data-key on '.'
# Look for the parsing entry point.
assert "JSON.parse" in body and "data-path" in body, \
"collector must parse JSON-encoded data-path arrays"
finally:
ic._instance_config = None
def test_renderer_handles_4_level_object_nesting(seeded_app, monkeypatch, tmp_path):
"""Smoke check: the recursive renderer doesn't bail out at depth 4.
The renderer is `renderNestedField(... depth)`; recursion is unbounded
on the JS side. We assert by ensuring the renderer's nested-form path
is wired with a depth-incrementing recursion call (literal markers in
the JS).
"""
monkeypatch.setenv("DATA_DIR", str(tmp_path))
state = tmp_path / "state"
state.mkdir(parents=True, exist_ok=True)
import yaml as _yaml
(state / "instance.yaml").write_text(_yaml.dump({
"data_source": {"type": "bigquery", "bigquery": {"project": "p"}},
}))
import app.instance_config as ic
ic._instance_config = None
try:
c = seeded_app["client"]
token = seeded_app["admin_token"]
c.cookies.set("access_token", token)
try:
r = c.get("/admin/server-config", headers={"Accept": "text/html"})
finally:
c.cookies.clear()
assert r.status_code == 200, r.text
body = r.text
# The recursion marker — depth bumps in the recursive call.
assert "renderNestedField(" in body
assert "(depth || 0) + 1" in body, \
"recursion must increment depth on each nested call"
finally:
ic._instance_config = None