Replace hardwired Anthropic API calls with a pluggable provider system. Each deployment configures its AI provider in instance.yaml — switching between Anthropic, LiteLLM, OpenRouter, or any OpenAI-compatible proxy is a config change, not a code change. New connectors/llm/ module: - StructuredExtractor Protocol with extract_json() interface - AnthropicExtractor: direct Anthropic SDK with retry + backoff - OpenAICompatExtractor: any OpenAI-compatible proxy with three-layer structured output fallback (json_schema -> json_object -> prompt) - Configurable structured_output policy (strict/json/auto) - Custom exception hierarchy (auth/rate_limit/timeout/format/refusal) - Zero secrets in logs: no API keys, prompts, or responses logged Reviewed by: Google Gemini, Claude Sonnet, OpenAI GPT-5.4. Security audit passed with all critical findings resolved.
2.6 KiB
2.6 KiB
Configuration Reference
instance.yaml
The main configuration file for your AI Data Analyst instance. Located at config/instance.yaml.
Instance Branding
instance:
name: "AI Data Analyst" # UI title, email subjects
subtitle: "Acme Corp" # Header subtitle
copyright: "Acme Corp" # Footer copyright
Authentication
auth:
allowed_domain: "acme.com" # Google OAuth domain restriction
Only emails from this domain can log in via Google OAuth. External users can be added via password auth (requires SendGrid).
email:
from_address: "noreply@acme.com"
from_name: "Acme Data Analyst"
Used for password auth setup and reset emails. Requires SENDGRID_API_KEY in .env.
Server
server:
host: "10.0.0.1" # Server IP
hostname: "data.acme.com" # Server DNS name
Desktop App
desktop:
jwt_issuer: "acme-analyst"
url_scheme: "acme-analyst"
Data Source
data_source:
type: "keboola" # keboola, csv, bigquery
Users
users:
john.doe:
name: "John Doe"
initials: "JD"
jane.smith:
name: "Jane Smith"
initials: "JS"
username_mapping:
john.doe: john # Only if webapp and server names differ
Datasets
datasets:
jira:
label: "Jira Tickets"
description: "Support tickets"
size_hint: "~50 MB"
requires: null
jira_attachments:
label: "Jira Attachments"
description: "File attachments"
size_hint: "~500 MB+"
requires: "jira"
Catalog
catalog:
categories:
sales:
label: "Sales"
icon: "sales"
hr:
label: "HR"
icon: "hr"
order: ["sales", "hr"]
Environment Variables (.env)
Required
| Variable | Description |
|---|---|
WEBAPP_SECRET_KEY |
Flask session secret |
GOOGLE_CLIENT_ID |
Google OAuth client ID |
GOOGLE_CLIENT_SECRET |
Google OAuth client secret |
Data Source (Keboola)
| Variable | Description |
|---|---|
KEBOOLA_STORAGE_TOKEN |
Keboola Storage API token |
KEBOOLA_STACK_URL |
Keboola stack URL |
KEBOOLA_PROJECT_ID |
Keboola project ID |
DATA_DIR |
Data directory path |
Optional
| Variable | Description |
|---|---|
SENDGRID_API_KEY |
For password auth emails |
TELEGRAM_BOT_TOKEN |
For Telegram notifications |
ANTHROPIC_API_KEY |
For Corporate Memory AI |
LLM_API_KEY |
API key for LLM proxy (LiteLLM, OpenRouter, etc.) |
JIRA_WEBHOOK_SECRET |
For Jira integration |
CONFIG_DIR |
Override config directory path |