docs,tests: anonymize customer references

Replace identifying customer names and infrastructure URLs in
documentation and test fixtures with generic placeholders.
Test semantics preserved.
This commit is contained in:
ZdenekSrotyr 2026-04-21 11:44:23 +02:00
parent c74a1fab53
commit bd6921c4d5
3 changed files with 14 additions and 14 deletions

View file

@ -11,12 +11,12 @@ a shared knowledge base. Currently it's hardwired to call Anthropic's API direct
Different clients deploying this platform use different AI providers: Different clients deploying this platform use different AI providers:
| Client | AI Provider | Why | | Client profile | AI Provider | Why |
|--------|------------|-----| |----------------|------------|-----|
| Groupon | LiteLLM proxy | Corporate AI gateway, cost control, audit | | Enterprise with central AI gateway | LiteLLM proxy | Cost control, audit, policy enforcement |
| Keboola | Direct Anthropic | Simple setup, single provider | | Keboola | Direct Anthropic | Simple setup, single provider |
| Future client A | OpenRouter | Multi-model access, cost optimization | | Multi-model deployments | OpenRouter | Multi-model access, cost optimization |
| Future client B | Google Gemini | Existing Google Cloud relationship | | GCP-native stack | Google Gemini | Existing Google Cloud relationship |
**Problem**: The code only works with Anthropic. Adding a second client means duplicating **Problem**: The code only works with Anthropic. Adding a second client means duplicating
or rewriting the AI calling logic. or rewriting the AI calling logic.
@ -155,18 +155,18 @@ instance.yaml (ai: section)
Extractor routes to the right API: Extractor routes to the right API:
├─ Anthropic SDK → api.anthropic.com/v1/messages ├─ Anthropic SDK → api.anthropic.com/v1/messages
└─ OpenAI SDK → litellm.groupondev.com/v1/chat/completions └─ OpenAI SDK → litellm.example.com/v1/chat/completions
openrouter.ai/v1/chat/completions openrouter.ai/v1/chat/completions
any OpenAI-compatible endpoint any OpenAI-compatible endpoint
``` ```
### Config examples ### Config examples
**Groupon (LiteLLM proxy):** **OpenAI-compatible proxy (LiteLLM, OpenRouter, Azure OpenAI, ...):**
```yaml ```yaml
ai: ai:
provider: "openai_compat" provider: "openai_compat"
base_url: "https://litellm.groupondev.com" base_url: "https://litellm.example.com"
api_key: "${LLM_API_KEY}" api_key: "${LLM_API_KEY}"
model: "claude-haiku-4-5-20251001" model: "claude-haiku-4-5-20251001"
``` ```
@ -241,7 +241,7 @@ Each client controls their own provider, model, and API gateway independently.
**In scope (v1):** **In scope (v1):**
- Anthropic direct provider (existing behavior, tested) - Anthropic direct provider (existing behavior, tested)
- OpenAI-compatible proxy provider (LiteLLM, verified against Groupon proxy) - OpenAI-compatible proxy provider (LiteLLM, verified against a production proxy deployment)
- Backward compatibility with existing `ai.anthropic_api_key` config - Backward compatibility with existing `ai.anthropic_api_key` config
- Three-layer structured output fallback - Three-layer structured output fallback
- Custom error hierarchy (auth / rate limit / timeout / format) - Custom error hierarchy (auth / rate limit / timeout / format)
@ -276,7 +276,7 @@ Each client controls their own provider, model, and API gateway independently.
- Graceful degradation when `ai:` config is missing - Graceful degradation when `ai:` config is missing
### Manual Verification (before production) ### Manual Verification (before production)
- Dry-run against actual Groupon LiteLLM proxy - Dry-run against the production LiteLLM proxy deployment
- Verify structured output works through proxy - Verify structured output works through proxy
- Verify sensitivity check works through proxy - Verify sensitivity check works through proxy
- Full collection produces valid knowledge.json - Full collection produces valid knowledge.json
@ -311,7 +311,7 @@ means existing `ai.anthropic_api_key` still works if we need to roll back.
| `docs/CONFIGURATION.md` | OSS | Add AI provider docs | | `docs/CONFIGURATION.md` | OSS | Add AI provider docs |
| `tests/test_llm_connector.py` | OSS | New: connector tests | | `tests/test_llm_connector.py` | OSS | New: connector tests |
| `tests/test_corporate_memory.py` | OSS | New/expanded: behavior tests | | `tests/test_corporate_memory.py` | OSS | New/expanded: behavior tests |
| `config/instance.yaml` | Instance | Add ai: section for Groupon | | `config/instance.yaml` | Instance | Add ai: section for the target provider |
| `.github/workflows/deploy.yml` | Instance | Add LLM_API_KEY to .env | | `.github/workflows/deploy.yml` | Instance | Add LLM_API_KEY to .env |
| `env.example` | Instance | Document LLM_API_KEY | | `env.example` | Instance | Document LLM_API_KEY |

View file

@ -6,7 +6,7 @@
## 1. Problem Statement ## 1. Problem Statement
The platform was built iteratively as an internal tool and needs to become a product for external customers (Groupon, others). Key problems: The platform was built iteratively as an internal tool and needs to become a product for external customers. Key problems:
1. **Fragile filesystem state** — 10+ JSON files, permission conflicts between processes (www-data, deploy, root, user) cause outages 1. **Fragile filesystem state** — 10+ JSON files, permission conflicts between processes (www-data, deploy, root, user) cause outages
2. **No API** — all operations via SSH + bash scripts, no programmatic control 2. **No API** — all operations via SSH + bash scripts, no programmatic control
@ -487,7 +487,7 @@ tests/
1. **Greenfield demo** — build new system from scratch with sample Keboola data 1. **Greenfield demo** — build new system from scratch with sample Keboola data
2. **Validate** — end-to-end: setup → sync → query → scripts → notifications 2. **Validate** — end-to-end: setup → sync → query → scripts → notifications
3. **Migrate internal** — point new system at Keboola internal, migrate users 3. **Migrate internal** — point new system at Keboola internal, migrate users
4. **Migrate Groupon** — deploy new system for Groupon with their config 4. **Migrate first external customer** — deploy new system with their config
5. **Deprecate old** — remove old server infrastructure 5. **Deprecate old** — remove old server infrastructure
## 10. Reused Code ## 10. Reused Code

View file

@ -433,7 +433,7 @@ class TestStripHtml:
'<p><strong>Business name: </strong>Live Deals</p>' '<p><strong>Business name: </strong>Live Deals</p>'
'<p><strong>Purpose:</strong></p>' '<p><strong>Purpose:</strong></p>'
'<p>The&nbsp;<em>Live deals</em>&nbsp;metric measures the&nbsp;breadth ' '<p>The&nbsp;<em>Live deals</em>&nbsp;metric measures the&nbsp;breadth '
'of active, purchasable supply on Groupon.</p>' 'of active, purchasable supply on the platform.</p>'
) )
result = strip_html(html_desc) result = strip_html(html_desc)
assert "<" not in result assert "<" not in result