docs,tests: anonymize customer references

Replace identifying customer names and infrastructure URLs in
documentation and test fixtures with generic placeholders.
Test semantics preserved.
This commit is contained in:
ZdenekSrotyr 2026-04-21 11:44:23 +02:00
parent c74a1fab53
commit bd6921c4d5
3 changed files with 14 additions and 14 deletions

View file

@ -11,12 +11,12 @@ a shared knowledge base. Currently it's hardwired to call Anthropic's API direct
Different clients deploying this platform use different AI providers:
| Client | AI Provider | Why |
|--------|------------|-----|
| Groupon | LiteLLM proxy | Corporate AI gateway, cost control, audit |
| Client profile | AI Provider | Why |
|----------------|------------|-----|
| Enterprise with central AI gateway | LiteLLM proxy | Cost control, audit, policy enforcement |
| Keboola | Direct Anthropic | Simple setup, single provider |
| Future client A | OpenRouter | Multi-model access, cost optimization |
| Future client B | Google Gemini | Existing Google Cloud relationship |
| Multi-model deployments | OpenRouter | Multi-model access, cost optimization |
| GCP-native stack | Google Gemini | Existing Google Cloud relationship |
**Problem**: The code only works with Anthropic. Adding a second client means duplicating
or rewriting the AI calling logic.
@ -155,18 +155,18 @@ instance.yaml (ai: section)
Extractor routes to the right API:
├─ Anthropic SDK → api.anthropic.com/v1/messages
└─ OpenAI SDK → litellm.groupondev.com/v1/chat/completions
└─ OpenAI SDK → litellm.example.com/v1/chat/completions
openrouter.ai/v1/chat/completions
any OpenAI-compatible endpoint
```
### Config examples
**Groupon (LiteLLM proxy):**
**OpenAI-compatible proxy (LiteLLM, OpenRouter, Azure OpenAI, ...):**
```yaml
ai:
provider: "openai_compat"
base_url: "https://litellm.groupondev.com"
base_url: "https://litellm.example.com"
api_key: "${LLM_API_KEY}"
model: "claude-haiku-4-5-20251001"
```
@ -241,7 +241,7 @@ Each client controls their own provider, model, and API gateway independently.
**In scope (v1):**
- Anthropic direct provider (existing behavior, tested)
- OpenAI-compatible proxy provider (LiteLLM, verified against Groupon proxy)
- OpenAI-compatible proxy provider (LiteLLM, verified against a production proxy deployment)
- Backward compatibility with existing `ai.anthropic_api_key` config
- Three-layer structured output fallback
- Custom error hierarchy (auth / rate limit / timeout / format)
@ -276,7 +276,7 @@ Each client controls their own provider, model, and API gateway independently.
- Graceful degradation when `ai:` config is missing
### Manual Verification (before production)
- Dry-run against actual Groupon LiteLLM proxy
- Dry-run against the production LiteLLM proxy deployment
- Verify structured output works through proxy
- Verify sensitivity check works through proxy
- Full collection produces valid knowledge.json
@ -311,7 +311,7 @@ means existing `ai.anthropic_api_key` still works if we need to roll back.
| `docs/CONFIGURATION.md` | OSS | Add AI provider docs |
| `tests/test_llm_connector.py` | OSS | New: connector tests |
| `tests/test_corporate_memory.py` | OSS | New/expanded: behavior tests |
| `config/instance.yaml` | Instance | Add ai: section for Groupon |
| `config/instance.yaml` | Instance | Add ai: section for the target provider |
| `.github/workflows/deploy.yml` | Instance | Add LLM_API_KEY to .env |
| `env.example` | Instance | Document LLM_API_KEY |

View file

@ -6,7 +6,7 @@
## 1. Problem Statement
The platform was built iteratively as an internal tool and needs to become a product for external customers (Groupon, others). Key problems:
The platform was built iteratively as an internal tool and needs to become a product for external customers. Key problems:
1. **Fragile filesystem state** — 10+ JSON files, permission conflicts between processes (www-data, deploy, root, user) cause outages
2. **No API** — all operations via SSH + bash scripts, no programmatic control
@ -487,7 +487,7 @@ tests/
1. **Greenfield demo** — build new system from scratch with sample Keboola data
2. **Validate** — end-to-end: setup → sync → query → scripts → notifications
3. **Migrate internal** — point new system at Keboola internal, migrate users
4. **Migrate Groupon** — deploy new system for Groupon with their config
4. **Migrate first external customer** — deploy new system with their config
5. **Deprecate old** — remove old server infrastructure
## 10. Reused Code

View file

@ -433,7 +433,7 @@ class TestStripHtml:
'<p><strong>Business name: </strong>Live Deals</p>'
'<p><strong>Purpose:</strong></p>'
'<p>The&nbsp;<em>Live deals</em>&nbsp;metric measures the&nbsp;breadth '
'of active, purchasable supply on Groupon.</p>'
'of active, purchasable supply on the platform.</p>'
)
result = strip_html(html_desc)
assert "<" not in result