agnes-the-ai-analyst/docs/archive/security-audit-2026-04.md
ZdenekSrotyr a48524509a
docs: consolidate and de-clutter the documentation tree (#306)
CLAUDE.md rewritten (708 -> ~320 lines): four overlapping release
sections collapsed to one, stale v1->v35 schema history dropped (it
lives in CHANGELOG), marketplace endpoint internals and verbose
process sections moved out or tightened.

New focused docs:
- docs/RELEASING.md - release process, deploy workflows, CI quirks
  (RELEASE_TEMPLATE.md folded in as an appendix)
- docs/marketplace.md - marketplace ingestion + re-serving internals
- docs/README.md - documentation index by audience, linked from
  README.md and CLAUDE.md

Archived under docs/archive/: docs/superpowers/ (52 historical
planning artifacts), HACKATHON.md, pd-ps-comments.md,
security-audit-2026-04.md, future/NOTIFICATIONS.md.

Removed the docs/auto-install.md stub. Fixed dangling links in
connectors/jira/README.md and dev_docs/README.md, repointed
code/doc references to archived paths.
2026-05-14 18:54:22 +00:00

16 KiB
Raw Permalink Blame History

Security audit — Agnes AI Data Analyst

Archived. This is a point-in-time audit snapshot. Findings have been triaged and addressed in subsequent releases — check CHANGELOG.md (search for the relevant file/endpoint) for the fixes that landed since 2026-04-22. Kept for historical reference; not a live tracker.

Date: 2026-04-22 Branch audited: main at commit cbb7733 Method: parallel review passes over four scope areas — (1) secrets/SQLi/authz/SSRF, (2) auth flows & route wiring, (3) templates & UI wiring/XSS, (4) data layer & config & dead code. Findings deduped across the passes, severities adjusted to real-world exploitability.

Known issues already in flight are marked with their tracking links so we do not re-open them.


Top 5 — fix first

1. [CRITICAL] Script-API sandbox escape → RCE for the analyst role

  • File: app/api/scripts.py:116180
  • Required role: Role.ANALYST (not admin)
  • Trigger: POST /api/scripts/run with body:
    ().__class__.__base__.__subclasses__()[N].__init__.__globals__["system"]("id")
    
  • Why existing guards miss it: the AST walker and the string allowlist block direct import os / exec, but neither stops attribute traversal through Python's class hierarchy (__class__ → __base__ → __subclasses__() → __globals__).
  • Impact: arbitrary OS commands under the FastAPI process uid. Gives access to DATA_DIR (DuckDB files, cached parquet), .jwt_secret, env vars, and any credentials mounted into the container.
  • Fix (minimum): add to the string-pattern blocklist: __subclasses__, __globals__, __mro__, __bases__, __class__, __dict__, __code__. In the AST walker, reject any ast.Attribute whose attr starts and ends with __.
  • Fix (correct): do not run untrusted Python in-process. Either drop server-side script execution entirely, or run the sandbox in nsjail/gVisor/Pyodide in isolation, or gate the endpoint behind Role.ADMIN if it must stay.
  • Confidence: broken (verified analytically).

2. [HIGH] /auth/password/reset endpoint missing — "Forgot Password?" returns 404

  • Template reference: app/web/templates/login_email.html:47<form method="POST" action="/auth/password/reset">
  • URL map: app/web/router.py:119"password_auth.reset_request": "/auth/password/reset"
  • Backend: app/auth/providers/password.py only registers /login, /login/web, /setup. No /reset handler is wired.
  • Related dead code: templates password_reset.html and password_setup.html exist but no route renders them — indicates an abandoned reset flow.
  • Tracking: tracked upstream in the auth-CLI repo
  • Confidence: broken.

3. [HIGH] No rate limiting on any auth endpoint

  • Files: app/auth/providers/password.py:36, app/auth/providers/email.py:53, app/auth/router.py:58, app/main.py (middleware).
  • Evidence: grep -rn "slowapi\|limiter\|throttle\|ratelimit" — zero hits in app/.
  • Impact:
    • Password brute-force against POST /auth/password/login and POST /auth/token.
    • Email bombing on POST /auth/email/send-link — attacker floods SMTP/SendGrid quota by looping with random recipients.
    • Enumeration of bootstrap state via POST /auth/bootstrap.
  • Fix: add slowapi with @limiter.limit("10/minute") on the four endpoints above, get_remote_address as key with proxy-aware client-IP extraction (project already has _client_ip helper in app/auth/dependencies.py).
  • Confidence: broken.

4. [HIGH] Open-redirect bypass via backslash in safe_next_path

  • Files: app/auth/_common.py:10-24, app/auth/providers/password.py:95-96, app/web/router.py:218-219.
  • Trigger: https://agnes/login?next=/\evil.com
  • Why the current check fails: Python sees /\ (not //) and the guard startswith("//") does not fire. Every major browser normalizes \ to / in URL paths, so Location: /\evil.com resolves as //evil.com — a cross-origin redirect.
  • Impact: phishing — attacker crafts a link on the victim's Agnes URL, lands them on a lookalike after login.
  • Fix:
    if len(candidate) > 1 and candidate[1] in ("/", "\\"):
        return default
    
    Existing tests (tests/test_web_ui.py:270-296) cover //evil.example/ but not /\evil.com — add the case.
  • Confidence: broken.

5. [HIGH] SSRF in /api/admin/configurekeboola_url accepted as-is

  • File: app/api/admin.py:163282; the URL is passed straight to KeboolaClient.test_connection() which issues a GET request.
  • Trigger: a compromised (or insider-threat) admin sends {"keboola_url": "http://169.254.169.254/latest/meta-data/"} or http://localhost:5432/.
  • Impact: server as SSRF proxy to the private network — GCP/AWS instance metadata service (IAM tokens), internal databases, LAN services.
  • Why a domain allowlist is wrong: keboola_url is the URL of the Keboola stack the Agnes instance connects to, not the Agnes host. Valid values include connection.keboola.com, connection.eu-central-1.keboola.com, connection.europe-west3.gcp.keboola.com, plus potentially self-hosted private Keboola stacks. A .keboola.com suffix check would break legitimate deployments.
  • Correct fix: enforce https:// scheme, then resolve the hostname and reject any result in a private / loopback / link-local / reserved block. Still allows any public HTTPS host.
    from urllib.parse import urlparse
    import ipaddress, socket
    
    def _validate_stack_url(url: str) -> str | None:
        try:
            p = urlparse(url)
        except Exception:
            return "not a valid URL"
        if p.scheme != "https":
            return "must use https"
        if not p.hostname:
            return "missing hostname"
        try:
            infos = socket.getaddrinfo(p.hostname, None)
        except socket.gaierror:
            return f"cannot resolve {p.hostname}"
        for fam, _, _, _, sa in infos:
            try:
                ip = ipaddress.ip_address(sa[0])
            except ValueError:
                continue
            if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved:
                return f"{p.hostname} resolves to a private/loopback address ({ip})"
        return None
    
  • Residual risk: DNS rebinding (hostname resolves to a public IP at validation time, then to loopback at request time). Out of scope for this fix — would need an outbound egress proxy with an IP-level ACL.
  • Confidence: broken.

Second tier (HIGH)

# Category File:line Summary Confidence
6 race app/auth/providers/email.py:106-128 _consume_token read-validate-clear is not atomic. Two concurrent clicks on the same magic link can both issue JWTs. Fix: UPDATE users SET reset_token=NULL WHERE id=? AND reset_token=? and check rowcount == 1 before issuing the JWT. broken
7 bootstrap app/auth/router.py:103-158 Check is "any user with password_hash". If a seed admin exists without a password (e.g. created by LOCAL_DEV_MODE=1 then redeployed without it, or a SEED_ADMIN_EMAIL without SEED_ADMIN_PASSWORD), /auth/bootstrap stays open and any caller can register a new admin account. Fix: check "any user at all" or require an explicit admin token for bootstrap. broken
8 cookie app/main.py:61 SessionMiddleware(secret_key=...) without max_age → OAuth session cookie expires only when the browser closes. Also https_only is not forced in production. Fix: max_age=3600 and https_only=bool(os.environ.get("DOMAIN")). broken
9 sqli-adjacent connectors/keboola/extractor.py:104-106, 128 CREATE OR REPLACE VIEW "{table_name}" AS SELECT * FROM kbc."{bucket}"."{source_table}" — f-string DDL with identifier quotes. Inputs come from table_registry (admin-controlled), but relying on quote-escaping for DDL is fragile. Fix: validate each identifier against ^[a-zA-Z0-9_]{1,63}$ before interpolation. suspicious
10 datetime connectors/keboola/client.py:183 Same offset-naive vs. offset-aware bug as the one we just fixed in app/auth/providers/email.py. datetime.now() - cached_time crashes or treats every cache entry as stale, causing needless API calls. Fix: use datetime.now(timezone.utc) on both sides. broken

Third tier (MEDIUM)

# Category File:line Summary
11 timing email.py:113, password.py:117, dependencies.py:148 Token / token_hash compared with != instead of hmac.compare_digest. 32-byte entropy mitigates the brute side, but constant-time comparison is a zero-cost hardening.
12 xss-latent app/web/templates/_theme.html:12 {{ var }}: {{ val }}; inside a <style> block. Jinja auto-escape does not escape for CSS context. Today the values come from instance.yaml (FS-write), so risk is low — but it escalates to HIGH if a /api/admin/theme endpoint is ever added. Validate against ^[a-zA-Z0-9#()., %-]{1,50}$.
13 config-drift app/instance_config.py:21-23 _instance_config is cached in a module-level global with no invalidation — changes via /api/admin/configure or direct file edits do not take effect until the process restarts. Fix: file-mtime watch, or explicit reload=True kwarg on load_instance_config.
14 rbac-inconsistency app/auth/dependencies.py:203-210 require_admin does an exact-match role == "admin" check, while require_role(Role.ADMIN) respects the hierarchy. Risk: if a higher role is ever added, require_admin rejects it. Fix: replace all call sites with require_role(Role.ADMIN), delete require_admin.
15 sqli-defense src/repositories/knowledge.py:49-58 update(**fields) builds SET clause from fields.keys() without an allowlist (unlike users.py and metrics.py). Today the API layer filters to a safe set of keys, but one future caller with user-controlled **fields turns this into SQL column injection. Add an ALLOWED_KNOWLEDGE_FIELDS set in the repository layer.
16 silent-fail services/scheduler/__main__.py:36-48 If SCHEDULER_API_TOKEN is missing and auto-fetch fails, the scheduler proceeds with an empty token. API calls then 401 silently and no sync runs. Fix: logger.error() + non-zero exit.
17 dos-input src/repositories/audit.py:38-49 limit is a user-supplied int passed straight into SQL LIMIT. limit=1_000_000 → OOM on large audit tables. Fix: limit = min(max(limit, 1), 1000).
18 pat-audit-only app/auth/dependencies.py:151-173 New-IP use of a PAT logs to audit but neither blocks the request nor notifies the owner. A leaked PAT keeps working until someone reviews the audit log. Consider auto-revoke on new IP for tokens older than N days, or at least a notification hook.
19 cookie-inconsistency google.py:115 vs password.py:93, email.py:156 Three different rules for secure=: Google uses TESTING != 1, the others use DOMAIN != "". Unify on one helper (_is_secure_context() keyed on DOMAIN).

Fourth tier (LOW) — debt, not incidents

# Category File:line Summary
20 path-traversal app/api/upload.py:77 Target path is md_dir / f"{user_email}.md". user_email comes from the JWT (trusted after authn) but is not sanitized — theoretical exploit requires being able to register an account with ../ in the email, which depends on the OAuth provider / domain filter. upload_session and upload_artifact correctly use Path(raw).name. Fix: use user["id"] as the filename, or re.sub(r"[^a-zA-Z0-9@._-]", "_", email).
21 info-disclosure app/api/users.py:209 POST /api/users/{id}/reset-password returns {"reset_token": "..."} in plaintext. Admin-only, but if the response is logged (Nginx access log, load balancer, audit proxy) the token is captured. Return "email_sent": true and email the token, or truncate the token in the log path.
22 dead-code app/web/templates/password_reset.html, password_setup.html, login_magic_link_sent.html Templates exist, no Python route renders them. Delete or finish the flow they belong to (password reset, magic-link "check your email" confirmation).
23 empty-schedule src/scheduler.py:92-99 Empty schedule string silently returns "not due" — tables with misconfigured schedules never sync, no warning. Log at WARNING when a schedule doesn't match any known pattern.
24 silent-excepts src/db.py:254, 286, 309, 325, 368, 382, 489, 546, src/orchestrator.py:121 except Exception: pass. Some are legitimate (optional CHECKPOINT on read-only DBs), others mask migration failures. Triage each; log at least at DEBUG.

Verified non-issues (do not re-open)

Several patterns looked scary at first glance but are correctly defended:

  • UserRepository.update() SQL injection — one audit pass flagged this as CRITICAL, but the allowed = {...} allowlist at src/repositories/users.py:50-56 drops any column not in the set before the SET clause is built. Safe.
  • Orchestrator DETACH / ATTACH — identifiers are validated through _SAFE_IDENTIFIER = re.compile(r"^[a-zA-Z_][a-zA-Z0-9_]{0,63}$") in src/orchestrator.py before interpolation.
  • CORS wildcard + credentials — default CORS_ORIGINS is localhost:3000,localhost:8000, not *. Starlette's CORSMiddleware also refuses to combine * with allow_credentials=True.
  • yaml.load unsafe variant — not present anywhere. Every YAML parser uses yaml.safe_load.
  • Hardcoded secrets in the repo — no real secrets in .py/.yaml/.md. config/deploy.yml references env vars (${KAMAL_REGISTRY_PASSWORD}). Git history has no .env leak.
  • LOCAL_DEV_MODE reachable via HTTP — the flag is read from os.environ only; no request header or query parameter toggles it.
  • IDOR on /auth/tokens/{token_id} — handler checks row["user_id"] != user["id"] before returning or modifying the token. Correct.
  • Jira webhook signatureconnectors/jira/webhook.py uses hmac.compare_digest. Correct.

Coverage gaps — what this audit did not check

Area Why skipped
connectors/jira/transform.py, incremental_transform.py Scope + complex transformation logic
services/corporate_memory/*, services/telegram_bot/*, services/ws_gateway/* Not read in depth
Frontend JS in app/web/static/*.js Audit #3 focused on HTML templates
async / await correctness (blocking calls inside async handlers) Not traced
Automated scanners (bandit, semgrep, trivy, pip-audit) Not installed in the audit environment
Dependency CVE audit 73 packages in uv.lock, not walked manually
pytest run Test suite not executed — some findings may already be covered by existing tests
Docker hardening (Dockerfile — user namespaces, capabilities) Out of scope
infra/modules/customer-instance/ (Terraform / startup scripts) Out of scope

Honest estimate: roughly 6570 % of the real attack surface is covered. To close the gap I would run bandit -r app/ src/ connectors/, pip-audit, execute the pytest suite, and review outbound traffic logs from staging.


Proposed action plan

This week:

  • #1 script sandbox escape — one issue, RCE severity, own ticket
  • #3 rate limiting on auth — slowapi introduction, one PR
  • #4 backslash open-redirect — two-line fix plus test

Next sprint:

  • #5 SSRF validator in /api/admin/configure (with the IP-range variant above, not a .keboola.com suffix)
  • #6 atomic magic-link token consumption
  • #7 bootstrap hardening
  • #8 SessionMiddleware max_age + https_only

Backlog (single tracking issue with checkboxes):

  • #1124 — timing comparisons, theme XSS gating, config reload, RBAC unification, schedule validation, silent-except triage, dead templates cleanup.

The missing /auth/password/reset endpoint (#2) is already tracked upstream in the auth-CLI repo.