ci: shard test suite + drop duplicate test run (#311)

The `test` job in ci.yml becomes a 4-way `test-shard` matrix (pytest-split, balanced by a committed .test_durations), aggregated into a single `test` status check so branch protection is unchanged. release.yml's duplicate full-suite `test` job is removed — it re-ran the same ~10 min suite a second time on every push to main/feature branches. release.yml is now image-build only; the advisory ruff/mypy steps move to a lean `lint` job in ci.yml. Net: ~10 min -> ~3 min wall-clock per push, and the suite runs once instead of twice.
2026-05-14 22:18:21 +02:00 · 2026-05-14 22:18:21 +02:00 · a1c7849b3e
commit a1c7849b3e
parent 6a4b3ba461
5 changed files with 4734 additions and 42 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -10,8 +10,12 @@ on:
  workflow_dispatch:
 jobs:
-  test:
+  test-shard:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        group: [1, 2, 3, 4]
    steps:
      - uses: actions/checkout@v6
@ -25,11 +29,56 @@ jobs:
      - name: Install dependencies
        run: uv pip install --system ".[dev,server]"
-      - name: Run tests (parallel)
+      - name: Run tests (shard ${{ matrix.group }}/4)
-        run: pytest tests/ -v --tb=short -n auto
+        # pytest-split shards the suite across 4 parallel jobs, balanced by
        # the committed `.test_durations` file; `-n auto` parallelises
        # within each shard across the runner's cores. Regenerate durations
        # with `pytest tests/ --store-durations -n auto` when the suite
        # drifts enough that shards become uneven.
        run: pytest tests/ -v --tb=short -n auto --splits 4 --group ${{ matrix.group }}
        env:
          TESTING: "1"
  # Single required status check. Branch protection requires `test`, but the
  # matrix above publishes `test-shard (1..4)` — this job aggregates them
  # into one `test` result so no branch-protection change is needed.
  test:
    needs: test-shard
    if: always()
    runs-on: ubuntu-latest
    steps:
      - name: Verify all test shards passed
        run: |
          if [ "${{ needs.test-shard.result }}" != "success" ]; then
            echo "::error::test-shard result was '${{ needs.test-shard.result }}' — one or more shards failed"
            exit 1
          fi
          echo "All 4 test shards passed."
  lint:
    # Advisory only (continue-on-error) — ruff + mypy surface issues but
    # never gate. Split out of release.yml's old test job; runs without the
    # full dependency install since neither tool needs it.
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6
      - uses: actions/setup-python@v6
        with:
          python-version: "3.13"
      - name: Lint with ruff
        run: |
          pip install ruff
          ruff check . || true
        continue-on-error: true
      - name: Type check with mypy
        run: |
          pip install mypy
          mypy src/ app/ cli/ connectors/ --ignore-missing-imports --no-error-summary || true
        continue-on-error: true
  cli-wheel-clean-install:
    # Catches the "wheel METADATA conflicts with transitive deps under fresh
    # resolver" class — exactly what the workspace-only `[tool.uv]
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@ -43,46 +43,22 @@ concurrency:
  cancel-in-progress: true
 jobs:
-  test:
+  # Tests + lint live in `ci.yml` (the sharded `test-shard` matrix and the
-    # Skip the `create` event for tags — those are owned by keboola-deploy.yml
+  # `lint` job). `release.yml` is the image-build pipeline only — it no
-    # and shouldn't double-build here. Branch creates DO run.
+  # longer re-runs the suite, which previously meant the full ~10 min test
-    if: github.event_name != 'create' || github.event.ref_type == 'branch'
+  # job ran twice on every push to main/feature branches.
-    runs-on: ubuntu-latest
+  #
-    steps:
+  # Tradeoff: `build-and-push` no longer has `needs: test`, so on a push to
-      - uses: actions/checkout@v6
+  # `main` the `:stable` image publishes *concurrently* with `ci.yml`'s
-
+  # tests on the merge commit — not gated behind them. What still protects
-      - uses: actions/setup-python@v6
+  # `main`: (1) branch protection requires `ci.yml`'s `test` + `docker-build`
-        with:
+  # to pass before a PR can merge, so merged code was tested at PR time;
-          python-version: "3.13"
+  # (2) the smoke-test + auto-rollback job below catches a critically broken
-
+  # `:stable`. A post-merge test failure on the merge commit itself (rare —
-      - name: Install uv
+  # flaky test or merge skew) would not block the image; that is the
-        uses: astral-sh/setup-uv@v7
+  # accepted cost of not running the suite twice. `build-and-push` is gated
-
+  # only by its own `if:` below.
      - name: Install dependencies
        run: uv pip install --system ".[dev,server]"
      - name: Lint with ruff
        run: |
          pip install ruff
          ruff check . || true
        continue-on-error: true  # Don't block on pre-existing lint issues; can tighten later
      - name: Type check with mypy
        run: |
          pip install mypy
          mypy src/ app/ cli/ connectors/ --ignore-missing-imports --no-error-summary || true
        continue-on-error: true  # Don't block on mypy initially, can tighten later
      - name: Run tests
        # `-n auto` parallelises across runner CPU cores; matches ci.yml.
        # Single-threaded was taking 15-20 min and frequently tripping
        # branch-protection waits on the parallel CI workflow.
        run: pytest tests/ -v --tb=short -n auto
        env:
          TESTING: "1"
  build-and-push:
    needs: test
    # Publish on:
    #   - any push (main → :stable-* / non-main → :dev-* + :dev-<slug>);
    #   - branch creation (a fresh branch off main with no extra commits
--- a/.test_durations
+++ b/.test_durations
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -10,6 +10,9 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C
 ## [Unreleased]
 ### Internal
 - CI test suite sharded for speed. The `test` job in `.github/workflows/ci.yml` is now a `test-shard` matrix — 4 parallel jobs via `pytest-split`, balanced by a committed `.test_durations` file — aggregated into a single `test` status check so branch protection needs no change. The duplicate full-suite `test` job in `release.yml` is removed (it re-ran the same ~10 min suite a second time on every push to main/feature branches); `release.yml` is now image-build only, with the advisory ruff/mypy steps moved to a lean `lint` job in `ci.yml`. Net: ~10 min → ~3 min wall-clock per push, and the suite runs once instead of twice. Adds `pytest-split` to the `dev` extra.
 ## [0.54.16] — 2026-05-14
 ### Fixed
--- a/pyproject.toml
+++ b/pyproject.toml
@ -137,6 +137,10 @@ dev = [
    "pytest>=9.0.0",
    "pytest-timeout>=2.0.0",
    "pytest-xdist>=3.0.0",
    # pytest-split shards the suite across parallel CI jobs (`--splits N
    # --group K`); see the `test-shard` matrix in `.github/workflows/ci.yml`.
    # Balanced by the committed `.test_durations` file.
    "pytest-split>=0.9.0",
    "faker>=24.0.0",
    # jsonschema validates the corporate-memory extraction-tool golden fixtures
    # under tests/test_corporate_memory_v1.py (extraction.json, correction.json,
@ -167,6 +171,7 @@ dev-dependencies = [
    "pytest>=9.0.0",
    "pytest-timeout>=2.0.0",
    "pytest-xdist>=3.0.0",
    "pytest-split>=0.9.0",
    "faker>=24.0.0",
    "anthropic>=0.30.0",
    "openai>=1.30.0",