Skip to content

ci(aprender): diff-scoped blocking mutation gate (coverage ratchet deferred — root facade)#2239

Open
noahgift wants to merge 3 commits into
mainfrom
feat/coverage-mutation-gates
Open

ci(aprender): diff-scoped blocking mutation gate (coverage ratchet deferred — root facade)#2239
noahgift wants to merge 3 commits into
mainfrom
feat/coverage-mutation-gates

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Gap

PMAT build-system audit gap #1: the two strongest quality signals were advisory, so a regression merged silently despite the "95% coverage / 80% mutation / ZERO tolerance" rule:

  • coverage — measured by cargo llvm-cov but never gated (codecov upload is continue-on-error).
  • mutants — full-tree, push-to-main only, continue-on-error at both job and step level → a surviving mutant blocked nothing.

aprender is the pilot for closing both. This is NOT enabled fleet-wide.

Change

1. Coverage ratchet (opt-in input added in paiml/.github#37):

  • coverage_min: "90.0" — a deliberately conservative ratchet floor, well below the documented achieved 96.94% line coverage (.pmat-gates.toml, .pmat-metrics.toml). The CI coverage job is --lib-scoped, so its measured % differs from the certeza full-suite number; we floor conservatively and tighten via the committed baseline.
  • coverage_baseline_file: ".github/coverage-baseline.txt" — committed baseline, seeded to 90.0. (Not .pmat/... because aprender gitignores .pmat/.) Effective floor = max(coverage_min, baseline), so the first gated run cannot break green.
  • The coverage job result is already wired into ci / gate, so a drop now blocks merge.

2. Diff-scoped mutation gate — the mutants job is rewritten:

before after
scope full tree (-- --lib) PR diff (--in-diff pr.diff)
trigger push to main only pull_request
blocking continue-on-error x2 blocking; wired into gate
runtime hours (choked the queue) minutes (∝ diff size)

A diff with no mutable code is a clean no-op pass. Threshold is MUTANTS_MAX_MISSED (default 0, tunable via repo var). On push-to-main the job is skipped (no PR diff); gate treats skipped as pass so main pushes are never blocked.

Safety — does not break the currently-green build

  • Coverage: floor 90.0 ≪ achieved 96.94%, and the baseline equals the floor, so max() is 90.0 — the first gated run passes by a wide margin. The enforcement step parses the already-produced lcov.info (no extra compile/test cost).
  • Mutation: scoped to the PR diff only, so it never re-runs the hours-long full tree; an empty/non-Rust diff is a no-op pass.
  • No cycle: mutants now needs: [ci, workspace-test] and gate needs: [..., mutants] (previously mutants needs: [gate]).

Ordering

Requires paiml/.github#37 (the coverage_min / coverage_baseline_file inputs) to merge first — this PR references @main of the reusable workflow.

🤖 Generated with Claude Code

noahgift and others added 2 commits June 25, 2026 18:10
…audit gap #1)

The two strongest quality signals were advisory, so a regression merged
silently despite the 95%-coverage / 80%-mutation / ZERO-tolerance rule:

  - coverage: measured but never gated (codecov upload is continue-on-error)
  - mutants:  full-tree, push-to-main only, continue-on-error at job AND
              step level — a surviving mutant blocked nothing

This closes both on aprender (the PILOT), without breaking the green build:

1. Coverage ratchet (opt-in input from sovereign-ci.yml):
   coverage_min: "90.0" — a deliberately conservative RATCHET floor, well
   below the documented achieved 96.94% line coverage. The CI coverage job
   is --lib-scoped (its % is not identical to the certeza full-suite number),
   so we floor conservatively and tighten via the committed baseline
   (.pmat/coverage-baseline.txt, seeded to 90.0 so the effective floor
   max(coverage_min, baseline) cannot break the first gated run). The
   coverage job is already wired into `ci / gate`, so a drop now blocks merge.

2. Diff-scoped mutation gate:
   The `mutants` job is rewritten from full-tree/push-only/continue-on-error
   to `cargo mutants --in-diff <pr.diff>` on pull_request events, BLOCKING
   (no continue-on-error; wired into the top-level `gate` via needs +
   result check). Diff-scoping gates only the lines a PR touches — fast
   (minutes, proportional to diff) and prevents NEW under-tested code from
   landing, instead of an hours-long full-tree run that choked the queue.
   A diff with no mutable code is a clean no-op pass. Threshold is
   MUTANTS_MAX_MISSED (default 0, tunable via repo var). On push-to-main the
   job is skipped (no PR diff); `gate` treats skipped as pass.

Requires paiml/.github PR #37 (the coverage_min input) to merge first.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
aprender gitignores .pmat/, so the ratchet baseline file cannot live at the
sovereign-ci default path (.pmat/coverage-baseline.txt). Move it to
.github/coverage-baseline.txt and wire it via the coverage_baseline_file
input. Seeded to 90.0 = coverage_min, so the effective floor
max(coverage_min, baseline) is unchanged and the first gated run is safe.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge June 25, 2026 16:17
…diff-scoped mutation gate

The coverage ratchet pilot exposed that aprender's root crate is a facade — the
sovereign-ci coverage job runs --lib on the root and exercises 0 tests, so there
is no lcov data to gate on. Enabling coverage_min meaningfully needs
test_workspace: true + GPU-member test_args exclusions (PMAT-159 blind-spot),
tracked as a follow-up. The coverage ratchet MECHANISM stays live fleet-wide via
sovereign-ci #37. aprender keeps the diff-scoped blocking mutation gate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@noahgift noahgift changed the title feat(ci): enforce coverage ratchet + diff-scoped mutation gate (PMAT audit gap #1, pilot) ci(aprender): diff-scoped blocking mutation gate (coverage ratchet deferred — root facade) Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant