admission: observe transient elastic CPU waiters via sticky bit by dt · Pull Request #171300 · cockroachdb/cockroach

dt · 2026-06-01T15:17:07Z

The elastic CPU controller's scheduler-latency listener point-samples the
WorkQueue's hasWaitingRequests at ~1Hz. The granter's tryGrant loop
drains the queue to empty as soon as tokens refill, so the queue spends
most of its time empty even under sustained throttling. The listener's
poll frequently lands in those empty windows and takes the inactive-decay
branch, pulling the utilization limit down toward inactive_point (~12%)
even when scheduler latency is well under target and there is clearly
demand for more elastic CPU.

This PR adds a sticky "had recent waiters" atomic bool on WorkQueue,
set on every Admit enqueue and cleared by an atomic Swap from the
listener each tick. A new elasticCPULimiter.hasOrHadRecentWaitingRequests
ORs this with the instantaneous hasWaitingRequests signal so any
enqueue between two ticks is durably visible to the controller, even if
the queue subsequently drained.

Fixes #170400

Epic: none

Release note (bug fix): Fix the elastic CPU admission controller holding
the elastic-work CPU utilization limit at its inactive floor (~12%) even
when there was sustained demand under the scheduling-latency target. The
controller's 1Hz poll could miss queued work that the granter drained
between ticks, causing it to incorrectly conclude there was no demand
and decay the limit.

The elastic CPU controller's scheduler-latency listener point-sampled the WorkQueue's hasWaitingRequests at ~1Hz to decide whether to raise or decay the utilization limit. The granter's tryGrant loop drains the queue to empty as soon as tokens refill, so the queue spends most of its time empty even under sustained throttling. The listener's poll frequently landed in those empty windows and took the inactive-decay branch, pulling the limit down toward inactive_point (~12%) even when sched latency was well under target and there was clearly demand for more elastic CPU. Add a sticky "had recent waiters" atomic bool on WorkQueue, set on every Admit enqueue and cleared by an atomic Swap from the listener each tick. The new elasticCPULimiter.hasOrHadRecentWaitingRequests ORs this with the instantaneous hasWaitingRequests signal so any enqueue between two ticks is durably visible to the controller, even if the queue subsequently drained. Fixes cockroachdb#170400 Release note (bug fix): Fix the elastic CPU admission controller holding the elastic-work CPU utilization limit at its inactive floor (~12%) even when there was sustained demand under the scheduling-latency target. The controller's 1Hz poll could miss queued work that the granter drained between ticks, causing it to incorrectly conclude there was no demand and decay the limit.

trunk-io · 2026-06-01T15:17:21Z

Merging to master in this repository is managed by Trunk.

To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

cockroach-teamcity · 2026-06-01T15:17:23Z

This change is

dt requested a review from a team as a code owner June 1, 2026 15:17

dt requested a review from wenyihu6 June 1, 2026 15:17

dt mentioned this pull request Jun 1, 2026

admission: elastic CPU controller misses demand due to point-sample of hasWaitingRequests #170400

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

admission: observe transient elastic CPU waiters via sticky bit#171300

admission: observe transient elastic CPU waiters via sticky bit#171300
dt wants to merge 1 commit into
cockroachdb:masterfrom
dt:dt/elastic-cpu-sticky-waiters

dt commented Jun 1, 2026

Uh oh!

trunk-io Bot commented Jun 1, 2026

Uh oh!

cockroach-teamcity commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

dt commented Jun 1, 2026

Uh oh!

trunk-io Bot commented Jun 1, 2026

Uh oh!

cockroach-teamcity commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants