/claim #9878

Summary

Reduces the frequency of LockSupport.unpark calls in ZScheduler.maybeUnparkWorker, which is invoked on every task submission and is a hot-path bottleneck as identified in #9878.

Problem

maybeUnparkWorker was called unconditionally on every submit() and submitAndYield(). Inside, it performs idle.poll() (a ConcurrentLinkedQueue CAS operation) and LockSupport.unpark() — both expensive operations that cause excessive park/unpark cycling under load.

Changes

maybeUnparkWorker — three fast-path early returns

private def maybeUnparkWorker(currentState: Int): Unit = {
val currentSearching = currentState & 0xffff
if (currentSearching > 0) return // someone is already searching
val currentActive = (currentState & 0xffff0000) >> 16
if (currentActive == poolSize) return // all workers busy
if (idle.isEmpty) return // nothing to unpark (avoids CAS)
val worker = idle.poll()
if (worker ne null) {
state.getAndAdd(0x10001)
worker.active = true
LockSupport.unpark(worker)
}
}

Guard 1 — currentSearching > 0 (most impactful): If at least one worker is already scanning for tasks, waking another is wasteful. The searching worker will find the newly submitted task and call maybeUnparkWorker itself after transitioning out of searching mode (line 400-401 in Worker.run), creating a notification cascade that wakes workers only when genuinely needed.

Guard 2 — currentActive == poolSize: All workers are active; there are no idle workers to wake.

Guard 3 — idle.isEmpty: ConcurrentLinkedQueue.isEmpty is O(1) and avoids the CAS overhead of poll() when the idle queue is already empty. The subsequent poll() null-check handles any races safely.

New test suite: ZSchedulerSpec

Added ZSchedulerSpec with concurrency stress tests to guard against regressions:

  • High-throughput fork/join (10k fibers) — validates worker wakeup under high concurrency
  • Chained forks (1000 depth) — stresses the cascade-notification path
  • Ping-pong via bounded queues — sensitive to park/unpark timing
  • Yield-heavy workloads — exercises the submitAndYield path
  • Execution metrics — validates metrics reporting

Tradeoffs

  • Fairness vs throughput: The currentSearching > 0 guard can delay waking idle threads by microseconds. In practice, the searching worker resolves quickly and cascade-notifies as needed.
  • idle.isEmpty is non-linearizable: ConcurrentLinkedQueue.isEmpty can return stale results. The subsequent poll() null-check handles this safely — at worst we skip one unpark attempt, and the next submission will retry.

Testing

The new ZSchedulerSpec covers the critical scheduler paths. All tests include timeouts and nonFlaky annotations to catch deadlocks or liveness regressions.

Fixes #9878

Claim

Total prize pool $1,350
Total paid $0
Status Pending
Submitted March 14, 2026
Last updated March 14, 2026

Contributors

AB

Abrailab

@CelebrityPunks

100%

Sponsors

ZI

ZIO

@ZIO

$850
AB

Abrailab

@CelebrityPunks

$500