Having all CPUs poll the lock word for the owner CPU that should be
yielded to defeats most of the purpose of using MCS queueing for
scalability. Yet it may be desirable for queued waiters to yield to a
preempted owner.
With this change, queue waiters never sample the owner CPU directly from
the lock word. The queue head (which is spinning on the lock) propagates
the owner CPU back to the next waiter if it finds the owner has been
preempted. That waiter then propagates the owner CPU back to the next
waiter, and so on.
s390 addresses this problem differenty, by having queued waiters sample
the lock word to find the owner at a low frequency. That has the
advantage of being simpler, the advantage of propagation is that the
lock word never has to be accesed by queued waiters, and the transfer of
cache lines to transmit the owner data is only required when lock holder
vCPU preemption occurs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-11-npiggin@gmail.com