2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-16 09:13:55 +08:00
linux-next/arch/arm64
Waiman Long f5bfdc8e39 locking/osq: Use optimized spinning loop for arm64
Arm64 has a more optimized spinning loop (atomic_cond_read_acquire)
using wfe for spinlock that can boost performance of sibling threads
by putting the current cpu to a wait state that is broken only when
the monitored variable changes or an external event happens.

OSQ has a more complicated spinning loop. Besides the lock value, it
also checks for need_resched() and vcpu_is_preempted(). The check for
need_resched() is not a problem as it is only set by the tick interrupt
handler. That will be detected by the spinning cpu right after iret.

The vcpu_is_preempted() check, however, is a problem as changes to the
preempt state of of previous node will not affect the wait state. For
ARM64, vcpu_is_preempted is not currently defined and so is a no-op.
Will has indicated that he is planning to para-virtualize wfe instead
of defining vcpu_is_preempted for PV support. So just add a comment in
arch/arm64/include/asm/spinlock.h to indicate that vcpu_is_preempted()
should not be defined as suggested.

On a 2-socket 56-core 224-thread ARM64 system, a kernel mutex locking
microbenchmark was run for 10s with and without the patch. The
performance numbers before patch were:

Running locktest with mutex [runtime = 10s, load = 1]
Threads = 224, Min/Mean/Max = 316/123,143/2,121,269
Threads = 224, Total Rate = 2,757 kop/s; Percpu Rate = 12 kop/s

After patch, the numbers were:

Running locktest with mutex [runtime = 10s, load = 1]
Threads = 224, Min/Mean/Max = 334/147,836/1,304,787
Threads = 224, Total Rate = 3,311 kop/s; Percpu Rate = 15 kop/s

So there was about 20% performance improvement.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200113150735.21956-1-longman@redhat.com
2020-01-17 10:19:30 +01:00
..
boot arm64: dts: ls1028a: fix reboot node 2019-12-12 09:59:38 +08:00
configs arm64: defconfig: re-run savedefconfig 2019-12-05 13:20:17 -08:00
crypto crypto: arch - conditionalize crypto api in arch glue for lib code 2019-11-27 13:08:49 +08:00
include locking/osq: Use optimized spinning loop for arm64 2020-01-17 10:19:30 +01:00
kernel arm64: Implement copy_thread_tls 2020-01-07 13:31:01 +01:00
kvm PPC: 2019-12-22 10:26:59 -08:00
lib arm64: uaccess: Remove uaccess_*_not_uao asm macros 2019-11-20 18:51:54 +00:00
mm arm64: Revert support for execute-only user mappings 2020-01-06 10:10:07 -08:00
net arm64: bpf: optimize modulo operation 2019-09-03 15:44:40 +02:00
xen xen/efi: have a common runtime setup function 2019-10-02 10:31:07 -04:00
Kbuild arm64: add arch/arm64/Kbuild 2019-08-21 18:47:15 +01:00
Kconfig arm64: Implement copy_thread_tls 2020-01-07 13:31:01 +01:00
Kconfig.debug treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Kconfig.platforms i.MX SoC update for 5.5: 2019-11-06 07:46:42 -08:00
Makefile arm64: implement ftrace with regs 2019-11-06 14:17:35 +00:00