linux/arch/x86
Thomas Gleixner cc8bf19137 x86/apic: Make apic_pending_intr_clear() more robust
In course of developing shorthand based IPI support issues with the
function which tries to clear eventually pending ISR bits in the local APIC
were observed.

  1) O-day testing triggered the WARN_ON() in apic_pending_intr_clear().

     This warning is emitted when the function fails to clear pending ISR
     bits or observes pending IRR bits which are not delivered to the CPU
     after the stale ISR bit(s) are ACK'ed.

     Unfortunately the function only emits a WARN_ON() and fails to dump
     the IRR/ISR content. That's useless for debugging.

     Feng added spot on debug printk's which revealed that the stale IRR
     bit belonged to the APIC timer interrupt vector, but adding ad hoc
     debug code does not help with sporadic failures in the field.

     Rework the loop so the full IRR/ISR contents are saved and on failure
     dumped.

  2) The loop termination logic is interesting at best.

     If the machine has no TSC or cpu_khz is not known yet it tries 1
     million times to ack stale IRR/ISR bits. What?

     With TSC it uses the TSC to calculate the loop termination. It takes a
     timestamp at entry and terminates the loop when:

     	  (rdtsc() - start_timestamp) >= (cpu_hkz << 10)

     That's roughly one second.

     Both methods are problematic. The APIC has 256 vectors, which means
     that in theory max. 256 IRR/ISR bits can be set. In practice this is
     impossible and the chance that more than a few bits are set is close
     to zero.

     With the pure loop based approach the 1 million retries are complete
     overkill.

     With TSC this can terminate too early in a guest which is running on a
     heavily loaded host even with only a couple of IRR/ISR bits set. The
     reason is that after acknowledging the highest priority ISR bit,
     pending IRRs must get serviced first before the next round of
     acknowledge can take place as the APIC (real and virtualized) does not
     honour EOI without a preceeding interrupt on the CPU. And every APIC
     read/write takes a VMEXIT if the APIC is virtualized. While trying to
     reproduce the issue 0-day reported it was observed that the guest was
     scheduled out long enough under heavy load that it terminated after 8
     iterations.

     Make the loop terminate after 512 iterations. That's plenty enough
     in any case and does not take endless time to complete.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190722105219.158847694@linutronix.de
2019-07-25 16:11:56 +02:00
..
boot x86, boot: Remove multiple copy of static function sanitize_boot_params() 2019-07-18 21:41:57 +02:00
configs x86/defconfigs: Remove useless UEVENT_HELPER_PATH 2019-06-21 19:22:08 +02:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-07-08 20:57:08 -07:00
entry Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-07-20 11:24:49 -07:00
events perf/x86/intel: Fix spurious NMI on fixed counter 2019-07-13 11:21:29 +02:00
hyperv x86/hyper-v: Zero out the VP ASSIST PAGE on allocation 2019-07-19 09:48:15 +02:00
ia32 clone: fix CLONE_PIDFD support 2019-07-14 20:36:12 +02:00
include x86/paravirt: Drop {read,write}_cr8() hooks 2019-07-22 10:12:33 +02:00
kernel x86/apic: Make apic_pending_intr_clear() more robust 2019-07-25 16:11:56 +02:00
kvm Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-07-20 10:45:15 -07:00
lib x86/uaccess: Remove redundant CLACs in getuser/putuser error paths 2019-07-18 21:01:06 +02:00
math-emu x86: math-emu: Hide clang warnings for 16-bit overflow 2019-07-17 00:42:26 +02:00
mm dma-mapping fixes for 5.3-rc1 2019-07-20 12:09:52 -07:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-07-08 19:48:57 -07:00
oprofile
pci treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 387 2019-06-05 17:37:11 +02:00
platform platform-drivers-x86 for v5.3-1 2019-07-14 16:51:47 -07:00
power x86/paravirt: Drop {read,write}_cr8() hooks 2019-07-22 10:12:33 +02:00
purgatory treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230 2019-06-19 17:09:06 +02:00
ras RAS/CEC: Add CONFIG_RAS_CEC_DEBUG and move CEC debug features there 2019-06-08 17:39:24 +02:00
realmode x86/realmode: Make set_real_mode_mem() static inline 2019-03-29 10:16:27 +01:00
tools Merge branch 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-07-08 17:34:44 -07:00
um Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2019-07-08 21:48:15 -07:00
video treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
xen x86/paravirt: Drop {read,write}_cr8() hooks 2019-07-22 10:12:33 +02:00
.gitignore
Kbuild treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
Kconfig dma-mapping fixes for 5.3-rc1 2019-07-20 12:09:52 -07:00
Kconfig.cpu x86/cpu: Create Zhaoxin processors architecture support file 2019-06-22 11:45:57 +02:00
Kconfig.debug It's been a relatively busy cycle for docs: 2019-07-09 12:34:26 -07:00
Makefile x86/build: Keep local relocations with ld.lld 2019-04-05 12:34:35 +02:00
Makefile_32.cpu
Makefile.um