Go to file
Nicholas Piggin 9d08fce64d workqueue: Improve scalability of workqueue watchdog touch
[ Upstream commit 98f887f820 ]

On a ~2000 CPU powerpc system, hard lockups have been observed in the
workqueue code when stop_machine runs (in this case due to CPU hotplug).
This is due to lots of CPUs spinning in multi_cpu_stop, calling
touch_nmi_watchdog() which ends up calling wq_watchdog_touch().
wq_watchdog_touch() writes to the global variable wq_watchdog_touched,
and that can find itself in the same cacheline as other important
workqueue data, which slows down operations to the point of lockups.

In the case of the following abridged trace, worker_pool_idr was in
the hot line, causing the lockups to always appear at idr_find.

  watchdog: CPU 1125 self-detected hard LOCKUP @ idr_find
  Call Trace:
  get_work_pool
  __queue_work
  call_timer_fn
  run_timer_softirq
  __do_softirq
  do_softirq_own_stack
  irq_exit
  timer_interrupt
  decrementer_common_virt
  * interrupt: 900 (timer) at multi_cpu_stop
  multi_cpu_stop
  cpu_stopper_thread
  smpboot_thread_fn
  kthread

Fix this by having wq_watchdog_touch() only write to the line if the
last time a touch was recorded exceeds 1/4 of the watchdog threshold.

Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12 11:07:52 +02:00
arch MIPS: cevt-r4k: Don't call get_c0_compare_int if timer irq is installed 2024-09-12 11:07:50 +02:00
block block: remove the blk_flush_integrity call in blk_integrity_unregister 2024-09-12 11:07:42 +02:00
certs certs/blacklist_hashes.c: fix const confusion in certs blacklist 2022-06-22 14:22:01 +02:00
crypto crypto: aead,cipher - zeroize key buffer after use 2024-07-18 13:07:27 +02:00
Documentation hwspinlock: Introduce hwspin_lock_bust() 2024-09-12 11:07:41 +02:00
drivers clocksource/drivers/timer-of: Remove percpu irq related code 2024-09-12 11:07:51 +02:00
fs nilfs2: protect references to superblock parameters exposed in sysfs 2024-09-12 11:07:52 +02:00
include Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE" 2024-09-12 11:07:43 +02:00
init init/main.c: Fix potential static_command_line memory overflow 2024-04-27 17:05:28 +02:00
io_uring io_uring/io-wq: limit retrying worker initialisation 2024-08-19 05:45:22 +02:00
ipc ipc/sem: Fix dangling sem_array access in semtimedop race 2022-12-08 11:28:45 +01:00
kernel workqueue: Improve scalability of workqueue watchdog touch 2024-09-12 11:07:52 +02:00
lib lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc() 2024-09-12 11:07:50 +02:00
LICENSES LICENSES/dual/CC-BY-4.0: Git rid of "smart quotes" 2021-07-15 06:31:24 -06:00
mm mm/numa: no task_numa_fault() call if PTE is changed 2024-09-04 13:23:37 +02:00
net net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN 2024-09-12 11:07:47 +02:00
samples Add gitignore file for samples/fanotify/ subdirectory 2024-07-27 10:46:16 +02:00
scripts kbuild: Fix '-S -c' in x86 stack protector scripts 2024-08-19 05:45:16 +02:00
security smack: unix sockets: fix accept()ed socket label 2024-09-12 11:07:45 +02:00
sound ASoC: topology: Properly initialize soc_enum values 2024-09-12 11:07:47 +02:00
tools kselftests: dmabuf-heaps: Ensure the driver name is null-terminated 2024-09-12 11:07:49 +02:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-02-01 17:27:15 +01:00
virt KVM: Always flush async #PF workqueue when vCPU is being destroyed 2024-04-10 16:18:34 +02:00
.clang-format clang-format: Update with the latest for_each macro list 2021-05-12 23:32:39 +02:00
.cocciconfig
.get_maintainer.ignore
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap mailmap: add Andrej Shadura 2021-10-18 20:22:03 -10:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Daniel Drake to credits 2021-09-21 08:34:58 +03:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS trace: Relocate event helper files 2024-04-10 16:19:24 +02:00
Makefile Linux 5.15.166 2024-09-04 13:23:42 +02:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.