linux/Documentation
Feng Tang fcd4f3a9d9 clocksource: Scale the watchdog read retries automatically
[ Upstream commit 2ed08e4bc5 ]

On a 8-socket server the TSC is wrongly marked as 'unstable' and disabled
during boot time on about one out of 120 boot attempts:

    clocksource: timekeeping watchdog on CPU227: wd-tsc-wd excessive read-back delay of 153560ns vs. limit of 125000ns,
    wd-wd read-back delay only 11440ns, attempt 3, marking tsc unstable
    tsc: Marking TSC unstable due to clocksource watchdog
    TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
    sched_clock: Marking unstable (119294969739, 159204297)<-(125446229205, -5992055152)
    clocksource: Checking clocksource tsc synchronization from CPU 319 to CPUs 0,99,136,180,210,542,601,896.
    clocksource: Switched to clocksource hpet

The reason is that for platform with a large number of CPUs, there are
sporadic big or huge read latencies while reading the watchog/clocksource
during boot or when system is under stress work load, and the frequency and
maximum value of the latency goes up with the number of online CPUs.

The cCurrent code already has logic to detect and filter such high latency
case by reading the watchdog twice and checking the two deltas. Due to the
randomness of the latency, there is a low probabilty that the first delta
(latency) is big, but the second delta is small and looks valid. The
watchdog code retries the readouts by default twice, which is not
necessarily sufficient for systems with a large number of CPUs.

There is a command line parameter 'max_cswd_read_retries' which allows to
increase the number of retries, but that's not user friendly as it needs to
be tweaked per system. As the number of required retries is proportional to
the number of online CPUs, this parameter can be calculated at runtime.

Scale and enlarge the number of retries according to the number of online
CPUs and remove the command line parameter completely.

[ tglx: Massaged change log and comments ]

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jin Wang <jin1.wang@intel.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Waiman Long <longman@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20240221060859.1027450-1-feng.tang@intel.com
Stable-dep-of: f2655ac2c0 ("clocksource: Fix brown-bag boolean thinko in cs_watchdog_read()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19 05:45:45 +02:00
..
ABI x86/rfds: Mitigate Register File Data Sampling (RFDS) 2024-04-10 16:18:48 +02:00
accounting sched/psi: report zeroes for CPU full at the system level 2022-06-09 10:22:48 +02:00
admin-guide clocksource: Scale the watchdog read retries automatically 2024-08-19 05:45:45 +02:00
arm Documentation: arm: marvell: Add 88F6825 model into list 2021-08-24 13:26:32 -06:00
arm64 arm64: errata: Expand speculative SSBS workaround (again) 2024-08-19 05:45:41 +02:00
block Documentation: block: blk-mq: Fix small typo in multi-queue docs 2021-08-24 13:30:00 -06:00
bpf libbpf: Rename libbpf documentation index file 2021-08-18 08:45:25 -07:00
cdrom docs: cdrom-standard.rst: get rid of uneeded UTF-8 chars 2021-05-11 11:00:17 -06:00
core-api dma-mapping: add dma_opt_mapping_size() 2024-04-10 16:18:47 +02:00
cpu-freq cpufreq: Remove ready() callback 2021-09-02 18:04:17 +02:00
crypto
dev-tools docs/scripts/gdb: add necessary make scripts_gdb step 2023-03-10 09:39:53 +01:00
devicetree dt-bindings: thermal: correct thermal zone node name limit 2024-08-19 05:45:10 +02:00
doc-guide docs: doc-guide: avoid using ReST :doc:foo markup 2021-06-17 13:24:37 -06:00
driver-api fpga: region: add owner module and take its refcount 2024-06-16 13:39:38 +02:00
fault-injection debugfs: fix error when writing negative value to atomic_t debugfs file 2022-12-31 13:14:03 +01:00
fb Documentation: Add leading slash to some paths 2021-03-31 13:49:19 -06:00
features RISC-V Patches for the 5.15 Merge Window, Part 2 2021-09-11 14:29:42 -07:00
filesystems Revert "lockd: introduce safe async lock op" 2024-04-27 17:05:23 +02:00
firmware_class
firmware-guide Documentation: ACPI: EINJ: Fix obsolete example 2022-08-25 11:40:01 +02:00
fpga fpga: fix spelling mistakes 2021-07-21 19:54:21 -07:00
gpu drm/i915/display: Move DRRS code its own file 2022-03-08 19:12:40 +01:00
hid Documentation: Add leading slash to some paths 2021-03-31 13:49:19 -06:00
hwmon hwmon: (ftsteutates) Fix scaling of measurements 2023-03-10 09:39:21 +01:00
i2c Documentation: i2c: add i2c-sysfs into index 2021-08-10 22:58:32 +02:00
ia64
ide
iio iio: hrtimer: Allow sub Hz granularity 2021-03-25 19:13:49 +00:00
infiniband
input Input: iforce - add support for Boeder Force Feedback Wheel 2022-09-20 12:39:45 +02:00
isdn
kbuild Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
kernel-hacking docs: futex: Fix kernel-doc references after code split-up preparation 2023-04-26 13:51:53 +02:00
leds Documentation: leds: standartizing LED names 2021-08-20 10:26:24 +02:00
litmus-tests
livepatch docs: livepatch: Fix a typo and remove the unnecessary gaps in a sentence 2021-03-08 17:25:16 -07:00
locking Documentation/locking/locktypes: Update migrate_disable() bits. 2021-12-14 10:57:18 +01:00
m68k
maintainer media: add a subsystem profile documentation 2021-03-22 08:56:42 +01:00
mhi
mips
misc-devices dw-xdata-pcie: Update outdated info and improve text format 2021-04-14 19:47:28 +02:00
netlabel
networking net: ena: Add dynamic recycling mechanism for rx buffers 2024-06-16 13:39:52 +02:00
nios2
nvdimm
openrisc
parisc
PCI pci-v5.15-changes 2021-09-07 19:13:42 -07:00
pcmcia
power Documentation: power: include kernel-doc in Energy Model doc 2021-09-07 21:17:28 +02:00
powerpc powerpc/doc: Fix htmldocs errors 2021-08-27 00:56:34 +10:00
process docs/process/howto: Replace C89 with C11 2023-12-13 18:36:46 +01:00
RCU doc: Update stallwarn.rst with recent changes 2021-07-20 13:36:33 -07:00
riscv riscv: Move early dtb mapping into the fixmap region 2023-05-01 08:23:24 +09:00
s390 vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE 2021-06-21 15:29:25 -06:00
scheduler This was a reasonably active cycle for documentation; this pull includes: 2021-06-28 16:53:05 -07:00
scsi scsi: core: Fix the scsi_set_resid() documentation 2023-09-19 12:22:50 +02:00
security KEYS: trusted: allow use of kernel RNG for key material 2023-10-19 23:05:33 +02:00
sh
sound ASoC: doc: Fix undefined SND_SOC_DAPM_NOPM argument 2024-02-23 08:54:46 +01:00
sparc
sphinx docs: kernel_include.py: Cope with docutils 0.21 2024-05-25 16:20:19 +02:00
sphinx-static
spi spi: pxa2xx: Update documentation to point out that it's outdated 2021-05-18 14:05:36 +01:00
staging
target
timers Documentation: drop optional BOMs 2021-05-10 15:17:34 -06:00
trace tracing/probes: Add symstr type for dynamic events 2023-08-03 10:22:30 +02:00
translations docs/process/howto: Replace C89 with C11 2023-12-13 18:36:46 +01:00
tty/device_drivers serial: 8250: Add proper clock handling for OxSemi PCIe devices 2022-08-17 14:24:23 +02:00
usb docs: usb: fix malformed table 2021-08-05 12:31:51 +02:00
userspace-api Remove DECnet support from kernel 2023-06-21 15:59:15 +02:00
virt KVM: s390: disable migration mode when dirty tracking is disabled 2023-03-10 09:40:01 +01:00
vm Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
w1 w1: fix build warning in w1_ds2438.rst 2021-05-26 09:11:24 +02:00
watchdog docs: watchdog: fix obsolete include file reference in pcwd 2021-03-06 17:36:51 -07:00
x86 x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key 2024-04-10 16:18:48 +02:00
xtensa
.gitignore
arch.rst docs: Group arch-specific documentation under "CPU Architectures" 2021-03-15 13:35:35 -06:00
asm-annotations.rst
atomic_bitops.txt locking/atomic: Make test_and_*_bit() ordered on failure 2022-08-25 11:39:54 +02:00
atomic_t.txt Documentation/atomic_t: Document forward progress expectations 2021-08-04 15:16:47 +02:00
Changes
CodingStyle
conf.py docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0 2022-06-09 10:23:30 +02:00
COPYING-logo
docutils.conf
dontdiff kbuild: generate Module.symvers only when vmlinux exists 2021-04-25 05:17:02 +09:00
index.rst docs: Group arch-specific documentation under "CPU Architectures" 2021-03-15 13:35:35 -06:00
Kconfig
logo.gif
Makefile docs: Makefile: Use CONFIG_SHELL not SHELL 2021-06-18 11:26:08 -06:00
memory-barriers.txt
SubmittingPatches
watch_queue.rst