linux/drivers
Niklas Söderlund 026befb502 clocksource/drivers/sh_cmt: Address race condition for clock events
[ Upstream commit db19d3aa77 ]

There is a race condition in the CMT interrupt handler. In the interrupt
handler the driver sets a driver private flag, FLAG_IRQCONTEXT. This
flag is used to indicate any call to set_next_event() should not be
directly propagated to the device, but instead cached. This is done as
the interrupt handler itself reprograms the device when needed before it
completes and this avoids this operation to take place twice.

It is unclear why this design was chosen, my suspicion is to allow the
struct clock_event_device.event_handler callback, which is called while
the FLAG_IRQCONTEXT is set, can update the next event without having to
write to the device twice.

Unfortunately there is a race between when the FLAG_IRQCONTEXT flag is
set and later cleared where the interrupt handler have already started to
write the next event to the device. If set_next_event() is called in
this window the value is only cached in the driver but not written. This
leads to the board to misbehave, or worse lockup and produce a splat.

   rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
   rcu:     0-...!: (0 ticks this GP) idle=f5e0/0/0x0 softirq=519/519 fqs=0 (false positive?)
   rcu:     (detected by 1, t=6502 jiffies, g=-595, q=77 ncpus=2)
   Sending NMI from CPU 1 to CPUs 0:
   NMI backtrace for cpu 0
   CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.10.0-rc5-arm64-renesas-00019-g74a6f86eaf1c-dirty #20
   Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
   pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
   pc : tick_check_broadcast_expired+0xc/0x40
   lr : cpu_idle_poll.isra.0+0x8c/0x168
   sp : ffff800081c63d70
   x29: ffff800081c63d70 x28: 00000000580000c8 x27: 00000000bfee5610
   x26: 0000000000000027 x25: 0000000000000000 x24: 0000000000000000
   x23: ffff00007fbb9100 x22: ffff8000818f1008 x21: ffff8000800ef07c
   x20: ffff800081c79ec0 x19: ffff800081c70c28 x18: 0000000000000000
   x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffc2c717d8
   x14: 0000000000000000 x13: ffff000009c18080 x12: ffff8000825f7fc0
   x11: 0000000000000000 x10: ffff8000818f3cd4 x9 : 0000000000000028
   x8 : ffff800081c79ec0 x7 : ffff800081c73000 x6 : 0000000000000000
   x5 : 0000000000000000 x4 : ffff7ffffe286000 x3 : 0000000000000000
   x2 : ffff7ffffe286000 x1 : ffff800082972900 x0 : ffff8000818f1008
   Call trace:
    tick_check_broadcast_expired+0xc/0x40
    do_idle+0x9c/0x280
    cpu_startup_entry+0x34/0x40
    kernel_init+0x0/0x11c
    do_one_initcall+0x0/0x260
    __primary_switched+0x80/0x88
   rcu: rcu_preempt kthread timer wakeup didn't happen for 6501 jiffies! g-595 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
   rcu:     Possible timer handling issue on cpu=0 timer-softirq=262
   rcu: rcu_preempt kthread starved for 6502 jiffies! g-595 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
   rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
   rcu: RCU grace-period kthread stack dump:
   task:rcu_preempt     state:I stack:0     pid:15    tgid:15    ppid:2      flags:0x00000008
   Call trace:
    __switch_to+0xbc/0x100
    __schedule+0x358/0xbe0
    schedule+0x48/0x148
    schedule_timeout+0xc4/0x138
    rcu_gp_fqs_loop+0x12c/0x764
    rcu_gp_kthread+0x208/0x298
    kthread+0x10c/0x110
    ret_from_fork+0x10/0x20

The design have been part of the driver since it was first merged in
early 2009. It becomes increasingly harder to trigger the issue the
older kernel version one tries. It only takes a few boots on v6.10-rc5,
while hundreds of boots are needed to trigger it on v5.10.

Close the race condition by using the CMT channel lock for the two
competing sections. The channel lock was added to the driver after its
initial design.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/20240702190230.3825292-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19 05:32:11 +02:00
..
accessibility
acpi ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 2024-07-27 10:33:43 +02:00
amba amba: bus: fix refcount leak 2023-09-23 10:48:09 +02:00
android binder: fix hang of unregistered readers 2024-08-19 05:32:04 +02:00
ata ata: libata-core: Fix double free on error 2024-07-05 09:00:35 +02:00
atm atm: idt77252: fix a memleak in open_card_ubr0 2024-02-23 08:12:53 +01:00
auxdisplay
base devres: Fix memory leakage caused by driver API devm_free_percpu() 2024-08-19 05:32:08 +02:00
bcma
block null_blk: Fix the WARNING: modpost: missing MODULE_DESCRIPTION() 2024-06-16 13:23:35 +02:00
bluetooth Bluetooth: btintel: Fixe build regression 2024-04-13 12:50:17 +02:00
bus bus: tegra-aconnect: Update dependency to ARCH_TEGRA 2024-03-26 18:22:35 -04:00
cdrom
char hwrng: amd - Convert PCIBIOS_* return codes to errnos 2024-08-19 05:32:04 +02:00
clk clk: davinci: da8xx-cfgchip: Initialize clk_init_data before use 2024-08-19 05:32:04 +02:00
clocksource clocksource/drivers/sh_cmt: Address race condition for clock events 2024-08-19 05:32:11 +02:00
connector
cpufreq cpufreq: imx6q: Don't disable 792 Mhz OPP unnecessarily 2023-12-08 08:43:26 +01:00
cpuidle sched,idle,rcu: Push rcu_idle deeper into the idle path 2023-10-25 11:16:26 +02:00
crypto crypto: qat - Fix ADF_DEV_RESET_SYNC memory leak 2024-06-16 13:23:44 +02:00
dax
dca
devfreq PM / devfreq: Fix leak in devfreq_dev_release() 2023-09-23 10:48:10 +02:00
dio
dma dmaengine: ioatdma: Fix missing kmem_cache_destroy() 2024-07-05 09:00:28 +02:00
dma-buf dma-buf/sw-sync: don't enable IRQ from sync_print_obj() 2024-06-16 13:23:37 +02:00
edac EDAC/thunderx: Fix possible out-of-bounds string access 2024-01-25 14:33:31 -08:00
eisa
extcon extcon: max8997: select IRQ_DOMAIN instead of depending on it 2024-06-16 13:23:33 +02:00
firewire firewire: nosy: ensure user_length is taken into account when fetching packet contents 2024-05-17 11:42:42 +02:00
firmware firmware: dmi: Stop decoding on broken entry 2024-07-18 11:39:32 +02:00
fmc
fpga fpga: bridge: fix kernel-doc parameter description 2023-05-17 11:13:15 +02:00
fsi fsi: master-ast-cf: Add MODULE_FIRMWARE macro 2023-09-23 10:47:57 +02:00
gnss
gpio gpio: davinci: Validate the obtained number of IRQs 2024-07-05 09:00:32 +02:00
gpu drm/vmwgfx: Fix overlay when using Screen Targets 2024-08-19 05:32:09 +02:00
hid HID: core: remove unnecessary WARN_ON() in implement() 2024-07-05 09:00:23 +02:00
hsi HSI: omap_ssi_core: Fix error handling in ssi_init() 2023-01-18 11:30:30 +01:00
hv hv_utils: drain the timesync packets on onchannelcallback 2024-07-05 09:00:26 +02:00
hwmon hwmon: (max6697) Fix swapped temp{1,8} critical alarms 2024-08-19 05:31:56 +02:00
hwspinlock
hwtracing intel_th: pci: Add Lunar Lake support 2024-07-05 09:00:26 +02:00
i2c i2c: rcar: bring hardware to known state when probing 2024-07-18 11:39:39 +02:00
ide treewide: Remove uninitialized_var() usage 2023-08-11 11:45:01 +02:00
idle
iio iio: chemical: bme680: Fix sensor data read operation 2024-07-05 09:00:33 +02:00
infiniband RDMA/iwcm: Fix a use-after-free related to destroying CM IDs 2024-08-19 05:32:06 +02:00
input Input: elan_i2c - do not leave interrupt disabled on suspend failure 2024-08-19 05:32:00 +02:00
iommu iommu/amd: Fix sysfs leak in iommu init 2024-07-05 09:00:23 +02:00
ipack
irqchip irqchip/mbigen: Fix mbigen node address layout 2024-08-19 05:32:10 +02:00
isdn mISDN: Fix a use after free in hfcmulti_tx() 2024-08-19 05:32:07 +02:00
leds leds: ss4200: Convert PCIBIOS_* return codes to errnos 2024-08-19 05:32:04 +02:00
lightnvm
macintosh macintosh/therm_windtunnel: fix module unload. 2024-08-19 05:32:01 +02:00
mailbox mailbox: ti-msgmgr: Fill non-message tx data fields with 0x0 2023-08-11 11:45:13 +02:00
mcb mcb: fix error handling for different scenarios when parsing 2023-11-28 16:46:35 +00:00
md md/raid5: avoid BUG_ON() while continue reshape after reassembling 2024-08-19 05:32:10 +02:00
media media: venus: fix use after free in vdec_close 2024-08-19 05:32:03 +02:00
memory
memstick memstick r592: make memstick_debug_get_tpc_name() static 2023-08-11 11:45:06 +02:00
message scsi: message: mptlan: Fix use after free bug in mptlan_remove() due to race condition 2023-05-30 12:42:09 +01:00
mfd mfd: omap-usb-tll: Use struct_size to allocate tll 2024-08-19 05:31:59 +02:00
misc mei: demote client disconnect warning on suspend to debug 2024-07-27 10:33:42 +02:00
mmc mmc: sdhci-pci: Convert PCIBIOS_* return codes to errnos 2024-07-05 09:00:33 +02:00
mtd ubi: eba: properly rollback inside self_check_eba 2024-08-19 05:32:04 +02:00
mux
net net: fec: Stop PPS on driver remove 2024-08-19 05:32:10 +02:00
nfc NFC: trf7970a: disable all regulators on removal 2024-05-02 16:17:11 +02:00
ntb ntb: Fix calculation ntb_transport_tx_free_entry() 2023-09-23 10:48:10 +02:00
nubus
nvdimm nd_btt: Make BTT lanes preemptible 2023-11-20 10:29:18 +01:00
nvme nvmet: fix ns enable/disable possible hang 2024-06-16 13:23:36 +02:00
nvmem nvmem: imx: correct nregs for i.MX6UL 2023-11-08 11:22:16 +01:00
of of: unittest: Fix of_count_phandle_with_args() expected value message 2024-01-25 14:33:36 -08:00
opp
oprofile
parisc parisc: iosapic.c: Fix sparse warnings 2023-10-10 21:44:58 +02:00
parport dev/parport: fix the array out-of-bounds risk 2024-08-19 05:32:08 +02:00
pci PCI: rockchip: Use GPIOD_OUT_LOW flag while requesting ep_gpio 2024-08-19 05:32:07 +02:00
pcmcia pcmcia: ds: fix possible name leak in error path in pcmcia_device_add() 2023-11-20 10:29:20 +01:00
perf
phy phy: ti: phy-omap-usb2: Fix NULL pointer dereference for SRP 2024-02-23 08:12:53 +01:00
pinctrl pinctrl: freescale: mxs: Fix refcount of child 2024-08-19 05:32:02 +02:00
platform platform: mips: cpu_hwmon: Disable driver on unsupported hardware 2024-08-19 05:32:05 +02:00
pnp PNP: ACPI: fix fortify warning 2024-02-23 08:12:44 +01:00
power power: supply: cros_usbpd: provide ID table for avoiding fallback match 2024-06-16 13:23:25 +02:00
powercap powercap: fix possible name leak in powercap_register_zone() 2023-03-11 16:31:36 +01:00
pps
ps3
ptp ptp: Fix error message on failed pin verification 2024-07-05 09:00:20 +02:00
pwm pwm: stm32: Always do lazy disabling 2024-08-19 05:31:56 +02:00
rapidio
ras
regulator regulator: core: Fix modpost error "regulator_get_regmap" undefined 2024-07-05 09:00:28 +02:00
remoteproc remoteproc: imx_rproc: Skip over memory region when node value is NULL 2024-08-19 05:32:09 +02:00
reset reset: hisilicon: hi6220: fix Wvoid-pointer-to-enum-cast warning 2024-01-25 14:33:30 -08:00
rpmsg rpmsg: virtio: Free driver_override when rpmsg_remove() 2024-02-23 08:12:40 +01:00
rtc rtc: cmos: Fix return value of nvmem callbacks 2024-08-19 05:32:05 +02:00
s390 s390/sclp: Fix sclp_init() cleanup on failure 2024-07-27 10:33:42 +02:00
sbus
scsi scsi: qla2xxx: validate nvme_local_port correctly 2024-08-19 05:32:05 +02:00
sfi
sh
siox
slimbus slimbus: core: Remove usage of the deprecated ida_simple_xx() API 2024-04-13 12:50:06 +02:00
sn
soc soc: ti: wkup_m3_ipc: Send NULL dummy message instead of pointer message 2024-07-05 09:00:32 +02:00
soundwire
spi spi: imx: Don't expect DMA for i.MX{25,35,50,51,53} cspi devices 2024-07-27 10:33:43 +02:00
spmi spmi: Add a check for remove callback when removing a SPMI driver 2023-05-17 11:13:17 +02:00
ssb treewide: Remove uninitialized_var() usage 2023-08-11 11:45:01 +02:00
staging greybus: arche-ctrl: move device table to its right location 2024-06-16 13:23:32 +02:00
target scsi: target: Fix SELinux error when systemd-modules loads the target module 2024-05-17 11:42:40 +02:00
tc
tee
thermal thermal: core: prevent potential string overflow 2023-11-20 10:29:17 +01:00
thunderbolt thunderbolt: Use const qualifier for ring_interrupt_index 2023-04-05 11:15:35 +02:00
tty tty: mcf: MCF54418 has 10 UARTS 2024-07-05 09:00:34 +02:00
uio uio: Fix use-after-free in uio_open 2024-01-25 14:33:30 -08:00
usb USB: core: Fix duplicate endpoint bug by clearing reserved bits in the descriptor 2024-07-18 11:39:38 +02:00
uwb
vfio vfio/platform: Disable virqfds on cleanup 2024-04-13 12:50:06 +02:00
vhost vhost: Add smp_rmb() in vhost_vq_avail_empty() 2024-05-02 16:17:08 +02:00
video fbdev: savage: Handle err return when savagefb_check_var failed 2024-06-16 13:23:39 +02:00
virt
virtio virtio: delete vq in vp_find_vqs_msix() when request_irq() fails 2024-06-16 13:23:36 +02:00
visorbus
vlynq
vme vme: Fix error not catched in fake_init() 2023-01-18 11:30:28 +01:00
w1 w1: fix loop in w1_fini() 2023-08-11 11:45:11 +02:00
watchdog watchdog: bcm2835_wdt: Fix WDIOC_SETTIMEOUT handling 2024-01-25 14:33:36 -08:00
xen xen/events: fix delayed eoi list handling 2023-11-28 16:46:33 +00:00
zorro
Kconfig
Makefile