linux/include
Johan Hovold ba81043753 scsi: ufs: core: Fix devfreq deadlocks
There is a lock inversion and rwsem read-lock recursion in the devfreq
target callback which can lead to deadlocks.

Specifically, ufshcd_devfreq_scale() already holds a clk_scaling_lock
read lock when toggling the write booster, which involves taking the
dev_cmd mutex before taking another clk_scaling_lock read lock.

This can lead to a deadlock if another thread:

  1) tries to acquire the dev_cmd and clk_scaling locks in the correct
     order, or

  2) takes a clk_scaling write lock before the attempt to take the
     clk_scaling read lock a second time.

Fix this by dropping the clk_scaling_lock before toggling the write booster
as was done before commit 0e9d4ca43b ("scsi: ufs: Protect some contexts
from unexpected clock scaling").

While the devfreq callbacks are already serialised, add a second
serialising mutex to handle the unlikely case where a callback triggered
through the devfreq sysfs interface is racing with a request to disable
clock scaling through the UFS controller 'clkscale_enable' sysfs
attribute. This could otherwise lead to the write booster being left
disabled after having disabled clock scaling.

Also take the new mutex in ufshcd_clk_scaling_allow() to make sure that any
pending write booster update has completed on return.

Note that this currently only affects Qualcomm platforms since commit
87bd05016a ("scsi: ufs: core: Allow host driver to disable wb toggling
during clock scaling").

The lock inversion (i.e. 1 above) was reported by lockdep as:

 ======================================================
 WARNING: possible circular locking dependency detected
 6.1.0-next-20221216 #211 Not tainted
 ------------------------------------------------------
 kworker/u16:2/71 is trying to acquire lock:
 ffff076280ba98a0 (&hba->dev_cmd.lock){+.+.}-{3:3}, at: ufshcd_query_flag+0x50/0x1c0

 but task is already holding lock:
 ffff076280ba9cf0 (&hba->clk_scaling_lock){++++}-{3:3}, at: ufshcd_devfreq_scale+0x2b8/0x380

 which lock already depends on the new lock.
[  +0.011606]
 the existing dependency chain (in reverse order) is:

 -> #1 (&hba->clk_scaling_lock){++++}-{3:3}:
        lock_acquire+0x68/0x90
        down_read+0x58/0x80
        ufshcd_exec_dev_cmd+0x70/0x2c0
        ufshcd_verify_dev_init+0x68/0x170
        ufshcd_probe_hba+0x398/0x1180
        ufshcd_async_scan+0x30/0x320
        async_run_entry_fn+0x34/0x150
        process_one_work+0x288/0x6c0
        worker_thread+0x74/0x450
        kthread+0x118/0x120
        ret_from_fork+0x10/0x20

 -> #0 (&hba->dev_cmd.lock){+.+.}-{3:3}:
        __lock_acquire+0x12a0/0x2240
        lock_acquire.part.0+0xcc/0x220
        lock_acquire+0x68/0x90
        __mutex_lock+0x98/0x430
        mutex_lock_nested+0x2c/0x40
        ufshcd_query_flag+0x50/0x1c0
        ufshcd_query_flag_retry+0x64/0x100
        ufshcd_wb_toggle+0x5c/0x120
        ufshcd_devfreq_scale+0x2c4/0x380
        ufshcd_devfreq_target+0xf4/0x230
        devfreq_set_target+0x84/0x2f0
        devfreq_update_target+0xc4/0xf0
        devfreq_monitor+0x38/0x1f0
        process_one_work+0x288/0x6c0
        worker_thread+0x74/0x450
        kthread+0x118/0x120
        ret_from_fork+0x10/0x20

 other info that might help us debug this:
  Possible unsafe locking scenario:
        CPU0                    CPU1
        ----                    ----
   lock(&hba->clk_scaling_lock);
                                lock(&hba->dev_cmd.lock);
                                lock(&hba->clk_scaling_lock);
   lock(&hba->dev_cmd.lock);

  *** DEADLOCK ***

Fixes: 0e9d4ca43b ("scsi: ufs: Protect some contexts from unexpected clock scaling")
Cc: stable@vger.kernel.org      # 5.12
Cc: Can Guo <quic_cang@quicinc.com>
Tested-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230116161201.16923-1-johan+linaro@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-01-18 19:08:37 -05:00
..
acpi More ACPI updates for 6.2-rc1 2022-12-15 10:21:10 -08:00
asm-generic asm-generic bits for 6.2 2022-12-20 08:32:11 -06:00
clocksource Updates for timers, timekeeping and drivers: 2022-12-12 12:52:02 -08:00
crypto crypto: acomp - define max size for destination 2022-12-09 18:45:00 +08:00
drm Some deferred-io and damage worker reworks revert and make a fb function 2022-12-09 11:53:52 +10:00
dt-bindings remoteproc updates for v6.2 2022-12-21 09:37:14 -08:00
keys
kunit kunit: add macro to allow conditionally exposing static symbols to tests 2022-12-12 14:13:48 -07:00
kvm Merge branch kvm-arm64/pmu-unchained into kvmarm-master/next 2022-12-05 14:38:44 +00:00
linux block-6.2-2022-12-19 2022-12-21 16:35:26 -08:00
math-emu
media Merge tag 'br-v6.2i' of git://linuxtv.org/hverkuil/media_tree into media_stage 2022-12-07 17:58:47 +01:00
memory
misc cxl: fix typo in comment 2022-11-24 23:12:19 +11:00
net 9p-for-6.2-rc1 2022-12-23 11:39:18 -08:00
pcmcia
ras
rdma RDMA: Extend RDMA kernel verbs ABI to support flush 2022-12-09 19:36:01 -04:00
rv
scsi Merge branch '6.2/scsi-queue' into 6.2/scsi-fixes 2022-12-30 16:29:34 +00:00
soc Networking changes for 6.2. 2022-12-13 15:47:48 -08:00
sound ALSA: hda/hdmi: fix stream-id config keep-alive for rt suspend 2022-12-09 12:06:15 +01:00
target scsi: target: core: Send max transfer length in blocks 2022-11-24 02:16:19 +00:00
trace pwm: Changes for v6.2-rc1 2022-12-21 09:41:28 -08:00
uapi SCSI misc on 20221222 2022-12-22 11:22:31 -08:00
ufs scsi: ufs: core: Fix devfreq deadlocks 2023-01-18 19:08:37 -05:00
vdso
video fbdev: omapfb: connector-analog-tv: remove support for platform data 2022-12-14 20:01:49 +01:00
xen xen: fix xen.h build for CONFIG_XEN_PVH=y 2022-12-05 12:59:49 +01:00