linux/kernel/power
Tetsuo Handa 2f0e18e0db PM: hibernate: defer device probing when resuming from hibernation
[ Upstream commit 8386c414e2 ]

syzbot is reporting hung task at misc_open() [1], for there is a race
window of AB-BA deadlock which involves probe_count variable. Currently
wait_for_device_probe() from snapshot_open() from misc_open() can sleep
forever with misc_mtx held if probe_count cannot become 0.

When a device is probed by hub_event() work function, probe_count is
incremented before the probe function starts, and probe_count is
decremented after the probe function completed.

There are three cases that can prevent probe_count from dropping to 0.

  (a) A device being probed stopped responding (i.e. broken/malicious
      hardware).

  (b) A process emulating a USB device using /dev/raw-gadget interface
      stopped responding for some reason.

  (c) New device probe requests keeps coming in before existing device
      probe requests complete.

The phenomenon syzbot is reporting is (b). A process which is holding
system_transition_mutex and misc_mtx is waiting for probe_count to become
0 inside wait_for_device_probe(), but the probe function which is called
 from hub_event() work function is waiting for the processes which are
blocked at mutex_lock(&misc_mtx) to respond via /dev/raw-gadget interface.

This patch mitigates (b) by deferring wait_for_device_probe() from
snapshot_open() to snapshot_write() and snapshot_ioctl(). Please note that
the possibility of (b) remains as long as any thread which is emulating a
USB device via /dev/raw-gadget interface can be blocked by uninterruptible
blocking operations (e.g. mutex_lock()).

Please also note that (a) and (c) are not addressed. Regarding (c), we
should change the code to wait for only one device which contains the
image for resuming from hibernation. I don't know how to address (a), for
use of timeout for wait_for_device_probe() might result in loss of user
data in the image. Maybe we should require the userland to wait for the
image device before opening /dev/snapshot interface.

Link: https://syzkaller.appspot.com/bug?extid=358c9ab4c93da7b7238c [1]
Reported-by: syzbot <syzbot+358c9ab4c93da7b7238c@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+358c9ab4c93da7b7238c@syzkaller.appspotmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17 14:23:04 +02:00
..
autosleep.c PM: sleep: fix typos in comments 2021-04-08 19:37:21 +02:00
console.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
energy_model.c PM: EM: Fix inefficient states detection 2021-11-18 19:16:30 +01:00
hibernate.c PM: hibernate: fix __setup handler error handling 2022-04-08 14:23:07 +02:00
Kconfig PM: sleep: remove trailing spaces and tabs 2021-06-11 18:49:09 +02:00
main.c PM: s2idle: ACPI: Fix wakeup interrupts handling 2022-02-16 12:56:19 +01:00
Makefile PM: hibernate: Split off snapshot dev option 2020-05-19 17:48:08 +02:00
power.h kernel/power: allow hibernation with page_poison sanity checking 2020-12-15 12:13:46 -08:00
poweroff.c kernel/power: constify sysrq_key_op 2020-05-15 14:53:20 +02:00
process.c PM: s2idle: ACPI: Fix wakeup interrupts handling 2022-02-16 12:56:19 +01:00
qos.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
snapshot.c PM: hibernate: Remove register_nosave_region_late() 2022-02-16 12:56:15 +01:00
suspend_test.c PM: suspend: fix return value of __setup handler 2022-04-08 14:23:07 +02:00
suspend.c PM: s2idle: ACPI: Fix wakeup interrupts handling 2022-02-16 12:56:19 +01:00
swap.c PM: hibernate: fix sparse warnings 2021-11-18 19:16:38 +01:00
user.c PM: hibernate: defer device probing when resuming from hibernation 2022-08-17 14:23:04 +02:00
wakelock.c PM: wakeup: simplify the output logic of pm_show_wakelocks() 2022-02-01 17:27:00 +01:00