xen: don't hang when resuming PCI device

If a xen domain with at least two VCPUs has a PCI device attached which
enters the D3hot state during suspend, the kernel may hang while
resuming, depending on the core on which an async resume task gets
scheduled.

The bug occurs because xen's do_suspend calls dpm_resume_start while
only the timer of the boot CPU has been resumed (when xen_suspend called
syscore_resume), before calling xen_arch_suspend to resume the timers of
the other CPUs. This breaks pci_dev_d3_sleep.

Thus this patch moves the call to xen_arch_resume before the call to
dpm_resume_start, eliminating the hangs and restoring the stack-like
structure of the suspend/restore procedure.

Signed-off-by: Jakub Kądziołka <niedzejkob@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220323012103.2537-1-niedzejkob@invisiblethingslab.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
This commit is contained in:
Jakub Kądziołka 2022-03-23 02:21:03 +01:00 committed by Boris Ostrovsky
parent 309b517276
commit ff32baa1f3

View File

@ -141,6 +141,8 @@ static void do_suspend(void)
raw_notifier_call_chain(&xen_resume_notifier, 0, NULL);
xen_arch_resume();
dpm_resume_start(si.cancelled ? PMSG_THAW : PMSG_RESTORE);
if (err) {
@ -148,8 +150,6 @@ static void do_suspend(void)
si.cancelled = 1;
}
xen_arch_resume();
out_resume:
if (!si.cancelled)
xs_resume();