watchdog/hpwdt: Disable NMI in Crash Kernel

NMIs received during the crash path are problematic as hpwdt_pretimeout
handling of the NMI would cause a reentry into kdump.

The situation is complicated in that I/O errors can be signaled as NMI
circumventing hpwdt_pretimeout's attempt to not claim NMI not associated
with either the WDT or the iLO NMI switch.  These NMI can additionally
cause a secondary NMI which cause the system to hang.

By disabling pretimeout and hpwdtimeout in crash path we both reduce
the risk of receiving an NMI and simuletaneously leave the WDT running
(if it was already in use) to allow the WDT to break the system out of
hangs by the WDT reset.

Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/1606097320-56762-2-git-send-email-jerry.hoemann@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
This commit is contained in:
Jerry Hoemann 2020-11-22 19:08:39 -07:00 committed by Wim Van Sebroeck
parent 42e967f3c6
commit acc195bd2c

View File

@ -21,6 +21,7 @@
#include <linux/types.h>
#include <linux/watchdog.h>
#include <asm/nmi.h>
#include <linux/crash_dump.h>
#define HPWDT_VERSION "2.0.3"
#define SECS_TO_TICKS(secs) ((secs) * 1000 / 128)
@ -334,6 +335,11 @@ static int hpwdt_init_one(struct pci_dev *dev,
watchdog_set_nowayout(&hpwdt_dev, nowayout);
watchdog_init_timeout(&hpwdt_dev, soft_margin, NULL);
if (is_kdump_kernel()) {
pretimeout = 0;
kdumptimeout = 0;
}
if (pretimeout && hpwdt_dev.timeout <= PRETIMEOUT_SEC) {
dev_warn(&dev->dev, "timeout <= pretimeout. Setting pretimeout to zero\n");
pretimeout = 0;