2021-09-30 17:44:51 +08:00
|
|
|
What: /sys/devices/system/machinecheck/machinecheckX/
|
|
|
|
Contact: Andi Kleen <ak@linux.intel.com>
|
|
|
|
Date: Feb, 2007
|
|
|
|
Description:
|
|
|
|
(X = CPU number)
|
|
|
|
|
|
|
|
Machine checks report internal hardware error conditions
|
|
|
|
detected by the CPU. Uncorrected errors typically cause a
|
|
|
|
machine check (often with panic), corrected ones cause a
|
|
|
|
machine check log entry.
|
|
|
|
|
|
|
|
For more details about the x86 machine check architecture
|
|
|
|
see the Intel and AMD architecture manuals from their
|
|
|
|
developer websites.
|
|
|
|
|
|
|
|
For more details about the architecture
|
|
|
|
see http://one.firstfloor.org/~andi/mce.pdf
|
|
|
|
|
|
|
|
Each CPU has its own directory.
|
|
|
|
|
|
|
|
What: /sys/devices/system/machinecheck/machinecheckX/bank<Y>
|
|
|
|
Contact: Andi Kleen <ak@linux.intel.com>
|
|
|
|
Date: Feb, 2007
|
|
|
|
Description:
|
|
|
|
(Y bank number)
|
|
|
|
|
|
|
|
64bit Hex bitmask enabling/disabling specific subevents for
|
|
|
|
bank Y.
|
|
|
|
|
|
|
|
When a bit in the bitmask is zero then the respective
|
|
|
|
subevent will not be reported.
|
|
|
|
|
|
|
|
By default all events are enabled.
|
|
|
|
|
|
|
|
Note that BIOS maintain another mask to disable specific events
|
|
|
|
per bank. This is not visible here
|
|
|
|
|
|
|
|
What: /sys/devices/system/machinecheck/machinecheckX/check_interval
|
|
|
|
Contact: Andi Kleen <ak@linux.intel.com>
|
|
|
|
Date: Feb, 2007
|
|
|
|
Description:
|
|
|
|
The entries appear for each CPU, but they are truly shared
|
|
|
|
between all CPUs.
|
|
|
|
|
|
|
|
How often to poll for corrected machine check errors, in
|
|
|
|
seconds (Note output is hexadecimal). Default 5 minutes.
|
|
|
|
When the poller finds MCEs it triggers an exponential speedup
|
|
|
|
(poll more often) on the polling interval. When the poller
|
|
|
|
stops finding MCEs, it triggers an exponential backoff
|
|
|
|
(poll less often) on the polling interval. The check_interval
|
|
|
|
variable is both the initial and maximum polling interval.
|
|
|
|
0 means no polling for corrected machine check errors
|
|
|
|
(but some corrected errors might be still reported
|
|
|
|
in other ways)
|
|
|
|
|
|
|
|
What: /sys/devices/system/machinecheck/machinecheckX/trigger
|
|
|
|
Contact: Andi Kleen <ak@linux.intel.com>
|
|
|
|
Date: Feb, 2007
|
|
|
|
Description:
|
|
|
|
The entries appear for each CPU, but they are truly shared
|
|
|
|
between all CPUs.
|
|
|
|
|
|
|
|
Program to run when a machine check event is detected.
|
|
|
|
This is an alternative to running mcelog regularly from cron
|
|
|
|
and allows to detect events faster.
|
|
|
|
|
|
|
|
What: /sys/devices/system/machinecheck/machinecheckX/monarch_timeout
|
|
|
|
Contact: Andi Kleen <ak@linux.intel.com>
|
|
|
|
Date: Feb, 2007
|
|
|
|
Description:
|
|
|
|
How long to wait for the other CPUs to machine check too on a
|
|
|
|
exception. 0 to disable waiting for other CPUs.
|
|
|
|
|
|
|
|
Unit: us
|
|
|
|
|
2021-09-30 17:44:52 +08:00
|
|
|
What: /sys/devices/system/machinecheck/machinecheckX/ignore_ce
|
|
|
|
Contact: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
|
|
|
|
Date: Jun 2009
|
|
|
|
Description:
|
|
|
|
Disables polling and CMCI for corrected errors.
|
|
|
|
All corrected events are not cleared and kept in bank MSRs.
|
|
|
|
|
|
|
|
What: /sys/devices/system/machinecheck/machinecheckX/dont_log_ce
|
|
|
|
Contact: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
|
|
|
|
Date: Jun 2009
|
|
|
|
Description:
|
|
|
|
Disables logging for corrected errors.
|
|
|
|
All reported corrected errors will be cleared silently.
|
|
|
|
|
|
|
|
This option will be useful if you never care about corrected
|
|
|
|
errors.
|
|
|
|
|
|
|
|
What: /sys/devices/system/machinecheck/machinecheckX/cmci_disabled
|
|
|
|
Contact: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
|
|
|
|
Date: Jun 2009
|
|
|
|
Description:
|
|
|
|
Disables the CMCI feature.
|