mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-11-26 21:54:11 +08:00
15cc3ae001
Yi Zhang reported the following failure on a 2-socket Haswell (E5-2603v3)
server (DELL PowerEdge 730xd):
EDAC sbridge: Some needed devices are missing
EDAC MC: Removed device 0 for sb_edac.c Haswell SrcID#0_Ha#0: DEV 0000:7f:12.0
EDAC MC: Removed device 1 for sb_edac.c Haswell SrcID#1_Ha#0: DEV 0000:ff:12.0
EDAC sbridge: Couldn't find mci handler
EDAC sbridge: Couldn't find mci handler
EDAC sbridge: Failed to register device with error -19.
The refactored sb_edac driver creates the IMC1 (the 2nd memory
controller) if any IMC1 device is present. In this case only
HA1_TA of IMC1 was present, but the driver expected to find
HA1/HA1_TM/HA1_TAD[0-3] devices too, leading to the above failure.
The document [1] says the 'E5-2603 v3' CPU has 4 memory channels max. Yi
Zhang inserted one DIMM per channel for each CPU, and did random error
address injection test with this patch:
4024 addresses fell in TOLM hole area
12715 addresses fell in CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
12774 addresses fell in CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
12798 addresses fell in CPU_SrcID#0_Ha#0_Chan#2_DIMM#0
12913 addresses fell in CPU_SrcID#0_Ha#0_Chan#3_DIMM#0
12674 addresses fell in CPU_SrcID#1_Ha#0_Chan#0_DIMM#0
12686 addresses fell in CPU_SrcID#1_Ha#0_Chan#1_DIMM#0
12882 addresses fell in CPU_SrcID#1_Ha#0_Chan#2_DIMM#0
12934 addresses fell in CPU_SrcID#1_Ha#0_Chan#3_DIMM#0
106400 addresses were injected totally.
The test result shows that all the 4 channels belong to IMC0 per CPU, so
the server really only has one IMC per CPU.
In the 1st page of chapter 2 in datasheet [2], it also says 'E5-2600 v3'
implements either one or two IMCs. For CPUs with one IMC, IMC1 is not
used and should be ignored.
Thus, do not create a second memory controller if the key HA1 is absent.
[1] http://ark.intel.com/products/83349/Intel-Xeon-Processor-E5-2603-v3-15M-Cache-1_60-GHz
[2] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf
Reported-and-tested-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes:
|
||
---|---|---|
.. | ||
altera_edac.c | ||
altera_edac.h | ||
amd64_edac_dbg.c | ||
amd64_edac_inj.c | ||
amd64_edac.c | ||
amd64_edac.h | ||
amd76x_edac.c | ||
amd8111_edac.c | ||
amd8111_edac.h | ||
amd8131_edac.c | ||
amd8131_edac.h | ||
cell_edac.c | ||
cpc925_edac.c | ||
debugfs.c | ||
e7xxx_edac.c | ||
e752x_edac.c | ||
edac_device_sysfs.c | ||
edac_device.c | ||
edac_device.h | ||
edac_mc_sysfs.c | ||
edac_mc.c | ||
edac_mc.h | ||
edac_module.c | ||
edac_module.h | ||
edac_pci_sysfs.c | ||
edac_pci.c | ||
edac_pci.h | ||
fsl_ddr_edac.c | ||
fsl_ddr_edac.h | ||
ghes_edac.c | ||
highbank_l2_edac.c | ||
highbank_mc_edac.c | ||
i7core_edac.c | ||
i3000_edac.c | ||
i3200_edac.c | ||
i5000_edac.c | ||
i5100_edac.c | ||
i5400_edac.c | ||
i7300_edac.c | ||
i82443bxgx_edac.c | ||
i82860_edac.c | ||
i82875p_edac.c | ||
i82975x_edac.c | ||
ie31200_edac.c | ||
Kconfig | ||
layerscape_edac.c | ||
Makefile | ||
mce_amd.c | ||
mce_amd.h | ||
mpc85xx_edac.c | ||
mpc85xx_edac.h | ||
mv64x60_edac.c | ||
mv64x60_edac.h | ||
octeon_edac-l2c.c | ||
octeon_edac-lmc.c | ||
octeon_edac-pc.c | ||
octeon_edac-pci.c | ||
pasemi_edac.c | ||
pnd2_edac.c | ||
pnd2_edac.h | ||
ppc4xx_edac.c | ||
ppc4xx_edac.h | ||
r82600_edac.c | ||
sb_edac.c | ||
skx_edac.c | ||
synopsys_edac.c | ||
thunderx_edac.c | ||
tile_edac.c | ||
wq.c | ||
x38_edac.c | ||
xgene_edac.c |