linux/drivers/edac
Robert Richter 23f61b9fc5 EDAC/ghes: Fix locking and memory barrier issues
The ghes registration and refcount is broken in several ways:

 * ghes_edac_register() returns with success for a 2nd instance
   even if a first instance's registration is still running. This is
   not correct as the first instance may fail later. A subsequent
   registration may not finish before the first. Parallel registrations
   must be avoided.

 * The refcount was increased even if a registration failed. This
   leads to stale counters preventing the device from being released.

 * The ghes refcount may not be decremented properly on unregistration.
   Always decrement the refcount once ghes_edac_unregister() is called to
   keep the refcount sane.

 * The ghes_pvt pointer is handed to the irq handler before registration
   finished.

 * The mci structure could be freed while the irq handler is running.

Fix this by adding a mutex to ghes_edac_register(). This mutex
serializes instances to register and unregister. The refcount is only
increased if the registration succeeded. This makes sure the refcount is
in a consistent state after registering or unregistering a device.

Note: A spinlock cannot be used here as the code section may sleep.

The ghes_pvt is protected by ghes_lock now. This ensures the pointer is
not updated before registration was finished or while the irq handler is
running. It is unset before unregistering the device including necessary
(implicit) memory barriers making the changes visible to other CPUs.
Thus, the device can not be used anymore by an interrupt.

Also, rename ghes_init to ghes_refcount for better readability and
switch to refcount API.

A refcount is needed because there can be multiple GHES structures being
defined (see ACPI 6.3 specification, 18.3.2.7 Generic Hardware Error
Source, "Some platforms may describe multiple Generic Hardware Error
Source structures with different notification types, ...").

Another approach to use the mci's device refcount (get_device()) and
have a release function does not work here. A release function will be
called only for device_release() with the last put_device() call. The
device must be deleted *before* that with device_del(). This is only
possible by maintaining an own refcount.

 [ bp: touchups. ]

Fixes: 0fe5f281f7 ("EDAC, ghes: Model a single, logical memory controller")
Fixes: 1e72e673b9 ("EDAC/ghes: Fix Use after free in ghes_edac remove path")
Co-developed-by: James Morse <james.morse@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Co-developed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20191105200732.3053-1-rrichter@marvell.com
2019-11-08 16:28:28 +01:00
..
altera_edac.c EDAC/altera: Use the proper type for the IRQ status bits 2019-08-07 10:37:34 +02:00
altera_edac.h edac: altera: Move Stratix10 SDRAM ECC to peripheral 2019-07-25 14:28:42 -04:00
amd64_edac_dbg.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
amd64_edac_inj.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
amd64_edac.c EDAC/amd64: Check for memory before fully initializing an instance 2019-11-06 11:10:11 +01:00
amd64_edac.h EDAC/amd64: Save max number of controllers to family type 2019-11-06 11:07:01 +01:00
amd76x_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
amd8111_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
amd8111_edac.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
amd8131_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
amd8131_edac.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
armada_xp_edac.c ARM: 8891/1: EDAC: armada_xp: Add support for more SoCs 2019-08-29 07:58:01 +01:00
aspeed_edac.c EDAC/aspeed: Use devm_platform_ioremap_resource() in aspeed_probe() 2019-10-24 11:17:29 +02:00
bluefield_edac.c EDAC, mellanox: Add ECC support for BlueField DDR4 2019-08-08 12:57:01 -03:00
cell_edac.c edac: rename edac_core.h to edac_mc.h 2016-12-15 08:54:51 -02:00
cpc925_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
debugfs.c ARM: 8892/1: EDAC: Add missing debugfs_create_x32 wrapper 2019-08-29 07:58:01 +01:00
e7xxx_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
e752x_edac.c EDAC: Fix indentation issues in several EDAC drivers 2018-11-10 16:56:16 +01:00
edac_device_sysfs.c edac: move EDAC device definitions to drivers/edac/edac_device.h 2016-12-15 08:54:51 -02:00
edac_device.c EDAC/device: Rework error logging API 2019-10-09 13:01:42 +02:00
edac_device.h EDAC/device: Rework error logging API 2019-10-09 13:01:42 +02:00
edac_mc_sysfs.c EDAC/mc_sysfs: Make debug messages consistent 2019-09-04 11:39:19 +02:00
edac_mc.c EDAC: Prefer 'unsigned int' to bare use of 'unsigned' 2019-09-03 19:21:19 +02:00
edac_mc.h EDAC: Prefer 'unsigned int' to bare use of 'unsigned' 2019-09-03 19:21:19 +02:00
edac_module.c treewide: Fix function prototypes for module_param_call() 2017-10-31 15:30:37 +01:00
edac_module.h ARM: 8892/1: EDAC: Add missing debugfs_create_x32 wrapper 2019-08-29 07:58:01 +01:00
edac_pci_sysfs.c edac: move documentation from edac_pci*.c to edac_pci.h 2016-12-15 08:54:51 -02:00
edac_pci.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
edac_pci.h edac: move documentation from edac_pci*.c to edac_pci.h 2016-12-15 08:54:51 -02:00
fsl_ddr_edac.c EDAC, fsl_ddr: Add LS1021A to the list of supported hardware 2018-12-19 11:57:45 +01:00
fsl_ddr_edac.h EDAC, fsl_ddr: Add LS1021A to the list of supported hardware 2018-12-19 11:57:45 +01:00
ghes_edac.c EDAC/ghes: Fix locking and memory barrier issues 2019-11-08 16:28:28 +01:00
highbank_l2_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 2019-05-30 11:29:52 -07:00
highbank_mc_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 2019-05-30 11:29:52 -07:00
i7core_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 172 2019-05-30 11:26:39 -07:00
i10nm_base.c x86/intel: Aggregate microserver naming 2019-08-28 11:29:32 +02:00
i3000_edac.c EDAC: Fix indentation issues in several EDAC drivers 2018-11-10 16:56:16 +01:00
i3200_edac.c EDAC: Correct DIMM capacity unit symbol 2018-09-22 18:18:57 +02:00
i5000_edac.c EDAC, i5000: Remove set but not used local variables 2018-12-11 14:53:49 +01:00
i5100_edac.c EDAC: i5100_edac: get rid of an unused var 2019-09-30 15:41:54 -03:00
i5400_edac.c EDAC: i5400_edac: get rid of some unused vars 2019-09-30 15:41:54 -03:00
i7300_edac.c EDAC: i7300_edac: fix a kernel-doc syntax 2019-09-30 15:41:54 -03:00
i82443bxgx_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
i82860_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
i82875p_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
i82975x_edac.c EDAC, i82975x: Fix spelling mistake "reserverd" -> "reserved" 2018-11-20 17:46:01 +01:00
ie31200_edac.c EDAC/ie31200: Reformat PCI device table 2019-06-20 11:44:36 -07:00
Kconfig ARM updates for 5.4-rc1: 2019-09-22 09:39:09 -07:00
layerscape_edac.c edac: rename edac_core.h to edac_mc.h 2016-12-15 08:54:51 -02:00
Makefile ARM updates for 5.4-rc1: 2019-09-22 09:39:09 -07:00
mce_amd.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mce_amd.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mpc85xx_edac.c EDAC, mpc85xx: Add T2080 l2-cache support 2017-02-03 10:36:35 +01:00
mpc85xx_edac.h EDAC, fsl-ddr: Separate FSL DDR driver from MPC85xx 2016-09-01 10:28:00 +02:00
mv64x60_edac.c EDAC, mv64x60: Fix an error handling path 2018-01-09 20:14:23 +01:00
mv64x60_edac.h edac: Drop __DATE__ usage 2011-04-19 00:23:22 +02:00
octeon_edac-l2c.c edac: rename edac_core.h to edac_mc.h 2016-12-15 08:54:51 -02:00
octeon_edac-lmc.c EDAC, octeon: Fix an uninitialized variable warning 2017-11-27 11:57:26 +01:00
octeon_edac-pc.c edac: rename edac_core.h to edac_mc.h 2016-12-15 08:54:51 -02:00
octeon_edac-pci.c edac: rename edac_core.h to edac_mc.h 2016-12-15 08:54:51 -02:00
pasemi_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333 2019-06-05 17:37:06 +02:00
pnd2_edac.c Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-09-16 18:47:53 -07:00
pnd2_edac.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
ppc4xx_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
ppc4xx_edac.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 2019-06-05 17:37:17 +02:00
qcom_edac.c EDAC, qcom_edac: Remove irq_handled local variable 2018-11-06 12:03:16 +01:00
r82600_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
sb_edac.c EDAC: sb_edac: get rid of unused vars 2019-09-30 15:41:54 -03:00
sifive_edac.c EDAC/sifive: Add EDAC platform driver for SiFive SoCs 2019-06-20 11:44:36 -07:00
skx_base.c EDAC, skx: Retrieve and print retry_rd_err_log registers 2019-10-18 15:27:58 -07:00
skx_common.c EDAC, skx: Retrieve and print retry_rd_err_log registers 2019-10-18 15:27:58 -07:00
skx_common.h EDAC, skx: Retrieve and print retry_rd_err_log registers 2019-10-18 15:27:58 -07:00
synopsys_edac.c EDAC, synopsys: Add Error Injection support for ZynqMP DDR controller 2018-11-06 10:38:27 +01:00
thunderx_edac.c EDAC, thunderx: Fix memory leak in thunderx_l2c_threaded_isr() 2018-10-13 13:58:06 +02:00
ti_edac.c EDAC, ti: Add support for TI keystone and DRA7xx EDAC 2017-11-27 13:51:19 +01:00
wq.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
x38_edac.c EDAC: Get rid of mci->mod_ver 2017-07-17 13:42:48 +02:00
xgene_edac.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 2019-05-21 11:28:45 +02:00