linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-14 15:54:15 +08:00

Author	SHA1	Message	Date
Max Gurtovoy	b9abef43a0	vfio/pci: remove CONFIG_VFIO_PCI_ZDEV from Kconfig In case we're running on s390 system always expose the capabilities for configuration of zPCI devices. In case we're running on different platform, continue as usual. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-19 10:29:56 -07:00
Max Gurtovoy	7e31d6dc2c	vfio-pci/zdev: fix possible segmentation fault issue In case allocation fails, we must behave correctly and exit with error. Fixes: `e6b817d4b8` ("vfio-pci/zdev: Add zPCI capabilities to VFIO_DEVICE_GET_INFO") Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-02 09:06:02 -07:00
Max Gurtovoy	46c4746660	vfio-pci/zdev: remove unused vdev argument Zdev static functions do not use vdev argument. Remove it. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-01 13:43:06 -07:00
Heiner Kallweit	37a682ffbe	vfio/pci: Fix handling of pci use accessor return codes The pci user accessors return negative errno's on error. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [aw: drop Fixes tag, pcibios_err_to_errno() behaves correctly for -errno] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-01 13:40:52 -07:00
Alexey Kardashevskiy	d22f9a6c92	vfio/pci/nvlink2: Do not attempt NPU2 setup on POWER8NVL NPU We execute certain NPU2 setup code (such as mapping an LPID to a device in NPU2) unconditionally if an Nvlink bridge is detected. However this cannot succeed on POWER8NVL machines as the init helpers return an error other than ENODEV which means the device is there is and setup failed so vfio_pci_enable() fails and pass through is not possible. This changes the two NPU2 related init helpers to return -ENODEV if there is no "memory-region" device tree property as this is the distinction between NPU and NPU2. Tested on - POWER9 pvr=004e1201, Ubuntu 19.04 host, Ubuntu 18.04 vm, NVIDIA GV100 10de:1db1 driver 418.39 - POWER8 pvr=004c0100, RHEL 7.6 host, Ubuntu 16.10 vm, NVIDIA P100 10de:15f9 driver 396.47 Fixes: `7f92891778` ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver") Cc: stable@vger.kernel.org # 5.0 Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-12-02 13:04:22 -07:00
Jason Gunthorpe	7b06a56d46	vfio-pci: Use io_remap_pfn_range() for PCI IO memory commit `f8f6ae5d07` ("mm: always have io_remap_pfn_range() set pgprot_decrypted()") allows drivers using mmap to put PCI memory mapped BAR space into userspace to work correctly on AMD SME systems that default to all memory encrypted. Since vfio_pci_mmap_fault() is working with PCI memory mapped BAR space it should be calling io_remap_pfn_range() otherwise it will not work on SME systems. Fixes: `11c4cd07ba` ("vfio-pci: Fault mmaps to enable vma tracking") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Peter Xu <peterx@redhat.com> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-12-02 13:03:12 -07:00
Eric Auger	16b8fe4caf	vfio/pci: Move dummy_resources_list init in vfio_pci_probe() In case an error occurs in vfio_pci_enable() before the call to vfio_pci_probe_mmaps(), vfio_pci_disable() will try to iterate on an uninitialized list and cause a kernel panic. Lets move to the initialization to vfio_pci_probe() to fix the issue. Signed-off-by: Eric Auger <eric.auger@redhat.com> Fixes: `05f0c03fba` ("vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive") CC: Stable <stable@vger.kernel.org> # v4.7+ Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-12-02 13:03:12 -07:00
Fred Gao	e4eccb8536	vfio/pci: Bypass IGD init in case of -ENODEV Bypass the IGD initialization when -ENODEV returns, that should be the case if opregion is not available for IGD or within discrete graphics device's option ROM, or host/lpc bridge is not found. Then use of -ENODEV here means no special device resources found which needs special care for VFIO, but we still allow other normal device resource access. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Xiong Zhang <xiong.y.zhang@intel.com> Cc: Hang Yuan <hang.yuan@linux.intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Fred Gao <fred.gao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-03 11:07:40 -07:00
Alex Williamson	38565c93c8	vfio/pci: Implement ioeventfd thread handler for contended memory lock The ioeventfd is called under spinlock with interrupts disabled, therefore if the memory lock is contended defer code that might sleep to a thread context. Fixes: `bc93b9ae01` ("vfio-pci: Avoid recursive read-lock usage") Link: https://bugzilla.kernel.org/show_bug.cgi?id=209253#c1 Reported-by: Ian Pilcher <arequipeno@gmail.com> Tested-by: Ian Pilcher <arequipeno@gmail.com> Tested-by: Justin Gatzen <justin.gatzen@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-03 11:07:40 -07:00
Linus Torvalds	fc996db970	VFIO updates for v5.10-rc1 - New fsl-mc vfio bus driver supporting userspace drivers of objects within NXP's DPAA2 architecture (Diana Craciun) - Support for exposing zPCI information on s390 (Matthew Rosato) - Fixes for "detached" VFs on s390 (Matthew Rosato) - Fixes for pin-pages and dma-rw accesses (Yan Zhao) - Cleanups and optimize vconfig regen (Zenghui Yu) - Fix duplicate irq-bypass token registration (Alex Williamson) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJfkcCjAAoJECObm247sIsi2XIP/j7NL4glPrWU37mesz9dd5nx SmZhcmxnOqZSQkOCnu+hNFZ9e+tdQjuX+jATOZaYz5l55bLAFmBlBj1Dv8HWaCVI mTbJ6xXUwdOvNSxbFH6BIUkJg8otR0iEkefVyJLNlF84FsaDknH4yZxx0vdeczjF wTkkk3+4VmH+4klvPIa9v0eL7yeKeFmgls9nQViVE5kDWUF4us/z/oHlVm9wR+mL 2r3DEjHyz4L2hwVEkhZk7ytR6szdhuhF2l7NoMmaSEXRXjBzJoO6I3P9Y2W4i+su MFgTfiQ+OpIfVuiR8GzGev+/SrjWGX0Hvb2sYriKOELjhyedkE2kmxacbqMZ/UE+ SRAhFf64C1rzJ4g1IW//Gg+9ObIPqlkqU52VDbOZdCED0AquwSyVmdwIUAK6qF+I HLOyZXhMI8EZ+w063cS+aKLJIvQTBbfIdMmPZkopVZhwWB3N3BjdvBKA+rPpPoTx 0DpeUo891+zyeEE4aunUmCB8HFnBPgUa+XZqg2juq9MxjScsqgTzA0WEZg7jV4oj tORQrqoAKJgSk9oVL3EvAnr+IJix3ScRTqYymESORkz/lRCk2hFX48qdeW+qiSP8 W1DHOnivFb1+JzhuZyaRKFWy1mK0EQQWTsE2b2ymPMKJbFhi+pVxaksmeG5x+4Q9 SAp+Qma8Aj3UtBKcj/S+ =LDPo -----END PGP SIGNATURE----- Merge tag 'vfio-v5.10-rc1' of git://github.com/awilliam/linux-vfio Pull VFIO updates from Alex Williamson: - New fsl-mc vfio bus driver supporting userspace drivers of objects within NXP's DPAA2 architecture (Diana Craciun) - Support for exposing zPCI information on s390 (Matthew Rosato) - Fixes for "detached" VFs on s390 (Matthew Rosato) - Fixes for pin-pages and dma-rw accesses (Yan Zhao) - Cleanups and optimize vconfig regen (Zenghui Yu) - Fix duplicate irq-bypass token registration (Alex Williamson) * tag 'vfio-v5.10-rc1' of git://github.com/awilliam/linux-vfio: (30 commits) vfio iommu type1: Fix memory leak in vfio_iommu_type1_pin_pages vfio/pci: Clear token on bypass registration failure vfio/fsl-mc: fix the return of the uninitialized variable ret vfio/fsl-mc: Fix the dead code in vfio_fsl_mc_set_irq_trigger vfio/fsl-mc: Fixed vfio-fsl-mc driver compilation on 32 bit MAINTAINERS: Add entry for s390 vfio-pci vfio-pci/zdev: Add zPCI capabilities to VFIO_DEVICE_GET_INFO vfio/fsl-mc: Add support for device reset vfio/fsl-mc: Add read/write support for fsl-mc devices vfio/fsl-mc: trigger an interrupt via eventfd vfio/fsl-mc: Add irq infrastructure for fsl-mc devices vfio/fsl-mc: Added lock support in preparation for interrupt handling vfio/fsl-mc: Allow userspace to MMAP fsl-mc device MMIO regions vfio/fsl-mc: Implement VFIO_DEVICE_GET_REGION_INFO ioctl call vfio/fsl-mc: Implement VFIO_DEVICE_GET_INFO ioctl vfio/fsl-mc: Scan DPRC objects on vfio-fsl-mc driver bind vfio: Introduce capability definitions for VFIO_DEVICE_GET_INFO s390/pci: track whether util_str is valid in the zpci_dev s390/pci: stash version in the zpci_dev vfio/fsl-mc: Add VFIO framework skeleton for fsl-mc devices ...	2020-10-22 13:00:44 -07:00
Alex Williamson	852b1beecb	vfio/pci: Clear token on bypass registration failure The eventfd context is used as our irqbypass token, therefore if an eventfd is re-used, our token is the same. The irqbypass code will return an -EBUSY in this case, but we'll still attempt to unregister the producer, where if that duplicate token still exists, results in removing the wrong object. Clear the token of failed producers so that they harmlessly fall out when unregistered. Fixes: `6d7425f109` ("vfio: Register/unregister irq_bypass_producer") Reported-by: guomin chen <guomin_chen@sina.com> Tested-by: guomin chen <guomin_chen@sina.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-10-19 07:13:55 -06:00
Jann Horn	4d45e75a99	mm: remove the now-unnecessary mmget_still_valid() hack The preceding patches have ensured that core dumping properly takes the mmap_lock. Thanks to that, we can now remove mmget_still_valid() and all its users. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-8-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-16 11:11:22 -07:00
Alex Williamson	2099363255	Merge branches 'v5.10/vfio/fsl-mc-v6' and 'v5.10/vfio/zpci-info-v3' into v5.10/vfio/next	2020-10-12 11:41:02 -06:00
Matthew Rosato	e6b817d4b8	vfio-pci/zdev: Add zPCI capabilities to VFIO_DEVICE_GET_INFO Define a new configuration entry VFIO_PCI_ZDEV for VFIO/PCI. When this s390-only feature is configured we add capabilities to the VFIO_DEVICE_GET_INFO ioctl that describe features of the associated zPCI device and its underlying hardware. This patch is based on work previously done by Pierre Morel. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-10-12 11:37:59 -06:00
Alex Williamson	3de066f8f8	Merge branches 'v5.10/vfio/bardirty', 'v5.10/vfio/dma_avail', 'v5.10/vfio/misc', 'v5.10/vfio/no-cmd-mem' and 'v5.10/vfio/yan_zhao_fixes' into v5.10/vfio/next	2020-09-22 10:56:51 -06:00
Matthew Rosato	515ecd5368	vfio/pci: Decouple PCI_COMMAND_MEMORY bit checks from is_virtfn While it is true that devices with is_virtfn=1 will have a Memory Space Enable bit that is hard-wired to 0, this is not the only case where we see this behavior -- For example some bare-metal hypervisors lack Memory Space Enable bit emulation for devices not setting is_virtfn (s390). Fix this by instead checking for the newly-added no_command_memory bit which directly denotes the need for PCI_COMMAND_MEMORY emulation in vfio. Fixes: `abafbc551f` ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-09-22 10:52:24 -06:00
Zenghui Yu	1c0f68252a	vfio/pci: Don't regenerate vconfig for all BARs if !bardirty Now we regenerate vconfig for all the BARs via vfio_bar_fixup(), every time any offset of any of them are read. Though BARs aren't re-read regularly, the regeneration can be avoided if no BARs had been written since they were last read, in which case vdev->bardirty is false. Let's return immediately in vfio_bar_fixup() if bardirty is false. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-09-21 14:08:12 -06:00
Zenghui Yu	eac7cc21c4	vfio/pci: Remove redundant declaration of vfio_pci_driver It was added by commit `137e553135` ("vfio/pci: Add sriov_configure support") but duplicates a forward declaration earlier in the file. Remove it. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-09-21 14:07:05 -06:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00
Alex Williamson	bc93b9ae01	vfio-pci: Avoid recursive read-lock usage A down_read on memory_lock is held when performing read/write accesses to MMIO BAR space, including across the copy_to/from_user() callouts which may fault. If the user buffer for these copies resides in an mmap of device MMIO space, the mmap fault handler will acquire a recursive read-lock on memory_lock. Avoid this by reducing the lock granularity. Sequential accesses requiring multiple ioread/iowrite cycles are expected to be rare, therefore typical accesses should not see additional overhead. VGA MMIO accesses are expected to be non-fatal regardless of the PCI memory enable bit to allow legacy probing, this behavior remains with a comment added. ioeventfds are now included in memory access testing, with writes dropped while memory space is disabled. Fixes: `abafbc551f` ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") Reported-by: Zhiyi Guo <zhguo@redhat.com> Tested-by: Zhiyi Guo <zhguo@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-08-17 11:08:18 -06:00
Giovanni Cabiddu	50173329c8	vfio/pci: Add QAT devices to denylist The current generation of Intel® QuickAssist Technology devices are not designed to run in an untrusted environment because of the following issues reported in the document "Intel® QuickAssist Technology (Intel® QAT) Software for Linux" (document number 336211-014): QATE-39220 - GEN - Intel® QAT API submissions with bad addresses that trigger DMA to invalid or unmapped addresses can cause a platform hang QATE-7495 - GEN - An incorrectly formatted request to Intel® QAT can hang the entire Intel® QAT Endpoint The document is downloadable from https://01.org/intel-quickassist-technology at the following link: https://01.org/sites/default/files/downloads/336211-014-qatforlinux-releasenotes-hwv1.7_0.pdf This patch adds the following QAT devices to the denylist: DH895XCC, C3XXX and C62X. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Fiona Trahe <fiona.trahe@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-07-27 13:43:40 -06:00
Giovanni Cabiddu	1f97970e6c	vfio/pci: Add device denylist Add denylist of devices that by default are not probed by vfio-pci. Devices in this list may be susceptible to untrusted application, even if the IOMMU is enabled. To be accessed via vfio-pci, the user has to explicitly disable the denylist. The denylist can be disabled via the module parameter disable_denylist. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Fiona Trahe <fiona.trahe@intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-07-27 13:43:40 -06:00
Alex Williamson	924b51abf9	vfio/pci: Hold igate across releasing eventfd contexts No need to release and immediately re-acquire igate while clearing out the eventfd ctxs. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-07-27 13:43:38 -06:00
Alex Williamson	bf3551e150	vfio/pci: Add Intel X550 to hidden INTx devices Intel document 333717-008, "Intel® Ethernet Controller X550 Specification Update", version 2.7, dated June 2020, includes errata #22, added in version 2.1, May 2016, indicating X550 NICs suffer from the same implementation deficiency as the 700-series NICs: "The Interrupt Status bit in the Status register of the PCIe configuration space is not implemented and is not set as described in the PCIe specification." Without the interrupt status bit, vfio-pci cannot determine when these devices signal INTx. They are therefore added to the nointx quirk. Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-07-27 13:43:37 -06:00
Zeng Tao	b872d06408	vfio/pci: fix racy on error and request eventfd ctx The vfio_pci_release call will free and clear the error and request eventfd ctx while these ctx could be in use at the same time in the function like vfio_pci_request, and it's expected to protect them under the vdev->igate mutex, which is missing in vfio_pci_release. This issue is introduced since commit `1518ac272e` ("vfio/pci: fix memory leaks of eventfd ctx"),and since commit `5c5866c593` ("vfio/pci: Clear error and request eventfd ctx after releasing"), it's very easily to trigger the kernel panic like this: [ 9513.904346] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [ 9513.913091] Mem abort info: [ 9513.915871] ESR = 0x96000006 [ 9513.918912] EC = 0x25: DABT (current EL), IL = 32 bits [ 9513.924198] SET = 0, FnV = 0 [ 9513.927238] EA = 0, S1PTW = 0 [ 9513.930364] Data abort info: [ 9513.933231] ISV = 0, ISS = 0x00000006 [ 9513.937048] CM = 0, WnR = 0 [ 9513.940003] user pgtable: 4k pages, 48-bit VAs, pgdp=0000007ec7d12000 [ 9513.946414] [0000000000000008] pgd=0000007ec7d13003, p4d=0000007ec7d13003, pud=0000007ec728c003, pmd=0000000000000000 [ 9513.956975] Internal error: Oops: 96000006 [#1] PREEMPT SMP [ 9513.962521] Modules linked in: vfio_pci vfio_virqfd vfio_iommu_type1 vfio hclge hns3 hnae3 [last unloaded: vfio_pci] [ 9513.972998] CPU: 4 PID: 1327 Comm: bash Tainted: G W 5.8.0-rc4+ #3 [ 9513.980443] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B270.01 05/08/2020 [ 9513.989274] pstate: 80400089 (Nzcv daIf +PAN -UAO BTYPE=--) [ 9513.994827] pc : _raw_spin_lock_irqsave+0x48/0x88 [ 9513.999515] lr : eventfd_signal+0x6c/0x1b0 [ 9514.003591] sp : ffff800038a0b960 [ 9514.006889] x29: ffff800038a0b960 x28: ffff007ef7f4da10 [ 9514.012175] x27: ffff207eefbbfc80 x26: ffffbb7903457000 [ 9514.017462] x25: ffffbb7912191000 x24: ffff007ef7f4d400 [ 9514.022747] x23: ffff20be6e0e4c00 x22: 0000000000000008 [ 9514.028033] x21: 0000000000000000 x20: 0000000000000000 [ 9514.033321] x19: 0000000000000008 x18: 0000000000000000 [ 9514.038606] x17: 0000000000000000 x16: ffffbb7910029328 [ 9514.043893] x15: 0000000000000000 x14: 0000000000000001 [ 9514.049179] x13: 0000000000000000 x12: 0000000000000002 [ 9514.054466] x11: 0000000000000000 x10: 0000000000000a00 [ 9514.059752] x9 : ffff800038a0b840 x8 : ffff007ef7f4de60 [ 9514.065038] x7 : ffff007fffc96690 x6 : fffffe01faffb748 [ 9514.070324] x5 : 0000000000000000 x4 : 0000000000000000 [ 9514.075609] x3 : 0000000000000000 x2 : 0000000000000001 [ 9514.080895] x1 : ffff007ef7f4d400 x0 : 0000000000000000 [ 9514.086181] Call trace: [ 9514.088618] _raw_spin_lock_irqsave+0x48/0x88 [ 9514.092954] eventfd_signal+0x6c/0x1b0 [ 9514.096691] vfio_pci_request+0x84/0xd0 [vfio_pci] [ 9514.101464] vfio_del_group_dev+0x150/0x290 [vfio] [ 9514.106234] vfio_pci_remove+0x30/0x128 [vfio_pci] [ 9514.111007] pci_device_remove+0x48/0x108 [ 9514.115001] device_release_driver_internal+0x100/0x1b8 [ 9514.120200] device_release_driver+0x28/0x38 [ 9514.124452] pci_stop_bus_device+0x68/0xa8 [ 9514.128528] pci_stop_and_remove_bus_device+0x20/0x38 [ 9514.133557] pci_iov_remove_virtfn+0xb4/0x128 [ 9514.137893] sriov_disable+0x3c/0x108 [ 9514.141538] pci_disable_sriov+0x28/0x38 [ 9514.145445] hns3_pci_sriov_configure+0x48/0xb8 [hns3] [ 9514.150558] sriov_numvfs_store+0x110/0x198 [ 9514.154724] dev_attr_store+0x44/0x60 [ 9514.158373] sysfs_kf_write+0x5c/0x78 [ 9514.162018] kernfs_fop_write+0x104/0x210 [ 9514.166010] __vfs_write+0x48/0x90 [ 9514.169395] vfs_write+0xbc/0x1c0 [ 9514.172694] ksys_write+0x74/0x100 [ 9514.176079] __arm64_sys_write+0x24/0x30 [ 9514.179987] el0_svc_common.constprop.4+0x110/0x200 [ 9514.184842] do_el0_svc+0x34/0x98 [ 9514.188144] el0_svc+0x14/0x40 [ 9514.191185] el0_sync_handler+0xb0/0x2d0 [ 9514.195088] el0_sync+0x140/0x180 [ 9514.198389] Code: b9001020 d2800000 52800022 f9800271 (885ffe61) [ 9514.204455] ---[ end trace 648de00c8406465f ]--- [ 9514.212308] note: bash[1327] exited with preempt_count 1 Cc: Qian Cai <cai@lca.pw> Cc: Alex Williamson <alex.williamson@redhat.com> Fixes: `1518ac272e` ("vfio/pci: fix memory leaks of eventfd ctx") Signed-off-by: Zeng Tao <prime.zeng@hisilicon.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-07-17 08:28:40 -06:00
Alex Williamson	ebfa440ce3	vfio/pci: Fix SR-IOV VF handling with MMIO blocking SR-IOV VFs do not implement the memory enable bit of the command register, therefore this bit is not set in config space after pci_enable_device(). This leads to an unintended difference between PF and VF in hand-off state to the user. We can correct this by setting the initial value of the memory enable bit in our virtualized config space. There's really no need however to ever fault a user on a VF though as this would only indicate an error in the user's management of the enable bit, versus a PF where the same access could trigger hardware faults. Fixes: `abafbc551f` ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-06-25 11:04:23 -06:00
Alex Williamson	5c5866c593	vfio/pci: Clear error and request eventfd ctx after releasing The next use of the device will generate an underflow from the stale reference. Cc: Qian Cai <cai@lca.pw> Fixes: `1518ac272e` ("vfio/pci: fix memory leaks of eventfd ctx") Reported-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Tested-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-06-17 15:18:42 -06:00
Michel Lespinasse	c1e8d7c6a7	mmap locking API: convert mmap_sem comments Convert comments that reference mmap_sem to reference mmap_lock instead. [akpm@linux-foundation.org: fix up linux-next leftovers] [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil] [akpm@linux-foundation.org: more linux-next fixups, per Michel] Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-09 09:39:14 -07:00
Michel Lespinasse	89154dd531	mmap locking API: convert mmap_sem call sites missed by coccinelle Convert the last few remaining mmap_sem rwsem calls to use the new mmap locking API. These were missed by coccinelle for some reason (I think coccinelle does not support some of the preprocessor constructs in these files ?) [akpm@linux-foundation.org: convert linux-next leftovers] [akpm@linux-foundation.org: more linux-next leftovers] [akpm@linux-foundation.org: more linux-next leftovers] Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-6-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-09 09:39:14 -07:00
Linus Torvalds	5a36f0f3f5	VFIO updates for v5.8-rc1 - Block accesses to disabled MMIO space (Alex Williamson) - VFIO device migration API (Kirti Wankhede) - type1 IOMMU dirty bitmap API and implementation (Kirti Wankhede) - PCI NULL capability masking (Alex Williamson) - Memory leak fixes (Qian Cai) - Reference leak fix (Qiushi Wu) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJe19RZAAoJECObm247sIsiBTEP/1DsA/mL/eDR94Hkmz3y1qoX tkrIJuM5xo95uk6DsbXDikbjM2jpUmvDUiRilara25IceuVX/tKgTjgxNOWJd+h8 KHBoPM0bveILp+Hp6r6ZAEfhZLQ3P6A5XBTB6ql0ebveadyt13TP9Cr0jCf4Jxr2 2A7wZAexWN8P4YBXBqrW8X3ZF3V9rfWYOJ4eILRvEplJ/TkqsGJyKW8LKP45waUF wutN6yh7irnWQbbUO6hb/eTIO7usrNB2L4A5CJMSZIm8JZYYmFQOrn73Ry3bA3Pg U+v3xIRU5U1ed/1DeLs8lRDDxImmkSXbVga0N+e3qnKtDIBh7/i9yNT9hQjhG9Ba C3IQoOM0Fj03hK/UahNerN7FKYoj+JvtdSKeTSNFXwdp/JUSXjj1tLouxUBSFsgD P15CMPXF/U6rT3nQ7/3z3JFKft/iSf6ShjXeNJFqhwH+lKsj13kF2+py3+vgayl8 zLOqeLdRkQwXz73UStX+hrc7MmNYZVRxJ0uf3XXq3/CZ1gwiinIQqxDF3ry+pQ6w vXB8j5XJqsOguoAU7trTdvkKpfoJd4yY0D6aNN9F3JN8xkEBbfm6xZ7Hzur2JI6r iCxUHcS4XrXA/PRlTFupg8451CefuPJa2UkAWSdk182hO2R41l458Q6b7lIMqhOQ oP04S8VYvvDIw5JVnPWU =KSAE -----END PGP SIGNATURE----- Merge tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfio Pull VFIO updates from Alex Williamson: - Block accesses to disabled MMIO space (Alex Williamson) - VFIO device migration API (Kirti Wankhede) - type1 IOMMU dirty bitmap API and implementation (Kirti Wankhede) - PCI NULL capability masking (Alex Williamson) - Memory leak fixes (Qian Cai) - Reference leak fix (Qiushi Wu) * tag 'vfio-v5.8-rc1' of git://github.com/awilliam/linux-vfio: vfio iommu: typecast corrections vfio iommu: Use shift operation for 64-bit integer division vfio/mdev: Fix reference count leak in add_mdev_supported_type vfio: Selective dirty page tracking if IOMMU backed device pins pages vfio iommu: Add migration capability to report supported features vfio iommu: Update UNMAP_DMA ioctl to get dirty bitmap before unmap vfio iommu: Implementation of ioctl for dirty pages tracking vfio iommu: Add ioctl definition for dirty pages tracking vfio iommu: Cache pgsize_bitmap in struct vfio_iommu vfio iommu: Remove atomicity of ref_count of pinned pages vfio: UAPI for migration interface for device state vfio/pci: fix memory leaks of eventfd ctx vfio/pci: fix memory leaks in alloc_perm_bits() vfio-pci: Mask cap zero vfio-pci: Invalidate mmaps and block MMIO access on disabled memory vfio-pci: Fault mmaps to enable vma tracking vfio/type1: Support faulting PFNMAP vmas	2020-06-05 13:51:49 -07:00
Alex Williamson	cd34b82e6e	Merge branches 'v5.8/vfio/alex-block-mmio-v3', 'v5.8/vfio/alex-zero-cap-v2' and 'v5.8/vfio/qian-leak-fixes' into v5.8/vfio/next	2020-05-26 10:27:15 -06:00
Qian Cai	1518ac272e	vfio/pci: fix memory leaks of eventfd ctx Finished a qemu-kvm (-device vfio-pci,host=0001:01:00.0) triggers a few memory leaks after a while because vfio_pci_set_ctx_trigger_single() calls eventfd_ctx_fdget() without the matching eventfd_ctx_put() later. Fix it by calling eventfd_ctx_put() for those memory in vfio_pci_release() before vfio_device_release(). unreferenced object 0xebff008981cc2b00 (size 128): comm "qemu-kvm", pid 4043, jiffies 4294994816 (age 9796.310s) hex dump (first 32 bytes): 01 00 00 00 6b 6b 6b 6b 00 00 00 00 ad 4e ad de ....kkkk.....N.. ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff ....kkkk........ backtrace: [<00000000917e8f8d>] slab_post_alloc_hook+0x74/0x9c [<00000000df0f2aa2>] kmem_cache_alloc_trace+0x2b4/0x3d4 [<000000005fcec025>] do_eventfd+0x54/0x1ac [<0000000082791a69>] __arm64_sys_eventfd2+0x34/0x44 [<00000000b819758c>] do_el0_svc+0x128/0x1dc [<00000000b244e810>] el0_sync_handler+0xd0/0x268 [<00000000d495ef94>] el0_sync+0x164/0x180 unreferenced object 0x29ff008981cc4180 (size 128): comm "qemu-kvm", pid 4043, jiffies 4294994818 (age 9796.290s) hex dump (first 32 bytes): 01 00 00 00 6b 6b 6b 6b 00 00 00 00 ad 4e ad de ....kkkk.....N.. ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff ....kkkk........ backtrace: [<00000000917e8f8d>] slab_post_alloc_hook+0x74/0x9c [<00000000df0f2aa2>] kmem_cache_alloc_trace+0x2b4/0x3d4 [<000000005fcec025>] do_eventfd+0x54/0x1ac [<0000000082791a69>] __arm64_sys_eventfd2+0x34/0x44 [<00000000b819758c>] do_el0_svc+0x128/0x1dc [<00000000b244e810>] el0_sync_handler+0xd0/0x268 [<00000000d495ef94>] el0_sync+0x164/0x180 Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-05-26 10:23:22 -06:00
Qian Cai	3e63b94b62	vfio/pci: fix memory leaks in alloc_perm_bits() vfio_pci_disable() calls vfio_config_free() but forgets to call free_perm_bits() resulting in memory leaks, unreferenced object 0xc000000c4db2dee0 (size 16): comm "qemu-kvm", pid 4305, jiffies 4295020272 (age 3463.780s) hex dump (first 16 bytes): 00 00 ff 00 ff ff ff ff ff ff ff ff ff ff 00 00 ................ backtrace: [<00000000a6a4552d>] alloc_perm_bits+0x58/0xe0 [vfio_pci] [<00000000ac990549>] vfio_config_init+0xdf0/0x11b0 [vfio_pci] init_pci_cap_msi_perm at drivers/vfio/pci/vfio_pci_config.c:1125 (inlined by) vfio_msi_cap_len at drivers/vfio/pci/vfio_pci_config.c:1180 (inlined by) vfio_cap_len at drivers/vfio/pci/vfio_pci_config.c:1241 (inlined by) vfio_cap_init at drivers/vfio/pci/vfio_pci_config.c:1468 (inlined by) vfio_config_init at drivers/vfio/pci/vfio_pci_config.c:1707 [<000000006db873a1>] vfio_pci_open+0x234/0x700 [vfio_pci] [<00000000630e1906>] vfio_group_fops_unl_ioctl+0x8e0/0xb84 [vfio] [<000000009e34c54f>] ksys_ioctl+0xd8/0x130 [<000000006577923d>] sys_ioctl+0x28/0x40 [<000000006d7b1cf2>] system_call_exception+0x114/0x1e0 [<0000000008ea7dd5>] system_call_common+0xf0/0x278 unreferenced object 0xc000000c4db2e330 (size 16): comm "qemu-kvm", pid 4305, jiffies 4295020272 (age 3463.780s) hex dump (first 16 bytes): 00 ff ff 00 ff ff ff ff ff ff ff ff ff ff 00 00 ................ backtrace: [<000000004c71914f>] alloc_perm_bits+0x44/0xe0 [vfio_pci] [<00000000ac990549>] vfio_config_init+0xdf0/0x11b0 [vfio_pci] [<000000006db873a1>] vfio_pci_open+0x234/0x700 [vfio_pci] [<00000000630e1906>] vfio_group_fops_unl_ioctl+0x8e0/0xb84 [vfio] [<000000009e34c54f>] ksys_ioctl+0xd8/0x130 [<000000006577923d>] sys_ioctl+0x28/0x40 [<000000006d7b1cf2>] system_call_exception+0x114/0x1e0 [<0000000008ea7dd5>] system_call_common+0xf0/0x278 Fixes: `89e1f7d4c6` ("vfio: Add PCI device driver") Signed-off-by: Qian Cai <cai@lca.pw> [aw: rolled in follow-up patch] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-05-18 09:58:28 -06:00
Alex Williamson	bc138db1b9	vfio-pci: Mask cap zero The PCI Code and ID Assignment Specification changed capability ID 0 from reserved to a NULL capability in the v1.1 revision. The NULL capability is defined to include only the 16-bit capability header, ie. only the ID and next pointer. Unfortunately vfio-pci creates a map of config space, where ID 0 is used to reserve the standard type 0 header. Finding an actual capability with this ID therefore results in a bogus range marked in that map and conflicts with subsequent capabilities. As this seems to be a dummy capability anyway and we already support dropping capabilities, let's hide this one rather than delving into the potentially subtle dependencies within our map. Seen on an NVIDIA Tesla T4. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-05-18 09:55:35 -06:00
Alex Williamson	abafbc551f	vfio-pci: Invalidate mmaps and block MMIO access on disabled memory Accessing the disabled memory space of a PCI device would typically result in a master abort response on conventional PCI, or an unsupported request on PCI express. The user would generally see these as a -1 response for the read return data and the write would be silently discarded, possibly with an uncorrected, non-fatal AER error triggered on the host. Some systems however take it upon themselves to bring down the entire system when they see something that might indicate a loss of data, such as this discarded write to a disabled memory space. To avoid this, we want to try to block the user from accessing memory spaces while they're disabled. We start with a semaphore around the memory enable bit, where writers modify the memory enable state and must be serialized, while readers make use of the memory region and can access in parallel. Writers include both direct manipulation via the command register, as well as any reset path where the internal mechanics of the reset may both explicitly and implicitly disable memory access, and manipulation of the MSI-X configuration, where the MSI-X vector table resides in MMIO space of the device. Readers include the read and write file ops to access the vfio device fd offsets as well as memory mapped access. In the latter case, we make use of our new vma list support to zap, or invalidate, those memory mappings in order to force them to be faulted back in on access. Our semaphore usage will stall user access to MMIO spaces across internal operations like reset, but the user might experience new behavior when trying to access the MMIO space while disabled via the PCI command register. Access via read or write while disabled will return -EIO and access via memory maps will result in a SIGBUS. This is expected to be compatible with known use cases and potentially provides better error handling capabilities than present in the hardware, while avoiding the more readily accessible and severe platform error responses that might otherwise occur. Fixes: CVE-2020-12888 Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-05-18 09:53:38 -06:00
Alex Williamson	11c4cd07ba	vfio-pci: Fault mmaps to enable vma tracking Rather than calling remap_pfn_range() when a region is mmap'd, setup a vm_ops handler to support dynamic faulting of the range on access. This allows us to manage a list of vmas actively mapping the area that we can later use to invalidate those mappings. The open callback invalidates the vma range so that all tracking is inserted in the fault handler and removed in the close handler. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-05-18 09:53:29 -06:00
Christophe Leroy	7bfc3c84cb	drivers/powerpc: Replace _ALIGN_UP() by ALIGN() _ALIGN_UP() is specific to powerpc ALIGN() is generic and does the same Replace _ALIGN_UP() by ALIGN() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/a5945463f86c984151962a475a3ee56a2893e85d.1587407777.git.christophe.leroy@c-s.fr	2020-05-11 23:15:15 +10:00
Sam Bobroff	00bc509554	vfio-pci/nvlink2: Allow fallback to ibm,mmio-atsd[0] Older versions of skiboot only provide a single value in the device tree property "ibm,mmio-atsd", even when multiple Address Translation Shoot Down (ATSD) registers are present. This prevents NVLink2 devices (other than the first) from being used with vfio-pci because vfio-pci expects to be able to assign a dedicated ATSD register to each NVLink2 device. However, ATSD registers can be shared among devices. This change allows vfio-pci to fall back to sharing the register at index 0 if necessary. Fixes: `7f92891778` ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver") Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-04-01 13:50:46 -06:00
Alex Williamson	b66574a3fb	vfio/pci: Cleanup .probe() exit paths The cleanup is getting a tad long. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-03-24 09:28:29 -06:00
Alex Williamson	959e1b75cc	vfio/pci: Remove dev_fmt definition It currently results in messages like: "vfio-pci 0000:03:00.0: vfio_pci: ..." Which is quite a bit redundant. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-03-24 09:28:29 -06:00
Alex Williamson	137e553135	vfio/pci: Add sriov_configure support With the VF Token interface we can now expect that a vfio userspace driver must be in collaboration with the PF driver, an unwitting userspace driver will not be able to get past the GET_DEVICE_FD step in accessing the device. We can now move on to actually allowing SR-IOV to be enabled by vfio-pci on the PF. Support for this is not enabled by default in this commit, but it does provide a module option for this to be enabled (enable_sriov=1). Enabling VFs is rather straightforward, except we don't want to risk that a VF might get autoprobed and bound to other drivers, so a bus notifier is used to "capture" VFs to vfio-pci using the driver_override support. We assume any later action to bind the device to other drivers is condoned by the system admin and allow it with a log warning. vfio-pci will disable SR-IOV on a PF before releasing the device, allowing a VF driver to be assured other drivers cannot take over the PF and that any other userspace driver must know the shared VF token. This support also does not provide a mechanism for the PF userspace driver itself to manipulate SR-IOV through the vfio API. With this patch SR-IOV can only be enabled via the host sysfs interface and the PF driver user cannot create or remove VFs. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-03-24 09:28:28 -06:00
Alex Williamson	43eeeecc8e	vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user The VFIO_DEVICE_FEATURE ioctl is meant to be a general purpose, device agnostic ioctl for setting, retrieving, and probing device features. This implementation provides a 16-bit field for specifying a feature index, where the data porition of the ioctl is determined by the semantics for the given feature. Additional flag bits indicate the direction and nature of the operation; SET indicates user data is provided into the device feature, GET indicates the device feature is written out into user data. The PROBE flag augments determining whether the given feature is supported, and if provided, whether the given operation on the feature is supported. The first user of this ioctl is for setting the vfio-pci VF token, where the user provides a shared secret key (UUID) on a SR-IOV PF device, which users must provide when opening associated VF devices. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-03-24 09:28:27 -06:00
Alex Williamson	cc20d79990	vfio/pci: Introduce VF token If we enable SR-IOV on a vfio-pci owned PF, the resulting VFs are not fully isolated from the PF. The PF can always cause a denial of service to the VF, even if by simply resetting itself. The degree to which a PF can access the data passed through a VF or interfere with its operation is dependent on a given SR-IOV implementation. Therefore we want to avoid a scenario where an existing vfio-pci based userspace driver might assume the PF driver is trusted, for example assigning a PF to one VM and VF to another with some expectation of isolation. IOMMU grouping could be a solution to this, but imposes an unnecessarily strong relationship between PF and VF drivers if they need to operate with the same IOMMU context. Instead we introduce a "VF token", which is essentially just a shared secret between PF and VF drivers, implemented as a UUID. The VF token can be set by a vfio-pci based PF driver and must be known by the vfio-pci based VF driver in order to gain access to the device. This allows the degree to which this VF token is considered secret to be determined by the applications and environment. For example a VM might generate a random UUID known only internally to the hypervisor while a userspace networking appliance might use a shared, or even well know, UUID among the application drivers. To incorporate this VF token, the VFIO_GROUP_GET_DEVICE_FD interface is extended to accept key=value pairs in addition to the device name. This allows us to most easily deny user access to the device without risk that existing userspace drivers assume region offsets, IRQs, and other device features, leading to more elaborate error paths. The format of these options are expected to take the form: "$DEVICE_NAME $OPTION1=$VALUE1 $OPTION2=$VALUE2" Where the device name is always provided first for compatibility and additional options are specified in a space separated list. The relation between and requirements for the additional options will be vfio bus driver dependent, however unknown or unused option within this schema should return error. This allow for future use of unknown options as well as a positive indication to the user that an option is used. An example VF token option would take this form: "0000:03:00.0 vf_token=2ab74924-c335-45f4-9b16-8569e5b08258" When accessing a VF where the PF is making use of vfio-pci, the user MUST provide the current vf_token. When accessing a PF, the user MUST provide the current vf_token IF there are active VF users or MAY provide a vf_token in order to set the current VF token when no VF users are active. The former requirement assures VF users that an unassociated driver cannot usurp the PF device. These semantics also imply that a VF token MUST be set by a PF driver before VF drivers can access their device, the default token is random and mechanisms to read the token are not provided in order to protect the VF token of previous users. Use of the vf_token option outside of these cases will return an error, as discussed above. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-03-24 09:28:27 -06:00
Alex Williamson	467c084f9a	vfio/pci: Implement match ops This currently serves the same purpose as the default implementation but will be expanded for additional functionality. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-03-24 09:28:26 -06:00
Linus Torvalds	a6d5f9dca4	VFIO updates for v5.6-rc1 - Fix nvlink error path (Alexey Kardashevskiy) - Update nvlink and spapr to use mmgrab() (Julia Lawall) - Update static declaration (Ben Dooks) - Annotate __iomem to fix sparse warnings (Ben Dooks) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJeOFZPAAoJECObm247sIsidxIQAIxDITCEjsquBuT9AfBHMcC6 HIqJ9/JYORGxc+nbtc0858rXsvAGJTJR7yu/WE9N9UqDFLHX0yoRHrWEQdK5OFtA pj3PpweMlzdXqMZl5kgQOaLoM4I+gYNrPIdcUyxWpfVJxCCQsgz0iFBrvNqoFgEM YK1UPZSCpHnD5EsG3Lz5BfSM7lyrSfH6R+eWempZST9eBHXC7kvU9pRYVmVDP/l7 fyBsWpoN+sg3rAf+8R7fQyBbPyggvpyhpaWTKVVdARZStISxiprWXEfQl3p2vOK/ aDbyMLhdMraQkkipcu+HAG0AJXX3gdxGBcX9BXLmB6Z8EohBj9hEu8RNUrNnFCm4 rOOT32IuwxSH3oZr+vmOUrG6AQhu10vPLHf5scOYdn3h/6R0gW13aTmsDPTzdOCN oZ2iU5CDJ6w9BMLELRL4WrKTC1lG9xYXS8RmS//GAq24Re7X8kUdHszyoSW8bUFu 5RSqgAYz8Wk/3qrQsMU03yYiEgMS1AZwKHJWRgTCJLdT/p3a8/NXgLWO9Y6EdMua LETW63Bg/YITFTqCr4jTOHYg8XUqrGHFQOWmT+boAkRAHDn56FGcrQ1Bl0BAC8cP 4C1iGwaxsa3DwQkVqsLMsHlBqpyLbz4a2SUQVmaIlwEs58LUv7EBwTymFir4w6aJ Esxh+sm0NYZE0nJ7aAqY =bamw -----END PGP SIGNATURE----- Merge tag 'vfio-v5.6-rc1' of git://github.com/awilliam/linux-vfio Pull VFIO updates from Alex Williamson: - Fix nvlink error path (Alexey Kardashevskiy) - Update nvlink and spapr to use mmgrab() (Julia Lawall) - Update static declaration (Ben Dooks) - Annotate __iomem to fix sparse warnings (Ben Dooks) * tag 'vfio-v5.6-rc1' of git://github.com/awilliam/linux-vfio: vfio: platform: fix __iomem in vfio_platform_amdxgbe.c vfio/mdev: make create attribute static vfio/spapr_tce: use mmgrab vfio: vfio_pci_nvlink2: use mmgrab vfio/spapr/nvlink2: Skip unpinning pages on error exit	2020-02-03 22:22:05 +00:00
Julia Lawall	bb3d3cf928	vfio: vfio_pci_nvlink2: use mmgrab Mmgrab was introduced in commit `f1f1007644` ("mm: add new mmgrab() helper") and most of the kernel was updated to use it. Update a remaining file. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) <smpl> @@ expression e; @@ - atomic_inc(&e->mm_count); + mmgrab(e); </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-01-07 13:02:43 -07:00
Alexey Kardashevskiy	338b4e10f9	vfio/spapr/nvlink2: Skip unpinning pages on error exit The nvlink2 subdriver for IBM Witherspoon machines preregisters GPU memory in the IOMMI API so KVM TCE code can map this memory for DMA as well. This is done by mm_iommu_newdev() called from vfio_pci_nvgpu_regops::mmap. In an unlikely event of failure the data->mem remains NULL and since mm_iommu_put() (which unregisters the region and unpins memory if that was regular memory) does not expect mem=NULL, it should not be called. This adds a check to only call mm_iommu_put() for a valid data->mem. Fixes: `7f92891778` ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-01-06 10:01:05 -07:00
Christoph Hellwig	4bdc0d676a	remove ioremap_nocache and devm_ioremap_nocache ioremap has provided non-cached semantics by default since the Linux 2.6 days, so remove the additional ioremap_nocache interface. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Arnd Bergmann <arnd@arndb.de>	2020-01-06 09:45:59 +01:00
Linus Torvalds	94e89b4023	VFIO updates for v5.5-rc1 - Remove hugepage checks for reserved pfns (Ben Luo) - Fix irq-bypass unregister ordering (Jiang Yi) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJd6oWBAAoJECObm247sIsi7Q8P/37Gs4jKx+ZUfd0K8MwyAJjV 1oiq0vsXVKQPRlBF0G9YdZQubX3X7l87PbeV0P4e6GKH5/A30FUA4OUgurbQaybK siajJVmdIuEeS/xNZxkuEyyjsOXXPBo+oBIhu3yDoLp0wOCYuAMOQea25H34dhG9 A141c/mZg/29jfz8whQMthZiACzIFXMcD/pO6VzspEwVX4aGaX4js1wgOp2firRf QqbQon7BfozwM5WAwwlyEyBYmewILQgciijsifGpKhTTfONl7j34GH8+/JM7WrD7 BNayX+DZdcr+NFrNyuvdU+/HnlCt/6lZADAtnyMRv53QTiRFFmdQkYU4bTpKLqHH XoN4OoPAdxYHOPRq/MOon7QsAvXodCTZ9GTIiXtC+hjIangm5f1jW7tHPjpRrbrT luXvRv3dC94lVOxJ8wfp36H/GPJbJkAASmOasTT54AP1PBmOihAYwYCVBMF9Qihp Fbk9DCg7mTb8vSbOQPxUi732sovOVhwQS4lkPq1BgoNibElzV6UbwYxQhn/0Ixpm ZcDE8lipk/sjns5HhFWYJYFJQAVtFVM3Ets4oJK/tB8ELzCch/URoc+zwNKnCVHQ jjCBcrHU/jdnNm85BrN12vFn1gjHqHZLvTGA7HCAABgqzeJ9Ln4IDuUELg9JsO/T xq/8RRxCREs5XL6aWr4n =veGz -----END PGP SIGNATURE----- Merge tag 'vfio-v5.5-rc1' of git://github.com/awilliam/linux-vfio Pull VFIO updates from Alex Williamson: - Remove hugepage checks for reserved pfns (Ben Luo) - Fix irq-bypass unregister ordering (Jiang Yi) * tag 'vfio-v5.5-rc1' of git://github.com/awilliam/linux-vfio: vfio/pci: call irq_bypass_unregister_producer() before freeing irq vfio/type1: remove hugepage checks in is_invalid_reserved_pfn()	2019-12-07 14:51:04 -08:00
Jiang Yi	d567fb8819	vfio/pci: call irq_bypass_unregister_producer() before freeing irq Since irq_bypass_register_producer() is called after request_irq(), we should do tear-down in reverse order: irq_bypass_unregister_producer() then free_irq(). Specifically free_irq() may release resources required by the irqbypass del_producer() callback. Notably an example provided by Marc Zyngier on arm64 with GICv4 that he indicates has the potential to wedge the hardware: free_irq(irq) __free_irq(irq) irq_domain_deactivate_irq(irq) its_irq_domain_deactivate() [unmap the VLPI from the ITS] kvm_arch_irq_bypass_del_producer(cons, prod) kvm_vgic_v4_unset_forwarding(kvm, irq, ...) its_unmap_vlpi(irq) [Unmap the VLPI from the ITS (again), remap the original LPI] Signed-off-by: Jiang Yi <giangyi@amazon.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: `6d7425f109` ("vfio: Register/unregister irq_bypass_producer") Link: https://lore.kernel.org/kvm/20191127164910.15888-1-giangyi@amazon.com Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> [aw: commit log] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-12-02 14:34:41 -07:00

1 2 3 4

200 Commits