Commit Graph

135 Commits

Author SHA1 Message Date
Antonios Motakis
7e992d6927 vfio: move eventfd support code for VFIO_PCI to a separate file
The virqfd functionality that is used by VFIO_PCI to implement interrupt
masking and unmasking via an eventfd, is generic enough and can be reused
by another driver. Move it to a separate file in order to allow the code
to be shared.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:54 -06:00
Antonios Motakis
09bbcb8810 vfio: pass an opaque pointer on virqfd initialization
VFIO_PCI passes the VFIO device structure *vdev via eventfd to the handler
that implements masking/unmasking of IRQs via an eventfd. We can replace
it in the virqfd infrastructure with an opaque type so we can make use
of the mechanism from other VFIO bus drivers.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:53 -06:00
Antonios Motakis
9269c393e7 vfio: add local lock for virqfd instead of depending on VFIO PCI
The Virqfd code needs to keep accesses to any struct *virqfd safe, but
this comes into play only when creating or destroying eventfds, so sharing
the same spinlock with the VFIO bus driver is not necessary.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:52 -06:00
Antonios Motakis
bb78e9eaab vfio: virqfd: rename vfio_pci_virqfd_init and vfio_pci_virqfd_exit
The functions vfio_pci_virqfd_init and vfio_pci_virqfd_exit are not really
PCI specific, since we plan to reuse the virqfd code with more VFIO drivers
in addition to VFIO_PCI.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
[Baptiste Reynal: Move rename vfio_pci_virqfd_init and vfio_pci_virqfd_exit
from "vfio: add a vfio_ prefix to virqfd_enable and virqfd_disable and export"]
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:52 -06:00
Antonios Motakis
bdc5e1021b vfio: add a vfio_ prefix to virqfd_enable and virqfd_disable and export
We want to reuse virqfd functionality in multiple VFIO drivers; before
moving these functions to core VFIO, add the vfio_ prefix to the
virqfd_enable and virqfd_disable functions, and export them so they can
be used from other modules.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:51 -06:00
Antonios Motakis
06211b40ce vfio/platform: support for level sensitive interrupts
Level sensitive interrupts are exposed as maskable and automasked
interrupts and are masked and disabled automatically when they fire.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
[Baptiste Reynal: Move masked interrupt initialization from "vfio/platform:
trigger an interrupt via eventfd"]
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:50 -06:00
Antonios Motakis
57f972e2b3 vfio/platform: trigger an interrupt via eventfd
This patch allows to set an eventfd for a platform device's interrupt,
and also to trigger the interrupt eventfd from userspace for testing.
Level sensitive interrupts are marked as maskable and are handled in
a later patch. Edge triggered interrupts are not advertised as maskable
and are implemented here using a simple and efficient IRQ handler.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
[Baptiste Reynal: fix masked interrupt initialization]
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:50 -06:00
Antonios Motakis
9a36321c8d vfio/platform: initial interrupts support code
This patch is a skeleton for the VFIO_DEVICE_SET_IRQS IOCTL, around which
most IRQ functionality is implemented in VFIO.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:49 -06:00
Antonios Motakis
682704c41e vfio/platform: return IRQ info
Return information for the interrupts exposed by the device.
This patch extends VFIO_DEVICE_GET_INFO with the number of IRQs
and enables VFIO_DEVICE_GET_IRQ_INFO.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:48 -06:00
Antonios Motakis
fad4d5b1f0 vfio/platform: support MMAP of MMIO regions
Allow to memory map the MMIO regions of the device so userspace can
directly access them. PIO regions are not being handled at this point.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:48 -06:00
Antonios Motakis
6e3f264560 vfio/platform: read and write support for the device fd
VFIO returns a file descriptor which we can use to manipulate the memory
regions of the device. Usually, the user will mmap memory regions that are
addressable on page boundaries, however for memory regions where this is
not the case we cannot provide mmap functionality due to security concerns.
For this reason we also allow to use read and write functions to the file
descriptor pointing to the memory regions.

We implement this functionality only for MMIO regions of platform devices;
PIO regions are not being handled at this point.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:47 -06:00
Antonios Motakis
e8909e67ca vfio/platform: return info for device memory mapped IO regions
This patch enables the IOCTLs VFIO_DEVICE_GET_REGION_INFO ioctl call,
which allows the user to learn about the available MMIO resources of
a device.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:46 -06:00
Antonios Motakis
2e8567bbb5 vfio/platform: return info for bound device
A VFIO userspace driver will start by opening the VFIO device
that corresponds to an IOMMU group, and will use the ioctl interface
to get the basic device info, such as number of memory regions and
interrupts, and their properties. This patch enables the
VFIO_DEVICE_GET_INFO ioctl call.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
[Baptiste Reynal: added include in vfio_platform_common.c]
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:46 -06:00
Antonios Motakis
b13329adc2 vfio: amba: add the VFIO for AMBA devices module to Kconfig
Enable building the VFIO AMBA driver. VFIO_AMBA depends on VFIO_PLATFORM,
since it is sharing a portion of the code, and it is essentially implemented
as a platform device whose resources are discovered via AMBA specific APIs
in the kernel.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:45 -06:00
Antonios Motakis
36fe431f28 vfio: amba: VFIO support for AMBA devices
Add support for discovering AMBA devices with VFIO and handle them
similarly to Linux platform devices.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:44 -06:00
Antonios Motakis
5316153239 vfio: platform: add the VFIO PLATFORM module to Kconfig
Enable building the VFIO PLATFORM driver that allows to use Linux platform
devices with VFIO.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:44 -06:00
Antonios Motakis
9df85aaa43 vfio: platform: probe to devices on the platform bus
Driver to bind to Linux platform devices, and callbacks to discover their
resources to be used by the main VFIO PLATFORM code.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:43 -06:00
Antonios Motakis
de49fc0d99 vfio/platform: initial skeleton of VFIO support for platform devices
This patch forms the common skeleton code for platform devices support
with VFIO. This will include the core functionality of VFIO_PLATFORM,
however binding to the device and discovering the device resources will
be done with the help of a separate file where any Linux platform bus
specific code will reside.

This will allow us to implement support for also discovering AMBA devices
and their resources, but still reuse a large part of the VFIO_PLATFORM
implementation.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
[Baptiste Reynal: added includes in vfio_platform_private.h]
Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-16 14:08:42 -06:00
Alexey Kardashevskiy
ec76f40070 vfio-pci: Add missing break to enable VFIO_PCI_ERR_IRQ_INDEX
This adds a missing break statement to VFIO_DEVICE_SET_IRQS handler
without which vfio_pci_set_err_trigger() would never be called.

While we are here, add another "break" to VFIO_PCI_REQ_IRQ_INDEX case
so if we add more indexes later, we won't miss it.

Fixes: 6140a8f562 ("vfio-pci: Add device request interface")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-03-12 09:51:38 -06:00
Alex Williamson
6140a8f562 vfio-pci: Add device request interface
Userspace can opt to receive a device request notification,
indicating that the device should be released.  This is setup
the same way as the error IRQ and also supports eventfd signaling.
Future support may forcefully remove the device from the user if
the request is ignored.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10 12:38:14 -07:00
Alex Williamson
cac80d6e38 vfio-pci: Generalize setup of simple eventfds
We want another single vector IRQ index to support signaling of
the device request to userspace.  Generalize the error reporting
IRQ index to avoid code duplication.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10 12:37:57 -07:00
Alex Williamson
13060b64b8 vfio: Add and use device request op for vfio bus drivers
When a request is made to unbind a device from a vfio bus driver,
we need to wait for the device to become unused, ie. for userspace
to release the device.  However, we have a long standing TODO in
the code to do something proactive to make that happen.  To enable
this, we add a request callback on the vfio bus driver struct,
which is intended to signal the user through the vfio device
interface to release the device.  Instead of passively waiting for
the device to become unused, we can now pester the user to give
it up.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10 12:37:47 -07:00
Alex Williamson
4a68810dbb vfio: Tie IOMMU group reference to vfio group
Move the iommu_group reference from the device to the vfio_group.
This ensures that the iommu_group persists as long as the vfio_group
remains.  This can be important if all of the device from an
iommu_group are removed, but we still have an outstanding vfio_group
reference; we can still walk the empty list of devices.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-06 15:05:06 -07:00
Alex Williamson
60720a0fc6 vfio: Add device tracking during unbind
There's a small window between the vfio bus driver calling
vfio_del_group_dev() and the device being completely unbound where
the vfio group appears to be non-viable.  This creates a race for
users like QEMU/KVM where the kvm-vfio module tries to get an
external reference to the group in order to match and release an
existing reference, while the device is potentially being removed
from the vfio bus driver.  If the group is momentarily non-viable,
kvm-vfio may not be able to release the group reference until VM
shutdown, making the group unusable until that point.

Bridge the gap between device removal from the group and completion
of the driver unbind by tracking it in a list.  The device is added
to the list before the bus driver reference is released and removed
using the existing unbind notifier.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-06 15:05:06 -07:00
Alex Williamson
c5e6688752 vfio/type1: Add conditional rescheduling
IOMMU operations can be expensive and it's not very difficult for a
user to give us a lot of work to do for a map or unmap operation.
Killing a large VM will vfio assigned devices can result in soft
lockups and IOMMU tracing shows that we can easily spend 80% of our
time with need-resched set.  A sprinkling of conf_resched() calls
after map and unmap calls has a very tiny affect on performance
while resulting in traces with <1% of calls overflowing into needs-
resched.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-06 14:19:12 -07:00
Alex Williamson
babbf17609 vfio/type1: Chunk contiguous reserved/invalid page mappings
We currently map invalid and reserved pages, such as often occur from
mapping MMIO regions of a VM through the IOMMU, using single pages.
There's really no reason we can't instead follow the methodology we
use for normal pages and find the largest possible physically
contiguous chunk for mapping.  The only difference is that we don't
do locked memory accounting for these since they're not back by RAM.

In most applications this will be a very minor improvement, but when
graphics and GPGPU devices are in play, MMIO BARs become non-trivial.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-06 10:59:16 -07:00
Alex Williamson
6fe1010d6d vfio/type1: DMA unmap chunking
When unmapping DMA entries we try to rely on the IOMMU API behavior
that allows the IOMMU to unmap a larger area than requested, up to
the size of the original mapping.  This works great when the IOMMU
supports superpages *and* they're in use.  Otherwise, each PAGE_SIZE
increment is unmapped separately, resulting in poor performance.

Instead we can use the IOVA-to-physical-address translation provided
by the IOMMU API and unmap using the largest contiguous physical
memory chunk available, which is also how vfio/type1 would have
mapped the region.  For a synthetic 1TB guest VM mapping and shutdown
test on Intel VT-d (2M IOMMU pagesize support), this achieves about
a 30% overall improvement mapping standard 4K pages, regardless of
IOMMU superpage enabling, and about a 40% improvement mapping 2M
hugetlbfs pages when IOMMU superpages are not available.  Hugetlbfs
with IOMMU superpages enabled is effectively unchanged.

Unfortunately the same algorithm does not work well on IOMMUs with
fine-grained superpages, like AMD-Vi, costing about 25% extra since
the IOMMU will automatically unmap any power-of-two contiguous
mapping we've provided it.  We add a routine and a domain flag to
detect this feature, leaving AMD-Vi unaffected by this unmap
optimization.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-06 10:58:56 -07:00
Wei Yang
7c2e211f3c vfio-pci: Fix the check on pci device type in vfio_pci_probe()
Current vfio-pci just supports normal pci device, so vfio_pci_probe() will
return if the pci device is not a normal device. While current code makes a
mistake. PCI_HEADER_TYPE is the offset in configuration space of the device
type, but we use this value to mask the type value.

This patch fixs this by do the check directly on the pci_dev->hdr_type.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org # v3.6+
2015-01-07 10:29:11 -07:00
Linus Torvalds
cc669743a3 VFIO updates for v3.19-rc1
- s390 support (Frank Blaschka)
  - Enable iommu-type1 for ARM SMMU (Will Deacon)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUka+sAAoJECObm247sIsiiy0P/iqrQpv94Z7rUKRlV3K8YtJj
 8Oi5fLLnT2by9v5mS+KMElnQ5gLU/C5B/QGLMNrF2uQl8lguWSXJw37r7MkbIkpN
 RDLx1NhetLqbJ4CYLjyv/Jx3vl+Wr/2nNWRVIS5ajBmMjEgKVLvjYs4SaXELc3a8
 a3YzcGW10BrVFlCJgUYqYIFGS1BmKjf7fbD5YBocj8tPv6NAlCiNNYYr+0pzW8Lf
 GTi39JlZ2t06hDq33eiUkrySWNjrIBn4g4PfAl7HBAscsZKS1w18MD1qVw4UXXa2
 15+CBbsHU7ZLVo6G7vuZeJNCX9tdQ0WIZWQzHstQa914l86WYImTJ2tyH7Rn0ZcQ
 3Mu9fzef9JgjkI56ol2zDwuOs+qttOYaLWjhhHiW4jkIxdnljnesIFlvmM3XeDGz
 3Zowg09HzE3K+dt8265jVKkcNJbPLzspLvF27nPMudZNHBozcoPE0jKxU7QC3eMT
 Ij36+puQq+jccUic3Np6rxk5tzTHEat1a7w3IUwXCCUP5P5QW+kuuIjbd4hqQkHn
 VDRjnT6MWC3GguUCXR5VyO0zezpI20pTbWwE8u2qwnE349m0Eq/vxytj2lCLYLPR
 Jjtdduf1/Ppam7tATd3PwTu6KljY3dJiUUikyOc1J0KmkgkSMw+BtR6G7qytyW4q
 /fhClcsaNtxSheVh0N+b
 =1L2V
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v3.19-rc1' of git://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:
 - s390 support (Frank Blaschka)
 - Enable iommu-type1 for ARM SMMU (Will Deacon)

* tag 'vfio-v3.19-rc1' of git://github.com/awilliam/linux-vfio:
  drivers/vfio: allow type-1 IOMMU instantiation on top of an ARM SMMU
  vfio: make vfio run on s390
2014-12-17 10:44:22 -08:00
Jiang Liu
83a18912b0 PCI/MSI: Rename write_msi_msg() to pci_write_msi_msg()
Rename write_msi_msg() to pci_write_msi_msg() to mark it as PCI
specific.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-23 13:01:45 +01:00
Will Deacon
5e9f36c59a drivers/vfio: allow type-1 IOMMU instantiation on top of an ARM SMMU
The ARM SMMU driver is compatible with the notion of a type-1 IOMMU in
VFIO.

This patch allows VFIO_IOMMU_TYPE1 to be selected if ARM_SMMU=y.

Signed-off-by: Will Deacon <will.deacon@arm.com>
[aw: update for existing S390 patch]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-11-14 09:10:59 -07:00
Frank Blaschka
1d53a3a7d3 vfio: make vfio run on s390
add Kconfig switch to hide INTx
add Kconfig switch to let vfio announce PCI BARs are not mapable

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-11-07 09:52:22 -07:00
Linus Torvalds
23971bdfff IOMMU Updates for Linux v3.18
This pull-request includes:
 
 	* Change in the IOMMU-API to convert the former iommu_domain_capable
 	  function to just iommu_capable
 
 	* Various fixes in handling RMRR ranges for the VT-d driver (one fix
 	  requires a device driver core change which was acked
 	  by Greg KH)
 
 	* The AMD IOMMU driver now assigns and deassigns complete alias groups
 	  to fix issues with devices using the wrong PCI request-id
 
 	* MMU-401 support for the ARM SMMU driver
 
 	* Multi-master IOMMU group support for the ARM SMMU driver
 
 	* Various other small fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJUPNxYAAoJECvwRC2XARrjMwMP/RLSr+oA31rGVjLXcmcCHl7Q
 Uj7xpcnG19qB0aqNR1JeJuZNkK/tw44pE353MQPbz4N9UVUiogklGIVD1iJvFV53
 0qm84bvpDJIof4aP35B3H3Umft2USTn/lmsQg/RklQcNTW8DzNj63b8BTNR7k/GL
 G7bLg7F1BUCl0shZCCsFspOIulQPAJYN2OvHlfYBav/bfDvfouQ3lrV+loGrK44r
 F2Hmp+imXlIhUCjfbiWz6wKFxvPrxZx482vm2pXBCSnXEdW4/fz6nf9VHUK/Cfsq
 JAimY1CfiDo1aqH9/yVHUOw5SD/NYOXq6E5bFPg/WENbipbbae5cK2u6PX5MMBAn
 CG4BM8l9xicfGPqgn5YFSRY/6qC6K7NlxMnt9U8l18QIkDVDqEtUgJQISJuce7wx
 FWx6eSWaxpIe5yhq19/h2ELalUUyR/fPq+UXXjYDL1kLV/vcvC/lC3mbNAQU93zU
 WK0bG2tDg88JHavc25Ewa2aOn4BVM2BpwuLbYlgQReaEmsQRnEPgtmRNyLJHqbFE
 wwpCj8pBWdufsJWRyvpnXQ+CfA7oSz4e7hz1G+0/5uiDmagfvg16Ql5JtPmmuLUm
 Kc3dVIiG0s1ewohZIIJETGCqprQbCSqs8CCQqB6p2zDBWFKpNT7F38lm/KlehkCz
 JpAiI7Y2K9Jejp0VIPrt
 =OMOt
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "This pull-request includes:

   - change in the IOMMU-API to convert the former iommu_domain_capable
     function to just iommu_capable

   - various fixes in handling RMRR ranges for the VT-d driver (one fix
     requires a device driver core change which was acked by Greg KH)

   - the AMD IOMMU driver now assigns and deassigns complete alias
     groups to fix issues with devices using the wrong PCI request-id

   - MMU-401 support for the ARM SMMU driver

   - multi-master IOMMU group support for the ARM SMMU driver

   - various other small fixes all over the place"

* tag 'iommu-updates-v3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits)
  iommu/vt-d: Work around broken RMRR firmware entries
  iommu/vt-d: Store bus information in RMRR PCI device path
  iommu/vt-d: Only remove domain when device is removed
  driver core: Add BUS_NOTIFY_REMOVED_DEVICE event
  iommu/amd: Fix devid mapping for ivrs_ioapic override
  iommu/irq_remapping: Fix the regression of hpet irq remapping
  iommu: Fix bus notifier breakage
  iommu/amd: Split init_iommu_group() from iommu_init_device()
  iommu: Rework iommu_group_get_for_pci_dev()
  iommu: Make of_device_id array const
  amd_iommu: do not dereference a NULL pointer address.
  iommu/omap: Remove omap_iommu unused owner field
  iommu: Remove iommu_domain_has_cap() API function
  IB/usnic: Convert to use new iommu_capable() API function
  vfio: Convert to use new iommu_capable() API function
  kvm: iommu: Convert to use new iommu_capable() API function
  iommu/tegra: Convert to iommu_capable() API function
  iommu/msm: Convert to iommu_capable() API function
  iommu/vt-d: Convert to iommu_capable() API function
  iommu/fsl: Convert to iommu_capable() API function
  ...
2014-10-15 07:23:49 +02:00
Linus Torvalds
27a9716bc8 VFIO updates for v3.18-rc1
- Nested IOMMU extension to type1 (Will Deacon)
  - Restore MSIx message before enabling (Gavin Shan)
  - Fix remove path locking (Alex Williamson)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUOOETAAoJECObm247sIsihDQP/jADEe9KFu4ymWu7rqi24w1L
 81hGNLXlfx2PPomluN3jENpyueo7vWdP5yZ8q/bi6oF6UbShL8Po01UKHOJzJJwW
 8GW86YcNsmPz/jl8Jcdbkex3dKvT1OzrDjFjCiKTJBHxE9nEdtWlRV8mO1pwd00t
 YFiXF8xFbkpHExMiQNU36rq/fzZCTOu4ZpCK9kDT7Sy+lsKAnGoXuM1IZK+7DGJo
 jcsMF32DVDmji6riy3uHHPc0qprP24QNVy6FfOmLEUvuOEIUOxMAYM9je9mmsHeS
 CeR/NHexr4RgYQE33jL1w8A1saT0rbu7DSKSa7OQebnY2Zte+oncLtqFZR2/Wylh
 jBU5r7P3PdxM6ykqEeC/3ytx7iFX6c7jc0SU4I5m8bFexmUQXqOko28gGIt0OL3n
 R8CmNF/MDs3gqYprhW6MvSJI1diY1+pX7pX0e7k7lDAoZ1QOjPNSGv+YOfF3H1YB
 AggIVxIKXW0T0bQ/hKcQiDKkxQ88vi1hld2LknbiBW9nMNLjNkxl2RZSGunFvWWN
 LzOYkBgR6rrTbhTvsWApsfYguYtGkgAGGJZSR1oev0BJnx4UHOfL1bykJRyUHdUd
 KDSBEni5TY65087IKD93nkyRhassszOa9XHmRDwQLxQeJCKRZi6bQRSzFZVheXIO
 O3XINOo2wNF1bIrfD/vR
 =s2+/
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v3.18-rc1' of git://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:
 - Nested IOMMU extension to type1 (Will Deacon)
 - Restore MSIx message before enabling (Gavin Shan)
 - Fix remove path locking (Alex Williamson)

* tag 'vfio-v3.18-rc1' of git://github.com/awilliam/linux-vfio:
  vfio-pci: Fix remove path locking
  drivers/vfio: Export vfio_spapr_iommu_eeh_ioctl() with GPL
  vfio/pci: Restore MSIx message prior to enabling
  PCI: Export MSI message relevant functions
  vfio/iommu_type1: add new VFIO_TYPE1_NESTING_IOMMU IOMMU type
  iommu: introduce domain attribute for nesting IOMMUs
2014-10-11 06:49:24 -04:00
Alex Williamson
93899a679f vfio-pci: Fix remove path locking
Locking both the remove() and release() path results in a deadlock
that should have been obvious.  To fix this we can get and hold the
vfio_device reference as we evaluate whether to do a bus/slot reset.
This will automatically block any remove() calls, allowing us to
remove the explict lock.  Fixes 61d792562b.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org	[3.17]
2014-09-29 17:18:39 -06:00
Gavin Shan
0f905ce2b5 drivers/vfio: Export vfio_spapr_iommu_eeh_ioctl() with GPL
The function should have been exported with EXPORT_SYMBOL_GPL()
as part of commit 92d18a6851 ("drivers/vfio: Fix EEH build error").

Suggested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-09-29 10:31:51 -06:00
Gavin Shan
b8f02af096 vfio/pci: Restore MSIx message prior to enabling
The MSIx vector table lives in device memory, which may be cleared as
part of a backdoor device reset. This is the case on the IBM IPR HBA
when the BIST is run on the device. When assigned to a QEMU guest,
the guest driver does a pci_save_state(), issues a BIST, then does a
pci_restore_state(). The BIST clears the MSIx vector table, but due
to the way interrupts are configured the pci_restore_state() does not
restore the vector table as expected. Eventually this results in an
EEH error on Power platforms when the device attempts to signal an
interrupt with the zero'd table entry.

Fix the problem by restoring the host cached MSI message prior to
enabling each vector.

Reported-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-09-29 10:16:24 -06:00
Will Deacon
f5c9ecebaf vfio/iommu_type1: add new VFIO_TYPE1_NESTING_IOMMU IOMMU type
VFIO allows devices to be safely handed off to userspace by putting
them behind an IOMMU configured to ensure DMA and interrupt isolation.
This enables userspace KVM clients, such as kvmtool and qemu, to further
map the device into a virtual machine.

With IOMMUs such as the ARM SMMU, it is then possible to provide SMMU
translation services to the guest operating system, which are nested
with the existing translation installed by VFIO. However, enabling this
feature means that the IOMMU driver must be informed that the VFIO domain
is being created for the purposes of nested translation.

This patch adds a new IOMMU type (VFIO_TYPE1_NESTING_IOMMU) to the VFIO
type-1 driver. The new IOMMU type acts identically to the
VFIO_TYPE1v2_IOMMU type, but additionally sets the DOMAIN_ATTR_NESTING
attribute on its IOMMU domains.

Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-09-29 10:06:19 -06:00
Chen, Gong
846fc70986 PCI/AER: Rename PCI_ERR_UNC_TRAIN to PCI_ERR_UNC_UND
In PCIe r1.0, sec 5.10.2, bit 0 of the Uncorrectable Error Status, Mask,
and Severity Registers was for "Training Error." In PCIe r1.1, sec 7.10.2,
bit 0 was redefined to be "Undefined."

Rename PCI_ERR_UNC_TRAIN to PCI_ERR_UNC_UND to reflect this change.

No functional change.

[bhelgaas: changelog]
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-09-25 09:42:40 -06:00
Joerg Roedel
eb165f0584 vfio: Convert to use new iommu_capable() API function
Cc: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2014-09-25 15:47:45 +02:00
Alexey Kardashevskiy
9b936c960f drivers/vfio: Enable VFIO if EEH is not supported
The existing vfio_pci_open() fails upon error returned from
vfio_spapr_pci_eeh_open(), which breaks POWER7's P5IOC2 PHB
support which this patch brings back.

The patch fixes the issue by dropping the return value of
vfio_spapr_pci_eeh_open().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-08 10:39:16 -06:00
Alexey Kardashevskiy
89a2edd62f drivers/vfio: Allow EEH to be built as module
This adds necessary declarations to the SPAPR VFIO EEH module,
otherwise multiple dynamic linker errors reported:

vfio_spapr_eeh: Unknown symbol eeh_pe_set_option (err 0)
vfio_spapr_eeh: Unknown symbol eeh_pe_configure (err 0)
vfio_spapr_eeh: Unknown symbol eeh_pe_reset (err 0)
vfio_spapr_eeh: Unknown symbol eeh_pe_get_state (err 0)
vfio_spapr_eeh: Unknown symbol eeh_iommu_group_to_pe (err 0)
vfio_spapr_eeh: Unknown symbol eeh_dev_open (err 0)
vfio_spapr_eeh: Unknown symbol eeh_pe_set_option (err 0)
vfio_spapr_eeh: Unknown symbol eeh_pe_configure (err 0)
vfio_spapr_eeh: Unknown symbol eeh_pe_reset (err 0)
vfio_spapr_eeh: Unknown symbol eeh_pe_get_state (err 0)
vfio_spapr_eeh: Unknown symbol eeh_iommu_group_to_pe (err 0)
vfio_spapr_eeh: Unknown symbol eeh_dev_open (err 0)

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-08 10:37:51 -06:00
Gavin Shan
92d18a6851 drivers/vfio: Fix EEH build error
The VFIO related components could be built as dynamic modules.
Unfortunately, CONFIG_EEH can't be configured to "m". The patch
fixes the build errors when configuring VFIO related components
as dynamic modules as follows:

  CC [M]  drivers/vfio/vfio_iommu_spapr_tce.o
In file included from drivers/vfio/vfio.c:33:0:
include/linux/vfio.h:101:43: warning: ‘struct pci_dev’ declared \
inside parameter list [enabled by default]
   :
  WRAP    arch/powerpc/boot/zImage.pseries
  WRAP    arch/powerpc/boot/zImage.maple
  WRAP    arch/powerpc/boot/zImage.pmac
  WRAP    arch/powerpc/boot/zImage.epapr
  MODPOST 1818 modules
ERROR: ".vfio_spapr_iommu_eeh_ioctl" [drivers/vfio/vfio_iommu_spapr_tce.ko]\
undefined!
ERROR: ".vfio_spapr_pci_eeh_open" [drivers/vfio/pci/vfio-pci.ko] undefined!
ERROR: ".vfio_spapr_pci_eeh_release" [drivers/vfio/pci/vfio-pci.ko] undefined!

Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-08 10:36:20 -06:00
Alex Williamson
bc4fba7712 vfio-pci: Attempt bus/slot reset on release
Each time a device is released, mark whether a local reset was
successful or whether a bus/slot reset is needed.  If a reset is
needed and all of the affected devices are bound to vfio-pci and
unused, allow the reset.  This is most useful when the userspace
driver is killed and releases all the devices in an unclean state,
such as when a QEMU VM quits.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-07 11:12:07 -06:00
Alex Williamson
61d792562b vfio-pci: Use mutex around open, release, and remove
Serializing open/release allows us to fix a refcnt error if we fail
to enable the device and lets us prevent devices from being unbound
or opened, giving us an opportunity to do bus resets on release.  No
restriction added to serialize binding devices to vfio-pci while the
mutex is held though.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-07 11:12:04 -06:00
Alex Williamson
9c22e660ce vfio-pci: Release devices with BusMaster disabled
Our current open/release path looks like this:

vfio_pci_open
  vfio_pci_enable
    pci_enable_device
    pci_save_state
    pci_store_saved_state

vfio_pci_release
  vfio_pci_disable
    pci_disable_device
    pci_restore_state

pci_enable_device() doesn't modify PCI_COMMAND_MASTER, so if a device
comes to us with it enabled, it persists through the open and gets
stored as part of the device saved state.  We then restore that saved
state when released, which can allow the device to attempt to continue
to do DMA.  When the group is disconnected from the domain, this will
get caught by the IOMMU, but if there are other devices in the group,
the device may continue running and interfere with the user.  Even in
the former case, IOMMUs don't necessarily behave well and a stream of
blocked DMA can result in unpleasant behavior on the host.

Explicitly disable Bus Master as we're enabling the device and
slightly re-work release to make sure that pci_disable_device() is
the last thing that touches the device.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-08-07 11:12:02 -06:00
Gavin Shan
1b69be5e8a drivers/vfio: EEH support for VFIO PCI device
The patch adds new IOCTL commands for sPAPR VFIO container device
to support EEH functionality for PCI devices, which have been passed
through from host to somebody else via VFIO.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:28:48 +10:00
Linus Torvalds
a68a7509d3 A handful of VFIO bug fixes for v3.16
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTkiFBAAoJECObm247sIsiymsQALDs03j1lm1ZUx3peI8JuEGp
 +4WrflS4eKBX2r+XOSqHQNlg5ceKpj1zsEMKXkLFBCwTEZXRz9tK7bu6hsRuE8XR
 j67JkBm4alsJQrJIQl0+dYJMFPVGWBNyoBc4kV2YbernBnFPc/5oOmP9wAJD80OI
 HHNQxOdfce+YZnkgltc/GkOWdG/YvuQbkyt2YmGtTePapzboj6Tf0yvkqRDaL0Bf
 07+vYi6ZOSzx/Niw/ak09hll4OhYfgiOQ/t6LETmLi7PEGR88TYO/kqdzGmlnx6V
 XsDZ+WORWjtORGnyJI4p7iVFiP7mz4mmwUptfJcA0AX98csEdYHt6H4mlVkN7Gmd
 rSvarKQ7On8MRbymQIVw5w3V78X/HiWUsUi3Fef5xytQyuD4VxylS0HDtPM/AyeN
 3zEJKOI+CIkFLZ6fReL8LKZuaRdQTbJvjxMVvmTr2XrBxZTlk91CsqT3JFXjJ6CV
 2QEv2eVmsCkPtScWNZzaeQsgH75Hc8vznZxA+/BFDoeR2iq31BPE9iuYDpaVZS58
 mnAEll4tPMIqcqQvxwoB8B8kyzX636trwW+585jr3gKRS8V6qHaWkXz06N5kzz7s
 dhRUy3rKe1SeutVo0Vs067Ym60yUlvuTeWTppuOnfRpgIpbCLeMwuiA22Wq3hBIF
 loXH9eAZOuwzBg1hqVWJ
 =aR4I
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v3.16-rc1' of git://github.com/awilliam/linux-vfio into next

Pull VFIO updates from Alex Williamson:
 "A handful of VFIO bug fixes for v3.16"

* tag 'vfio-v3.16-rc1' of git://github.com/awilliam/linux-vfio:
  drivers/vfio/pci: Fix wrong MSI interrupt count
  drivers/vfio: Rework offsetofend()
  vfio/iommu_type1: Avoid overflow
  vfio/pci: Fix unchecked return value
  vfio/pci: Fix sizing of DPA and THP express capabilities
2014-06-07 20:12:15 -07:00
Gavin Shan
fd49c81f08 drivers/vfio/pci: Fix wrong MSI interrupt count
According PCI local bus specification, the register of Message
Control for MSI (offset: 2, length: 2) has bit#0 to enable or
disable MSI logic and it shouldn't be part contributing to the
calculation of MSI interrupt count. The patch fixes the issue.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-05-30 11:35:54 -06:00
Alex Williamson
c8dbca165b vfio/iommu_type1: Avoid overflow
Coverity reports use of a tained scalar used as a loop boundary.
For the most part, any values passed from userspace for a DMA mapping
size, IOVA, or virtual address are valid, with some alignment
constraints.  The size is ultimately bound by how many pages the user
is able to lock, IOVA is tested by the IOMMU driver when doing a map,
and the virtual address needs to pass get_user_pages.  The only
problem I can find is that we do expect the __u64 user values to fit
within our variables, which might not happen on 32bit platforms.  Add
a test for this and return error on overflow.  Also propagate use of
the type-correct local variables throughout the function.

The above also points to the 'end' variable, which can be zero if
we're operating at the very top of the address space.  We try to
account for this, but our loop botches it.  Rework the loop to use
the remaining size as our loop condition rather than the IOVA vs end.

Detected by Coverity: CID 714659

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-05-30 11:35:54 -06:00