linux/drivers/vfio/mdev/mdev_core.c

276 lines
6.6 KiB
C
Raw Permalink Normal View History

// SPDX-License-Identifier: GPL-2.0-only
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
/*
* Mediated device Core Driver
*
* Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
* Author: Neo Jia <cjia@nvidia.com>
* Kirti Wankhede <kwankhede@nvidia.com>
*/
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/sysfs.h>
#include <linux/mdev.h>
#include "mdev_private.h"
#define DRIVER_VERSION "0.1"
#define DRIVER_AUTHOR "NVIDIA Corporation"
#define DRIVER_DESC "Mediated device Core Driver"
static struct class_compat *mdev_bus_compat_class;
static LIST_HEAD(mdev_list);
static DEFINE_MUTEX(mdev_list_lock);
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
/* Caller must hold parent unreg_sem read or write lock */
static void mdev_device_remove_common(struct mdev_device *mdev)
{
struct mdev_parent *parent = mdev->type->parent;
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
mdev_remove_sysfs_files(mdev);
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
device_del(&mdev->dev);
lockdep_assert_held(&parent->unreg_sem);
/* Balances with device_initialize() */
put_device(&mdev->dev);
}
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
static int mdev_device_remove_cb(struct device *dev, void *data)
{
if (dev->bus == &mdev_bus_type)
mdev_device_remove_common(to_mdev_device(dev));
return 0;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
}
/*
* mdev_register_parent: Register a device as parent for mdevs
* @parent: parent structure registered
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
* @dev: device structure representing parent device.
* @mdev_driver: Device driver to bind to the newly created mdev
* @types: Array of supported mdev types
* @nr_types: Number of entries in @types
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
*
* Registers the @parent stucture as a parent for mdev types and thus mdev
* devices. The caller needs to hold a reference on @dev that must not be
* released until after the call to mdev_unregister_parent().
*
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
* Returns a negative value on error, otherwise 0.
*/
int mdev_register_parent(struct mdev_parent *parent, struct device *dev,
struct mdev_driver *mdev_driver, struct mdev_type **types,
unsigned int nr_types)
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
{
char *env_string = "MDEV_STATE=registered";
char *envp[] = { env_string, NULL };
int ret;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
memset(parent, 0, sizeof(*parent));
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
init_rwsem(&parent->unreg_sem);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
parent->dev = dev;
parent->mdev_driver = mdev_driver;
parent->types = types;
parent->nr_types = nr_types;
atomic_set(&parent->available_instances, mdev_driver->max_instances);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
ret = parent_create_sysfs_files(parent);
if (ret)
return ret;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
if (ret)
dev_warn(dev, "Failed to create compatibility class link\n");
dev_info(dev, "MDEV: Registered\n");
kobject_uevent_env(&dev->kobj, KOBJ_CHANGE, envp);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
return 0;
}
EXPORT_SYMBOL(mdev_register_parent);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
/*
* mdev_unregister_parent : Unregister a parent device
* @parent: parent structure to unregister
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
*/
void mdev_unregister_parent(struct mdev_parent *parent)
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
{
char *env_string = "MDEV_STATE=unregistered";
char *envp[] = { env_string, NULL };
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
dev_info(parent->dev, "MDEV: Unregistering\n");
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
down_write(&parent->unreg_sem);
class_compat_remove_link(mdev_bus_compat_class, parent->dev, NULL);
device_for_each_child(parent->dev, NULL, mdev_device_remove_cb);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
parent_remove_sysfs_files(parent);
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
up_write(&parent->unreg_sem);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
kobject_uevent_env(&parent->dev->kobj, KOBJ_CHANGE, envp);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
}
EXPORT_SYMBOL(mdev_unregister_parent);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
static void mdev_device_release(struct device *dev)
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
{
struct mdev_device *mdev = to_mdev_device(dev);
struct mdev_parent *parent = mdev->type->parent;
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
mutex_lock(&mdev_list_lock);
list_del(&mdev->next);
if (!parent->mdev_driver->get_available)
atomic_inc(&parent->available_instances);
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
mutex_unlock(&mdev_list_lock);
/* Pairs with the get in mdev_device_create() */
kobject_put(&mdev->type->kobj);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
dev_dbg(&mdev->dev, "MDEV: destroying\n");
kfree(mdev);
}
int mdev_device_create(struct mdev_type *type, const guid_t *uuid)
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
{
int ret;
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
struct mdev_device *mdev, *tmp;
struct mdev_parent *parent = type->parent;
struct mdev_driver *drv = parent->mdev_driver;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
mutex_lock(&mdev_list_lock);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
/* Check for duplicate */
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
list_for_each_entry(tmp, &mdev_list, next) {
if (guid_equal(&tmp->uuid, uuid)) {
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
mutex_unlock(&mdev_list_lock);
return -EEXIST;
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
}
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
}
if (!drv->get_available) {
/*
* Note: that non-atomic read and dec is fine here because
* all modifications are under mdev_list_lock.
*/
if (!atomic_read(&parent->available_instances)) {
mutex_unlock(&mdev_list_lock);
return -EUSERS;
}
atomic_dec(&parent->available_instances);
}
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
mdev = kzalloc(sizeof(*mdev), GFP_KERNEL);
if (!mdev) {
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
mutex_unlock(&mdev_list_lock);
return -ENOMEM;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
}
device_initialize(&mdev->dev);
mdev->dev.parent = parent->dev;
mdev->dev.bus = &mdev_bus_type;
mdev->dev.release = mdev_device_release;
mdev->dev.groups = mdev_device_groups;
mdev->type = type;
/* Pairs with the put in mdev_device_release() */
kobject_get(&type->kobj);
guid_copy(&mdev->uuid, uuid);
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
list_add(&mdev->next, &mdev_list);
mutex_unlock(&mdev_list_lock);
ret = dev_set_name(&mdev->dev, "%pUl", uuid);
if (ret)
goto out_put_device;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
/* Check if parent unregistration has started */
if (!down_read_trylock(&parent->unreg_sem)) {
ret = -ENODEV;
goto out_put_device;
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
}
vfio/mdev: Improve the create/remove sequence This patch addresses below two issues and prepares the code to address 3rd issue listed below. 1. mdev device is placed on the mdev bus before it is created in the vendor driver. Once a device is placed on the mdev bus without creating its supporting underlying vendor device, mdev driver's probe() gets triggered. However there isn't a stable mdev available to work on. create_store() mdev_create_device() device_register() ... vfio_mdev_probe() [...] parent->ops->create() vfio_ap_mdev_create() mdev_set_drvdata(mdev, matrix_mdev); /* Valid pointer set above */ Due to this way of initialization, mdev driver who wants to use the mdev, doesn't have a valid mdev to work on. 2. Current creation sequence is, parent->ops_create() groups_register() Remove sequence is, parent->ops->remove() groups_unregister() However, remove sequence should be exact mirror of creation sequence. Once this is achieved, all users of the mdev will be terminated first before removing underlying vendor device. (Follow standard linux driver model). At that point vendor's remove() ops shouldn't fail because taking the device off the bus should terminate any usage. 3. When remove operation fails, mdev sysfs removal attempts to add the file back on already removed device. Following call trace [1] is observed. [1] call trace: kernel: WARNING: CPU: 2 PID: 9348 at fs/sysfs/file.c:327 sysfs_create_file_ns+0x7f/0x90 kernel: CPU: 2 PID: 9348 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 kernel: RIP: 0010:sysfs_create_file_ns+0x7f/0x90 kernel: Call Trace: kernel: remove_store+0xdc/0x100 [mdev] kernel: kernfs_fop_write+0x113/0x1a0 kernel: vfs_write+0xad/0x1b0 kernel: ksys_write+0x5a/0xe0 kernel: do_syscall_64+0x5a/0x210 kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved in following ways. 1. Split the device registration/deregistration sequence so that some things can be done between initialization of the device and hooking it up to the bus respectively after deregistering it from the bus but before giving up our final reference. In particular, this means invoking the ->create() and ->remove() callbacks in those new windows. This gives the vendor driver an initialized mdev device to work with during creation. At the same time, a bus driver who wish to bind to mdev driver also gets initialized mdev device. This follows standard Linux kernel bus and device model. 2. During remove flow, first remove the device from the bus. This ensures that any bus specific devices are removed. Once device is taken off the mdev bus, invoke remove() of mdev from the vendor driver. 3. The driver core device model provides way to register and auto unregister the device sysfs attribute groups at dev->groups. Make use of dev->groups to let core create the groups and eliminate code to avoid explicit groups creation and removal. To ensure, that new sequence is solid, a below stack dump of a process is taken who attempts to remove the device while device is in use by vfio driver and user application. This stack dump validates that vfio driver guards against such device removal when device is in use. cat /proc/21962/stack [<0>] vfio_del_group_dev+0x216/0x3c0 [vfio] [<0>] mdev_remove+0x21/0x40 [mdev] [<0>] device_release_driver_internal+0xe8/0x1b0 [<0>] bus_remove_device+0xf9/0x170 [<0>] device_del+0x168/0x350 [<0>] mdev_device_remove_common+0x1d/0x50 [mdev] [<0>] mdev_device_remove+0x8c/0xd0 [mdev] [<0>] remove_store+0x71/0x90 [mdev] [<0>] kernfs_fop_write+0x113/0x1a0 [<0>] vfs_write+0xad/0x1b0 [<0>] ksys_write+0x5a/0xe0 [<0>] do_syscall_64+0x5a/0x210 [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<0>] 0xffffffffffffffff This prepares the code to eliminate calling device_create_file() in subsequent patch. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:32 +08:00
ret = device_add(&mdev->dev);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
if (ret)
goto out_unlock;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
ret = device_driver_attach(&drv->driver, &mdev->dev);
if (ret)
goto out_del;
ret = mdev_create_sysfs_files(mdev);
vfio/mdev: Improve the create/remove sequence This patch addresses below two issues and prepares the code to address 3rd issue listed below. 1. mdev device is placed on the mdev bus before it is created in the vendor driver. Once a device is placed on the mdev bus without creating its supporting underlying vendor device, mdev driver's probe() gets triggered. However there isn't a stable mdev available to work on. create_store() mdev_create_device() device_register() ... vfio_mdev_probe() [...] parent->ops->create() vfio_ap_mdev_create() mdev_set_drvdata(mdev, matrix_mdev); /* Valid pointer set above */ Due to this way of initialization, mdev driver who wants to use the mdev, doesn't have a valid mdev to work on. 2. Current creation sequence is, parent->ops_create() groups_register() Remove sequence is, parent->ops->remove() groups_unregister() However, remove sequence should be exact mirror of creation sequence. Once this is achieved, all users of the mdev will be terminated first before removing underlying vendor device. (Follow standard linux driver model). At that point vendor's remove() ops shouldn't fail because taking the device off the bus should terminate any usage. 3. When remove operation fails, mdev sysfs removal attempts to add the file back on already removed device. Following call trace [1] is observed. [1] call trace: kernel: WARNING: CPU: 2 PID: 9348 at fs/sysfs/file.c:327 sysfs_create_file_ns+0x7f/0x90 kernel: CPU: 2 PID: 9348 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 kernel: RIP: 0010:sysfs_create_file_ns+0x7f/0x90 kernel: Call Trace: kernel: remove_store+0xdc/0x100 [mdev] kernel: kernfs_fop_write+0x113/0x1a0 kernel: vfs_write+0xad/0x1b0 kernel: ksys_write+0x5a/0xe0 kernel: do_syscall_64+0x5a/0x210 kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved in following ways. 1. Split the device registration/deregistration sequence so that some things can be done between initialization of the device and hooking it up to the bus respectively after deregistering it from the bus but before giving up our final reference. In particular, this means invoking the ->create() and ->remove() callbacks in those new windows. This gives the vendor driver an initialized mdev device to work with during creation. At the same time, a bus driver who wish to bind to mdev driver also gets initialized mdev device. This follows standard Linux kernel bus and device model. 2. During remove flow, first remove the device from the bus. This ensures that any bus specific devices are removed. Once device is taken off the mdev bus, invoke remove() of mdev from the vendor driver. 3. The driver core device model provides way to register and auto unregister the device sysfs attribute groups at dev->groups. Make use of dev->groups to let core create the groups and eliminate code to avoid explicit groups creation and removal. To ensure, that new sequence is solid, a below stack dump of a process is taken who attempts to remove the device while device is in use by vfio driver and user application. This stack dump validates that vfio driver guards against such device removal when device is in use. cat /proc/21962/stack [<0>] vfio_del_group_dev+0x216/0x3c0 [vfio] [<0>] mdev_remove+0x21/0x40 [mdev] [<0>] device_release_driver_internal+0xe8/0x1b0 [<0>] bus_remove_device+0xf9/0x170 [<0>] device_del+0x168/0x350 [<0>] mdev_device_remove_common+0x1d/0x50 [mdev] [<0>] mdev_device_remove+0x8c/0xd0 [mdev] [<0>] remove_store+0x71/0x90 [mdev] [<0>] kernfs_fop_write+0x113/0x1a0 [<0>] vfs_write+0xad/0x1b0 [<0>] ksys_write+0x5a/0xe0 [<0>] do_syscall_64+0x5a/0x210 [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<0>] 0xffffffffffffffff This prepares the code to eliminate calling device_create_file() in subsequent patch. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:32 +08:00
if (ret)
goto out_del;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
mdev->active = true;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
dev_dbg(&mdev->dev, "MDEV: created\n");
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
up_read(&parent->unreg_sem);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
return 0;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
out_del:
vfio/mdev: Improve the create/remove sequence This patch addresses below two issues and prepares the code to address 3rd issue listed below. 1. mdev device is placed on the mdev bus before it is created in the vendor driver. Once a device is placed on the mdev bus without creating its supporting underlying vendor device, mdev driver's probe() gets triggered. However there isn't a stable mdev available to work on. create_store() mdev_create_device() device_register() ... vfio_mdev_probe() [...] parent->ops->create() vfio_ap_mdev_create() mdev_set_drvdata(mdev, matrix_mdev); /* Valid pointer set above */ Due to this way of initialization, mdev driver who wants to use the mdev, doesn't have a valid mdev to work on. 2. Current creation sequence is, parent->ops_create() groups_register() Remove sequence is, parent->ops->remove() groups_unregister() However, remove sequence should be exact mirror of creation sequence. Once this is achieved, all users of the mdev will be terminated first before removing underlying vendor device. (Follow standard linux driver model). At that point vendor's remove() ops shouldn't fail because taking the device off the bus should terminate any usage. 3. When remove operation fails, mdev sysfs removal attempts to add the file back on already removed device. Following call trace [1] is observed. [1] call trace: kernel: WARNING: CPU: 2 PID: 9348 at fs/sysfs/file.c:327 sysfs_create_file_ns+0x7f/0x90 kernel: CPU: 2 PID: 9348 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 kernel: RIP: 0010:sysfs_create_file_ns+0x7f/0x90 kernel: Call Trace: kernel: remove_store+0xdc/0x100 [mdev] kernel: kernfs_fop_write+0x113/0x1a0 kernel: vfs_write+0xad/0x1b0 kernel: ksys_write+0x5a/0xe0 kernel: do_syscall_64+0x5a/0x210 kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved in following ways. 1. Split the device registration/deregistration sequence so that some things can be done between initialization of the device and hooking it up to the bus respectively after deregistering it from the bus but before giving up our final reference. In particular, this means invoking the ->create() and ->remove() callbacks in those new windows. This gives the vendor driver an initialized mdev device to work with during creation. At the same time, a bus driver who wish to bind to mdev driver also gets initialized mdev device. This follows standard Linux kernel bus and device model. 2. During remove flow, first remove the device from the bus. This ensures that any bus specific devices are removed. Once device is taken off the mdev bus, invoke remove() of mdev from the vendor driver. 3. The driver core device model provides way to register and auto unregister the device sysfs attribute groups at dev->groups. Make use of dev->groups to let core create the groups and eliminate code to avoid explicit groups creation and removal. To ensure, that new sequence is solid, a below stack dump of a process is taken who attempts to remove the device while device is in use by vfio driver and user application. This stack dump validates that vfio driver guards against such device removal when device is in use. cat /proc/21962/stack [<0>] vfio_del_group_dev+0x216/0x3c0 [vfio] [<0>] mdev_remove+0x21/0x40 [mdev] [<0>] device_release_driver_internal+0xe8/0x1b0 [<0>] bus_remove_device+0xf9/0x170 [<0>] device_del+0x168/0x350 [<0>] mdev_device_remove_common+0x1d/0x50 [mdev] [<0>] mdev_device_remove+0x8c/0xd0 [mdev] [<0>] remove_store+0x71/0x90 [mdev] [<0>] kernfs_fop_write+0x113/0x1a0 [<0>] vfs_write+0xad/0x1b0 [<0>] ksys_write+0x5a/0xe0 [<0>] do_syscall_64+0x5a/0x210 [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<0>] 0xffffffffffffffff This prepares the code to eliminate calling device_create_file() in subsequent patch. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:32 +08:00
device_del(&mdev->dev);
out_unlock:
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
up_read(&parent->unreg_sem);
out_put_device:
vfio/mdev: Improve the create/remove sequence This patch addresses below two issues and prepares the code to address 3rd issue listed below. 1. mdev device is placed on the mdev bus before it is created in the vendor driver. Once a device is placed on the mdev bus without creating its supporting underlying vendor device, mdev driver's probe() gets triggered. However there isn't a stable mdev available to work on. create_store() mdev_create_device() device_register() ... vfio_mdev_probe() [...] parent->ops->create() vfio_ap_mdev_create() mdev_set_drvdata(mdev, matrix_mdev); /* Valid pointer set above */ Due to this way of initialization, mdev driver who wants to use the mdev, doesn't have a valid mdev to work on. 2. Current creation sequence is, parent->ops_create() groups_register() Remove sequence is, parent->ops->remove() groups_unregister() However, remove sequence should be exact mirror of creation sequence. Once this is achieved, all users of the mdev will be terminated first before removing underlying vendor device. (Follow standard linux driver model). At that point vendor's remove() ops shouldn't fail because taking the device off the bus should terminate any usage. 3. When remove operation fails, mdev sysfs removal attempts to add the file back on already removed device. Following call trace [1] is observed. [1] call trace: kernel: WARNING: CPU: 2 PID: 9348 at fs/sysfs/file.c:327 sysfs_create_file_ns+0x7f/0x90 kernel: CPU: 2 PID: 9348 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 kernel: RIP: 0010:sysfs_create_file_ns+0x7f/0x90 kernel: Call Trace: kernel: remove_store+0xdc/0x100 [mdev] kernel: kernfs_fop_write+0x113/0x1a0 kernel: vfs_write+0xad/0x1b0 kernel: ksys_write+0x5a/0xe0 kernel: do_syscall_64+0x5a/0x210 kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved in following ways. 1. Split the device registration/deregistration sequence so that some things can be done between initialization of the device and hooking it up to the bus respectively after deregistering it from the bus but before giving up our final reference. In particular, this means invoking the ->create() and ->remove() callbacks in those new windows. This gives the vendor driver an initialized mdev device to work with during creation. At the same time, a bus driver who wish to bind to mdev driver also gets initialized mdev device. This follows standard Linux kernel bus and device model. 2. During remove flow, first remove the device from the bus. This ensures that any bus specific devices are removed. Once device is taken off the mdev bus, invoke remove() of mdev from the vendor driver. 3. The driver core device model provides way to register and auto unregister the device sysfs attribute groups at dev->groups. Make use of dev->groups to let core create the groups and eliminate code to avoid explicit groups creation and removal. To ensure, that new sequence is solid, a below stack dump of a process is taken who attempts to remove the device while device is in use by vfio driver and user application. This stack dump validates that vfio driver guards against such device removal when device is in use. cat /proc/21962/stack [<0>] vfio_del_group_dev+0x216/0x3c0 [vfio] [<0>] mdev_remove+0x21/0x40 [mdev] [<0>] device_release_driver_internal+0xe8/0x1b0 [<0>] bus_remove_device+0xf9/0x170 [<0>] device_del+0x168/0x350 [<0>] mdev_device_remove_common+0x1d/0x50 [mdev] [<0>] mdev_device_remove+0x8c/0xd0 [mdev] [<0>] remove_store+0x71/0x90 [mdev] [<0>] kernfs_fop_write+0x113/0x1a0 [<0>] vfs_write+0xad/0x1b0 [<0>] ksys_write+0x5a/0xe0 [<0>] do_syscall_64+0x5a/0x210 [<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<0>] 0xffffffffffffffff This prepares the code to eliminate calling device_create_file() in subsequent patch. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:32 +08:00
put_device(&mdev->dev);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
return ret;
}
int mdev_device_remove(struct mdev_device *mdev)
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
{
struct mdev_device *tmp;
struct mdev_parent *parent = mdev->type->parent;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
mutex_lock(&mdev_list_lock);
list_for_each_entry(tmp, &mdev_list, next) {
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
if (tmp == mdev)
break;
}
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
if (tmp != mdev) {
mutex_unlock(&mdev_list_lock);
return -ENODEV;
}
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
if (!mdev->active) {
mutex_unlock(&mdev_list_lock);
return -EAGAIN;
}
vfio/mdev: Check globally for duplicate devices When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-05-16 03:53:55 +08:00
mdev->active = false;
mutex_unlock(&mdev_list_lock);
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
/* Check if parent unregistration has started */
if (!down_read_trylock(&parent->unreg_sem))
return -ENODEV;
vfio/mdev: Synchronize device create/remove with parent removal In following sequences, child devices created while removing mdev parent device can be left out, or it may lead to race of removing half initialized child mdev devices. issue-1: -------- cpu-0 cpu-1 ----- ----- mdev_unregister_device() device_for_each_child() mdev_device_remove_cb() mdev_device_remove() create_store() mdev_device_create() [...] device_add() parent_remove_sysfs_files() /* BUG: device added by cpu-0 * whose parent is getting removed * and it won't process this mdev. */ issue-2: -------- Below crash is observed when user initiated remove is in progress and mdev_unregister_driver() completes parent unregistration. cpu-0 cpu-1 ----- ----- remove_store() mdev_device_remove() active = false; mdev_unregister_device() parent device removed. [...] parents->ops->remove() /* * BUG: Accessing invalid parent. */ This is similar race like create() racing with mdev_unregister_device(). BUG: unable to handle kernel paging request at ffffffffc0585668 PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: remove_store+0x71/0x90 [mdev] kernfs_fop_write+0x113/0x1a0 vfs_write+0xad/0x1b0 ksys_write+0x5a/0xe0 do_syscall_64+0x5a/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Therefore, mdev core is improved as below to overcome above issues. Wait for any ongoing mdev create() and remove() to finish before unregistering parent device. This continues to allow multiple create and remove to progress in parallel for different mdev devices as most common case. At the same time guard parent removal while parent is being accessed by create() and remove() callbacks. create()/remove() and unregister_device() are synchronized by the rwsem. Refactor device removal code to mdev_device_remove_common() to avoid acquiring unreg_sem of the parent. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2019-06-07 00:52:33 +08:00
mdev_device_remove_common(mdev);
up_read(&parent->unreg_sem);
return 0;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
}
static int __init mdev_init(void)
{
vfio/mdev: Move the compat_class initialization to module init The pointer to mdev_bus_compat_class is statically defined at the top of mdev_core, and was originally (commit 7b96953bc640 ("vfio: Mediated device Core driver") serialized by the parent_list_lock. The blamed commit removed this mutex, leaving the pointer initialization unserialized. As a result, the creation of multiple MDEVs in parallel (such as during boot) can encounter errors during the creation of the sysfs entries, such as: [ 8.337509] sysfs: cannot create duplicate filename '/class/mdev_bus' [ 8.337514] vfio_ccw 0.0.01d8: MDEV: Registered [ 8.337516] CPU: 13 PID: 946 Comm: driverctl Not tainted 6.4.0-rc7 #20 [ 8.337522] Hardware name: IBM 3906 M05 780 (LPAR) [ 8.337525] Call Trace: [ 8.337528] [<0000000162b0145a>] dump_stack_lvl+0x62/0x80 [ 8.337540] [<00000001622aeb30>] sysfs_warn_dup+0x78/0x88 [ 8.337549] [<00000001622aeca6>] sysfs_create_dir_ns+0xe6/0xf8 [ 8.337552] [<0000000162b04504>] kobject_add_internal+0xf4/0x340 [ 8.337557] [<0000000162b04d48>] kobject_add+0x78/0xd0 [ 8.337561] [<0000000162b04e0a>] kobject_create_and_add+0x6a/0xb8 [ 8.337565] [<00000001627a110e>] class_compat_register+0x5e/0x90 [ 8.337572] [<000003ff7fd815da>] mdev_register_parent+0x102/0x130 [mdev] [ 8.337581] [<000003ff7fdc7f2c>] vfio_ccw_sch_probe+0xe4/0x178 [vfio_ccw] [ 8.337588] [<0000000162a7833c>] css_probe+0x44/0x80 [ 8.337599] [<000000016279f4da>] really_probe+0xd2/0x460 [ 8.337603] [<000000016279fa08>] driver_probe_device+0x40/0xf0 [ 8.337606] [<000000016279fb78>] __device_attach_driver+0xc0/0x140 [ 8.337610] [<000000016279cbe0>] bus_for_each_drv+0x90/0xd8 [ 8.337618] [<00000001627a00b0>] __device_attach+0x110/0x190 [ 8.337621] [<000000016279c7c8>] bus_rescan_devices_helper+0x60/0xb0 [ 8.337626] [<000000016279cd48>] drivers_probe_store+0x48/0x80 [ 8.337632] [<00000001622ac9b0>] kernfs_fop_write_iter+0x138/0x1f0 [ 8.337635] [<00000001621e5e14>] vfs_write+0x1ac/0x2f8 [ 8.337645] [<00000001621e61d8>] ksys_write+0x70/0x100 [ 8.337650] [<0000000162b2bdc4>] __do_syscall+0x1d4/0x200 [ 8.337656] [<0000000162b3c828>] system_call+0x70/0x98 [ 8.337664] kobject: kobject_add_internal failed for mdev_bus with -EEXIST, don't try to register things with the same name in the same directory. [ 8.337668] kobject: kobject_create_and_add: kobject_add error: -17 [ 8.337674] vfio_ccw: probe of 0.0.01d9 failed with error -12 [ 8.342941] vfio_ccw_mdev aeb9ca91-10c6-42bc-a168-320023570aea: Adding to iommu group 2 Move the initialization of the mdev_bus_compat_class pointer to the init path, to match the cleanup in module exit. This way the code in mdev_register_parent() can simply link the new parent to it, rather than determining whether initialization is required first. Fixes: 89345d5177aa ("vfio/mdev: embedd struct mdev_parent in the parent data structure") Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230626133642.2939168-1-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-06-26 21:36:42 +08:00
int ret;
ret = bus_register(&mdev_bus_type);
if (ret)
return ret;
mdev_bus_compat_class = class_compat_register("mdev_bus");
if (!mdev_bus_compat_class) {
bus_unregister(&mdev_bus_type);
return -ENOMEM;
}
return 0;
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
}
static void __exit mdev_exit(void)
{
vfio/mdev: Move the compat_class initialization to module init The pointer to mdev_bus_compat_class is statically defined at the top of mdev_core, and was originally (commit 7b96953bc640 ("vfio: Mediated device Core driver") serialized by the parent_list_lock. The blamed commit removed this mutex, leaving the pointer initialization unserialized. As a result, the creation of multiple MDEVs in parallel (such as during boot) can encounter errors during the creation of the sysfs entries, such as: [ 8.337509] sysfs: cannot create duplicate filename '/class/mdev_bus' [ 8.337514] vfio_ccw 0.0.01d8: MDEV: Registered [ 8.337516] CPU: 13 PID: 946 Comm: driverctl Not tainted 6.4.0-rc7 #20 [ 8.337522] Hardware name: IBM 3906 M05 780 (LPAR) [ 8.337525] Call Trace: [ 8.337528] [<0000000162b0145a>] dump_stack_lvl+0x62/0x80 [ 8.337540] [<00000001622aeb30>] sysfs_warn_dup+0x78/0x88 [ 8.337549] [<00000001622aeca6>] sysfs_create_dir_ns+0xe6/0xf8 [ 8.337552] [<0000000162b04504>] kobject_add_internal+0xf4/0x340 [ 8.337557] [<0000000162b04d48>] kobject_add+0x78/0xd0 [ 8.337561] [<0000000162b04e0a>] kobject_create_and_add+0x6a/0xb8 [ 8.337565] [<00000001627a110e>] class_compat_register+0x5e/0x90 [ 8.337572] [<000003ff7fd815da>] mdev_register_parent+0x102/0x130 [mdev] [ 8.337581] [<000003ff7fdc7f2c>] vfio_ccw_sch_probe+0xe4/0x178 [vfio_ccw] [ 8.337588] [<0000000162a7833c>] css_probe+0x44/0x80 [ 8.337599] [<000000016279f4da>] really_probe+0xd2/0x460 [ 8.337603] [<000000016279fa08>] driver_probe_device+0x40/0xf0 [ 8.337606] [<000000016279fb78>] __device_attach_driver+0xc0/0x140 [ 8.337610] [<000000016279cbe0>] bus_for_each_drv+0x90/0xd8 [ 8.337618] [<00000001627a00b0>] __device_attach+0x110/0x190 [ 8.337621] [<000000016279c7c8>] bus_rescan_devices_helper+0x60/0xb0 [ 8.337626] [<000000016279cd48>] drivers_probe_store+0x48/0x80 [ 8.337632] [<00000001622ac9b0>] kernfs_fop_write_iter+0x138/0x1f0 [ 8.337635] [<00000001621e5e14>] vfs_write+0x1ac/0x2f8 [ 8.337645] [<00000001621e61d8>] ksys_write+0x70/0x100 [ 8.337650] [<0000000162b2bdc4>] __do_syscall+0x1d4/0x200 [ 8.337656] [<0000000162b3c828>] system_call+0x70/0x98 [ 8.337664] kobject: kobject_add_internal failed for mdev_bus with -EEXIST, don't try to register things with the same name in the same directory. [ 8.337668] kobject: kobject_create_and_add: kobject_add error: -17 [ 8.337674] vfio_ccw: probe of 0.0.01d9 failed with error -12 [ 8.342941] vfio_ccw_mdev aeb9ca91-10c6-42bc-a168-320023570aea: Adding to iommu group 2 Move the initialization of the mdev_bus_compat_class pointer to the init path, to match the cleanup in module exit. This way the code in mdev_register_parent() can simply link the new parent to it, rather than determining whether initialization is required first. Fixes: 89345d5177aa ("vfio/mdev: embedd struct mdev_parent in the parent data structure") Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230626133642.2939168-1-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-06-26 21:36:42 +08:00
class_compat_unregister(mdev_bus_compat_class);
bus_unregister(&mdev_bus_type);
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
}
subsys_initcall(mdev_init)
vfio: Mediated device Core driver Design for Mediated Device Driver: Main purpose of this driver is to provide a common interface for mediated device management that can be used by different drivers of different devices. This module provides a generic interface to create the device, add it to mediated bus, add device to IOMMU group and then add it to vfio group. Below is the high Level block diagram, with Nvidia, Intel and IBM devices as example, since these are the devices which are going to actively use this module as of now. +---------------+ | | | +-----------+ | mdev_register_driver() +--------------+ | | | +<------------------------+ __init() | | | mdev | | | | | | bus | +------------------------>+ |<-> VFIO user | | driver | | probe()/remove() | vfio_mdev.ko | APIs | | | | | | | +-----------+ | +--------------+ | | | MDEV CORE | | MODULE | | mdev.ko | | +-----------+ | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | nvidia.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | Physical | | | | device | | mdev_register_device() +--------------+ | | interface | |<------------------------+ | | | | | | i915.ko |<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | | | | | | | | mdev_register_device() +--------------+ | | | +<------------------------+ | | | | | | ccw_device.ko|<-> physical | | | +------------------------>+ | device | | | | callback +--------------+ | +-----------+ | +---------------+ Core driver provides two types of registration interfaces: 1. Registration interface for mediated bus driver: /** * struct mdev_driver - Mediated device's driver * @name: driver name * @probe: called when new device created * @remove:called when device removed * @driver:device driver structure * **/ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driver driver; }; Mediated bus driver for mdev device should use this interface to register and unregister with core driver respectively: int mdev_register_driver(struct mdev_driver *drv, struct module *owner); void mdev_unregister_driver(struct mdev_driver *drv); Mediated bus driver is responsible to add/delete mediated devices to/from VFIO group when devices are bound and unbound to the driver. 2. Physical device driver interface This interface provides vendor driver the set APIs to manage physical device related work in its driver. APIs are : * dev_attr_groups: attributes of the parent device. * mdev_attr_groups: attributes of the mediated device. * supported_type_groups: attributes to define supported type. This is mandatory field. * create: to allocate basic resources in vendor driver for a mediated device. This is mandatory to be provided by vendor driver. * remove: to free resources in vendor driver when mediated device is destroyed. This is mandatory to be provided by vendor driver. * open: open callback of mediated device * release: release callback of mediated device * read : read emulation callback. * write: write emulation callback. * ioctl: ioctl callback. * mmap: mmap emulation callback. Drivers should use these interfaces to register and unregister device to mdev core driver respectively: extern int mdev_register_device(struct device *dev, const struct parent_ops *ops); extern void mdev_unregister_device(struct device *dev); There are no locks to serialize above callbacks in mdev driver and vfio_mdev driver. If required, vendor driver can have locks to serialize above APIs in their driver. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Jike Song <jike.song@intel.com> Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-11-17 04:46:13 +08:00
module_exit(mdev_exit)
MODULE_VERSION(DRIVER_VERSION);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR(DRIVER_AUTHOR);
MODULE_DESCRIPTION(DRIVER_DESC);