linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-17 09:14:19 +08:00

Author	SHA1	Message	Date
Shaokun Zhang	7a43ce37cd	vhost: Remove the repeated declaration Function 'vhost_vring_ioctl' is declared twice, remove the repeated declaration. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/1621857884-19964-1-git-send-email-zhangshaokun@hisilicon.com Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-07-03 04:50:53 -04:00
Stefano Garzarella	8693059284	vhost-iotlb: fix vhost_iotlb_del_range() documentation Trivial change for the vhost_iotlb_del_range() documentation, fixing the function name in the comment block. Discovered with `make C=2 M=drivers/vhost`: ../drivers/vhost/iotlb.c:92: warning: expecting prototype for vring_iotlb_del_range(). Prototype was for vhost_iotlb_del_range() instead Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210504135444.158716-1-sgarzare@redhat.com Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-07-03 04:50:50 -04:00
Alexander Aring	e3ae2365ef	net: sock: introduce sk_error_report This patch introduces a function wrapper to call the sk_error_report callback. That will prepare to add additional handling whenever sk_error_report is called, for example to trace socket errors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-29 11:28:21 -07:00
Arseny Krasnov	ced7b71371	vhost/vsock: support SEQPACKET for transport When received packet is copied to guests's rx queue, data buffers of rx queue could be smaller that data buffer of input packet, so data of input packet is copied to each rx buffer, thus each rx buffer will be a packet with dynamically created header. Fields of such header are initialized from header of input packet(except length field which value is depends on number of bytes copied to rx buffer). But in SEQPACKET case, we also need to take care of record delimeter bit: if input packet has this bit set, we don't copy it to header of packet in rx buffer, except case when such rx buffer is last part of input packet. Otherwise, we will get sequence of packets with delimeter bit set, thus braking record bounds. Also remove ignore of non-stream type of packets, handle SEQPACKET feature bit. Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-06-11 13:32:47 -07:00
Matteo Croce	224bf7db55	vhost_net: use XDP helpers Make use of the xdp_{init,prepare}_buff() helpers instead of an open-coded version. Also, the field xdp->rxq was never set, so pass NULL to xdp_init_buff() to clear it. Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-05-14 15:20:10 -07:00
Linus Torvalds	16bb86b556	virtio,vhost,vdpa: features, fixes A bunch of new drivers including vdpa support for block and virtio-vdpa. Beginning of vq kick (aka doorbell) mapping support. Misc fixes. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmCRBBEPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpiCIH/iNNTeyl4hZJ8IOTlqTagjZgUBYslpda66pU XfGKmXWpCGHYSw0XgbfHDyTZTCmdyq/b4FrxPgYrrEsQqztLIaGHyapHPcXEAThb +pHtcxqsQ8DGucJZpNU44M3kB13u07gauR540HyXzEqLXd5vEhG7dkClBjm67TWN SbJoEP3eNJMUezYuGsmUAGoi/M9NyCx+RiLd7roIlTxhIDW17PFNY0sIgG/sX6/s 1MXng0l00EjawIu4OnWfjg6kZoa6se41Rpcwd7XluTZncYKnMTJGoxDwv0xoJl4I pI5OS+Ea6ENuuygmYMEl294I5E0QeaMGFpEYyO9sm764K5bLjVw= =x0Ot -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "A bunch of new drivers including vdpa support for block and virtio-vdpa. Beginning of vq kick (aka doorbell) mapping support. Misc fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (40 commits) virtio_pci_modern: correct sparse tags for notify virtio_pci_modern: __force cast the notify mapping vDPA/ifcvf: get_config_size should return dev specific config size vDPA/ifcvf: enable Intel C5000X-PL virtio-block for vDPA vDPA/ifcvf: deduce VIRTIO device ID when probe vdpa_sim_blk: add support for vdpa management tool vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID vdpa_sim_blk: implement ramdisk behaviour vdpa: add vdpa simulator for block device vhost/vdpa: Remove the restriction that only supports virtio-net devices vhost/vdpa: use get_config_size callback in vhost_vdpa_config_validate() vdpa: add get_config_size callback in vdpa_config_ops vdpa_sim: cleanup kiovs in vdpasim_free() vringh: add vringh_kiov_length() helper vringh: implement vringh_kiov_advance() vringh: explain more about cleaning riov and wiov vringh: reset kiov 'consumed' field in __vringh_iov() vringh: add 'iotlb_lock' to synchronize iotlb accesses vdpa_sim: use iova module to allocate IOVA addresses vDPA/ifcvf: deduce VIRTIO device ID from pdev ids ...	2021-05-05 13:31:39 -07:00
Xie Yongji	9d6d97bff7	vhost/vdpa: Remove the restriction that only supports virtio-net devices Since the config checks are done by the vDPA drivers, we can remove the virtio-net restriction and we should be able to support all kinds of virtio devices. <linux/virtio_net.h> is not needed anymore, but we need to include <linux/slab.h> to avoid compilation failures. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-11-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-05-03 04:55:53 -04:00
Stefano Garzarella	d6d8bb92fd	vhost/vdpa: use get_config_size callback in vhost_vdpa_config_validate() Let's use the new 'get_config_size()' callback available instead of using the 'virtio_id' to get the size of the device config space. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-10-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-05-03 04:55:53 -04:00
Stefano Garzarella	b8c06ad4d6	vringh: implement vringh_kiov_advance() In some cases, it may be useful to provide a way to skip a number of bytes in a vringh_kiov. Let's implement vringh_kiov_advance() for this purpose, reusing the code from vringh_iov_xfer(). We replace that code calling the new vringh_kiov_advance(). Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-6-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-05-03 04:55:53 -04:00
Stefano Garzarella	69c13c58bd	vringh: explain more about cleaning riov and wiov riov and wiov can be reused with subsequent calls of vringh_getdesc_*(). Let's add a paragraph in the documentation of these functions to better explain when riov and wiov need to be cleaned up. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-5-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-05-03 04:55:53 -04:00
Stefano Garzarella	bbc2c372a8	vringh: reset kiov 'consumed' field in __vringh_iov() __vringh_iov() overwrites the contents of riov and wiov, in fact it resets the 'i' and 'used' fields, but also the 'consumed' field should be reset to avoid an inconsistent state. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-4-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-05-03 04:55:53 -04:00
Stefano Garzarella	f53d9910d0	vringh: add 'iotlb_lock' to synchronize iotlb accesses Usually iotlb accesses are synchronized with a spinlock. Let's request it as a new parameter in vringh_set_iotlb() and hold it when we navigate the iotlb in iotlb_translate() to avoid race conditions with any new additions/deletions of ranges from the ioltb. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210315163450.254396-3-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-05-03 04:55:52 -04:00
Jason Wang	3a3e0fad16	vhost-vdpa: fix vm_flags for virtqueue doorbell mapping The virtqueue doorbell is usually implemented via registeres but we don't provide the necessary vma->flags like VM_PFNMAP. This may cause several issues e.g when userspace tries to map the doorbell via vhost IOTLB, kernel may panic due to the page is not backed by page structure. This patch fixes this by setting the necessary vm_flags. With this patch, try to map doorbell via IOTLB will fail with bad address. Cc: stable@vger.kernel.org Fixes: `ddd89d0a05` ("vhost_vdpa: support doorbell mapping via mmap") Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210413091557.29008-1-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-05-03 04:55:52 -04:00
Linus Torvalds	4f9701057a	IOMMU Updates for Linux v5.13 Including: - Big cleanup of almost unsused parts of the IOMMU API by Christoph Hellwig. This mostly affects the Freescale PAMU driver. - New IOMMU driver for Unisoc SOCs - ARM SMMU Updates from Will: - SMMUv3: Drop vestigial PREFETCH_ADDR support - SMMUv3: Elide TLB sync logic for empty gather - SMMUv3: Fix "Service Failure Mode" handling - SMMUv2: New Qualcomm compatible string - Removal of the AMD IOMMU performance counter writeable check on AMD. It caused long boot delays on some machines and is only needed to work around an errata on some older (possibly pre-production) chips. If someone is still hit by this hardware issue anyway the performance counters will just return 0. - Support for targeted invalidations in the AMD IOMMU driver. Before that the driver only invalidated a single 4k page or the whole IO/TLB for an address space. This has been extended now and is mostly useful for emulated AMD IOMMUs. - Several fixes for the Shared Virtual Memory support in the Intel VT-d driver - Mediatek drivers can now be built as modules - Re-introduction of the forcedac boot option which got lost when converting the Intel VT-d driver to the common dma-iommu implementation. - Extension of the IOMMU device registration interface and support iommu_ops to be const again when drivers are built as modules. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmCMEIoACgkQK/BELZcB GuOu9xAAvg6aR0uHlxvRq6cgNnHN9Ltp5+t3qFYtRRrauY0iOPMO62k0QQli5shX CGeczD0e59KAZqI0zNJnQn8hMY5dg7XVkFCC5BrSzuCDCtwJZ0N5Tq3pfUlaV1rw BJf41t79Fd+jp7kn53tu+vRAfYZ3+sLOx/6U3c15pqKRZSkyFWbQllOtD3J5LnLu 1PyPlfiNpMwCajiS7aQbN+fuJ/lKIFeA2MDPOsCBzhbfxiJUqJxZOKAZO3rOjFfK feTibqQ+3Zz6MPXt9st1cvPpy8jCosv81OY6Knqvxf/oB5q+fEdi2uNrKISonb/t Fw331oOIwg2A+HOpwC9MN1AumOIqiHSWWENAMk9SlP+TMIWKQ8kZreyI6IEB23dV +QvP3DVA+CfLwtNY/Zh0IqKh28D+IHlKbpWNU1m+9AUe468mV/MTjfwxr9Yfffhm LZ6C0DgFdmtqv8jPuDGUOgo3RNeN8bLnUSEHG9gHibA+RKujl5BWDjKkwILqMQTt Ysdsu8TiNtFIULomizqCpgqEbQfW8TLFvASXCM1VMQ/PDURxvchZPxFDJonYXy+K z2HGaG3eUE07YrAdRKH69aMVIbmS+sjEhvmi4xZ1Lh7wWcIE2AZVvO8qNb+Ckcp3 4tLPPDksm/iQngnFf6gdgH3qv4rgbzE4+74GXqeANiQCjY9dSJI= =qF2C -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - Big cleanup of almost unsused parts of the IOMMU API by Christoph Hellwig. This mostly affects the Freescale PAMU driver. - New IOMMU driver for Unisoc SOCs - ARM SMMU Updates from Will: - Drop vestigial PREFETCH_ADDR support (SMMUv3) - Elide TLB sync logic for empty gather (SMMUv3) - Fix "Service Failure Mode" handling (SMMUv3) - New Qualcomm compatible string (SMMUv2) - Removal of the AMD IOMMU performance counter writeable check on AMD. It caused long boot delays on some machines and is only needed to work around an errata on some older (possibly pre-production) chips. If someone is still hit by this hardware issue anyway the performance counters will just return 0. - Support for targeted invalidations in the AMD IOMMU driver. Before that the driver only invalidated a single 4k page or the whole IO/TLB for an address space. This has been extended now and is mostly useful for emulated AMD IOMMUs. - Several fixes for the Shared Virtual Memory support in the Intel VT-d driver - Mediatek drivers can now be built as modules - Re-introduction of the forcedac boot option which got lost when converting the Intel VT-d driver to the common dma-iommu implementation. - Extension of the IOMMU device registration interface and support iommu_ops to be const again when drivers are built as modules. * tag 'iommu-updates-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (84 commits) iommu: Streamline registration interface iommu: Statically set module owner iommu/mediatek-v1: Add error handle for mtk_iommu_probe iommu/mediatek-v1: Avoid build fail when build as module iommu/mediatek: Always enable the clk on resume iommu/fsl-pamu: Fix uninitialized variable warning iommu/vt-d: Force to flush iotlb before creating superpage iommu/amd: Put newline after closing bracket in warning iommu/vt-d: Fix an error handling path in 'intel_prepare_irq_remapping()' iommu/vt-d: Fix build error of pasid_enable_wpe() with !X86 iommu/amd: Remove performance counter pre-initialization test Revert "iommu/amd: Fix performance counter initialization" iommu/amd: Remove duplicate check of devid iommu/exynos: Remove unneeded local variable initialization iommu/amd: Page-specific invalidations for more than one page iommu/arm-smmu-v3: Remove the unused fields for PREFETCH_CONFIG command iommu/vt-d: Avoid unnecessary cache flush in pasid entry teardown iommu/vt-d: Invalidate PASID cache when root/context entry changed iommu/vt-d: Remove WO permissions on second-level paging entries iommu/vt-d: Report the right page fault address ...	2021-05-01 09:33:00 -07:00
Linus Torvalds	d72cd4ad41	SCSI misc on 20210428 This series consists of the usual driver updates (ufs, target, tcmu, smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx). The major core change is using a sbitmap instead of an atomic for queue tracking. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYInvqCYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYh2AP0SgqqL WYZRT2oiyBOKD28v+ceOSiXvgjPlqABwVMC0BAEAn29/wNCxyvzZ1k/b0iPJ4M+S klkSxLzXKQLzJBgdK5w= =p5B/ -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This consists of the usual driver updates (ufs, target, tcmu, smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx). The major core change is using a sbitmap instead of an atomic for queue tracking" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (412 commits) scsi: target: tcm_fc: Fix a kernel-doc header scsi: target: Shorten ALUA error messages scsi: target: Fix two format specifiers scsi: target: Compare explicitly with SAM_STAT_GOOD scsi: sd: Introduce a new local variable in sd_check_events() scsi: dc395x: Open-code status_byte(u8) calls scsi: 53c700: Open-code status_byte(u8) calls scsi: smartpqi: Remove unused functions scsi: qla4xxx: Remove an unused function scsi: myrs: Remove unused functions scsi: myrb: Remove unused functions scsi: mpt3sas: Fix two kernel-doc headers scsi: fcoe: Suppress a compiler warning scsi: libfc: Fix a format specifier scsi: aacraid: Remove an unused function scsi: core: Introduce enum scsi_disposition scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case scsi: core: Rename scsi_softirq_done() into scsi_complete() scsi: core: Remove an incorrect comment scsi: core: Make the scsi_alloc_sgtables() documentation more accurate ...	2021-04-28 17:22:10 -07:00
Xie Yongji	a9d064524f	vhost-vdpa: protect concurrent access to vhost device iotlb Protect vhost device iotlb by vhost_dev->mutex. Otherwise, it might cause corruption of the list and interval tree in struct vhost_iotlb if userspace sends the VHOST_IOTLB_MSG_V2 message concurrently. Fixes: 4c8cf318("vhost: introduce vDPA-based backend") Cc: stable@vger.kernel.org Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210412095512.178-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-04-22 18:15:31 -04:00
Joerg Roedel	49d11527e5	Merge branches 'iommu/fixes', 'arm/mediatek', 'arm/smmu', 'arm/exynos', 'unisoc', 'x86/vt-d', 'x86/amd' and 'core' into next	2021-04-16 17:16:03 +02:00
Christoph Hellwig	bc9a05eef1	iommu: remove DOMAIN_ATTR_GEOMETRY The geometry information can be trivially queried from the iommu_domain struture. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Li Yang <leoyang.li@nxp.com> Link: https://lore.kernel.org/r/20210401155256.298656-16-hch@lst.de Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:56:53 +02:00
Linus Torvalds	bf152b0b41	virtio: fixes, cleanups Some fixes and cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmBTl5oPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpTjQIAMvBc1dElNT1wmEkALeR3GRG+e1FcNdvhJaC hjK23b7xuHDkX4/yyqui7bgvZTkYE5WuUU/Jq6eAOR1k3n9o6u3nV1px+ntRi4OJ dmFiXlqOgkgvCfRwIqJk68eyURIhw4vdswMn0DZGMbFubh9vUw6H4CGye6pNxqPu ZhyGMYCQKguxs3+KWtHEkjcEdZbkxkxB9G7yA0jXhGmeMDVfGbRiucJWwwRutgrs lI2uf1vI0A9qGi4kQlTLO2Qv2b9CRbFZyT1zPuqtZER2PKRLOwFuNTMUueYcaWfW 8XAM0R7mMZ1IDPgL181D+98Jk8eDQVcwVdVYOFWT9RpBdhtTel0= =3fwV -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio fixes from Michael Tsirkin: "Some fixes and cleanups all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost-vdpa: set v->config_ctx to NULL if eventfd_ctx_fdget() fails vhost-vdpa: fix use-after-free of v->config_ctx vhost: Fix vhost_vq_reset() vhost_vdpa: fix the missing irq_bypass_unregister_producer() invocation vdpa_sim: Skip typecasting from void* virtio: remove export for virtio_config_{enable, disable} virtio-mmio: Use to_virtio_mmio_device() to simply code vdpa: set the virtqueue num during register	2021-03-18 11:20:35 -07:00
Stefano Garzarella	0bde59c172	vhost-vdpa: set v->config_ctx to NULL if eventfd_ctx_fdget() fails In vhost_vdpa_set_config_call() if eventfd_ctx_fdget() fails the 'v->config_ctx' contains an error instead of a valid pointer. Since we consider 'v->config_ctx' valid if it is not NULL, we should set it to NULL in this case to avoid to use an invalid pointer in other functions such as vhost_vdpa_config_put(). Fixes: `776f395004` ("vhost_vdpa: Support config interrupt in vdpa") Cc: lingshan.zhu@intel.com Cc: stable@vger.kernel.org Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210311135257.109460-3-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-03-14 18:10:07 -04:00
Stefano Garzarella	f6bbf0010b	vhost-vdpa: fix use-after-free of v->config_ctx When the 'v->config_ctx' eventfd_ctx reference is released we didn't set it to NULL. So if the same character device (e.g. /dev/vhost-vdpa-0) is re-opened, the 'v->config_ctx' is invalid and calling again vhost_vdpa_config_put() causes use-after-free issues like the following refcount_t underflow: refcount_t: underflow; use-after-free. WARNING: CPU: 2 PID: 872 at lib/refcount.c:28 refcount_warn_saturate+0xae/0xf0 RIP: 0010:refcount_warn_saturate+0xae/0xf0 Call Trace: eventfd_ctx_put+0x5b/0x70 vhost_vdpa_release+0xcd/0x150 [vhost_vdpa] __fput+0x8e/0x240 ____fput+0xe/0x10 task_work_run+0x66/0xa0 exit_to_user_mode_prepare+0x118/0x120 syscall_exit_to_user_mode+0x21/0x50 ? __x64_sys_close+0x12/0x40 do_syscall_64+0x45/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: `776f395004` ("vhost_vdpa: Support config interrupt in vdpa") Cc: lingshan.zhu@intel.com Cc: stable@vger.kernel.org Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210311135257.109460-2-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Zhu Lingshan <lingshan.zhu@intel.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-03-14 18:10:07 -04:00
Laurent Vivier	beb691e69f	vhost: Fix vhost_vq_reset() vhost_reset_is_le() is vhost_init_is_le(), and in the case of cross-endian legacy, vhost_init_is_le() depends on vq->user_be. vq->user_be is set by vhost_disable_cross_endian(). But in vhost_vq_reset(), we have: vhost_reset_is_le(vq); vhost_disable_cross_endian(vq); And so user_be is used before being set. To fix that, reverse the lines order as there is no other dependency between them. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Link: https://lore.kernel.org/r/20210312140913.788592-1-lvivier@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-03-14 18:06:33 -04:00
Gautam Dawar	4c050286bb	vhost_vdpa: fix the missing irq_bypass_unregister_producer() invocation When qemu with vhost-vdpa netdevice is run for the first time, it works well. But after the VM is powered off, the next qemu run causes kernel panic due to a NULL pointer dereference in irq_bypass_register_producer(). When the VM is powered off, vhost_vdpa_clean_irq() misses on calling irq_bypass_unregister_producer() for irq 0 because of the existing check. This leaves stale producer nodes, which are reset in vhost_vring_call_reset() when vhost_dev_init() is invoked during the second qemu run. As the node member of struct irq_bypass_producer is also initialized to zero, traversal on the producers list causes crash due to NULL pointer dereference. Fixes: `2cf1ba9a4d` ("vhost_vdpa: implement IRQ offloading in vhost_vdpa") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211711 Signed-off-by: Gautam Dawar <gdawar.xilinx@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210224114845.104173-1-gdawar.xilinx@gmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-03-14 04:37:36 -04:00
Mike Christie	6ec29cb8ad	scsi: target: vhost-scsi: Use LIO wq cmd submission helper Convert vhost-scsi to use the LIO wq cmd submission helper. Link: https://lore.kernel.org/r/20210227170006.5077-18-michael.christie@oracle.com Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-03-04 17:37:02 -05:00
Mike Christie	0869419947	scsi: target: core: Add gfp_t arg to target_cmd_init_cdb() tcm_loop could be used like a normal block device, so we can't use GFP_KERNEL and should use GFP_NOIO. This adds a gfp_t arg to target_cmd_init_cdb() and converts the users. For every driver but loop GFP_KERNEL is kept. This will also be useful in subsequent patches where loop needs to do target_submit_prep() from interrupt context to get a ref to the se_device, and so it will need to use GFP_ATOMIC. Link: https://lore.kernel.org/r/20210227170006.5077-16-michael.christie@oracle.com Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-03-04 17:37:02 -05:00
Mike Christie	eb929804db	scsi: target: vhost-scsi: Convert to new submission API target_submit_cmd_map_sgls() is being removed, so convert vhost-scsi to the new submission API. This has it use target_init_cmd(), target_submit_prep(), target_submit() because we need to have LIO core map sgls which is now done in target_submit_prep(), and in the next patches we will do the target_submit step from the LIO workqueue. Note: vhost-scsi never calls target_stop_session() so target_submit_cmd_map_sgls() never failed (in the new API target_init_cmd() handles target_stop_session() being called when cmds are being submitted). If it were to have used target_stop_session() and got an error, we would have hit a refcount bug like xen and usb, because it does: if (rc < 0) { transport_send_check_condition_and_sense(se_cmd, TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE, 0); transport_generic_free_cmd(se_cmd, 0); } transport_send_check_condition_and_sense() calls queue_status which does transport_generic_free_cmd(), and then we do an extra transport_generic_free_cmd() call above which would have dropped the refcount to -1 and the refcount code would spit out errors. Link: https://lore.kernel.org/r/20210227170006.5077-12-michael.christie@oracle.com Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-03-04 17:37:01 -05:00
Ming Lei	c548e62bcf	scsi: sbitmap: Move allocation hint into sbitmap Allocation hint should have belonged to sbitmap. Also, when sbitmap's depth is high and there is no need to use mulitple wakeup queues, user can benefit from percpu allocation hint too. Move allocation hint into sbitmap, then SCSI device queue can benefit from allocation hint when converting to plain sbitmap. Convert vhost/scsi.c to use sbitmap allocation with percpu alloc hint. This is more efficient than the previous approach. Link: https://lore.kernel.org/r/20210122023317.687987-5-ming.lei@redhat.com Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: virtualization@lists.linux-foundation.org Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-03-04 17:36:59 -05:00
Ming Lei	efe1f3a1d5	scsi: sbitmap: Maintain allocation round_robin in sbitmap Currently the allocation round_robin info is maintained by sbitmap_queue. However, bit allocation really belongs to sbitmap. Move it there. Link: https://lore.kernel.org/r/20210122023317.687987-3-ming.lei@redhat.com Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: virtualization@lists.linux-foundation.org Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-03-04 17:36:59 -05:00
Linus Torvalds	ffc1759676	virtio: features, fixes new vdpa features to allow creation and deletion of new devices virtio-blk support per-device queue depth fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmA3+oYPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpyXgIAL71dM1GjVwnJC/hZHRPeRKBLUVzj7bAILaO i4TKQj0rs5OjJPrbGJVrbTpiUXfef+D75lzKYmOnfk+f2UeYSR6XecnlWbLddI16 RcMHQW6lt/M5WiyQjt71VH+gqtKIJLHDt3Ek1C0g8BjbFEWnpElAqdd/AWkzg9B9 ibCVPQq9dk+A8ZtfZpFB7/ykykHY8ndNQS9RJQLtE8fLNifN3Cir+uUf+pFzjjbs PvukiN7BNqHXOCeoMpMttEuYGNR29jgZHbEm1hdnSQ55NIYqLMuhoD8eO114/CBz p4clSmzhVoSU0sfc3igcyCZoVtjRcebOAaep7OoaIBRlQ1MXht8= =YFEf -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: - new vdpa features to allow creation and deletion of new devices - virtio-blk support per-device queue depth - fixes, cleanups all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (31 commits) virtio-input: add multi-touch support virtio_mmio: fix one typo vdpa/mlx5: fix param validation in mlx5_vdpa_get_config() virtio_net: Fix fall-through warnings for Clang virtio_input: Prevent EV_MSC/MSC_TIMESTAMP loop storm for MT. virtio-blk: support per-device queue depth virtio_vdpa: don't warn when fail to disable vq virtio-pci: introduce modern device module virito-pci-modern: rename map_capability() to vp_modern_map_capability() virtio-pci-modern: introduce helper to get notification offset virtio-pci-modern: introduce helper for getting queue nums virtio-pci-modern: introduce helper for setting/geting queue size virtio-pci-modern: introduce helper to set/get queue_enable virtio-pci-modern: introduce vp_modern_queue_address() virtio-pci-modern: introduce vp_modern_set_queue_vector() virtio-pci-modern: introduce vp_modern_generation() virtio-pci-modern: introduce helpers for setting and getting features virtio-pci-modern: introduce helpers for setting and getting status virtio-pci-modern: introduce helper to set config vector virtio-pci-modern: introduce vp_modern_remove() ...	2021-02-25 12:21:08 -08:00
Dongli Zhang	489084dd3f	vhost scsi: alloc vhost_scsi with kvzalloc() to avoid delay The size of 'struct vhost_scsi' is order-10 (~2.3MB). It may take long time delay by kzalloc() to compact memory pages by retrying multiple times when there is a lack of high-order pages. As a result, there is latency to create a VM (with vhost-scsi) or to hotadd vhost-scsi-based storage. The prior commit `595cb75498` ("vhost/scsi: use vmalloc for order-10 allocation") prefers to fallback only when really needed, while this patch allocates with kvzalloc() with __GFP_NORETRY implicitly set to avoid retrying memory pages compact for multiple times. The __GFP_NORETRY is implicitly set if the size to allocate is more than PAGE_SZIE and when __GFP_RETRY_MAYFAIL is not explicitly set. Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Joe Jin <joe.jin@oracle.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Link: https://lore.kernel.org/r/20210123080853.4214-1-dongli.zhang@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-02-23 07:52:56 -05:00
Yunjian Wang	dc9c9e72ff	vhost_net: avoid tx queue stuck when sendmsg fails Currently the driver doesn't drop a packet which can't be sent by tun (e.g bad packet). In this case, the driver will always process the same packet lead to the tx queue stuck. To fix this issue: 1. in the case of persistent failure (e.g bad packet), the driver can skip this descriptor by ignoring the error. 2. in the case of transient failure (e.g -ENOBUFS, -EAGAIN and -ENOMEM), the driver schedules the worker to try again. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/1610685980-38608-1-git-send-email-wangyunjian@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-19 11:13:30 -08:00
Jakub Kicinski	833d22f2f9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Trivial conflict in CAN on file rename. Conflicts: drivers/net/can/m_can/tcan4x5x-core.c Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-08 13:28:00 -08:00
Jonathan Lemon	9ee5e5ade0	tap/tun: add skb_zcopy_init() helper for initialization. Replace direct assignments with skb_zcopy_init() for zerocopy cases where a new skb is initialized, without changing the reference counts. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 16:08:37 -08:00
Jonathan Lemon	36177832f4	skbuff: Add skb parameter to the ubuf zerocopy callback Add an optional skb parameter to the zerocopy callback parameter, which is passed down from skb_zcopy_clear(). This gives access to the original skb, which is needed for upcoming RX zero-copy error handling. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 16:06:37 -08:00
Linus Torvalds	9f1abbe97c	vhost: bugfix This fixes configs with vhost vsock behind a viommu. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl/0WZEPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpJWsH/jmJwyYZgiiOfsUb0pbqzTW7bTOdUsZ0lvwS LlPVOz8Gg18A1eQO+tkUvJSlYPxfrbF0Bw6m0WQxOvCOs5kJeMbcrxNi5cB5A+qH y2KeRYYHWlTXax8kouiRqUHOvsf+XudVsB8iO18rZTdcAAV4j/bxNQa48qrnsdX5 Tw0QoQMLl/cLSV6wmx35mPfBN0SFfka3+sD6Et88p21OAYzSrY3le5HlDKzX7wRV nl8yD9gsgehqZhswQPJeaLxaJE5lK5x10GBIFNBekKsehDfUHA0CTLXVov0+kyYO PH8szOSfh/kjsYu6eXsLcYABddSqH/lTpxFzUphVVDESIiRPKCU= =rWDO -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull vhost bugfix from Michael Tsirkin: "This fixes configs with vhost vsock behind a viommu" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost/vsock: add IOTLB API support	2021-01-05 13:30:28 -08:00
Linus Torvalds	aa35e45cd4	Networking fixes for 5.11-rc3, including fixes from netfilter, wireless and bpf trees. Current release - regressions: - mt76: - usb: fix NULL pointer dereference in mt76u_status_worker - sdio: fix NULL pointer dereference in mt76s_process_tx_queue - net: ipa: fix interconnect enable bug Current release - always broken: - netfilter: ipset: fixes possible oops in mtype_resize - ath11k: fix number of coding issues found by static analysis tools and spurious error messages Previous releases - regressions: - e1000e: re-enable s0ix power saving flows for systems with the Intel i219-LM Ethernet controllers to fix power use regression - virtio_net: fix recursive call to cpus_read_lock() to avoid a deadlock - ipv4: ignore ECN bits for fib lookups in fib_compute_spec_dst() - net-sysfs: take the rtnl lock around XPS configuration - xsk: - fix memory leak for failed bind - rollback reservation at NETDEV_TX_BUSY - r8169: work around power-saving bug on some chip versions Previous releases - always broken: - dcb: validate netlink message in DCB handler - tun: fix return value when the number of iovs exceeds MAX_SKB_FRAGS to prevent unnecessary retries - vhost_net: fix ubuf refcount when sendmsg fails - bpf: save correct stopping point in file seq iteration - ncsi: use real net-device for response handler - neighbor: fix div by zero caused by a data race (TOCTOU) - bareudp: - fix use of incorrect min_headroom size - fix false positive lockdep splat from the TX lock - net: mvpp2: - clear force link UP during port init procedure in case bootloader had set it - add TCAM entry to drop flow control pause frames - fix PPPoE with ipv6 packet parsing - fix GoP Networking Complex Control config of port 3 - fix pkt coalescing IRQ-threshold configuration - xsk: fix race in SKB mode transmit with shared cq - ionic: account for vlan tag len in rx buffer len - net: stmmac: ignore the second clock input, current clock framework does not handle exclusive clock use well, other drivers may reconfigure the second clock Misc: - ppp: change PPPIOCUNBRIDGECHAN ioctl request number to follow existing scheme Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl/zsqQACgkQMUZtbf5S IrvfqA/+MbjN9TRccZRgYVzPVzlP5jswi7VZIjikPrNxCdwgQd8bDMfeaD6I1PcX WHf35vtD8zh729qz9DheWXFp7kDQ1fY0Z59KA25xf/ulFEkZPl3RBg70rSgv4rc+ T82dVo6x33DPe6NkspDC+Uhjz2IxcS/P7F9N7DtbavrfNuDyX8+0U/FFQIL0xOyG DuhwecCh0vJFGcWXTWtK1vP1CPD98L28KS2Od+EZsUUZOKt1WMyGrAgNcT6uYXmO NIYNy+FPyvvIwTLupoFE7oU4LA0sZozyvzcTDugXBF5EKoR8BwBFk0FfWzN9Oxge LrmhNBSTeYyiw8XMOwSIfxwZnBm7mJFQqTHR1+Y83Qw1SR6PfSUZgkEkW2SYgprL 9CzE3O3P3Ci7TSx7fvZUn8B1q5J0DfZR6ZYyor9zl55e+ikraRYtXsk47bf9AGXl owpHXEYWHFmgOP+LVdf1BUjuiE3vnCBJBsHlMbRkxiNPKravWtPSiM2yTu6fEbpT pMXCgFQBL/IqwzX01zuw7teg40YLVaFnmFdQbYDwA5p9VODlQvHzn2K4GyuktswX wxHYU5WRWtCkBfE+nbAROKzE7MuH9jtPtV1ZeuseTqYGBRuvEvudX8ypEvKS45pP OWkzFsSXd9q7M6cxftipwjcyLiIO+UGdizNHvDUyEQOPAyYPKb4= =N4/x -----END PGP SIGNATURE----- Merge tag 'net-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes, including fixes from netfilter, wireless and bpf trees. Current release - regressions: - mt76: fix NULL pointer dereference in mt76u_status_worker and mt76s_process_tx_queue - net: ipa: fix interconnect enable bug Current release - always broken: - netfilter: fixes possible oops in mtype_resize in ipset - ath11k: fix number of coding issues found by static analysis tools and spurious error messages Previous releases - regressions: - e1000e: re-enable s0ix power saving flows for systems with the Intel i219-LM Ethernet controllers to fix power use regression - virtio_net: fix recursive call to cpus_read_lock() to avoid a deadlock - ipv4: ignore ECN bits for fib lookups in fib_compute_spec_dst() - sysfs: take the rtnl lock around XPS configuration - xsk: fix memory leak for failed bind and rollback reservation at NETDEV_TX_BUSY - r8169: work around power-saving bug on some chip versions Previous releases - always broken: - dcb: validate netlink message in DCB handler - tun: fix return value when the number of iovs exceeds MAX_SKB_FRAGS to prevent unnecessary retries - vhost_net: fix ubuf refcount when sendmsg fails - bpf: save correct stopping point in file seq iteration - ncsi: use real net-device for response handler - neighbor: fix div by zero caused by a data race (TOCTOU) - bareudp: fix use of incorrect min_headroom size and a false positive lockdep splat from the TX lock - mvpp2: - clear force link UP during port init procedure in case bootloader had set it - add TCAM entry to drop flow control pause frames - fix PPPoE with ipv6 packet parsing - fix GoP Networking Complex Control config of port 3 - fix pkt coalescing IRQ-threshold configuration - xsk: fix race in SKB mode transmit with shared cq - ionic: account for vlan tag len in rx buffer len - stmmac: ignore the second clock input, current clock framework does not handle exclusive clock use well, other drivers may reconfigure the second clock Misc: - ppp: change PPPIOCUNBRIDGECHAN ioctl request number to follow existing scheme" * tag 'net-5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits) net: dsa: lantiq_gswip: Fix GSWIP_MII_CFG(p) register access net: dsa: lantiq_gswip: Enable GSWIP_MII_CFG_EN also for internal PHYs net: lapb: Decrease the refcount of "struct lapb_cb" in lapb_device_event r8169: work around power-saving bug on some chip versions net: usb: qmi_wwan: add Quectel EM160R-GL selftests: mlxsw: Set headroom size of correct port net: macb: Correct usage of MACB_CAPS_CLK_HW_CHG flag ibmvnic: fix: NULL pointer dereference. docs: networking: packet_mmap: fix old config reference docs: networking: packet_mmap: fix formatting for C macros vhost_net: fix ubuf refcount incorrectly when sendmsg fails bareudp: Fix use of incorrect min_headroom size bareudp: set NETIF_F_LLTX flag net: hdlc_ppp: Fix issues when mod_timer is called while timer is running atlantic: remove architecture depends erspan: fix version 1 check in gre_parse_header() net: hns: fix return value check in __lb_other_process() net: sched: prevent invalid Scell_log shift count net: neighbor: fix a crash caused by mod zero ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst() ...	2021-01-05 12:38:56 -08:00
Yunjian Wang	01e31bea7e	vhost_net: fix ubuf refcount incorrectly when sendmsg fails Currently the vhost_zerocopy_callback() maybe be called to decrease the refcount when sendmsg fails in tun. The error handling in vhost handle_tx_zerocopy() will try to decrease the same refcount again. This is wrong. To fix this issue, we only call vhost_net_ubuf_put() when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS. Fixes: `bab632d69e` ("vhost: vhost TX zero-copy support") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/1609207308-20544-1-git-send-email-wangyunjian@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-04 13:18:52 -08:00
Stefano Garzarella	e13a6915a0	vhost/vsock: add IOTLB API support This patch enables the IOTLB API support for vhost-vsock devices, allowing the userspace to emulate an IOMMU for the guest. These changes were made following vhost-net, in details this patch: - exposes VIRTIO_F_ACCESS_PLATFORM feature and inits the iotlb device if the feature is acked - implements VHOST_GET_BACKEND_FEATURES and VHOST_SET_BACKEND_FEATURES ioctls - calls vq_meta_prefetch() before vq processing to prefetch vq metadata address in IOTLB - provides .read_iter, .write_iter, and .poll callbacks for the chardev; they are used by the userspace to exchange IOTLB messages This patch was tested specifying "intel_iommu=strict" in the guest kernel command line. I used QEMU with a patch applied [1] to fix a simple issue (that patch was merged in QEMU v5.2.0): $ qemu -M q35,accel=kvm,kernel-irqchip=split \ -drive file=fedora.qcow2,format=qcow2,if=virtio \ -device intel-iommu,intremap=on,device-iotlb=on \ -device vhost-vsock-pci,guest-cid=3,iommu_platform=on,ats=on [1] https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg09077.html Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201223143638.123417-1-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2020-12-27 05:49:01 -05:00
Linus Torvalds	64145482d3	virtio,vdpa: features, cleanups, fixes vdpa sim refactoring virtio mem Big Block Mode support misc cleanus, fixes Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl/gznEPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpu/cIAJSVWVCs/5KVfeOg6NQ5WRK48g58eZoaIS6z jr5iyCRfoQs3tQgcX0W02X3QwVwesnpepF9FChFwexlh+Te3tWXKaDj3eWBmlJVh Hg8bMOOiOqY7qh47LsGbmb2pnJ3Tg8uwuTz+w/6VDc43CQa7ganwSl0owqye3ecm IdGbIIXZQs55FCzM8hwOWWpjsp1C2lRtjefsOc5AbtFjzGk+7767YT+C73UgwcSi peHbD8YFJTInQj6JCbF7uYYAWHrOFAOssWE3OwKtZJdTdJvE7bMgSZaYvUgHMvFR gRycqxpLAg6vcuns4qjiYafrywvYwEvTkPIXmMG6IAgNYIPAxK0= =SmPb -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: - vdpa sim refactoring - virtio mem: Big Block Mode support - misc cleanus, fixes * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (61 commits) vdpa: Use simpler version of ida allocation vdpa: Add missing comment for virtqueue count uapi: virtio_ids: add missing device type IDs from OASIS spec uapi: virtio_ids.h: consistent indentions vhost scsi: fix error return code in vhost_scsi_set_endpoint() virtio_ring: Fix two use after free bugs virtio_net: Fix error code in probe() virtio_ring: Cut and paste bugs in vring_create_virtqueue_packed() tools/virtio: add barrier for aarch64 tools/virtio: add krealloc_array tools/virtio: include asm/bug.h vdpa/mlx5: Use write memory barrier after updating CQ index vdpa: split vdpasim to core and net modules vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov vdpa_sim: make vdpasim->buffer size configurable vdpa_sim: use kvmalloc to allocate vdpasim->buffer vdpa_sim: set vringh notify callback vdpa_sim: add set_config callback in vdpasim_dev_attr vdpa_sim: add get_config callback in vdpasim_dev_attr vdpa_sim: make 'config' generic and usable for any device type ...	2020-12-24 12:06:46 -08:00
Zhang Changzhong	2e1139d613	vhost scsi: fix error return code in vhost_scsi_set_endpoint() Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: `25b98b64e2` ("vhost scsi: alloc cmds per vq instead of session") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1607071411-33484-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-12-18 16:14:31 -05:00
Tian Tao	0ab4b8901a	vhost_vdpa: switch to vmemdup_user() Replace opencoded alloc and copy with vmemdup_user() Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Link: https://lore.kernel.org/r/1605057288-60400-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2020-12-18 16:14:28 -05:00
Bartosz Golaszewski	3a99974872	vhost: vringh: use krealloc_array() Use the helper that checks for overflows internally instead of manually calculating the size of the new array. Link: https://lkml.kernel.org/r/20201109110654.12547-5-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Christian Knig <christian.koenig@amd.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: David Rientjes <rientjes@google.com> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: James Morse <james.morse@arm.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Jason Wang <jasowang@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Maxime Ripard <mripard@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Robert Richter <rric@kernel.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Takashi Iwai <tiwai@suse.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-15 12:13:37 -08:00
Dan Carpenter	2c602741b5	vhost_vdpa: return -EFAULT if copy_to_user() fails The copy_to_user() function returns the number of bytes remaining to be copied but this should return -EFAULT to the user. Fixes: `1b48dc03e5` ("vhost: vdpa: report iova range") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X8c32z5EtDsMyyIL@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2020-12-02 04:36:40 -05:00
Si-Wei Liu	ad89653f79	vhost-vdpa: fix page pinning leakage in error path (rework) Pinned pages are not properly accounted particularly when mapping error occurs on IOTLB update. Clean up dangling pinned pages for the error path. The memory usage for bookkeeping pinned pages is reverted to what it was before: only one single free page is needed. This helps reduce the host memory demand for VM with a large amount of memory, or in the situation where host is running short of free memory. Fixes: `4c8cf31885` ("vhost: introduce vDPA-based backend") Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Link: https://lore.kernel.org/r/1604618793-4681-1-git-send-email-si-wei.liu@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2020-11-25 04:29:07 -05:00
Stefano Garzarella	8009b0f4ab	vringh: fix vringh_iov_push_() documentation vringh_iov_push_() functions don't have 'dst' parameter, but have the 'src' parameter. Replace 'dst' description with 'src' description. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20201116161653.102904-1-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-11-25 04:22:48 -05:00
Mike Christie	b4fffc177f	vhost scsi: fix lun reset completion handling vhost scsi owns the scsi se_cmd but lio frees the se_cmd->se_tmr before calling release_cmd, so while with normal cmd completion we can access the se_cmd from the vhost work, we can't do the same with se_cmd->se_tmr. This has us copy the tmf response in vhost_scsi_queue_tm_rsp to our internal vhost-scsi tmf struct for when it gets sent to the guest from our worker thread. Fixes: `efd838fec1` ("vhost scsi: Add support for LUN resets.") Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/1605887459-3864-1-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2020-11-25 04:22:13 -05:00
Mike Christie	efd838fec1	vhost scsi: Add support for LUN resets. In newer versions of virtio-scsi we just reset the timer when an a command times out, so TMFs are never sent for the cmd time out case. However, in older kernels and for the TMF inject cases, we can still get resets and we end up just failing immediately so the guest might see the device get offlined and IO errors. For the older kernel cases, we want the same end result as the modern virtio-scsi driver where we let the lower levels fire their error handling and handle the problem. And at the upper levels we want to wait. This patch ties the LUN reset handling into the LIO TMF code which will just wait for outstanding commands to complete like we are doing in the modern virtio-scsi case. Note: I did not handle the ABORT case to keep this simple. For ABORTs LIO just waits on the cmd like how it does for the RESET case. If an ABORT fails, the guest OS ends up escalating to LUN RESET, so in the end we get the same behavior where we wait on the outstanding cmds. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/1604986403-4931-6-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-11-15 17:30:55 -05:00
Mike Christie	18f1becb69	vhost scsi: add lun parser helper Move code to parse lun from req's lun_buf to helper, so tmf code can use it in the next patch. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/1604986403-4931-5-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-11-15 17:30:55 -05:00
Mike Christie	47a3565e8b	vhost scsi: fix cmd completion race We might not do the final se_cmd put from vhost_scsi_complete_cmd_work. When the last put happens a little later then we could race where vhost_scsi_complete_cmd_work does vhost_signal, the guest runs and sends more IO, and vhost_scsi_handle_vq runs but does not find any free cmds. This patch has us delay completing the cmd until the last lio core ref is dropped. We then know that once we signal to the guest that the cmd is completed that if it queues a new command it will find a free cmd. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/1604986403-4931-4-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-11-15 17:30:55 -05:00
Mike Christie	25b98b64e2	vhost scsi: alloc cmds per vq instead of session We currently are limited to 256 cmds per session. This leads to problems where if the user has increased virtqueue_size to more than 2 or cmd_per_lun to more than 256 vhost_scsi_get_tag can fail and the guest will get IO errors. This patch moves the cmd allocation to per vq so we can easily match whatever the user has specified for num_queues and virtqueue_size/cmd_per_lun. It also makes it easier to control how much memory we preallocate. For cases, where perf is not as important and we can use the current defaults (1 vq and 128 cmds per vq) memory use from preallocate cmds is cut in half. For cases, where we are willing to use more memory for higher perf, cmd mem use will now increase as the num queues and queue depth increases. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/1604986403-4931-3-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-11-15 17:30:55 -05:00

1 2 3 4 5 ...

750 Commits