linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-05 01:54:09 +08:00

Author	SHA1	Message	Date
Lee Jones	721a6fe5f9	i2c: busses: i2c-st: Fix copy/paste function misnaming issues Fixes the following W=1 kernel build warning(s): drivers/i2c/busses/i2c-st.c:531: warning: expecting prototype for st_i2c_handle_write(). Prototype was for st_i2c_handle_read() instead drivers/i2c/busses/i2c-st.c:566: warning: expecting prototype for st_i2c_isr(). Prototype was for st_i2c_isr_thread() instead Fix the "enmpty" typo while here. Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Alain Volmat <alain.volmat@foss.st.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-05-27 21:39:35 +02:00
Lee Jones	3e0f8672f1	i2c: busses: i2c-pnx: Provide descriptions for 'alg_data' data structure Fixes the following W=1 kernel build warning(s): drivers/i2c/busses/i2c-pnx.c:147: warning: Function parameter or member 'alg_data' not described in 'i2c_pnx_start' drivers/i2c/busses/i2c-pnx.c:147: warning: Excess function parameter 'adap' description in 'i2c_pnx_start' drivers/i2c/busses/i2c-pnx.c:202: warning: Function parameter or member 'alg_data' not described in 'i2c_pnx_stop' drivers/i2c/busses/i2c-pnx.c:202: warning: Excess function parameter 'adap' description in 'i2c_pnx_stop' drivers/i2c/busses/i2c-pnx.c:231: warning: Function parameter or member 'alg_data' not described in 'i2c_pnx_master_xmit' drivers/i2c/busses/i2c-pnx.c:231: warning: Excess function parameter 'adap' description in 'i2c_pnx_master_xmit' drivers/i2c/busses/i2c-pnx.c:301: warning: Function parameter or member 'alg_data' not described in 'i2c_pnx_master_rcv' drivers/i2c/busses/i2c-pnx.c:301: warning: Excess function parameter 'adap' description in 'i2c_pnx_master_rcv' Signed-off-by: Lee Jones <lee.jones@linaro.org> Acked-by: Vladimir Zapolskiy <vz@mleia.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-05-27 21:34:08 +02:00
Lee Jones	d4c73d41be	i2c: busses: i2c-ocores: Place the expected function names into the documentation headers Fixes the following W=1 kernel build warning(s): drivers/i2c/busses/i2c-ocores.c:253: warning: This comment starts with '/', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst drivers/i2c/busses/i2c-ocores.c:267: warning: This comment starts with '/', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst drivers/i2c/busses/i2c-ocores.c:299: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst drivers/i2c/busses/i2c-ocores.c:347: warning: expecting prototype for It handles an IRQ(). Prototype was for ocores_process_polling() instead Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Peter Korsgaard <peter@korsgaard.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-05-27 21:33:41 +02:00
Lee Jones	f9f193fc22	i2c: busses: i2c-eg20t: Fix 'bad line' issue and provide description for 'msgs' param Fixes the following W=1 kernel build warning(s): drivers/i2c/busses/i2c-eg20t.c:151: warning: bad line: PCH i2c controller drivers/i2c/busses/i2c-eg20t.c:369: warning: Function parameter or member 'msgs' not described in 'pch_i2c_writebytes' Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-05-27 21:33:10 +02:00
Lee Jones	b4c760de3c	i2c: busses: i2c-designware-master: Fix misnaming of 'i2c_dw_init_master()' Fixes the following W=1 kernel build warning(s): drivers/i2c/busses/i2c-designware-master.c:176: warning: expecting prototype for i2c_dw_init(). Prototype was for i2c_dw_init_master() instead Signed-off-by: Lee Jones <lee.jones@linaro.org> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-05-27 21:32:12 +02:00
Lee Jones	6eb8a47369	i2c: busses: i2c-cadence: Fix incorrectly documented 'enum cdns_i2c_slave_mode' Fixes the following W=1 kernel build warning(s): drivers/i2c/busses/i2c-cadence.c:157: warning: expecting prototype for enum cdns_i2c_slave_mode. Prototype was for enum cdns_i2c_slave_state instead Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-05-27 21:31:59 +02:00
Lee Jones	f09aa114c4	i2c: busses: i2c-ali1563: File headers are not good candidates for kernel-doc Fixes the following W=1 kernel build warning(s): drivers/i2c/busses/i2c-ali1563.c:24: warning: expecting prototype for i2c(). Prototype was for ALI1563_MAX_TIMEOUT() instead Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-05-27 21:29:26 +02:00
Lee Jones	45ce82f5ea	i2c: muxes: i2c-arb-gpio-challenge: Demote non-conformant kernel-doc headers Fixes the following W=1 kernel build warning(s): drivers/i2c/muxes/i2c-arb-gpio-challenge.c:43: warning: Function parameter or member 'muxc' not described in 'i2c_arbitrator_select' drivers/i2c/muxes/i2c-arb-gpio-challenge.c:43: warning: Function parameter or member 'chan' not described in 'i2c_arbitrator_select' drivers/i2c/muxes/i2c-arb-gpio-challenge.c:86: warning: Function parameter or member 'muxc' not described in 'i2c_arbitrator_deselect' drivers/i2c/muxes/i2c-arb-gpio-challenge.c:86: warning: Function parameter or member 'chan' not described in 'i2c_arbitrator_deselect' Signed-off-by: Lee Jones <lee.jones@linaro.org> Acked-by: Douglas Anderson <dianders@chromium.org> Acked-by: Peter Rosin <peda@axentia.se> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-05-27 21:29:03 +02:00
Lee Jones	72ab7b6bb1	i2c: busses: i2c-nomadik: Fix formatting issue pertaining to 'timeout' Fixes the following W=1 kernel build warning(s): drivers/i2c/busses/i2c-nomadik.c:184: warning: Function parameter or member 'timeout' not described in 'nmk_i2c_dev' Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-05-27 21:27:48 +02:00
Shyam Prasad N	eb06881805	cifs: fix string declarations and assignments in tracepoints We missed using the variable length string macros in several tracepoints. Fixed them in this change. There's probably more useful macros that we can use to print others like flags etc. But I'll submit sepawrate patches for those at a future date. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Cc: <stable@vger.kernel.org> # v5.12 Signed-off-by: Steve French <stfrench@microsoft.com>	2021-05-27 14:04:32 -05:00
Aurelien Aptel	6d2fcfe6b5	cifs: set server->cipher_type to AES-128-CCM for SMB3.0 SMB3.0 doesn't have encryption negotiate context but simply uses the SMB2_GLOBAL_CAP_ENCRYPTION flag. When that flag is present in the neg response cifs.ko uses AES-128-CCM which is the only cipher available in this context. cipher_type was set to the server cipher only when parsing encryption negotiate context (SMB3.1.1). For SMB3.0 it was set to 0. This means cipher_type value can be 0 or 1 for AES-128-CCM. Fix this by checking for SMB3.0 and encryption capability and setting cipher_type appropriately. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2021-05-27 14:03:47 -05:00
Linus Torvalds	3224374f7e	ACPI fix for 5.13-rc4. Fix a recent ACPI power management regression causing boot issues to occur on some systems due to attempts to turn off ACPI power resources that are already off (which should work according to the ACPI specification). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmCv5H0SHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRx11wP/1j7DA8TOOeJjdxYOw3Ky3XtntjpVphe xeI2M9w8zDhiZl0JeyWNjLbiPL50j6xi9+YuglG/pIjnhUoMtewr/7+4+X6/iz7U VMZ+C+Hd/0/vdn1vWKxMmCIHASi5Vz+RMon58GweBLAcMiedJkVX1qtv+ZAUmhBY KtCjQVpSDuhmzrw6LPADjTjobHIsQPUnu+SE/faplFoEGHv9CACrypaHuNO+ACDm v0eT09Zjt+qGEhkCy33xmd47+JcktOmSuHabg4yn/93+z7HHwNDhIbInFUTdvPoW CnBV6wUGerj57nN6kzjCEb4nowhYwbmfJf8ufROfay+eR/arnueo9rNX5G/iGxRS VpIAG+Fu3mm9RfOwwWrl/v8ofBPjRGhCShAYnBPtFr5ZQX4pnRstZcWmxxQK5/x5 2RN9cQlBmm0wzKTwJ9XYpscZqf+yNVQSzYz1vM5jLNh8DggQ4ChqUG7QJcUYKrwL Gs6eoiMRgF/CqaNgx8cd96YRv7KiUZtYN8fQRSWmCmmTlkKJk1FoHbQXQ78MUZ9Y w7KI9yXCA5TgcUAwgrBuJIrcEMIGYlukrdSDFFwJfmUKsd2X/mRoIdr8Z8yQicQg ZMzyKKjMCqKDDZLYx8ZiKygap7Hg/W+bii75PyFOD+UWg6LXn6ARdmaWiggLw1bE 10Z6UnsV7zZ2 =z6Ba -----END PGP SIGNATURE----- Merge tag 'acpi-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Fix a recent ACPI power management regression causing boot issues to occur on some systems due to attempts to turn off ACPI power resources that are already off (which should work according to the ACPI specification)" * tag 'acpi-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: power: Refine turning off unused power resources	2021-05-27 08:39:05 -10:00
Linus Torvalds	96c132f837	IOMMU Fixes for Linux v5.13-rc3 Including: - Important fix for the AMD IOMMU driver in the recently added page-specific invalidation code to fix a calculation. - Fix a NULL-ptr dereference in the AMD IOMMU driver when a device switches domain types. - Fixes for the Intel VT-d driver to check for allocation failure and do correct cleanup. - Another fix for Intel VT-d to not allow supervisor page requests from devices when using second level page translation. - Add a MODULE_DEVICE_TABLE to the VIRTIO IOMMU driver -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmCvq8MACgkQK/BELZcB GuPqvxAAhcMbhlyUayOW2eRjjNNSpI1uRSNPDRlIOpxqgJS2UHe3srXDV/ZduqaT +PJBmRxE120LahcKhI60nzVEG4uJNqWMTytZPOAq/g+sJiMuplcmpUNxcA3TgnjL DMdsA4iShVq+yKl0kdrdU3N+wJahKUYvUr4IvAwtTs2QDCfSBwUjhH8eLJefFbaI kDhno4FCwLuUlYkrtk//i9Y2InpA1tYXQ7yih4hhqhmkt6ODoAltf3rKu39WgF+G STESV5aocD2+2dl9Pwap/qOTT/9S3Oz+3TMkcCZ+S3WBic/MmMxMqHlh9iYo41fb PYUwtk5vqKvxtQ1Rm+oP6OeHiQ0NhCrb2prPSUl5e9U4fX1FEBjKgJcOc3Yyc2d0 P3CkeeMk0qcBV8F6qYVuumkzX1m79jyspPAswqKjAsVjHHCALLM+xh1lyDqhxQYM Ak6uzCewYiufvGmnjgRXLa4QqcLb7N0pWx3W1nbRMyY2ntqie0mevzAVJ2o27BCZ Ti2bz/Ls6Po0FX1QzwpfobSxgWptuBIE++24uzMvnSyqugHBWyywgeAlEQBQ+tE9 iDfP8g9P2jUttSgt+3Cnf/idmrngRTrKLX4LrGHb0HlW4zVomhduy8DoD5cAixPg +fRQsakwVInejt24ChbcT+Lr15fDhIgIp5bNt3Thb8uUK+gld7U= =49ie -----END PGP SIGNATURE----- Merge tag 'iommu-fixes-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Important fix for the AMD IOMMU driver in the recently added page-specific invalidation code to fix a calculation. - Fix a NULL-ptr dereference in the AMD IOMMU driver when a device switches domain types. - Fixes for the Intel VT-d driver to check for allocation failure and do correct cleanup. - Another fix for Intel VT-d to not allow supervisor page requests from devices when using second level page translation. - Add a MODULE_DEVICE_TABLE to the VIRTIO IOMMU driver * tag 'iommu-fixes-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Fix sysfs leak in alloc_iommu() iommu/vt-d: Use user privilege for RID2PASID translation iommu/vt-d: Check for allocation failure in aux_detach_device() iommu/virtio: Add missing MODULE_DEVICE_TABLE iommu/amd: Fix wrong parentheses on page-specific invalidations iommu/amd: Clear DMA ops when switching domain	2021-05-27 08:06:36 -10:00
Ian Rogers	c59870e211	perf debug: Move debug initialization earlier This avoids segfaults during option handlers that use pr_err. For example, "perf --debug nopager list" segfaults before this change. Fixes: `8abceacff8` (perf debug: Add debug_set_file function) Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210519164447.2672030-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-05-27 13:24:22 -03:00
David Howells	f610a5a29c	afs: Fix the nlink handling of dir-over-dir rename Fix rename of one directory over another such that the nlink on the deleted directory is cleared to 0 rather than being decremented to 1. This was causing the generic/035 xfstest to fail. Fixes: `e49c7b2f6d` ("afs: Build an abstraction around an "operation" concept") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/162194384460.3999479.7605572278074191079.stgit@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-05-27 06:23:58 -10:00
Dave Chinner	0fe0bbe00a	xfs: bunmapi has unnecessary AG lock ordering issues large directory block size operations are assert failing because xfs_bunmapi() is not completely removing fragmented directory blocks like so: XFS: Assertion failed: done, file: fs/xfs/libxfs/xfs_dir2.c, line: 677 .... Call Trace: xfs_dir2_shrink_inode+0x1a8/0x210 xfs_dir2_block_to_sf+0x2ae/0x410 xfs_dir2_block_removename+0x21a/0x280 xfs_dir_removename+0x195/0x1d0 xfs_rename+0xb79/0xc50 ? avc_has_perm+0x8d/0x1a0 ? avc_has_perm_noaudit+0x9a/0x120 xfs_vn_rename+0xdb/0x150 vfs_rename+0x719/0xb50 ? __lookup_hash+0x6a/0xa0 do_renameat2+0x413/0x5e0 __x64_sys_rename+0x45/0x50 do_syscall_64+0x3a/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xae We are aborting the bunmapi() pass because of this specific chunk of code: /* * Make sure we don't touch multiple AGF headers out of order * in a single transaction, as that could cause AB-BA deadlocks. */ if (!wasdel && !isrt) { agno = XFS_FSB_TO_AGNO(mp, del.br_startblock); if (prev_agno != NULLAGNUMBER && prev_agno > agno) break; prev_agno = agno; } This is designed to prevent deadlocks in AGF locking when freeing multiple extents by ensuring that we only ever lock in increasing AG number order. Unfortunately, this also violates the "bunmapi will always succeed" semantic that some high level callers depend on, such as xfs_dir2_shrink_inode(), xfs_da_shrink_inode() and xfs_inactive_symlink_rmt(). This AG lock ordering was introduced back in 2017 to fix deadlocks triggered by generic/299 as reported here: https://lore.kernel.org/linux-xfs/800468eb-3ded-9166-20a4-047de8018582@gmail.com/ This codebase is old enough that it was before we were defering all AG based extent freeing from within xfs_bunmapi(). THat is, we never actually lock AGs in xfs_bunmapi() any more - every non-rt based extent free is added to the defer ops list, as is all BMBT block freeing. And RT extents are not RT based, so there's no lock ordering issues associated with them. Hence this AGF lock ordering code is both broken and dead. Let's just remove it so that the large directory block code works reliably again. Tested against xfs/538 and generic/299 which is the original test that exposed the deadlocks that this code fixed. Fixes: `5b094d6dac` ("xfs: fix multi-AG deadlock in xfs_bunmapi") Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-05-27 08:11:24 -07:00
Dave Chinner	991c2c5980	xfs: btree format inode forks can have zero extents xfs/538 is assert failing with this trace when testing with directory block sizes of 64kB: XFS: Assertion failed: !xfs_need_iread_extents(ifp), file: fs/xfs/libxfs/xfs_bmap.c, line: 608 .... Call Trace: xfs_bmap_btree_to_extents+0x2a9/0x470 ? kmem_cache_alloc+0xe7/0x220 __xfs_bunmapi+0x4ca/0xdf0 xfs_bunmapi+0x1a/0x30 xfs_dir2_shrink_inode+0x71/0x210 xfs_dir2_block_to_sf+0x2ae/0x410 xfs_dir2_block_removename+0x21a/0x280 xfs_dir_removename+0x195/0x1d0 xfs_remove+0x244/0x460 xfs_vn_unlink+0x53/0xa0 ? selinux_inode_unlink+0x13/0x20 vfs_unlink+0x117/0x220 do_unlinkat+0x1a2/0x2d0 __x64_sys_unlink+0x42/0x60 do_syscall_64+0x3a/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xae This is a check to ensure that the extents have been read into memory before we are doing a ifork btree manipulation. This assert is bogus in the above case. We have a fragmented directory block that has more extents in it than can fit in extent format, so the inode data fork is in btree format. xfs_dir2_shrink_inode() asks to remove all remaining 16 filesystem blocks from the inode so it can convert to short form, and __xfs_bunmapi() removes all the extents. We now have a data fork in btree format but have zero extents in the fork. This incorrectly trips the xfs_need_iread_extents() assert because it assumes that an empty extent btree means the extent tree has not been read into memory yet. This is clearly not the case with xfs_bunmapi(), as it has an explicit call to xfs_iread_extents() in it to pull the extents into memory before it starts unmapping. Also, the assert directly after this bogus one is: ASSERT(ifp->if_format == XFS_DINODE_FMT_BTREE); Which covers the context in which it is legal to call xfs_bmap_btree_to_extents just fine. Hence we should just remove the bogus assert as it is clearly wrong and causes a regression. The returns the test behaviour to the pre-existing assert failure in xfs_dir2_shrink_inode() that indicates xfs_bunmapi() has failed to remove all the extents in the range it was asked to unmap. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2021-05-27 08:11:24 -07:00
Rolf Eike Beer	0ee74d5a48	iommu/vt-d: Fix sysfs leak in alloc_iommu() iommu_device_sysfs_add() is called before, so is has to be cleaned on subsequent errors. Fixes: `39ab9555c2` ("iommu: Add sysfs bindings for struct iommu_device") Cc: stable@vger.kernel.org # 4.11.x Signed-off-by: Rolf Eike Beer <eb@emlix.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/17411490.HIIP88n32C@mobilepool36.emlix.com Link: https://lore.kernel.org/r/20210525070802.361755-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-05-27 16:07:08 +02:00
Marco Elver	b16ef427ad	io_uring: fix data race to avoid potential NULL-deref Commit `ba5ef6dc8a` ("io_uring: fortify tctx/io_wq cleanup") introduced setting tctx->io_wq to NULL a bit earlier. This has caused KCSAN to detect a data race between accesses to tctx->io_wq: write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1: io_uring_clean_tctx fs/io_uring.c:9042 [inline] __io_uring_cancel fs/io_uring.c:9136 io_uring_files_cancel include/linux/io_uring.h:16 [inline] do_exit kernel/exit.c:781 do_group_exit kernel/exit.c:923 get_signal kernel/signal.c:2835 arch_do_signal_or_restart arch/x86/kernel/signal.c:789 handle_signal_work kernel/entry/common.c:147 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] ... read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0: io_uring_try_cancel_iowq fs/io_uring.c:8911 [inline] io_uring_try_cancel_requests fs/io_uring.c:8933 io_ring_exit_work fs/io_uring.c:8736 process_one_work kernel/workqueue.c:2276 ... With the config used, KCSAN only reports data races with value changes: this implies that in the case here we also know that tctx->io_wq was non-NULL. Therefore, depending on interleaving, we may end up with: [CPU 0] \| [CPU 1] io_uring_try_cancel_iowq() \| io_uring_clean_tctx() if (!tctx->io_wq) // false \| ... ... \| tctx->io_wq = NULL io_wq_cancel_cb(tctx->io_wq, ...) \| ... -> NULL-deref \| Note: It is likely that thus far we've gotten lucky and the compiler optimizes the double-read into a single read into a register -- but this is never guaranteed, and can easily change with a different config! Fix the data race by restoring the previous behaviour, where both setting io_wq to NULL and put of the wq are _serialized_ after concurrent io_uring_try_cancel_iowq() via acquisition of the uring_lock and removal of the node in io_uring_del_task_file(). Fixes: `ba5ef6dc8a` ("io_uring: fortify tctx/io_wq cleanup") Suggested-by: Pavel Begunkov <asml.silence@gmail.com> Reported-by: syzbot+bf2b3d0435b9b728946c@syzkaller.appspotmail.com Signed-off-by: Marco Elver <elver@google.com> Cc: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20210527092547.2656514-1-elver@google.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-05-27 07:44:49 -06:00
Jens Axboe	a4b58f1721	nvme fixes for Linux 5.13 - fix a memory leak in nvme_cdev_add (Guoqing Jiang) - fix inline data size comparison in nvmet_tcp_queue_response (Hou Pu) - fix false keep-alive timeout when a controller is torn down (Sagi Grimberg) - fix a nvme-tcp Kconfig dependency (Sagi Grimberg) - short-circuit reconnect retries for FC (Hannes Reinecke) - decode host pathing error for connect (Hannes Reinecke) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmCviQcLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYOl2BAAsFnUbt+2oYKzNldYHGewAv+v33uwtN88zCChbEXu 6Saj0M5B4s9LbpyXdz2l8cb8CtmvSnFRjbHpdKny0E0qXGKp1YHbcTlefBiMeDPp YME543oAZ1rACafiLFwr/7Ys50Frtel9hNWty2Fgv8Q2eiyQRJnQvJHfwQ/Fk9g2 swtVLbO83RwAef+kcLZaJojLeBlvW55vl/Srj5qj6tMPNl5DUgjUmioPTnFdo0ty 6iEyAurXpcdbMIjYdgulbAypyQIvKyzcXgi0BVuxAbCV9DBKdrwP0ob/HDyYsNBX 8sRUQnIfLketBKsIvuwhVcLr27HYHyDAHiGc70iG1ndrvdfkEKDP5BEwZxkNOA7d TbjvFanQIlvxkz419KIZo3SIdrG88DcYz5Ai0MxA8RuiKPJUE+Iksp8aCds/3dLR urfXHyHvj/O5US+xej0j5K63oVW7KPri3mbnFgfSHhmpg1KXBTmVk9a2Gz7ZyKIO XVHaq1kQHhvE0Kp1juZNqJ/ojd17wQP7bFIBGEHo6IPJtDILa0IQOw1S2aPX+/+q 45dJqFxndco8oCyl81ynax9n+T3nA4DJs20a6j7fYbQnJpwKBiSKQsUbVWPybKzM rq+/7PlsGbjSsQwf1dTvhF04j/5HqTT5A+LUsLR6Jm7sPPYkws36PC7Cqr9sUxPX RYo= =JT0/ -----END PGP SIGNATURE----- Merge tag 'nvme-5.13-2021-05-27' of git://git.infradead.org/nvme into block-5.13 Pull NVMe fixes from Christoph: "nvme fixes for Linux 5.13 - fix a memory leak in nvme_cdev_add (Guoqing Jiang) - fix inline data size comparison in nvmet_tcp_queue_response (Hou Pu) - fix false keep-alive timeout when a controller is torn down (Sagi Grimberg) - fix a nvme-tcp Kconfig dependency (Sagi Grimberg) - short-circuit reconnect retries for FC (Hannes Reinecke) - decode host pathing error for connect (Hannes Reinecke)" * tag 'nvme-5.13-2021-05-27' of git://git.infradead.org/nvme: nvmet: fix false keep-alive timeout when a controller is torn down nvmet-tcp: fix inline data size comparison in nvmet_tcp_queue_response nvme-tcp: remove incorrect Kconfig dep in BLK_DEV_NVME nvme-fabrics: decode host pathing error for connect nvme-fc: short-circuit reconnect retries nvme: fix potential memory leaks in nvme_cdev_add	2021-05-27 07:38:12 -06:00
Christian Gmeiner	9808f9be31	serial: 8250_pci: handle FL_NOIRQ board flag In commit `8428413b1d` ("serial: 8250_pci: Implement MSI(-X) support") the way the irq gets allocated was changed. With that change the handling FL_NOIRQ got lost. Restore the old behaviour. Fixes: `8428413b1d` ("serial: 8250_pci: Implement MSI(-X) support") Cc: <stable@vger.kernel.org> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Link: https://lore.kernel.org/r/20210527095529.26281-1-christian.gmeiner@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-27 15:22:36 +02:00
Huilong Deng	a799b68a7c	nfs: Remove trailing semicolon in macros Macros should not use a trailing semicolon. Signed-off-by: Huilong Deng <denghuilong@cdjrlc.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2021-05-27 09:19:33 -04:00
Alexander Usyskin	bbf0a94744	mei: request autosuspend after sending rx flow control A rx flow control waiting in the control queue may block autosuspend. Re-request autosuspend after flow control been sent to unblock the transition to the low power state. Cc: <stable@vger.kernel.org> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20210526193334.445759-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-27 15:17:19 +02:00
Greg Kroah-Hartman	022b93cf2d	interconnect fixes for v5.13 This contains two tiny driver fixes: - bcm-voter: Add missing MODULE_DEVICE_TABLE - bcm-voter: Add a missing of_node_put() Signed-off-by: Georgi Djakov <djakov@kernel.org> -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJgr45VAAoJEIDQzArG2BZjtpsP/0DCs8y80J1CiXyOKAZqGK6X HrMf1Yee8z8wrZqtWr9aKWC9cLskN/Pdlk/IQOvZ+LpMajneByLPdmyFbHhjLko3 d2rGNWV9EMnRvZtbiRudp2e15cw6k4OwITzlMu1HBJsLQiVB81WCgU+pcZce/9e8 99tyGQ/L/eMCOYAJ9fLhqf+02bDuH2D+wnb1oVLkfYqvDqRFlWFHmkgg7TzGwHOt Ws91FPaMEzEK3fsHcfpJTDhfyscHNZIQ21G24t5artqHkOZBmQEbdwlhIthRhZhc bqgtIeSpLGBkwFXZmXA4rlaDBB9qEODagpToUVeej4oSQECpQY9SAuW8sT9jJrT+ eEUcJWqc6D4roNNnvuYVQ3qDWBGzd/SCVUjA6IQUxsiTXDQr/hI5OdxSLdKDND7b m8nDp/6LWZ/pXL6iXbaL4OVlZl76Q7Ga1DfcBfC8JLDt/6PzPH66ErihZkNDtbAS 32ug4qVxJ8SWrFuDM/RtKX50K5drqbng4IFoTtCIpQ4+tdRn0o1WFOCTJzBvAmah da9O4rHkoGD12AhZnIvbCXrFspvDd/8/BoMmfqg/TGDL3gSKq1f5YKFGu2QFyCy3 kVmrrsx1UvjDWC4h5xIPTMgzQqUosTbJmGcGMrIRauJp7JcDzSpX+mV7ijy/J7mB ZqhjAk8KruvFL77j1Au+ =Kwhg -----END PGP SIGNATURE----- Merge tag 'icc-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-linus Grorgi writes: interconnect fixes for v5.13 This contains two tiny driver fixes: - bcm-voter: Add missing MODULE_DEVICE_TABLE - bcm-voter: Add a missing of_node_put() Signed-off-by: Georgi Djakov <djakov@kernel.org> * tag 'icc-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc: interconnect: qcom: Add missing MODULE_DEVICE_TABLE interconnect: qcom: bcm-voter: add a missing of_node_put()	2021-05-27 15:06:33 +02:00
David Matlack	bedd9195df	KVM: x86/mmu: Fix comment mentioning skip_4k This comment was left over from a previous version of the patch that introduced wrprot_gfn_range, when skip_4k was passed in instead of min_level. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210526163227.3113557-1-dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 08:51:25 -04:00
Chuck Lever	ae605ee983	xprtrdma: Revert `586a0787ce` Commit `9ed5af268e` ("SUNRPC: Clean up the handling of page padding in rpc_prepare_reply_pages()") [Dec 2020] affects RPC Replies that have a data payload (i.e., Write chunks). rpcrdma_prepare_readch(), as its name suggests, sets up Read chunks which are data payloads within RPC Calls. Those payloads are constructed by xdr_write_pages(), which continues to stuff the call buffer's tail kvec with the payload's XDR roundup. Thus removing the tail buffer logic in rpcrdma_prepare_readch() was the wrong thing to do. Fixes: `586a0787ce` ("xprtrdma: Clean up rpcrdma_prepare_readch()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2021-05-27 08:46:19 -04:00
Zhang Xiaoxu	e67afa7ee4	NFSv4: Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config Since commit `bdcc2cd14e` ("NFSv4.2: handle NFS-specific llseek errors"), nfs42_proc_llseek would return -EOPNOTSUPP rather than -ENOTSUPP when SEEK_DATA on NFSv4.0/v4.1. This will lead xfstests generic/285 not run on NFSv4.0/v4.1 when set the CONFIG_NFS_V4_2, rather than run failed. Fixes: `bdcc2cd14e` ("NFSv4.2: handle NFS-specific llseek errors") Cc: <stable.vger.kernel.org> # 4.2 Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2021-05-27 08:46:19 -04:00
Marcelo Tosatti	a2486020a8	KVM: VMX: update vcpu posted-interrupt descriptor when assigning device For VMX, when a vcpu enters HLT emulation, pi_post_block will: 1) Add vcpu to per-cpu list of blocked vcpus. 2) Program the posted-interrupt descriptor "notification vector" to POSTED_INTR_WAKEUP_VECTOR With interrupt remapping, an interrupt will set the PIR bit for the vector programmed for the device on the CPU, test-and-set the ON bit on the posted interrupt descriptor, and if the ON bit is clear generate an interrupt for the notification vector. This way, the target CPU wakes upon a device interrupt and wakes up the target vcpu. Problem is that pi_post_block only programs the notification vector if kvm_arch_has_assigned_device() is true. Its possible for the following to happen: 1) vcpu V HLTs on pcpu P, kvm_arch_has_assigned_device is false, notification vector is not programmed 2) device is assigned to VM 3) device interrupts vcpu V, sets ON bit (notification vector not programmed, so pcpu P remains in idle) 4) vcpu 0 IPIs vcpu V (in guest), but since pi descriptor ON bit is set, kvm_vcpu_kick is skipped 5) vcpu 0 busy spins on vcpu V's response for several seconds, until RCU watchdog NMIs all vCPUs. To fix this, use the start_assignment kvm_x86_ops callback to kick vcpus out of the halt loop, so the notification vector is properly reprogrammed to the wakeup vector. Reported-by: Pei Zhang <pezhang@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Message-Id: <20210526172014.GA29007@fuller.cnet> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:58:23 -04:00
Marcelo Tosatti	084071d5e9	KVM: rename KVM_REQ_PENDING_TIMER to KVM_REQ_UNBLOCK KVM_REQ_UNBLOCK will be used to exit a vcpu from its inner vcpu halt emulation loop. Rename KVM_REQ_PENDING_TIMER to KVM_REQ_UNBLOCK, switch PowerPC to arch specific request bit. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Message-Id: <20210525134321.303768132@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:57:38 -04:00
Marcelo Tosatti	57ab87947a	KVM: x86: add start_assignment hook to kvm_x86_ops Add a start_assignment hook to kvm_x86_ops, which is called when kvm_arch_start_assignment is done. The hook is required to update the wakeup vector of a sleeping vCPU when a device is assigned to the guest. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Message-Id: <20210525134321.254128742@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:50:13 -04:00
Wanpeng Li	9805cf03fd	KVM: LAPIC: Narrow the timer latency between wait_lapic_expire and world switch Let's treat lapic_timer_advance_ns automatic tuning logic as hypervisor overhead, move it before wait_lapic_expire instead of between wait_lapic_expire and the world switch, the wait duration should be calculated by the up-to-date guest_tsc after the overhead of automatic tuning logic. This patch reduces ~30+ cycles for kvm-unit-tests/tscdeadline-latency when testing busy waits. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1621339235-11131-5-git-send-email-wanpengli@tencent.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:58 -04:00
Paolo Bonzini	fb0f94794b	selftests: kvm: do only 1 memslot_perf_test run by default The test takes a long time with the current implementation of memslots, so cut the run time a bit. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:57 -04:00
Joe Richey	fb1070d18e	KVM: X86: Use _BITUL() macro in UAPI headers Replace BIT() in KVM's UPAI header with _BITUL(). BIT() is not defined in the UAPI headers and its usage may cause userspace build errors. Fixes: `fb04a1eddb` ("KVM: X86: Implement ring-based dirty memory tracking") Signed-off-by: Joe Richey <joerichey@google.com> Message-Id: <20210521085849.37676-3-joerichey94@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:57 -04:00
Axel Rasmussen	33090a884d	KVM: selftests: add shared hugetlbfs backing source type This lets us run the demand paging test on top of a shared hugetlbfs-backed area. The "shared" is key, as this allows us to exercise userfaultfd minor faults on hugetlbfs. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-11-axelrasmussen@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:57 -04:00
Axel Rasmussen	a4b9722a59	KVM: selftests: allow using UFFD minor faults for demand paging UFFD handling of MINOR faults is a new feature whose use case is to speed up demand paging (compared to MISSING faults). So, it's interesting to let this selftest exercise this new mode. Modify the demand paging test to have the option of using UFFD minor faults, as opposed to missing faults. Now, when turning on userfaultfd with '-u', the desired mode has to be specified ("MISSING" or "MINOR"). If we're in minor mode, before registering, prefault via the alias. This way, the guest will trigger minor faults, instead of missing faults, and we can UFFDIO_CONTINUE to resolve them. Modify the page fault handler function to use the right ioctl depending on the mode we're running in. In MINOR mode, use UFFDIO_CONTINUE. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-10-axelrasmussen@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:57 -04:00
Axel Rasmussen	94f3f2b31a	KVM: selftests: create alias mappings when using shared memory When a memory region is added with a src_type specifying that it should use some kind of shared memory, also create an alias mapping to the same underlying physical pages. And, add an API so tests can get access to these alias addresses. Basically, for a guest physical address, let us look up the analogous host alias address. In a future commit, we'll modify the demand paging test to take advantage of this to exercise UFFD minor faults. The idea is, we pre-fault the underlying pages via the alias. When the guest faults, it gets a "minor" fault (PTEs don't exist yet, but a page is already in the page cache). Then, the userfaultfd theads can handle the fault: they could potentially modify the underlying memory via the alias if they wanted to, and then they install the PTEs and let the guest carry on via a UFFDIO_CONTINUE ioctl. Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-9-axelrasmussen@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:56 -04:00
Axel Rasmussen	c9befd5958	KVM: selftests: add shmem backing source type This lets us run the demand paging test on top of a shmem-backed area. In follow-up commits, we'll 1) leverage this new capability to create an alias mapping, and then 2) use the alias mapping to exercise UFFD minor faults. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-8-axelrasmussen@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:56 -04:00
Axel Rasmussen	b3784bc28c	KVM: selftests: refactor vm_mem_backing_src_type flags Each struct vm_mem_backing_src_alias has a flags field, which denotes the flags used to mmap() an area of that type. Previously, this field never included MAP_PRIVATE \| MAP_ANONYMOUS, because vm_userspace_mem_region_add assumed that all types would always use those flags, and so it hardcoded them. In a follow-up commit, we'll add a new type: shmem. Areas of this type must not have MAP_PRIVATE \| MAP_ANONYMOUS, and instead they must have MAP_SHARED. So, refactor things. Make it so that the flags field of struct vm_mem_backing_src_alias really is a complete set of flags, and don't add in any extras in vm_userspace_mem_region_add. This will let us easily tack on shmem. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-7-axelrasmussen@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:56 -04:00
Axel Rasmussen	0368c2c1b4	KVM: selftests: allow different backing source types Add an argument which lets us specify a different backing memory type for the test. The default is just to use anonymous, matching existing behavior. This is in preparation for testing UFFD minor faults. For that, we'll need to use a new backing memory type which is setup with MAP_SHARED. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-6-axelrasmussen@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:56 -04:00
Axel Rasmussen	32ffa4f71e	KVM: selftests: compute correct demand paging size This is a preparatory commit needed before we can use different kinds of backing pages for guest memory. Previously, we used perf_test_args.host_page_size, which is the host's native page size (commonly 4K). For VM_MEM_SRC_ANONYMOUS this turns out to be okay, but in a follow-up commit we want to allow using different kinds of backing memory. Take VM_MEM_SRC_ANONYMOUS_HUGETLB for example. Without this change, if we used that backing page type, when we issued a UFFDIO_COPY ioctl we'd only do so with 4K, rather than the full 2M of a backing hugepage. In this case, UFFDIO_COPY returns -EINVAL (__mcopy_atomic_hugetlb checks the size). Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-5-axelrasmussen@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:55 -04:00
Axel Rasmussen	25408e5a02	KVM: selftests: simplify setup_demand_paging error handling A small cleanup. Our caller writes: r = setup_demand_paging(...); if (r < 0) exit(-r); Since we're just going to exit anyway, instead of returning an error we can just re-use TEST_ASSERT. This makes the caller simpler, as well as the function itself - no need to write our branches, etc. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-3-axelrasmussen@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:55 -04:00
David Matlack	2aab4b355c	KVM: selftests: Print a message if /dev/kvm is missing If a KVM selftest is run on a machine without /dev/kvm, it will exit silently. Make it easy to tell what's happening by printing an error message. Opportunistically consolidate all codepaths that open /dev/kvm into a single function so they all print the same message. This slightly changes the semantics of vm_is_unrestricted_guest() by changing a TEST_ASSERT() to exit(KSFT_SKIP). However vm_is_unrestricted_guest() is only called in one place (x86_64/mmio_warning_test.c) and that is to determine if the test should be skipped or not. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210511202120.1371800-1-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:55 -04:00
Axel Rasmussen	c887d6a126	KVM: selftests: trivial comment/logging fixes Some trivial fixes I found while touching related code in this series, factored out into a separate commit for easier reviewing: - s/gor/got/ and add a newline in demand_paging_test.c - s/backing_src/src_type/ in a comment to be consistent with the real function signature in kvm_util.c Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-2-axelrasmussen@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:55 -04:00
David Matlack	a10453c038	KVM: selftests: Fix hang in hardware_disable_test If /dev/kvm is not available then hardware_disable_test will hang indefinitely because the child process exits before posting to the semaphore for which the parent is waiting. Fix this by making the parent periodically check if the child has exited. We have to be careful to forward the child's exit status to preserve a KSFT_SKIP status. I considered just checking for /dev/kvm before creating the child process, but there are so many other reasons why the child could exit early that it seemed better to handle that as general case. Tested: $ ./hardware_disable_test /dev/kvm not available, skipping test $ echo $? 4 $ modprobe kvm_intel $ ./hardware_disable_test $ echo $? 0 Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210514230521.2608768-1-dmatlack@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:55 -04:00
David Matlack	50bc913d52	KVM: selftests: Ignore CPUID.0DH.1H in get_cpuid_test Similar to CPUID.0DH.0H this entry depends on the vCPU's XCR0 register and IA32_XSS MSR. Since this test does not control for either before assigning the vCPU's CPUID, these entries will not necessarily match the supported CPUID exposed by KVM. This fixes get_cpuid_test on Cascade Lake CPUs. Suggested-by: Jim Mattson <jmattson@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210519211345.3944063-1-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:54 -04:00
David Matlack	ef4c9f4f65	KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn() vm_get_max_gfn() casts vm->max_gfn from a uint64_t to an unsigned int, which causes the upper 32-bits of the max_gfn to get truncated. Nobody noticed until now likely because vm_get_max_gfn() is only used as a mechanism to create a memslot in an unused region of the guest physical address space (the top), and the top of the 32-bit physical address space was always good enough. This fix reveals a bug in memslot_modification_stress_test which was trying to create a dummy memslot past the end of guest physical memory. Fix that by moving the dummy memslot lower. Fixes: `52200d0d94` ("KVM: selftests: Remove duplicate guest mode handling") Reviewed-by: Venkatesh Srinivas <venkateshs@chromium.org> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210521173828.1180619-1-dmatlack@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:54 -04:00
Maciej S. Szmigiero	cad347fab1	KVM: selftests: add a memslot-related performance benchmark This benchmark contains the following tests: * Map test, where the host unmaps guest memory while the guest writes to it (maps it). The test is designed in a way to make the unmap operation on the host take a negligible amount of time in comparison with the mapping operation in the guest. The test area is actually split in two: the first half is being mapped by the guest while the second half in being unmapped by the host. Then a guest <-> host sync happens and the areas are reversed. * Unmap test which is broadly similar to the above map test, but it is designed in an opposite way: to make the mapping operation in the guest take a negligible amount of time in comparison with the unmap operation on the host. This test is available in two variants: with per-page unmap operation or a chunked one (using 2 MiB chunk size). * Move active area test which involves moving the last (highest gfn) memslot a bit back and forth on the host while the guest is concurrently writing around the area being moved (including over the moved memslot). * Move inactive area test which is similar to the previous move active area test, but now guest writes all happen outside of the area being moved. * Read / write test in which the guest writes to the beginning of each page of the test area while the host writes to the middle of each such page. Then each side checks the values the other side has written. This particular test is not expected to give different results depending on particular memslots implementation, it is meant as a rough sanity check and to provide insight on the spread of test results expected. Each test performs its operation in a loop until a test period ends (this is 5 seconds by default, but it is configurable). Then the total count of loops done is divided by the actual elapsed time to give the test result. The tests have a configurable memslot cap with the "-s" test option, by default the system maximum is used. Each test is repeated a particular number of times (by default 20 times), the best result achieved is printed. The test memory area is divided equally between memslots, the reminder is added to the last memslot. The test area size does not depend on the number of memslots in use. The tests also measure the time that it took to add all these memslots. The best result from the tests that use the whole test area is printed after all the requested tests are done. In general, these tests are designed to use as much memory as possible (within reason) while still doing 100+ loops even on high memslot counts with the default test length. Increasing the test runtime makes it increasingly more likely that some event will happen on the system during the test run, which might lower the test result. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <8d31bb3d92bc8fa33a9756fa802ee14266ab994e.1618253574.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:54 -04:00
Maciej S. Szmigiero	22721a5610	KVM: selftests: Keep track of memslots more efficiently The KVM selftest framework was using a simple list for keeping track of the memslots currently in use. This resulted in lookups and adding a single memslot being O(n), the later due to linear scanning of the existing memslot set to check for the presence of any conflicting entries. Before this change, benchmarking high count of memslots was more or less impossible as pretty much all the benchmark time was spent in the selftest framework code. We can simply use a rbtree for keeping track of both of gfn and hva. We don't need an interval tree for hva here as we can't have overlapping memslots because we allocate a completely new memory chunk for each new memslot. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <b12749d47ee860468240cf027412c91b76dbe3db.1618253574.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:54 -04:00
Paolo Bonzini	a13534d667	selftests: kvm: fix potential issue with ELF loading vm_vaddr_alloc() sets up GVA to GPA mapping page by page; therefore, GPAs may not be continuous if same memslot is used for data and page table allocation. kvm_vm_elf_load() however expects a continuous range of HVAs (and thus GPAs) because it does not try to read file data page by page. Fix this mismatch by allocating memory in one step. Reported-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:53 -04:00
Zhenzhong Duan	39fe2fc966	selftests: kvm: make allocation of extra memory take effect The extra memory pages is missed to be allocated during VM creating. perf_test_util and kvm_page_table_test use it to alloc extra memory currently. Fix it by adding extra_mem_pages to the total memory calculation before allocate. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20210512043107.30076-1-zhenzhong.duan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-05-27 07:45:53 -04:00

1 2 3 4 5 ...

1014581 Commits