Commit Graph

1199791 Commits

Author SHA1 Message Date
Kashyap Desai
64b632654b RDMA/bnxt_re: Initialize dpi_tbl_lock mutex
Fix the missing dpi_tbl_lock mutex initialization.

Fixes: 0ac20faf5d ("RDMA/bnxt_re: Reorg the bar mapping")
Link: https://lore.kernel.org/r/1691642677-21369-4-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-08-10 16:35:54 -03:00
Kalesh AP
5ac8480ae4 RDMA/bnxt_re: Fix error handling in probe failure path
During bnxt_re_dev_init(), when bnxt_re_setup_chip_ctx() fails unregister
with L2 first before bailing out probe.

Fixes: ae8637e131 ("RDMA/bnxt_re: Add chip context to identify 57500 series")
Link: https://lore.kernel.org/r/1691642677-21369-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-08-10 16:35:54 -03:00
Selvin Xavier
5363fc488d RDMA/bnxt_re: Properly order ib_device_unalloc() to avoid UAF
ib_dealloc_device() should be called only after device cleanup.  Fix the
dealloc sequence.

Fixes: 6d758147c7 ("RDMA/bnxt_re: Use auxiliary driver interface")
Link: https://lore.kernel.org/r/1691642677-21369-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-08-10 16:35:54 -03:00
Junxian Huang
8e7b295da1 MAINTAINERS: Remove maintainer of HiSilicon RoCE
Haoyue no longer maintains the Hisilicon RoCE driver. So remove him
from MAINTAINERS.

Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://lore.kernel.org/r/20230807064228.4032536-1-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-08-08 21:36:31 +03:00
Douglas Miller
4fdfaef71f IB/hfi1: Fix possible panic during hotplug remove
During hotplug remove it is possible that the update counters work
might be pending, and may run after memory has been freed.
Cancel the update counters work before freeing memory.

Fixes: 7724105686 ("IB/hfi1: add driver files")
Signed-off-by: Douglas Miller <doug.miller@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Link: https://lore.kernel.org/r/169099756100.3927190.15284930454106475280.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-08-03 21:13:57 +03:00
Michael Guralnik
186b169cf1 RDMA/umem: Set iova in ODP flow
Fixing the ODP registration flow to set the iova correctly.
The calculation in ib_umem_num_dma_blocks() function assumes the iova of
the umem is set correctly.

When iova is not set, the calculation in ib_umem_num_dma_blocks() is
equivalent to length/page_size, which is true only when memory is aligned.
For unaligned memory, iova must be set for the ALIGN() in the
ib_umem_num_dma_blocks() to take effect and return a correct value.

mlx5_ib uses ib_umem_num_dma_blocks() to decide the mkey size to use for
the MR. Without this fix, when registering unaligned ODP MR, a wrong
size mkey might be chosen and this might cause the UMR to fail.

UMR would fail over insufficient size to update the mkey translation:
infiniband mlx5_0: dump_cqe:273:(pid 0): dump error cqe
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000030: 00 00 00 00 0f 00 78 06 25 00 00 58 00 da ac d2
infiniband mlx5_0: mlx5_ib_post_send_wait:806:(pid 20311): reg umr
failed (6)
infiniband mlx5_0: pagefault_real_mr:661:(pid 20311): Failed to update
mkey page tables

Fixes: f0093fb1a7 ("RDMA/mlx5: Move mlx5_ib_cont_pages() to the creation of the mlx5_ib_mr")
Fixes: a665aca89a ("RDMA/umem: Split ib_umem_num_pages() into ib_umem_num_dma_blocks()")
Signed-off-by: Artemy Kovalyov <artemyko@nvidia.com>
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://lore.kernel.org/r/3d4be7ca2155bf239dd8c00a2d25974a92c26ab8.1689757344.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-31 11:38:38 +03:00
Sindhu Devale
ae463563b7 RDMA/irdma: Report correct WC error
Report the correct WC error if a MW bind is performed
on an already valid/bound window.

Fixes: 44d9e52977 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155439.1057-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-26 14:58:42 +03:00
Sindhu Devale
3bfb25fa2b RDMA/irdma: Fix op_type reporting in CQEs
The op_type field CQ poll info structure is incorrectly
filled in with the queue type as opposed to the op_type
received in the CQEs. The wrong opcode could be decoded
and returned to the ULP.

Copy the op_type field received in the CQE in the CQ poll
info structure.

Fixes: 24419777e9 ("RDMA/irdma: Fix RQ completion opcode")
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155439.1057-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-26 14:58:42 +03:00
Christophe JAILLET
5c719d7aef RDMA/rxe: Fix an error handling path in rxe_bind_mw()
All errors go to the error handling path, except this one. Be consistent
and also branch to it.

Fixes: 02ed253770 ("RDMA/rxe: Introduce rxe access supported flags")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/43698d8a3ed4e720899eadac887427f73d7ec2eb.1689623735.git.christophe.jaillet@wanadoo.fr
Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-18 15:22:43 +03:00
Selvin Xavier
29900bf351 RDMA/bnxt_re: Fix hang during driver unload
Driver unload hits a hang during stress testing of load/unload.

stack trace snippet -

tasklet_kill at ffffffff9aabb8b2
bnxt_qplib_nq_stop_irq at ffffffffc0a805fb [bnxt_re]
bnxt_qplib_disable_nq at ffffffffc0a80c5b [bnxt_re]
bnxt_re_dev_uninit at ffffffffc0a67d15 [bnxt_re]
bnxt_re_remove_device at ffffffffc0a6af1d [bnxt_re]

tasklet_kill can hang if the tasklet is scheduled after it is disabled.

Modified the sequences to disable the interrupt first and synchronize
irq before disabling the tasklet.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1689322969-25402-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17 08:02:13 +03:00
Kashyap Desai
b5bbc65512 RDMA/bnxt_re: Prevent handling any completions after qp destroy
HW may generate completions that indicates QP is destroyed.
Driver should not be scheduling any more completion handlers
for this QP, after the QP is destroyed. Since CQs are active
during the QP destroy, driver may still schedule completion
handlers. This can cause a race where the destroy_cq and poll_cq
running simultaneously.

Snippet of kernel panic while doing bnxt_re driver load unload in loop.
This indicates a poll after the CQ is freed. 

[77786.481636] Call Trace:
[77786.481640]  <TASK>
[77786.481644]  bnxt_re_poll_cq+0x14a/0x620 [bnxt_re]
[77786.481658]  ? kvm_clock_read+0x14/0x30
[77786.481693]  __ib_process_cq+0x57/0x190 [ib_core]
[77786.481728]  ib_cq_poll_work+0x26/0x80 [ib_core]
[77786.481761]  process_one_work+0x1e5/0x3f0
[77786.481768]  worker_thread+0x50/0x3a0
[77786.481785]  ? __pfx_worker_thread+0x10/0x10
[77786.481790]  kthread+0xe2/0x110
[77786.481794]  ? __pfx_kthread+0x10/0x10
[77786.481797]  ret_from_fork+0x2c/0x50

To avoid this, complete all completion handlers before returning the
destroy QP. If free_cq is called soon after destroy_qp,  IB stack
will cancel the CQ work before invoking the destroy_cq verb and
this will prevent any race mentioned.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1689322969-25402-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17 08:02:13 +03:00
Thomas Bogendoerfer
dc52aadbc1 RDMA/mthca: Fix crash when polling CQ for shared QPs
Commit 21c2fe94ab ("RDMA/mthca: Combine special QP struct with mthca QP")
introduced a new struct mthca_sqp which doesn't contain struct mthca_qp
any longer. Placing a pointer of this new struct into qptable leads
to crashes, because mthca_poll_one() expects a qp pointer. Fix this
by putting the correct pointer into qptable.

Fixes: 21c2fe94ab ("RDMA/mthca: Combine special QP struct with mthca QP")
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Link: https://lore.kernel.org/r/20230713141658.9426-1-tbogendoerfer@suse.de
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17 08:02:13 +03:00
Shiraz Saleem
0e15863015 RDMA/core: Update CMA destination address on rdma_resolve_addr
8d037973d4 ("RDMA/core: Refactor rdma_bind_addr") intoduces as regression
on irdma devices on certain tests which uses rdma CM, such as cmtime.

No connections can be established with the MAD QP experiences a fatal
error on the active side.

The cma destination address is not updated with the dst_addr when ULP
on active side calls rdma_bind_addr followed by rdma_resolve_addr.
The id_priv state is 'bound' in resolve_prepare_src and update is skipped.

This leaves the dgid passed into irdma driver to create an Address Handle
(AH) for the MAD QP at 0. The create AH descriptor as well as the ARP cache
entry is invalid and HW throws an asynchronous events as result.

[ 1207.656888] resolve_prepare_src caller: ucma_resolve_addr+0xff/0x170 [rdma_ucm] daddr=200.0.4.28 id_priv->state=7
[....]
[ 1207.680362] ice 0000:07:00.1 rocep7s0f1: caller: irdma_create_ah+0x3e/0x70 [irdma] ah_id=0 arp_idx=0 dest_ip=0.0.0.0
destMAC=00:00:64:ca:b7:52 ipvalid=1 raw=0000:0000:0000:0000:0000:ffff:0000:0000
[ 1207.682077] ice 0000:07:00.1 rocep7s0f1: abnormal ae_id = 0x401 bool qp=1 qp_id = 1, ae_src=5
[ 1207.691657] infiniband rocep7s0f1: Fatal error (1) on MAD QP (1)

Fix this by updating the CMA destination address when the ULP calls
a resolve address with the CM state already bound.

Fixes: 8d037973d4 ("RDMA/core: Refactor rdma_bind_addr")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230712234133.1343-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17 08:02:13 +03:00
Shiraz Saleem
f0842bb3d3 RDMA/irdma: Fix data race on CQP request done
KCSAN detects a data race on cqp_request->request_done memory location
which is accessed locklessly in irdma_handle_cqp_op while being
updated in irdma_cqp_ce_handler.

Annotate lockless intent with READ_ONCE/WRITE_ONCE to avoid any
compiler optimizations like load fusing and/or KCSAN warning.

[222808.417128] BUG: KCSAN: data-race in irdma_cqp_ce_handler [irdma] / irdma_wait_event [irdma]

[222808.417532] write to 0xffff8e44107019dc of 1 bytes by task 29658 on cpu 5:
[222808.417610]  irdma_cqp_ce_handler+0x21e/0x270 [irdma]
[222808.417725]  cqp_compl_worker+0x1b/0x20 [irdma]
[222808.417827]  process_one_work+0x4d1/0xa40
[222808.417835]  worker_thread+0x319/0x700
[222808.417842]  kthread+0x180/0x1b0
[222808.417852]  ret_from_fork+0x22/0x30

[222808.417918] read to 0xffff8e44107019dc of 1 bytes by task 29688 on cpu 1:
[222808.417995]  irdma_wait_event+0x1e2/0x2c0 [irdma]
[222808.418099]  irdma_handle_cqp_op+0xae/0x170 [irdma]
[222808.418202]  irdma_cqp_cq_destroy_cmd+0x70/0x90 [irdma]
[222808.418308]  irdma_puda_dele_rsrc+0x46d/0x4d0 [irdma]
[222808.418411]  irdma_rt_deinit_hw+0x179/0x1d0 [irdma]
[222808.418514]  irdma_ib_dealloc_device+0x11/0x40 [irdma]
[222808.418618]  ib_dealloc_device+0x2a/0x120 [ib_core]
[222808.418823]  __ib_unregister_device+0xde/0x100 [ib_core]
[222808.418981]  ib_unregister_device+0x22/0x40 [ib_core]
[222808.419142]  irdma_ib_unregister_device+0x70/0x90 [irdma]
[222808.419248]  i40iw_close+0x6f/0xc0 [irdma]
[222808.419352]  i40e_client_device_unregister+0x14a/0x180 [i40e]
[222808.419450]  i40iw_remove+0x21/0x30 [irdma]
[222808.419554]  auxiliary_bus_remove+0x31/0x50
[222808.419563]  device_remove+0x69/0xb0
[222808.419572]  device_release_driver_internal+0x293/0x360
[222808.419582]  driver_detach+0x7c/0xf0
[222808.419592]  bus_remove_driver+0x8c/0x150
[222808.419600]  driver_unregister+0x45/0x70
[222808.419610]  auxiliary_driver_unregister+0x16/0x30
[222808.419618]  irdma_exit_module+0x18/0x1e [irdma]
[222808.419733]  __do_sys_delete_module.constprop.0+0x1e2/0x310
[222808.419745]  __x64_sys_delete_module+0x1b/0x30
[222808.419755]  do_syscall_64+0x39/0x90
[222808.419763]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

[222808.419829] value changed: 0x01 -> 0x03

Fixes: 915cc7ac0f ("RDMA/irdma: Add miscellaneous utility definitions")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230711175253.1289-4-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17 08:02:00 +03:00
Shiraz Saleem
f2c3037811 RDMA/irdma: Fix data race on CQP completion stats
CQP completion statistics is read lockesly in irdma_wait_event and
irdma_check_cqp_progress while it can be updated in the completion
thread irdma_sc_ccq_get_cqe_info on another CPU as KCSAN reports.

Make completion statistics an atomic variable to reflect coherent updates
to it. This will also avoid load/store tearing logic bug potentially
possible by compiler optimizations.

[77346.170861] BUG: KCSAN: data-race in irdma_handle_cqp_op [irdma] / irdma_sc_ccq_get_cqe_info [irdma]

[77346.171383] write to 0xffff8a3250b108e0 of 8 bytes by task 9544 on cpu 4:
[77346.171483]  irdma_sc_ccq_get_cqe_info+0x27a/0x370 [irdma]
[77346.171658]  irdma_cqp_ce_handler+0x164/0x270 [irdma]
[77346.171835]  cqp_compl_worker+0x1b/0x20 [irdma]
[77346.172009]  process_one_work+0x4d1/0xa40
[77346.172024]  worker_thread+0x319/0x700
[77346.172037]  kthread+0x180/0x1b0
[77346.172054]  ret_from_fork+0x22/0x30

[77346.172136] read to 0xffff8a3250b108e0 of 8 bytes by task 9838 on cpu 2:
[77346.172234]  irdma_handle_cqp_op+0xf4/0x4b0 [irdma]
[77346.172413]  irdma_cqp_aeq_cmd+0x75/0xa0 [irdma]
[77346.172592]  irdma_create_aeq+0x390/0x45a [irdma]
[77346.172769]  irdma_rt_init_hw.cold+0x212/0x85d [irdma]
[77346.172944]  irdma_probe+0x54f/0x620 [irdma]
[77346.173122]  auxiliary_bus_probe+0x66/0xa0
[77346.173137]  really_probe+0x140/0x540
[77346.173154]  __driver_probe_device+0xc7/0x220
[77346.173173]  driver_probe_device+0x5f/0x140
[77346.173190]  __driver_attach+0xf0/0x2c0
[77346.173208]  bus_for_each_dev+0xa8/0xf0
[77346.173225]  driver_attach+0x29/0x30
[77346.173240]  bus_add_driver+0x29c/0x2f0
[77346.173255]  driver_register+0x10f/0x1a0
[77346.173272]  __auxiliary_driver_register+0xbc/0x140
[77346.173287]  irdma_init_module+0x55/0x1000 [irdma]
[77346.173460]  do_one_initcall+0x7d/0x410
[77346.173475]  do_init_module+0x81/0x2c0
[77346.173491]  load_module+0x1232/0x12c0
[77346.173506]  __do_sys_finit_module+0x101/0x180
[77346.173522]  __x64_sys_finit_module+0x3c/0x50
[77346.173538]  do_syscall_64+0x39/0x90
[77346.173553]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

[77346.173634] value changed: 0x0000000000000094 -> 0x0000000000000095

Fixes: 915cc7ac0f ("RDMA/irdma: Add miscellaneous utility definitions")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230711175253.1289-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17 08:01:46 +03:00
Shiraz Saleem
4984eb5145 RDMA/irdma: Add missing read barriers
On code inspection, there are many instances in the driver where
CEQE and AEQE fields written to by HW are read without guaranteeing
that the polarity bit has been read and checked first.

Add a read barrier to avoid reordering of loads on the CEQE/AEQE fields
prior to checking the polarity bit.

Fixes: 3f49d68425 ("RDMA/irdma: Implement HW Admin Queue OPs")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230711175253.1289-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17 08:01:22 +03:00
Dan Carpenter
d64b1ee12a RDMA/mlx4: Make check for invalid flags stricter
This code is trying to ensure that only the flags specified in the list
are allowed.  The problem is that ucmd->rx_hash_fields_mask is a u64 and
the flags are an enum which is treated as a u32 in this context.  That
means the test doesn't check whether the highest 32 bits are zero.

Fixes: 4d02ebd9bb ("IB/mlx4: Fix RSS hash fields restrictions")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/233ed975-982d-422a-b498-410f71d8a101@moroto.mountain
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-12 15:41:27 +03:00
Linus Torvalds
06c2afb862 Linux 6.5-rc1 2023-07-09 13:53:13 -07:00
Linus Torvalds
c192ac7357 MAINTAINERS 2: Electric Boogaloo
We just sorted the entries and fields last release, so just out of a
perverse sense of curiosity, I decided to see if we can keep things
ordered for even just one release.

The answer is "No. No we cannot".

I suggest that all kernel developers will need weekly training sessions,
involving a lot of Big Bird and Sesame Street.  And at the yearly
maintainer summit, we will all sing the alphabet song together.

I doubt I will keep doing this.  At some point "perverse sense of
curiosity" turns into just a cold dark place filled with sadness and
despair.

Repeats: 80e62bc848 ("MAINTAINERS: re-sort all entries and fields")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-09 10:29:53 -07:00
Linus Torvalds
f71f64210d dma-mapping fixes for Linux 6.5
- swiotlb area sizing fixes (Petr Tesarik)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmSq57sLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYOh0xAAkwklIxxzXvNlgjvy2hdgPWImPS8tGPDSIsqA9TDD
 WDZq89nz/ndnchdPObDvyJXmfBgqa0qCHqopBVPqMKv5a1pKZhrRXYlbajGQQwji
 MIqefTLZ/VGw7bDpEivOt+yadwQ3KxuxWWs7/JPKLLReSJ22H8P+jkrK7P7kBL5X
 YaMtZG9d86fvFHnQHKdAOlF1iCvnoZDHPcvaZbI6m5mMSZ+HIYqK5pP1MTUEAbIU
 MX4ZSI7/mL0q67+kZuM/NCSeq1pH0Cd0D2DGm+8k/y87G81GS6E5Wgg+xO7vYiXf
 5ChuwlAO9K9KhH7NIRkKhkad/Ii89ENXSyv5gmPRoKYK5FXajnHSlJTUrZojV6XC
 Pbsd9ATVzV0rY61EPyh6G1a+Ttp/pwMp+W0I2fi032GVAePQ/vhB9x9O+2O/3QiC
 v80nUSatkciZncWqkguhp3NONsRmLKep3CCQnEAA/gLs27B0avdQeslnqbOOUQKd
 Si+djIoel8ONjQ+mW8eFCsVYMH1xFSo0aGsgGe0y2cyBE3DN1AW9eRnOXWT4C1PR
 UyDlx8ACs87ojec+YRQFYk2/PbsU7CQiH1pteXvBHcbhiVUAvrtXtg6ANQ+7066P
 IIduAZmlHcHb1BhyrSQbAtRllVLIp/l9IAkCSY9SvL0tjh/B5CaRBD5m2Taow5I/
 KUI=
 =4Lfc
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fixes from Christoph Hellwig:

 - swiotlb area sizing fixes (Petr Tesarik)

* tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping:
  swiotlb: reduce the number of areas to match actual memory pool size
  swiotlb: always set the number of areas before allocating the pool
2023-07-09 10:24:22 -07:00
Linus Torvalds
a9943ad3dd - Optimize IRQ domain's name assignment
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSqa70ACgkQEsHwGGHe
 VUrTlhAAtfvA+mCeWVwn42827YlJeMvFV6RedERSk8x5CO/IJgHx2YFB0UXlznuM
 xQSIC7pkY2He62ggb+s6m+Pb1M3bxxJDaRSRnQGHOqXuSNuzokqjwvVMaSBE9s5W
 fgvdbZ4CDEynW7awiDq1PaMcJFzbc6RMbBDspNDke2ikZgGvqqWl3Gx2DHhGyNfQ
 UFJHETejxhfheU7Y9c06P2ToTyzc0RcaPFQfFmYIse+c6iTcjnqr9NRHCIklSygN
 F3VjSEXC1MkAZIAuV9pjkl3z01xfvDe1WqxH3jID/vM97AY08DqmBQC31VZOBmmV
 shndEUpc2yi7ued0WU1XLf1sMyEHm5rgF/19gRhFR9sxu4Zm9+JxOofo0URwqtsV
 Z3d0CxpDrF0ATJs2zTstUmOG0AD6SBfC+BQZi3M6rohXgR0z04dBnyTgCnxAs60u
 d5byFza5nZ/sX7KZXgVlKfH5ej0AcZuyFM/qMKqiTqzz5C4dv0Hq7y718aaecxmz
 W9A3olDGhC48pr9250TthEMhBzRdXvKgKQBUhC2TY/0hUBHyvmHeU//ql54XYhgh
 XVx3qI7yieZC3d27hZN0WTCG6qSCJoq8LROmGDIbx16Uj/cm1AHzJTJ+5CbCOafZ
 41gPBoNCNYNXlgYNBXpVcPRtfcK98fGl3jpPOk0llaSa6IW+Lcc=
 =ATsc
 -----END PGP SIGNATURE-----

Merge tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq update from Borislav Petkov:

 - Optimize IRQ domain's name assignment

* tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqdomain: Use return value of strreplace()
2023-07-09 10:16:04 -07:00
Linus Torvalds
51e3d7c274 - Do FPU AP initialization on Xen PV too which got missed by the recent
boot reordering work
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSqal4ACgkQEsHwGGHe
 VUr+pg/+JyqfzyymWYAaPUfwaFH7V8425p8thrZL+OSnDDoAZt5UnPpLB4lYZKWW
 u2SlphNSLhuclZ7Wly1zkkPO1J8O88FRCFFBxONtnrQ4WqH2P7f2E6cHgzD4dQRF
 RX/pNuLQ1TNYiOHvNvJ3xJvVdAGrcXFBbqupfSig+dQMBKIyuzGu/Jn7Cm0Q+HJK
 j9WJWiGNJ+f8WOEbHiTdI89OFcPmUMe2nhtK/I/QIUoCBIiyp3jQ2RilZwY2V7Wu
 U5kSQChqp7N+e275TLlOCFGvNW2htCZ5GPc2/nCOkfmnTDTwjVGX8jQr+EqC1pj1
 WcueoTjBMw2Drs4/V9ItkGXYqmUE4CK03nGp6uZ2hA5Qo8mSAdzr59A3+I7BbHur
 ulbm1i6ZZ0ip9Co080E0JS0F1CIL7ROIQ6HDQz4BUGQ1BbmIhNBmdj7yBJ20nTrr
 L7EmwgDsOF2NhKpg5USGrPxJWBvc9ma72CAlHAiPVUgzFIR6Z5DN9TM8aWgZZPDt
 RULC1/L/SI2FQmrMnCYhjO7Om0qJFk422cWCVjOA3D/lRo3toFEJ/XopxxXz9FZs
 guAIJuFLjDun13hxS9PCGvRCkg2cdVsCykkg1ydAbg2ux99rPDAmmnwYPG7pvxiP
 2W0gq43dbQAZlYjRx3gV5sHpUtPCsF+1Lz5jXkldRZJNXD1v1Fk=
 =RZFV
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fpu fix from Borislav Petkov:

 - Do FPU AP initialization on Xen PV too which got missed by the recent
   boot reordering work

* tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/xen: Fix secondary processors' FPU initialization
2023-07-09 10:13:32 -07:00
Linus Torvalds
e3da8db055 A single fix for the mechanism to park CPUs with an INIT IPI.
On shutdown or kexec, the kernel tries to park the non-boot CPUs with an
 INIT IPI. But the same code path is also used by the crash utility. If the
 CPU which panics is not the boot CPU then it sends an INIT IPI to the boot
 CPU which resets the machine. Prevent this by validating that the CPU which
 runs the stop mechanism is the boot CPU. If not, leave the other CPUs in
 HLT.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmSqoEcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYNSEACwo5zgibek27qeMvJGfNztm0qRa4mw
 wN0qV31yaNcEfhqL8bMU8n3wvEA+pZBqhaU5fyalY+yxc29jI/j9eda5zR+Fi9e5
 kVyFT2M0rVSDLFraoQeD+T/tSSK2MJtswF12ytY5mHzHMCb6Uy9fNCpUiQlB+i81
 AcnlKQk9ifZXFdMJPj5E+E6l776T8NZPoYEdFgJloxaYOGTdFJDWDlryx4LD7Urz
 Fx/ec8Ug/FYSPl2XzXHugvHjNefxKoomcZ3v3CSZonBcav7Gz6F06HAR5vVRWSHx
 4Dlh6zdy+60YKBmkvpb+RJIBMo8aXclwT+tntaoJvGHZ+PNASO6JVz9PvmoNgfWK
 Oy2n1K687qIOY6d+yxUZgbZpwXX5bG6kc0xbicUNigGagrYTfd83G5RAfwxNkqsY
 23Qw4Ue8uxve4M8iM/FfxKIShuDBiLCIDDIrWDjEkvIAnr1pd+NPUv7kqOTI7Kz/
 srNgcOwalypzuS93lgaN1yjRv1mmaPXhdhjy0DwGbC54bKgNzfq+7z75Ibn0dSFF
 JUPFVjztB+ymnM6PJ1dR77SvPi+xOi60nw7L+Qu9US4yKkW0NeGiIWVsggNorbU6
 UPFSE5gxwFD0w1EZ9W+IDeOZUNhjJUINZsn8txm+tb+oEqTIGRPHPOo0C1dBmLW9
 AmDIeHljj0iWIw==
 =DOCF
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Thomas Gleixner:
 "A single fix for the mechanism to park CPUs with an INIT IPI.

  On shutdown or kexec, the kernel tries to park the non-boot CPUs with
  an INIT IPI. But the same code path is also used by the crash utility.
  If the CPU which panics is not the boot CPU then it sends an INIT IPI
  to the boot CPU which resets the machine.

  Prevent this by validating that the CPU which runs the stop mechanism
  is the boot CPU. If not, leave the other CPUs in HLT"

* tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/smp: Don't send INIT to boot CPU
2023-07-09 10:08:38 -07:00
Linus Torvalds
74099e2034 - fixes for KVM
- fix for loongson build and cpu probing
 - DT fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmSqY18aHHRzYm9nZW5k
 QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHBYrw//VbewcC/3lYDaS2SDGIs1
 2f6DGshl7WeY/Y3IBA5xJXZHoK7AgwmC4OtPw+Ge4ZatD4mdW2B2aulIzRcm9yJV
 8EXNBDs7LYiKb7vG7ImYS38eykm/9dQBeK0gDsKA0eVWIKXgCCuDS78INnYFUwUk
 ZOJ8y2UkDxCOjLzhND3cDVWZEL2ZNq3XzPhFRaCrdnJTV3GgQs6s1f1A5t3iszNx
 4JQ4OM3z+wFNZj1CcWxs/lvMfftaXX5Osyyt6Pgy0gSou7ZrdWGlgIu1QDu9agq5
 o1+hXGYDUlnWsmQoUa8OA4+jhQs/XQYwMLs/sbSH9/1iGVPYDiWuw/oDPlesx4/u
 LtIk85SIL2todeD94b4w2UJvQjuaUgl/WzTF1CfAziX2yjs1zaHQaWXblnNzINLS
 eUpqFDbfPD0yzwjyfC52lNGbSKqOtBLr83GGzPrpp0OZDEI3rjwigARYcLXfh8SR
 KAgWRfYsrTi1efREdyAUlgsyys0yyCVTW5IqpKb6paSrIUIMpF3QE1+fl0Y3XuNK
 YByUyxJVYwWhljBXrNlaVcqdxtz5eZqeVQf0kus7AeHOOzNx0mBRGd99oKIXMMU8
 6fg/rhxSjtEnUdEWQC7hhrnnbgW/CT6uP1EndshYZ/NntyEygJVNNzHnKTbJYQWk
 LDnAtQTlmMB7snCc1+ECjO8=
 =5EjT
 -----END PGP SIGNATURE-----

Merge tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Thomas Bogendoerfer:

 - fixes for KVM

 - fix for loongson build and cpu probing

 - DT fixes

* tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: kvm: Fix build error with KVM_MIPS_DEBUG_COP0_COUNTERS enabled
  MIPS: dts: add missing space before {
  MIPS: Loongson: Fix build error when make modules_install
  MIPS: KVM: Fix NULL pointer dereference
  MIPS: Loongson: Fix cpu_probe_loongson() again
2023-07-09 10:02:49 -07:00
Linus Torvalds
76487845fd Minor cleanups for 6.5:
* Fix an uninitialized variable warning.
 
 Signed-off-by: Darrick J. Wong <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZKjUjwAKCRBKO3ySh0YR
 pn92AQC4gY9GOyKcc/aiAd/t1u8gGxnFtcN06xh4TdVArMM4/AD/UtEKx9LYuaSF
 pyhw5SfzxI555HfXkA8ci/D+BxguVQs=
 =/vX1
 -----END PGP SIGNATURE-----

Merge tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:
 "Nothing exciting here, just getting rid of a gcc warning that I got
  tired of seeing when I turn on gcov"

* tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix uninit warning in xfs_growfs_data
2023-07-09 09:50:42 -07:00
Linus Torvalds
4770353b66 3 smb3 client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmSqNkIACgkQiiy9cAdy
 T1GXsAwAhYUyjlXZLDsmO+9PjKhM9WRM1IO5myy3P396R0Tzq741f8LM7Lx08qc+
 D1701gsnhIrvprem1HjtW6DZzCVnLdpBIYUEnwUr8eDqMpk1VFKug3xSVhIRMih3
 Y30dHTgQ0aCLrrh5XHOWhBHJbpq7Wdlh3q0oi8I36Of8e6tGFNo2wI4ud7no4aIj
 N222dWOs56FXtVAmgEAuc7U2A40ztMOp7FXrbzhK4FwD5kO+pFkqJcLjG6Bk10ph
 Tyg3Wh2TnX+MviOY0xUaN0X50dSoSJPkSUYGkccrIcfVPwEoH7l6j0LNgAVyhG7K
 f5EUbM7Td51a1Znj9wX6U9N0UfO/IOZRDFZ7ACckLBBBEzfKYCgYY5dWJ6aVxZHb
 bB336f1ObvDiocEabS1SMa//sXUjpOy3Tg8etLCYJpqjWYE8nO7lERoBWGWXkUqy
 xO86pGQjYLzkw16R11tzbplv+1HxoGwIuQnOubivv2prn++NZ4Zr2ohBeDlyJc1/
 WwF42UfM
 =F8D0
 -----END PGP SIGNATURE-----

Merge tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6

Pull more smb client updates from Steve French:

 - fix potential use after free in unmount

 - minor cleanup

 - add worker to cleanup stale directory leases

* tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Add a laundromat thread for cached directories
  smb: client: remove redundant pointer 'server'
  cifs: fix session state transition to avoid use-after-free issue
2023-07-09 09:45:32 -07:00
Linus Torvalds
cff0687396 Fixes for pci_clean_master, error handling in driver inits, and various
other issues/bugs.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEoE9b9c3U2JxX98mqbmZLrHqL0iMFAmSpuR8ACgkQbmZLrHqL
 0iOnrg/9FT/mEyLZXfI/sAZWyCKvydvbFL78Id9JVKxqQ00tlLlMtRFlLG7gtagn
 GrOf0HSM6RJxeWo8gqGjb1tM0L1qO/Bt9MlqZf7jIJTyL8DBUMz5sFdzqQ5TZo6j
 krKRjzjiNIyPlSUWGBBt2MIVUbJwfFu6f1dnpAVTWjyc2CNEYa5r3FvAqOTVYTMS
 IiNJyCaGwdPOXQVxOWwJnT4aPcWZzOMf4GYWX2hi8HczcEh1bHcBtHL6UGk5CiG9
 igj1e4Q3865lDp9eFIrQEtW2BSf1igzILpT3WmSxMXvufzILLAl6pOq/s/ziTEM2
 Hrg/JnU1W1Iozpyzd5A1lVIy5+NFN3AEHM3iBI0aZ5Qj1yBhNv4CJ/WLqPABBs+D
 cmuea+gOJjKKVGqdLNZ9+rDGtIV+SRQA+0GGoBhBXvVqH5C7qUxx5ZJMD26mC9lS
 L0dQTbgVyJc3Zv8/FHNGuKvp+xq+vgeZmt8zDViNPlmB9bjz3XikCE7D0lr74coD
 Vvmn4dUo65EqvWfKqLR8fXDZ9pdaTqF19xddNPiZS4p3/CDtKWqXIuBdTKiqq1s4
 Lx78R8wU/uifWSf/9+cPyJ7UCCyPpUdj/SahzZLE4ACfmdfYWIHNf0REt+kGn7Vv
 uCoe7rXBccrz03Fv5mHHuNGS24dfdMim2LXYb3GN2A5atIT0Nko=
 =xPW1
 -----END PGP SIGNATURE-----

Merge tag 'ntb-6.5' of https://github.com/jonmason/ntb

Pull NTB updates from Jon Mason:
 "Fixes for pci_clean_master, error handling in driver inits, and
  various other issues/bugs"

* tag 'ntb-6.5' of https://github.com/jonmason/ntb:
  ntb: hw: amd: Fix debugfs_create_dir error checking
  ntb.rst: Fix copy and paste error
  ntb_netdev: Fix module_init problem
  ntb: intel: Remove redundant pci_clear_master
  ntb: epf: Remove redundant pci_clear_master
  ntb_hw_amd: Remove redundant pci_clear_master
  ntb: idt: drop redundant pci_enable_pcie_error_reporting()
  MAINTAINERS: git://github -> https://github.com for jonmason
  NTB: EPF: fix possible memory leak in pci_vntb_probe()
  NTB: ntb_tool: Add check for devm_kcalloc
  NTB: ntb_transport: fix possible memory leak while device_register() fails
  ntb: intel: Fix error handling in intel_ntb_pci_driver_init()
  NTB: amd: Fix error handling in amd_ntb_pci_driver_init()
  ntb: idt: Fix error handling in idt_pci_driver_init()
2023-07-09 09:35:51 -07:00
Hugh Dickins
1c7873e336 mm: lock newly mapped VMA with corrected ordering
Lockdep is certainly right to complain about

  (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f
                 but task is already holding lock:
  (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db

Invert those to the usual ordering.

Fixes: 33313a747e ("mm: lock newly mapped VMA which can be modified after it becomes visible")
Cc: stable@vger.kernel.org
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-08 16:44:11 -07:00
Linus Torvalds
946c6b59c5 16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues.
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZKmgXAAKCRDdBJ7gKXxA
 joqDAP0V520Jy0cyJrRMvaQRFMqtVeDOdTpAue7ZOQHSi/LZnAD9EEAxDpYF/V4x
 PO27ixXQ4Glm2iYgH7bDX7J73WiA3wg=
 =JsYW
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull hotfixes from Andrew Morton:
 "16 hotfixes. Six are cc:stable and the remainder address post-6.4
  issues"

The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since
it was all hopefully fixed in mainline.

* tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  lib: dhry: fix sleeping allocations inside non-preemptable section
  kasan, slub: fix HW_TAGS zeroing with slub_debug
  kasan: fix type cast in memory_is_poisoned_n
  mailmap: add entries for Heiko Stuebner
  mailmap: update manpage link
  bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
  MAINTAINERS: add linux-next info
  mailmap: add Markus Schneider-Pargmann
  writeback: account the number of pages written back
  mm: call arch_swap_restore() from do_swap_page()
  squashfs: fix cache race with migration
  mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison
  docs: update ocfs2-devel mailing list address
  MAINTAINERS: update ocfs2-devel mailing list address
  mm: disable CONFIG_PER_VMA_LOCK until its fixed
  fork: lock VMAs of the parent process when forking
2023-07-08 14:30:25 -07:00
Suren Baghdasaryan
fb49c45532 fork: lock VMAs of the parent process when forking
When forking a child process, the parent write-protects anonymous pages
and COW-shares them with the child being forked using copy_present_pte().

We must not take any concurrent page faults on the source vma's as they
are being processed, as we expect both the vma and the pte's behind it
to be stable.  For example, the anon_vma_fork() expects the parents
vma->anon_vma to not change during the vma copy.

A concurrent page fault on a page newly marked read-only by the page
copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the
source vma, defeating the anon_vma_clone() that wasn't done because the
parent vma originally didn't have an anon_vma, but we now might end up
copying a pte entry for a page that has one.

Before the per-vma lock based changes, the mmap_lock guaranteed
exclusion with concurrent page faults.  But now we need to do a
vma_start_write() to make sure no concurrent faults happen on this vma
while it is being processed.

This fix can potentially regress some fork-heavy workloads.  Kernel
build time did not show noticeable regression on a 56-core machine while
a stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression.  If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance.  Further
optimizations are possible if this regression proves to be problematic.

Suggested-by: David Hildenbrand <david@redhat.com>
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/
Reported-by: Jacob Young <jacobly.alt@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
Fixes: 0bff0aaea0 ("x86/mm: try VMA lock-based page fault handling first")
Cc: stable@vger.kernel.org
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-08 14:08:02 -07:00
Suren Baghdasaryan
33313a747e mm: lock newly mapped VMA which can be modified after it becomes visible
mmap_region adds a newly created VMA into VMA tree and might modify it
afterwards before dropping the mmap_lock.  This poses a problem for page
faults handled under per-VMA locks because they don't take the mmap_lock
and can stumble on this VMA while it's still being modified.  Currently
this does not pose a problem since post-addition modifications are done
only for file-backed VMAs, which are not handled under per-VMA lock.
However, once support for handling file-backed page faults with per-VMA
locks is added, this will become a race.

Fix this by write-locking the VMA before inserting it into the VMA tree.
Other places where a new VMA is added into VMA tree do not modify it
after the insertion, so do not need the same locking.

Cc: stable@vger.kernel.org
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-08 14:08:02 -07:00
Suren Baghdasaryan
c137381f71 mm: lock a vma before stack expansion
With recent changes necessitating mmap_lock to be held for write while
expanding a stack, per-VMA locks should follow the same rules and be
write-locked to prevent page faults into the VMA being expanded. Add
the necessary locking.

Cc: stable@vger.kernel.org
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-08 14:08:02 -07:00
Linus Torvalds
7fcd473a64 SCSI misc on 20230708
A few late arriving patches that missed the initial pull request.
 It's mostly bug fixes (the dt-bindings is a fix for the initial pull).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZKmvwSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishResAQCPDbBh
 omMRBE+W+Vx2TgOJGjo/F+T1D2JjBhLIGpNVggEApJtgrQutAToiCU/qIP9GOTl7
 evetzh5boMMuyD2s7ak=
 =pi4v
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "A few late arriving patches that missed the initial pull request. It's
  mostly bug fixes (the dt-bindings is a fix for the initial pull)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: core: Remove unused function declaration
  scsi: target: docs: Remove tcm_mod_builder.py
  scsi: target: iblock: Quiet bool conversion warning with pr_preempt use
  scsi: dt-bindings: ufs: qcom: Fix ICE phandle
  scsi: core: Simplify scsi_cdl_check_cmd()
  scsi: isci: Fix comment typo
  scsi: smartpqi: Replace one-element arrays with flexible-array members
  scsi: target: tcmu: Replace strlcpy() with strscpy()
  scsi: ncr53c8xx: Replace strlcpy() with strscpy()
  scsi: lpfc: Fix lpfc_name struct packing
2023-07-08 12:35:18 -07:00
Linus Torvalds
84dc5aa3f0 Part 2 of I2C patches for 6.5
* xiic patch should have been in part 1 but slipped through
 * mpc patch fixes a build regression from part 1
 * nomadik is a fix which needed a rebase after part 1
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEOZGx6rniZ1Gk92RdFA3kzBSgKbYFAmSoblYPHHdzYUBrZXJu
 ZWwub3JnAAoJEBQN5MwUoCm2iG0P/RG3yGeV0PzQ2XbKPwIMAQURQ7NiqVVrAnUP
 kkduOkPJ9QU+9Qjb/4SRCTxWFVFbqruyOZWB+5hmwHtv4MS3H1ulKaH9bat5bQsF
 F28o5pjMVQEhjl+Y1jAtnSOlhprbTds6B1NdgbMFopnQ7ExlB1/RAt0rvZdgsfkt
 WeKRQMlne0mpQ7mybp6qsN8ldYBSDu8AmZ0S/RSnCwRwl0AxYYxk5roQAfg0tbaU
 N1zWUerNO7lAVoJnFsdeo0CMxK7t3QGHPSZGayargEF0KJHmxfzU14oZ6l28j6wq
 mSazjyK5wTeqqa4mU835RO9Ko6zE42u5nM+ok8Ui6a4xWjvafNrjnZkIClMg2eC9
 7ESZoz+Rtd3kCu3LiWXqPhs9cplHrRtA4M8E4gbTbsdMB8kFRaaiocjHfPksO8SG
 QtFXHGV5AgtoBkCC3womZWvY17XOZf40S6NUFafZ8kZFg5WGCWP3rwUZhz/+frSM
 QiIdxM+cER6WC6ADnZHdQyFl1fjLc4EdbwoN54E7DRBqB4s0wy+1vU+/qf9JdIuR
 zIN0I2OFxnwFEiTfAlTX1RmYy7L/3dtP2Gk8og+W9CxHSwJsURhqIfz/MjNuoh7U
 c11jS9QfETBj2GDjRoRxE0zj4Q2rhjdTwFhQfuYp0eI6UsDfYHHB7ql2FwO1+EW1
 VkmkX38T
 =xwp6
 -----END PGP SIGNATURE-----

Merge tag 'i2c-for-6.5-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull more i2c updates from Wolfram Sang:

 - xiic patch should have been in the original pull but slipped through

 - mpc patch fixes a build regression

 - nomadik cleanup

* tag 'i2c-for-6.5-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: mpc: Drop unused variable
  i2c: nomadik: Remove a useless call in the remove function
  i2c: xiic: Don't try to handle more interrupt events after error
2023-07-08 12:28:00 -07:00
Linus Torvalds
8fc3b8f082 hardening fixes for v6.5-rc1
- Check for NULL bdev in LoadPin (Matthias Kaehlcke)
 
 - Revert unwanted KUnit FORTIFY build default
 
 - Fix 1-element array causing boot warnings with xhci-hub
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmSoVSsWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjyuD/9Sgr+T3VJyROJdKouYO8tLUqaO
 g0A6+WE0L7XyO4ZYk4FOadeihsVEPuhB0fpDTwriKCKdPB35+Nhq8YfWPPQcGdjQ
 0IAT5AjsjYDDFGABRtsNRcL+KXyR+QRVUnSllEsZuwb3lyq6HRbdTF2QBjToAbyO
 QOgEnFJNqPp2w9y2KSzpMuYL4I9o1WbyM+huVSfoKe/3d2WnVKiARMpV+0EJgUAy
 BvORp55+c1w77IRbQduACWszdCLXfkQyI+p5ii3M7cZmePDe4q8LHN01WtIMEnHy
 cln7AnwU4daxzfdeAWIQMLFjOXTLHlkRhC18KSobeBc5Zkudtcg5LxtFGiDsDgOU
 mUWB/Ow8rgr6KlYkMFmFrW/GAVX12KbPXDATECa/4Yhl55Ydl/1bChJWWnX2pppU
 mRRnwIcY7MfhRLeB284Gst81wOHy408arJsm/vck5kdya0Ys1y38rgNQm7iKfXVu
 FYMrDU9qqGmeIVk2namjQYoWH5ei670PXndtrcvSffeZOhpzk2FnFphtraPe0mrl
 l1lcUonZwEoTJ4wDiOR9cjSphoDVom9LgwygQVb4KGHBjuCfRABDV2DGy9duBMtv
 Akcet48VkCX6wF91+30fFmTs5haRiF/5kkx5fGuxhFlQO8QHYVjIOH55VqhAt3mw
 d0OWiZaNRvbNfjPSkQ==
 =R3uK
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - Check for NULL bdev in LoadPin (Matthias Kaehlcke)

 - Revert unwanted KUnit FORTIFY build default

 - Fix 1-element array causing boot warnings with xhci-hub

* tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  usb: ch9: Replace bmSublinkSpeedAttr 1-element array with flexible array
  Revert "fortify: Allow KUnit test to build without FORTIFY"
  dm: verity-loadpin: Add NULL pointer check for 'bdev' parameter
2023-07-08 12:08:39 -07:00
Anup Sharma
bff6efc54b ntb: hw: amd: Fix debugfs_create_dir error checking
The debugfs_create_dir function returns ERR_PTR in case of error, and the
only correct way to check if an error occurred is 'IS_ERR' inline function.
This patch will replace the null-comparison with IS_ERR.

Signed-off-by: Anup Sharma <anupnewsmail@gmail.com>
Suggested-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2023-07-08 13:55:44 -04:00
Linus Torvalds
c206353dfd perf tools changes and fixes for v6.5: 2nd batch
Build:
 
  - Allow to generate vmlinux.h from BTF using `make GEN_VMLINUX_H=1`
    and skip if the vmlinux has no BTF.
 
  - Replace deprecated clang -target xxx option by --target=xxx.
 
 perf record:
 
  - Print event attributes with well known type and config symbols in the
    debug output like below:
 
     # perf record -e cycles,cpu-clock -C0 -vv true
     <SNIP>
     ------------------------------------------------------------
     perf_event_attr:
       type                             0 (PERF_TYPE_HARDWARE)
       size                             136
       config                           0 (PERF_COUNT_HW_CPU_CYCLES)
       { sample_period, sample_freq }   4000
       sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER
       read_format                      ID
       disabled                         1
       inherit                          1
       freq                             1
       sample_id_all                    1
       exclude_guest                    1
     ------------------------------------------------------------
     sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
     ------------------------------------------------------------
     perf_event_attr:
       type                             1 (PERF_TYPE_SOFTWARE)
       size                             136
       config                           0 (PERF_COUNT_SW_CPU_CLOCK)
       { sample_period, sample_freq }   4000
       sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER
       read_format                      ID
       disabled                         1
       inherit                          1
       freq                             1
       sample_id_all                    1
       exclude_guest                    1
 
  - Update AMD IBS event error message since it now support per-process
    profiling but no priviledge filters.
 
     $ sudo perf record -e ibs_op//k -C 0
     Error:
     AMD IBS doesn't support privilege filtering. Try again without
     the privilege modifiers (like 'k') at the end.
 
 perf lock contention:
 
  - Support CSV style output using -x option
 
     $ sudo perf lock con -ab -x, sleep 1
     # output: contended, total wait, max wait, avg wait, type, caller
     19, 194232, 21415, 10222, spinlock, process_one_work+0x1f0
     15, 162748, 23843, 10849, rwsem:R, do_user_addr_fault+0x40e
     4, 86740, 23415, 21685, rwlock:R, ep_poll_callback+0x2d
     1, 84281, 84281, 84281, mutex, iwl_mvm_async_handlers_wk+0x135
     8, 67608, 27404, 8451, spinlock, __queue_work+0x174
     3, 58616, 31125, 19538, rwsem:W, do_mprotect_pkey+0xff
     3, 52953, 21172, 17651, rwlock:W, do_epoll_wait+0x248
     2, 30324, 19704, 15162, rwsem:R, do_madvise+0x3ad
     1, 24619, 24619, 24619, spinlock, rcu_core+0xd4
 
  - Add --output option to save the data to a file not to be interfered
    by other debug messages.
 
 Test:
 
  - Fix event parsing test on ARM where there's no raw PMU nor supports
    PERF_PMU_CAP_EXTENDED_HW_TYPE.
 
  - Update the lock contention test case for CSV output.
 
  - Fix a segfault in the daemon command test.
 
 Vendor events (JSON):
 
  - Add has_event() to check if the given event is available on system
    at runtime.  On Intel machines, some transaction events may not be
    present when TSC extensions are disabled.
 
  - Update Intel event metrics.
 
 Misc:
 
  - Sort symbols by name using an external array of pointers instead of
    a rbtree node in the symbol.  This will save 16-bytes or 24-bytes
    per symbol whether the sorting is actually requested or not.
 
  - Fix unwinding DWARF callstacks using libdw when --symfs option is
    used.
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZKb4mwAKCRCMstVUGiXM
 g1QqAPwKZow/DhAzyN7KvzdNd+SojRGpUMl6RkVphY/9ntDqPAD+L3V5aXLTiC1L
 8kUzdpRX5VMjqdR9U7TycUOi4QU40QA=
 =dEF1
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.5-2-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next

Pull more perf tools updates from Namhyung Kim:
 "These are remaining changes and fixes for this cycle.

  Build:

   - Allow generating vmlinux.h from BTF using `make GEN_VMLINUX_H=1`
     and skip if the vmlinux has no BTF.

   - Replace deprecated clang -target xxx option by --target=xxx.

  perf record:

   - Print event attributes with well known type and config symbols in
     the debug output like below:

       # perf record -e cycles,cpu-clock -C0 -vv true
       <SNIP>
       ------------------------------------------------------------
       perf_event_attr:
         type                             0 (PERF_TYPE_HARDWARE)
         size                             136
         config                           0 (PERF_COUNT_HW_CPU_CYCLES)
         { sample_period, sample_freq }   4000
         sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER
         read_format                      ID
         disabled                         1
         inherit                          1
         freq                             1
         sample_id_all                    1
         exclude_guest                    1
       ------------------------------------------------------------
       sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
       ------------------------------------------------------------
       perf_event_attr:
         type                             1 (PERF_TYPE_SOFTWARE)
         size                             136
         config                           0 (PERF_COUNT_SW_CPU_CLOCK)
         { sample_period, sample_freq }   4000
         sample_type                      IP|TID|TIME|CPU|PERIOD|IDENTIFIER
         read_format                      ID
         disabled                         1
         inherit                          1
         freq                             1
         sample_id_all                    1
         exclude_guest                    1

   - Update AMD IBS event error message since it now support per-process
     profiling but no priviledge filters.

       $ sudo perf record -e ibs_op//k -C 0
       Error:
       AMD IBS doesn't support privilege filtering. Try again without
       the privilege modifiers (like 'k') at the end.

  perf lock contention:

   - Support CSV style output using -x option

       $ sudo perf lock con -ab -x, sleep 1
       # output: contended, total wait, max wait, avg wait, type, caller
       19, 194232, 21415, 10222, spinlock, process_one_work+0x1f0
       15, 162748, 23843, 10849, rwsem:R, do_user_addr_fault+0x40e
       4, 86740, 23415, 21685, rwlock:R, ep_poll_callback+0x2d
       1, 84281, 84281, 84281, mutex, iwl_mvm_async_handlers_wk+0x135
       8, 67608, 27404, 8451, spinlock, __queue_work+0x174
       3, 58616, 31125, 19538, rwsem:W, do_mprotect_pkey+0xff
       3, 52953, 21172, 17651, rwlock:W, do_epoll_wait+0x248
       2, 30324, 19704, 15162, rwsem:R, do_madvise+0x3ad
       1, 24619, 24619, 24619, spinlock, rcu_core+0xd4

   - Add --output option to save the data to a file not to be interfered
     by other debug messages.

  Test:

   - Fix event parsing test on ARM where there's no raw PMU nor supports
     PERF_PMU_CAP_EXTENDED_HW_TYPE.

   - Update the lock contention test case for CSV output.

   - Fix a segfault in the daemon command test.

  Vendor events (JSON):

   - Add has_event() to check if the given event is available on system
     at runtime. On Intel machines, some transaction events may not be
     present when TSC extensions are disabled.

   - Update Intel event metrics.

  Misc:

   - Sort symbols by name using an external array of pointers instead of
     a rbtree node in the symbol. This will save 16-bytes or 24-bytes
     per symbol whether the sorting is actually requested or not.

   - Fix unwinding DWARF callstacks using libdw when --symfs option is
     used"

* tag 'perf-tools-for-v6.5-2-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next: (38 commits)
  perf test: Fix event parsing test when PERF_PMU_CAP_EXTENDED_HW_TYPE isn't supported.
  perf test: Fix event parsing test on Arm
  perf evsel amd: Fix IBS error message
  perf: unwind: Fix symfs with libdw
  perf symbol: Fix uninitialized return value in symbols__find_by_name()
  perf test: Test perf lock contention CSV output
  perf lock contention: Add --output option
  perf lock contention: Add -x option for CSV style output
  perf lock: Remove stale comments
  perf vendor events intel: Update tigerlake to 1.13
  perf vendor events intel: Update skylakex to 1.31
  perf vendor events intel: Update skylake to 57
  perf vendor events intel: Update sapphirerapids to 1.14
  perf vendor events intel: Update icelakex to 1.21
  perf vendor events intel: Update icelake to 1.19
  perf vendor events intel: Update cascadelakex to 1.19
  perf vendor events intel: Update meteorlake to 1.03
  perf vendor events intel: Add rocketlake events/metrics
  perf vendor metrics intel: Make transaction metrics conditional
  perf jevents: Support for has_event function
  ...
2023-07-08 10:21:51 -07:00
Linus Torvalds
ad8258e877 bitmap patches for v6.5
Fixes for different bitmap pieces:
 
  - lib/test_bitmap: increment failure counter properly
 
    The tests that don't use expect_eq() macro to determine that a test is
    failured must increment failed_tests explicitly.
 
  - lib/bitmap: drop optimization of bitmap_{from,to}_arr64
 
    bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE
    architectures when it's wired to bitmap_copy_clear_tail().
 
  - nodemask: Drop duplicate check in for_each_node_mask()
 
    As the return value type of first_node() became unsigned, the node >= 0
    became unnecessary.
 
  - cpumask: fix function description kernel-doc notation
 
  - MAINTAINERS: Add bits.h to the BITMAP API record
  - MAINTAINERS: Add bitfield.h to the BITMAP API record
 
    Add linux/bits.h and linux/bitfield.h for visibility
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmShrd0ACgkQsUSA/Tof
 vsg5RAv/YyedOccCkLtd4D+l+jF/a1qd64u6HfONkYzdoSeKIBiAlKukpHOjJjky
 ii5vSwEMzW34MA6wBylmY4078OXEwEu9SUsF/+5jGvqJ1ABCZkwFoMuAavRHkHhc
 xE/es6XvpSbZ4U4GHnyaUbQCuEkpW21U644QdSWqjVPDHmo32EgtNG7hTaZfCRNh
 VNa8H+tbgi5rQUMBeyxppJjyynGKwut6pX4OcOi2lq1onz8zzHx8H4AJC2v8Yg0r
 2ZRS1FfAuk2Pvp7JFQOExcxf0x/Ph3zp8dqIgfo2pSMAi/0ZjnTkAPxCYIGsWG5u
 TbfO3rHviHtKAN0q5/6xp939G69flW5xSica+IvtuXSIS/JbllwDN4mrexYxppWH
 Euj/ixJe79Dn+jtOgG17p4cp70yAcM1IQfyWH25Q8Jej+C/gN8WB0UXD4K7jTXrn
 9rGID28WbuclAWzJDmFQTpZutG1GHjM7rytit+if08gz065vUhTllhJC+vCHQojY
 clS4E2DW
 =nZF4
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-6.5-rc1' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:
 "Fixes for different bitmap pieces:

   - lib/test_bitmap: increment failure counter properly

     The tests that don't use expect_eq() macro to determine that a test
     is failured must increment failed_tests explicitly.

   - lib/bitmap: drop optimization of bitmap_{from,to}_arr64

     bitmap_{from,to}_arr64() optimization is overly optimistic
     on 32-bit LE architectures when it's wired to
     bitmap_copy_clear_tail().

   - nodemask: Drop duplicate check in for_each_node_mask()

     As the return value type of first_node() became unsigned, the node
     >= 0 became unnecessary.

   - cpumask: fix function description kernel-doc notation

   - MAINTAINERS: Add bits.h and bitfield.h to the BITMAP API record

     Add linux/bits.h and linux/bitfield.h for visibility"

* tag 'bitmap-6.5-rc1' of https://github.com/norov/linux:
  MAINTAINERS: Add bitfield.h to the BITMAP API record
  MAINTAINERS: Add bits.h to the BITMAP API record
  cpumask: fix function description kernel-doc notation
  nodemask: Drop duplicate check in for_each_node_mask()
  lib/bitmap: drop optimization of bitmap_{from,to}_arr64
  lib/test_bitmap: increment failure counter properly
2023-07-08 10:02:24 -07:00
Geert Uytterhoeven
8ba388c06b lib: dhry: fix sleeping allocations inside non-preemptable section
The Smatch static checker reports the following warnings:

    lib/dhry_run.c:38 dhry_benchmark() warn: sleeping in atomic context
    lib/dhry_run.c:43 dhry_benchmark() warn: sleeping in atomic context

Indeed, dhry() does sleeping allocations inside the non-preemptable
section delimited by get_cpu()/put_cpu().

Fix this by using atomic allocations instead.
Add error handling, as atomic these allocations may fail.

Link: https://lkml.kernel.org/r/bac6d517818a7cd8efe217c1ad649fffab9cc371.1688568764.git.geert+renesas@glider.be
Fixes: 13684e966d ("lib: dhry: fix unstable smp_processor_id(_) usage")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/0469eb3a-02eb-4b41-b189-de20b931fa56@moroto.mountain
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:32 -07:00
Andrey Konovalov
fdb54d9660 kasan, slub: fix HW_TAGS zeroing with slub_debug
Commit 946fa0dbf2 ("mm/slub: extend redzone check to extra allocated
kmalloc space than requested") added precise kmalloc redzone poisoning to
the slub_debug functionality.

However, this commit didn't account for HW_TAGS KASAN fully initializing
the object via its built-in memory initialization feature.  Even though
HW_TAGS KASAN memory initialization contains special memory initialization
handling for when slub_debug is enabled, it does not account for in-object
slub_debug redzones.  As a result, HW_TAGS KASAN can overwrite these
redzones and cause false-positive slub_debug reports.

To fix the issue, avoid HW_TAGS KASAN memory initialization when
slub_debug is enabled altogether.  Implement this by moving the
__slub_debug_enabled check to slab_post_alloc_hook.  Common slab code
seems like a more appropriate place for a slub_debug check anyway.

Link: https://lkml.kernel.org/r/678ac92ab790dba9198f9ca14f405651b97c8502.1688561016.git.andreyknvl@google.com
Fixes: 946fa0dbf2 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: Will Deacon <will@kernel.org>
Acked-by: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: kasan-dev@googlegroups.com
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:32 -07:00
Andrey Konovalov
05c56e7b43 kasan: fix type cast in memory_is_poisoned_n
Commit bb6e04a173 ("kasan: use internal prototypes matching gcc-13
builtins") introduced a bug into the memory_is_poisoned_n implementation:
it effectively removed the cast to a signed integer type after applying
KASAN_GRANULE_MASK.

As a result, KASAN started failing to properly check memset, memcpy, and
other similar functions.

Fix the bug by adding the cast back (through an additional signed integer
variable to make the code more readable).

Link: https://lkml.kernel.org/r/8c9e0251c2b8b81016255709d4ec42942dcaf018.1688431866.git.andreyknvl@google.com
Fixes: bb6e04a173 ("kasan: use internal prototypes matching gcc-13 builtins")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:32 -07:00
Heiko Stuebner
d3a808ec78 mailmap: add entries for Heiko Stuebner
I am going to lose my vrull.eu address at the end of july, and while
adding it to mailmap I also realised that there are more old addresses
from me dangling, so update .mailmap for all of them.

Link: https://lkml.kernel.org/r/20230704163919.1136784-3-heiko@sntech.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:31 -07:00
Heiko Stuebner
ddcd91f4cb mailmap: update manpage link
Patch series "Update .mailmap for my work address and fix manpage".

While updating mailmap for the going-away address, I also found that on
current systems the manpage linked from the header comment changed.

And in fact it looks like the git mailmap feature got its own manpage.


This patch (of 2):

On recent systems the git-shortlog manpage only tells people to
    See gitmailmap(5)

So instead of sending people on a scavenger hunt, put that info into the
header directly.  Though keep the old reference around for older systems.

Link: https://lkml.kernel.org/r/20230704163919.1136784-1-heiko@sntech.de
Link: https://lkml.kernel.org/r/20230704163919.1136784-2-heiko@sntech.de
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:31 -07:00
Liu Shixin
028725e733 bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
commit dd0ff4d12d ("bootmem: remove the vmemmap pages from kmemleak in
put_page_bootmem") fix an overlaps existing problem of kmemleak.  But the
problem still existed when HAVE_BOOTMEM_INFO_NODE is disabled, because in
this case, free_bootmem_page() will call free_reserved_page() directly.

Fix the problem by adding kmemleak_free_part() in free_bootmem_page() when
HAVE_BOOTMEM_INFO_NODE is disabled.

Link: https://lkml.kernel.org/r/20230704101942.2819426-1-liushixin2@huawei.com
Fixes: f41f2ed43c ("mm: hugetlb: free the vmemmap pages associated with each HugeTLB page")
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:31 -07:00
Randy Dunlap
0d707cdefb MAINTAINERS: add linux-next info
Add linux-next info to MAINTAINERS for ease of finding this data.

Link: https://lkml.kernel.org/r/20230704054410.12527-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:31 -07:00
Markus Schneider-Pargmann
6dedd768f3 mailmap: add Markus Schneider-Pargmann
Add my old mail address and update my name.

Link: https://lkml.kernel.org/r/20230628081341.3470229-1-msp@baylibre.com
Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:30 -07:00
Matthew Wilcox (Oracle)
8344a3d44b writeback: account the number of pages written back
nr_to_write is a count of pages, so we need to decrease it by the number
of pages in the folio we just wrote, not by 1.  Most callers specify
either LONG_MAX or 1, so are unaffected, but writeback_sb_inodes() might
end up writing 512x as many pages as it asked for.

Dave added:

: XFS is the only filesystem this would affect, right?  AFAIA, nothing
: else enables large folios and uses writeback through
: write_cache_pages() at this point...
: 
: In which case, I'd be surprised if much difference, if any, gets
: noticed by anyone.

Link: https://lkml.kernel.org/r/20230628185548.981888-1-willy@infradead.org
Fixes: 793917d997 ("mm/readahead: Add large folio readahead")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:30 -07:00
Peter Collingbourne
6dca4ac6fc mm: call arch_swap_restore() from do_swap_page()
Commit c145e0b47c ("mm: streamline COW logic in do_swap_page()") moved
the call to swap_free() before the call to set_pte_at(), which meant that
the MTE tags could end up being freed before set_pte_at() had a chance to
restore them.  Fix it by adding a call to the arch_swap_restore() hook
before the call to swap_free().

Link: https://lkml.kernel.org/r/20230523004312.1807357-2-pcc@google.com
Link: https://linux-review.googlesource.com/id/I6470efa669e8bd2f841049b8c61020c510678965
Fixes: c145e0b47c ("mm: streamline COW logic in do_swap_page()")
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reported-by: Qun-wei Lin <Qun-wei.Lin@mediatek.com>
Closes: https://lore.kernel.org/all/5050805753ac469e8d727c797c2218a9d780d434.camel@mediatek.com/
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>	[6.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:30 -07:00
Vincent Whitchurch
08bab74ae6 squashfs: fix cache race with migration
Migration replaces the page in the mapping before copying the contents and
the flags over from the old page, so check that the page in the page cache
is really up to date before using it.  Without this, stressing squashfs
reads with parallel compaction sometimes results in squashfs reporting
data corruption.

Link: https://lkml.kernel.org/r/20230629-squashfs-cache-migration-v1-1-d50ebe55099d@axis.com
Fixes: e994f5b677 ("squashfs: cache partial compressed blocks")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:30 -07:00
John Hubbard
191fcdb6c9 mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison
The following crash happens for me when running the -mm selftests (below).
Specifically, it happens while running the uffd-stress subtests:

kernel BUG at mm/hugetlb.c:7249!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 3238 Comm: uffd-stress Not tainted 6.4.0-hubbard-github+ #109
Hardware name: ASUS X299-A/PRIME X299-A, BIOS 1503 08/03/2018
RIP: 0010:huge_pte_alloc+0x12c/0x1a0
...
Call Trace:
 <TASK>
 ? __die_body+0x63/0xb0
 ? die+0x9f/0xc0
 ? do_trap+0xab/0x180
 ? huge_pte_alloc+0x12c/0x1a0
 ? do_error_trap+0xc6/0x110
 ? huge_pte_alloc+0x12c/0x1a0
 ? handle_invalid_op+0x2c/0x40
 ? huge_pte_alloc+0x12c/0x1a0
 ? exc_invalid_op+0x33/0x50
 ? asm_exc_invalid_op+0x16/0x20
 ? __pfx_put_prev_task_idle+0x10/0x10
 ? huge_pte_alloc+0x12c/0x1a0
 hugetlb_fault+0x1a3/0x1120
 ? finish_task_switch+0xb3/0x2a0
 ? lock_is_held_type+0xdb/0x150
 handle_mm_fault+0xb8a/0xd40
 ? find_vma+0x5d/0xa0
 do_user_addr_fault+0x257/0x5d0
 exc_page_fault+0x7b/0x1f0
 asm_exc_page_fault+0x22/0x30

That happens because a BUG() statement in huge_pte_alloc() attempts to
check that a pte, if present, is a hugetlb pte, but it does so in a
non-lockless-safe manner that leads to a false BUG() report.

We got here due to a couple of bugs, each of which by itself was not quite
enough to cause a problem:

First of all, before commit c33c794828f2("mm: ptep_get() conversion"), the
BUG() statement in huge_pte_alloc() was itself fragile: it relied upon
compiler behavior to only read the pte once, despite using it twice in the
same conditional.

Next, commit c33c794828 ("mm: ptep_get() conversion") broke that
delicate situation, by causing all direct pte reads to be done via
READ_ONCE().  And so READ_ONCE() got called twice within the same BUG()
conditional, leading to comparing (potentially, occasionally) different
versions of the pte, and thus to false BUG() reports.

Fix this by taking a single snapshot of the pte before using it in the
BUG conditional.

Now, that commit is only partially to blame here but, people doing
bisections will invariably land there, so this will help them find a fix
for a real crash.  And also, the previous behavior was unlikely to ever
expose this bug--it was fragile, yet not actually broken.

So that's why I chose this commit for the Fixes tag, rather than the
commit that created the original BUG() statement.

Link: https://lkml.kernel.org/r/20230701010442.2041858-1-jhubbard@nvidia.com
Fixes: c33c794828 ("mm: ptep_get() conversion")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: James Houghton <jthoughton@google.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:29 -07:00