- Prevent the Power Allocator thermal governor from dereferencing a NULL
pointer if it is bound to a tripless thermal zone (Nícolas Prado).
- Prevent thermal zones enabled too early from staying effectively
dormant forever because their temperature cannot be determined
initially (Rafael Wysocki).
- Fix list sorting during thermal zone temperature updates to ensure
the proper ordering of trip crossing notifications (Rafael Wysocki).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmaOZdQSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxV0gQALKvi36z4lCF2NpZtncnS0TCwiqknb3h
I5W7O28EeED/qKxtUhT0rKaFJA5py9Civ2J1xfccnsO2KLlLtZxzL2yNrHHnewGZ
SCdVDommTh0zIw01d7h4dzFPE73cYWoX5kwsto9ty0/xi6IBt019LLCIJgB6OmqA
pu4RoESkhxoVFNrV8dtB7Fj+IT9rIGHtC0c4DZXqIgz6MJkiNAXzwyhL2N7icxlx
zPhDSBWv2CLTqVxFAFxSc0Hq5FUieU/vMjkrpT4liR4KuTnbqmxOw2pFdgsCf/AJ
CKhef9aqXeoQYIMTbCreOdgAYMtNekjnUuta8OMwCxop9HCVhh1O1asMURiIX5VT
8SRal1nDgmTXG5NR0V6TVJ2VYQ6amfqSux0B2lyxcMxr4VsY4kekpsnXXPO68rHB
ZVCSIza/fH13dyYOrd0GC7Qz2bGRKYstiXXDZc6s69ij7ulDNpG61M49M3W1V7wk
v2p0SZwjFax3H5DPyd7b9pvBEeAsKGCco2wm/BLauYtnciSsIQBw3Q330DGzsXBm
EN7vGq8q/w6D6Y3S0syiRyGcaDpDK3FmZerXdASaBRNkWvnXn3fhsOngBY62+3iX
LqVbvXar2Of//Q9NkvF3S1ko4tJF78vplqz+ScjHUnIE6kfpoby+CVupIcbVXceB
bcsyCAFYcYlL
=o+Gb
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"These fix a possible NULL pointer dereference in a thermal governor,
fix up the handling of thermal zones enabled before their temperature
can be determined and fix list sorting during thermal zone temperature
updates.
Specifics:
- Prevent the Power Allocator thermal governor from dereferencing a
NULL pointer if it is bound to a tripless thermal zone (Nícolas
Prado)
- Prevent thermal zones enabled too early from staying effectively
dormant forever because their temperature cannot be determined
initially (Rafael Wysocki)
- Fix list sorting during thermal zone temperature updates to ensure
the proper ordering of trip crossing notifications (Rafael
Wysocki)"
* tag 'thermal-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: core: Fix list sorting in __thermal_zone_device_update()
thermal: core: Call monitor_thermal_zone() if zone temperature is invalid
thermal: gov_power_allocator: Return early in manage if trip_max is NULL
The count variable is used without initialization, it results in mistakes
in the device counting and crashes the userspace if the get hot reset info
path is triggered.
Fixes: f6944d4a0b ("vfio/pci: Collect hot-reset devices to local buffer")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219010
Reported-by: Žilvinas Žaltiena <zaltys@natrix.lt>
Cc: Beld Zhang <beldzhang@gmail.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240710004150.319105-1-yi.l.liu@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
In order to use toshiba_dmi_quirks[] together with the standard DMI
matching functions, it must be terminated by a empty entry.
Since this entry is missing, an array out-of-bounds access occurs
every time the quirk list is processed.
Fix this by adding the terminating empty entry.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202407091536.8b116b3d-lkp@intel.com
Fixes: 3cb1f40dfd ("drivers/platform: toshiba_acpi: Call HCI_PANEL_POWER_ON on resume on some models")
Cc: stable@vger.kernel.org
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20240709143851.10097-1-W_Armin@gmx.de
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
After commit 230e9fc286 ("slab: add SLAB_ACCOUNT flag"), we need to mark
the inode cache as SLAB_ACCOUNT, similar to commit 5d097056c9 ("kmemcg:
account for certain kmem allocations to memcg")
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
originally, stack closures were only used synchronously, and with the
original implementation of closure_sync() the ref never hit 0; thus,
closure_put_after_sub() assumes that if the ref hits 0 it's on the debug
list, in debug mode.
that's no longer true with the current implementation of closure_sync,
so we need a new magic so closure_debug_destroy() doesn't pop an assert.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Replace a comma between expression statements by a semicolon.
Fixes: efc1bb8a6f ("davinci: add power management support")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The way the delayed work is handled in ceph_monc_stop() is prone to
races with mon_fault() and possibly also finish_hunting(). Both of
these can requeue the delayed work which wouldn't be canceled by any of
the following code in case that happens after cancel_delayed_work_sync()
runs -- __close_session() doesn't mess with the delayed work in order
to avoid interfering with the hunting interval logic. This part was
missed in commit b5d91704f5 ("libceph: behave in mon_fault() if
cur_mon < 0") and use-after-free can still ensue on monc and objects
that hang off of it, with monc->auth and monc->monmap being
particularly susceptible to quickly being reused.
To fix this:
- clear monc->cur_mon and monc->hunting as part of closing the session
in ceph_monc_stop()
- bail from delayed_work() if monc->cur_mon is cleared, similar to how
it's done in mon_fault() and finish_hunting() (based on monc->hunting)
- call cancel_delayed_work_sync() after the session is closed
Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/66857
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
minixfs now uses kmap_local_page(), so we can't call kunmap() to
undo it. This one call was missed as part of the commit this fixes.
Fixes: 6628f69ee6 (minixfs: Use dir_put_page() in minix_unlink() and minix_rename())
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/20240709195841.1986374-1-willy@infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
The commit 6533e558c6 ("i40e: Fix reset path while removing
the driver") introduced a new PF state "__I40E_IN_REMOVE" to block
modifying the XDP program while the driver is being removed.
Unfortunately, such a change is useful only if the ".ndo_bpf()"
callback was called out of the rmmod context because unloading the
existing XDP program is also a part of driver removing procedure.
In other words, from the rmmod context the driver is expected to
unload the XDP program without reporting any errors. Otherwise,
the kernel warning with callstack is printed out to dmesg.
Example failing scenario:
1. Load the i40e driver.
2. Load the XDP program.
3. Unload the i40e driver (using "rmmod" command).
The example kernel warning log:
[ +0.004646] WARNING: CPU: 94 PID: 10395 at net/core/dev.c:9290 unregister_netdevice_many_notify+0x7a9/0x870
[...]
[ +0.010959] RIP: 0010:unregister_netdevice_many_notify+0x7a9/0x870
[...]
[ +0.002726] Call Trace:
[ +0.002457] <TASK>
[ +0.002119] ? __warn+0x80/0x120
[ +0.003245] ? unregister_netdevice_many_notify+0x7a9/0x870
[ +0.005586] ? report_bug+0x164/0x190
[ +0.003678] ? handle_bug+0x3c/0x80
[ +0.003503] ? exc_invalid_op+0x17/0x70
[ +0.003846] ? asm_exc_invalid_op+0x1a/0x20
[ +0.004200] ? unregister_netdevice_many_notify+0x7a9/0x870
[ +0.005579] ? unregister_netdevice_many_notify+0x3cc/0x870
[ +0.005586] unregister_netdevice_queue+0xf7/0x140
[ +0.004806] unregister_netdev+0x1c/0x30
[ +0.003933] i40e_vsi_release+0x87/0x2f0 [i40e]
[ +0.004604] i40e_remove+0x1a1/0x420 [i40e]
[ +0.004220] pci_device_remove+0x3f/0xb0
[ +0.003943] device_release_driver_internal+0x19f/0x200
[ +0.005243] driver_detach+0x48/0x90
[ +0.003586] bus_remove_driver+0x6d/0xf0
[ +0.003939] pci_unregister_driver+0x2e/0xb0
[ +0.004278] i40e_exit_module+0x10/0x5f0 [i40e]
[ +0.004570] __do_sys_delete_module.isra.0+0x197/0x310
[ +0.005153] do_syscall_64+0x85/0x170
[ +0.003684] ? syscall_exit_to_user_mode+0x69/0x220
[ +0.004886] ? do_syscall_64+0x95/0x170
[ +0.003851] ? exc_page_fault+0x7e/0x180
[ +0.003932] entry_SYSCALL_64_after_hwframe+0x71/0x79
[ +0.005064] RIP: 0033:0x7f59dc9347cb
[ +0.003648] Code: 73 01 c3 48 8b 0d 65 16 0c 00 f7 d8 64 89 01 48 83
c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 16 0c 00 f7 d8 64 89 01 48
[ +0.018753] RSP: 002b:00007ffffac99048 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ +0.007577] RAX: ffffffffffffffda RBX: 0000559b9bb2f6e0 RCX: 00007f59dc9347cb
[ +0.007140] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000559b9bb2f748
[ +0.007146] RBP: 00007ffffac99070 R08: 1999999999999999 R09: 0000000000000000
[ +0.007133] R10: 00007f59dc9a5ac0 R11: 0000000000000206 R12: 0000000000000000
[ +0.007141] R13: 00007ffffac992d8 R14: 0000559b9bb2f6e0 R15: 0000000000000000
[ +0.007151] </TASK>
[ +0.002204] ---[ end trace 0000000000000000 ]---
Fix this by checking if the XDP program is being loaded or unloaded.
Then, block only loading a new program while "__I40E_IN_REMOVE" is set.
Also, move testing "__I40E_IN_REMOVE" flag to the beginning of XDP_SETUP
callback to avoid unnecessary operations and checks.
Fixes: 6533e558c6 ("i40e: Fix reset path while removing the driver")
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20240708230750.625986-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is a potential race between __update_and_free_hugetlb_folio() and
try_memory_failure_hugetlb():
CPU1 CPU2
__update_and_free_hugetlb_folio try_memory_failure_hugetlb
folio_test_hugetlb
-- It's still hugetlb folio.
folio_clear_hugetlb_hwpoison
spin_lock_irq(&hugetlb_lock);
__get_huge_page_for_hwpoison
folio_set_hugetlb_hwpoison
spin_unlock_irq(&hugetlb_lock);
spin_lock_irq(&hugetlb_lock);
__folio_clear_hugetlb(folio);
-- Hugetlb flag is cleared but too late.
spin_unlock_irq(&hugetlb_lock);
When the above race occurs, raw error page info will be leaked. Even
worse, raw error pages won't have hwpoisoned flag set and hit
pcplists/buddy. Fix this issue by deferring
folio_clear_hugetlb_hwpoison() until __folio_clear_hugetlb() is done. So
all raw error pages will have hwpoisoned flag set.
Link: https://lkml.kernel.org/r/20240708025127.107713-1-linmiaohe@huawei.com
Fixes: 32c877191e ("hugetlb: do not clear hugetlb dtor until allocating vmemmap")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The vmf->ptl in filemap_fault_recheck_pte_none() is still set from
handle_pte_fault(). But at the same time, we did a pte_unmap(vmf->pte).
After a pte_unmap(vmf->pte) unmap and rcu_read_unlock(), the page table
may be racily changed and vmf->ptl maybe fails to protect the actual page
table. Fix this by replacing pte_offset_map() with
pte_offset_map_nolock().
As David said, the PTL pointer might be stale so if we continue to use
it infilemap_fault_recheck_pte_none(), it might trigger UAF. Also, if
the PTL fails, the issue fixed by commit 58f327f2ce ("filemap: avoid
unnecessary major faults in filemap_fault()") might reappear.
Link: https://lkml.kernel.org/r/20240313012913.2395414-1-zhangpeng362@huawei.com
Fixes: 58f327f2ce ("filemap: avoid unnecessary major faults in filemap_fault()")
Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yin Fengwei <fengwei.yin@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Mark get_current() and current_thread_info() functions as always_inline to
fix the following modpost warning:
WARNING: modpost: vmlinux: section mismatch in reference: get_current+0xc (section: .text.unlikely) -> initcall_level_names (section: .init.data)
The warning happens when these functions are called from an __init
function and they don't get inlined (remain in the .text section) while
the value they return points into .init.data section. Assuming
get_current() always returns a valid address, this situation can happen
only during init stage and accessing .init.data from .text section during
that stage should pose no issues.
Link: https://lkml.kernel.org/r/20240704132506.1011978-2-surenb@google.com
Fixes: 22d407b164 ("lib: add allocation tagging support for memory allocation profiling")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Mark alloc_tag_{save|restore} as always_inline to fix the following
modpost warnings:
WARNING: modpost: vmlinux: section mismatch in reference: alloc_tag_save+0x1c (section: .text.unlikely) -> initcall_level_names (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference: alloc_tag_restore+0x3c (section: .text.unlikely) -> initcall_level_names (section: .init.data)
The warnings happen when these functions are called from an __init
function and they don't get inlined (remain in the .text section) while
the value returned by get_current() points into .init.data section.
Assuming get_current() always returns a valid address, this situation can
happen only during init stage and accessing .init.data from .text section
during that stage should pose no issues.
Link: https://lkml.kernel.org/r/20240704132506.1011978-1-surenb@google.com
Fixes: 22d407b164 ("lib: add allocation tagging support for memory allocation profiling")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407032306.gi9nZsBi-lkp@intel.com/
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
X would not start in my old 32-bit partition (and the "n"-handling looks
just as wrong on 64-bit, but for whatever reason did not show up there):
"n" must be accumulated over all pages before it's added to "offset" and
compared with "copy", immediately after the skb_frag_foreach_page() loop.
Fixes: d2d30a376d ("net: allow skb_datagram_iter to be called from any context")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Link: https://patch.msgid.link/fef352e8-b89a-da51-f8ce-04bc39ee6481@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In cdv_intel_lvds_get_modes(), the return value of drm_mode_duplicate()
is assigned to mode, which will lead to a NULL pointer dereference on
failure of drm_mode_duplicate(). Add a check to avoid npd.
Cc: stable@vger.kernel.org
Fixes: 6a227d5fd6 ("gma500: Add support for Cedarview")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240709113311.37168-1-make24@iscas.ac.cn
In psb_intel_lvds_get_modes(), the return value of drm_mode_duplicate() is
assigned to mode, which will lead to a possible NULL pointer dereference
on failure of drm_mode_duplicate(). Add a check to avoid npd.
Cc: stable@vger.kernel.org
Fixes: 89c78134cc ("gma500: Add Poulsbo support")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240709092011.3204970-1-make24@iscas.ac.cn
Like other SPI controller flags, bits_per_word_mask may be used by a
peripheral driver, so it needs to reflect the capabilities of the
underlying controller.
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20240708-spi-mux-fix-v1-3-6c8845193128@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Adding spi_optimize_message() broke the spi-mux driver because it
calls spi_async() from it's transfer_one_message() callback. This
resulted in passing an incorrectly optimized message to the controller.
For example, if the underlying controller has an optimize_message()
callback, this would have not been called and can cause a crash when
the underlying controller driver tries to transfer the message.
Also, since the spi-mux driver swaps out the controller pointer by
replacing msg->spi, __spi_unoptimize_message() was being called with a
different controller than the one used in __spi_optimize_message(). This
could cause a crash when attempting to free the message resources when
__spi_unoptimize_message() is called in spi_finalize_current_message()
since it is being called with a controller that did not allocate the
resources.
This is fixed by adding a defer_optimize_message flag for controllers.
This flag causes all of the spi_[maybe_][un]optimize_message() calls to
be a no-op (other than attaching a pointer to the spi device to the
message).
This allows the spi-mux driver to pass an unmodified message to
spi_async() in spi_mux_transfer_one_message() after the spi device has
been swapped out. This causes __spi_optimize_message() and
__spi_unoptimize_message() to be called only once per message and with
the correct/same controller in each case.
Reported-by: Oleksij Rempel <o.rempel@pengutronix.de>
Closes: https://lore.kernel.org/linux-spi/Zn6HMrYG2b7epUxT@pengutronix.de/
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Closes: https://lore.kernel.org/linux-spi/20240628-awesome-discerning-bear-1621f9-mkl@pengutronix.de/
Fixes: 7b1d87af14 ("spi: add spi_optimize_message() APIs")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20240708-spi-mux-fix-v1-2-6c8845193128@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Calling spi_maybe_unoptimize_message() in spi_async() is wrong because
the message is likely to be in the queue and not transferred yet. This
can corrupt the message while it is being used by the controller driver.
spi_maybe_unoptimize_message() is already called in the correct place
in spi_finalize_current_message() to balance the call to
spi_maybe_optimize_message() in spi_async().
Fixes: 7b1d87af14 ("spi: add spi_optimize_message() APIs")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20240708-spi-mux-fix-v1-1-6c8845193128@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Since commit c49cfa9170 ("USB: serial: use generic method if no
alternative is provided in usb serial layer"), USB serial core calls the
generic resume implementation when the driver has not provided one.
This can trigger a crash on resume with mos7840 since support for
multiple read URBs was added back in 2011. Specifically, both port read
URBs are now submitted on resume for open ports, but the context pointer
of the second URB is left set to the core rather than mos7840 port
structure.
Fix this by implementing dedicated suspend and resume functions for
mos7840.
Tested with Delock 87414 USB 2.0 to 4x serial adapter.
Signed-off-by: Dmitry Smirnov <d.smirnov@inbox.lv>
[ johan: analyse crash and rewrite commit message; set busy flag on
resume; drop bulk-in check; drop unnecessary usb_kill_urb() ]
Fixes: d83b405383 ("USB: serial: add support for multiple read urbs")
Cc: stable@vger.kernel.org # 3.3
Signed-off-by: Johan Hovold <johan@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmaJwEYACgkQiiy9cAdy
T1EPaQwAufRbLgmhf0mXUhRukYFIWwAyPOvMEov9vr6uWAmIaqxb2ggmgxwolulS
oEheMyoE+nDRzUFnPv+QY/ihV66Eqq2A83oSW/JVc+WAhiyLG7hWKWdHr2IxEG87
IJA9oJVWoYBQVpINozibwN0qONr8AU6B0jIGZ7+MzU3e09ARLf6OltfXWjLZT68K
xK5fqcZErF+wawnk26u/FRmd81vD3zhRAIqGFIt7E62ngedTsWvqqn7Dx5MDI28a
KkgO8hudyhULGZk8qI/pN/8+vBFJlMdTWaWN9410ucpoQ+5G4M0quOsqzn5DxbWw
0lnBAgDvR1jwyU4cUj4Dgb0TnG/ABiuVQebz82LeIoisItSPenNyKc5FRfry/OFN
PJFvWoUvYGFXUtSkdmLwLeWppTVvpL8vxyk+OPx3URwheqCiaQHN/l3xSBqLIldw
4uPL+grt9zeKOvMvsBFfN+2eiUeC3foZkg4RKucs5aSPJtHra4w6zhvfsuJosNsW
XgIRM19F
=eUKV
-----END PGP SIGNATURE-----
Merge tag '6.10-rc6-smb3-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- fix access flags to address fuse incompatibility
- fix device type returned by get filesystem info
* tag '6.10-rc6-smb3-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: discard write access to the directory open
ksmbd: return FILE_DEVICE_DISK instead of super magic
This kselftest fixes update for Linux 6.10 consists of fixes to clang
build failures to timerns, vDSO tests and fixes to vDSO makefile.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmaMaSgACgkQCwJExA0N
Qxz+xw/8DG3OciD30wrXyPbH7Lw35AJXH3IsIjirQET6hoE+saxYOWbVN9POqqVy
VTh21ZwJpuSrh8gIYHUEZPLezTwIYWN7aZm2Zps7VttlfXjNvbWiLBB4ptAL/XWu
SogHFeE1u1KHk8JZY3v3j//hQxL0FqnUbqRjv5nnOUS1krgL4shP6JsdUU65Bs9x
TLSCJrJSCSpG/u7KAXSHlYy0kn9fnL+F2LUqTFf+kzOOdLZ+XaxHS/02GsZYgcVI
SUvL6x4NEqVMyxcnvL4QBs91SD1/q80vf7g0+gKHkcuHKluto/Zmnwhw40oN92lr
T6muSS2jW+OemZzglJdD4aIbCEisVtwPsPkdtux9JZV9VAH2lyYz0+G0J2fX7r11
LOcd4Y7HhoYA5UL6s6puE8xQEZOUrBNMY4exfeOkW/UaJhscewtyTMQsNRs8qW+4
lEoHFJSsVQtfuZSxUaiXm49loVxu8JueynG6dafRue8tf9mCWpOzl01fVpkoLL/1
5lMOau3DZallsiHKU0COg6eJhAi6QQjC2nYNMJHwO3DFCKpwneMYbbU9xqS5MZ5Y
wVijpgyFdIMk5qxHDdVEmevFNyYG3xGYKq/sReDuwb4qJkdx7rDS5mMTkVxyHdCe
ezHxw6tuiLohHXDHVCR/KxQwjiHkXZF2uudzTFDt6Lxeu68PAFk=
=c2zt
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan
"Fixes to clang build failures to timerns, vDSO tests and fixes to vDSO
makefile"
* tag 'linux_kselftest-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/vDSO: remove duplicate compiler invocations from Makefile
selftests/vDSO: remove partially duplicated "all:" target in Makefile
selftests/vDSO: fix clang build errors and warnings
selftest/timerns: fix clang build failures for abs() calls
crst_table_free() used to work with NULL pointers before the conversion
to ptdescs. Since crst_table_free() can be called with a NULL pointer
(error handling in crst_table_upgrade() add an explicit check.
Also add the same check to base_crst_free() for consistency reasons.
In real life this should not happen, since order two GFP_KERNEL
allocations will not fail, unless FAIL_PAGE_ALLOC is enabled and used.
Reported-by: Yunseong Kim <yskelg@gmail.com>
Fixes: 6326c26c15 ("s390: convert various pgalloc functions to use ptdescs")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This implements the runtime constant infrastructure for arm64, allowing
the dcache d_hash() function to be generated using as a constant for
hash table address followed by shift by a constant of the hash index.
[ Fixed up to deal with the big-endian case as per Mark Rutland ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZoz2cwAKCRDbK58LschI
g0njAQCgFSexbgpphOFplFea6i1Dom8yifcZXSdjUlDtqW25QQD/WXNZkHhloCqk
387RkULudSkk3xZfc7paiXOTNGuaww8=
=FUjt
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-07-09
The following pull-request contains BPF updates for your *net* tree.
We've added 3 non-merge commits during the last 1 day(s) which contain
a total of 5 files changed, 81 insertions(+), 11 deletions(-).
The main changes are:
1) Fix a use-after-free in a corner case where tcx_entry got released too
early. Also add BPF test coverage along with the fix, from Daniel Borkmann.
2) Fix a kernel panic on Loongarch in sk_msg_recvmsg() which got triggered
by running BPF sockmap selftests, from Geliang Tang.
bpf-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
skmsg: Skip zero length skb in sk_msg_recvmsg
selftests/bpf: Extend tcx tests to cover late tcx_entry release
bpf: Fix too early release of tcx_entry
====================
Link: https://patch.msgid.link/20240709091452.27840-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When SMP is enabled and spinlocks are actually functional then there is
a deadlock with the 'statelock' spinlock between ks8851_start_xmit_spi
and ks8851_irq:
watchdog: BUG: soft lockup - CPU#0 stuck for 27s!
call trace:
queued_spin_lock_slowpath+0x100/0x284
do_raw_spin_lock+0x34/0x44
ks8851_start_xmit_spi+0x30/0xb8
ks8851_start_xmit+0x14/0x20
netdev_start_xmit+0x40/0x6c
dev_hard_start_xmit+0x6c/0xbc
sch_direct_xmit+0xa4/0x22c
__qdisc_run+0x138/0x3fc
qdisc_run+0x24/0x3c
net_tx_action+0xf8/0x130
handle_softirqs+0x1ac/0x1f0
__do_softirq+0x14/0x20
____do_softirq+0x10/0x1c
call_on_irq_stack+0x3c/0x58
do_softirq_own_stack+0x1c/0x28
__irq_exit_rcu+0x54/0x9c
irq_exit_rcu+0x10/0x1c
el1_interrupt+0x38/0x50
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x64/0x68
__netif_schedule+0x6c/0x80
netif_tx_wake_queue+0x38/0x48
ks8851_irq+0xb8/0x2c8
irq_thread_fn+0x2c/0x74
irq_thread+0x10c/0x1b0
kthread+0xc8/0xd8
ret_from_fork+0x10/0x20
This issue has not been identified earlier because tests were done on
a device with SMP disabled and so spinlocks were actually NOPs.
Now use spin_(un)lock_bh for TX queue related locking to avoid execution
of softirq work synchronously that would lead to a deadlock.
Fixes: 3dc5d44545 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240706101337.854474-1-rwahl@gmx.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In rvu_check_rsrc_availability() in case of invalid SSOW req, an incorrect
data is printed to error log. 'req->sso' value is printed instead of
'req->ssow'. Looks like "copy-paste" mistake.
Fix this mistake by replacing 'req->sso' with 'req->ssow'.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 746ea74241 ("octeontx2-af: Add RVU block LF provisioning support")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240705095317.12640-1-amishin@t-argos.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In the cases where the power domain connected to logics is allowed to
transition from a level(L)-->power collapse(0)-->retention(1) or
vice versa retention(1)-->power collapse(0)-->level(L) will cause the
logic to lose the configurations. The ARC does not support retention
to collapse transition on MxC rails.
The targets from SM8450 onwards the PLL logics of clock controllers are
connected to MxC rails and the recommended configurations are carried
out during the clock controller probes. The MxC transition as mentioned
above should be skipped to ensure the PLL settings are intact across
clock controller power on & off.
On older targets that do not split MX into MxA and MxC does not collapse
the logic and it is parked always at RETENTION, thus this issue is never
observed on those targets.
Cc: stable@vger.kernel.org # v5.17
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Taniya Das <quic_tdas@quicinc.com>
Link: https://lore.kernel.org/r/20240625-avoid_mxc_retention-v2-1-af9c2f549a5f@quicinc.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
HP 250 G7 has a mute LED that can be made to work using quirk
ALC269_FIXUP_HP_LINE1_MIC1_LED. Enable already existing quirk.
Signed-off-by: Nazar Bilinskyi <nbilinskyi@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20240709080546.18344-1-nbilinskyi@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
bnxt doesn't check if a ring is used by RSS contexts when reducing
ring count. Core performs a similar check for the drivers for
the main context, but core doesn't know about additional contexts,
so it can't validate them. bnxt_fill_hw_rss_tbl_p5() uses ring
id to index bp->rx_ring[], which without the check may end up
being out of bounds.
BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
Read of size 2 at addr ffff8881c5809618 by task ethtool/31525
Call Trace:
__bnxt_hwrm_vnic_set_rss+0xb79/0xe40
bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460
__bnxt_setup_vnic_p5+0x12e/0x270
__bnxt_open_nic+0x2262/0x2f30
bnxt_open_nic+0x5d/0xf0
ethnl_set_channels+0x5d4/0xb30
ethnl_default_set_doit+0x2f1/0x620
Core does track the additional contexts in net-next, so we can
move this validation out of the driver as a follow up there.
Fixes: b3d0083caf ("bnxt_en: Support RSS contexts in ethtool .{get|set}_rxfh()")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240705020005.681746-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch
platform, the following kernel panic occurs:
[...]
Oops[#1]:
CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18
Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
... ...
ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560
ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0
CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD: 0000000c (PPLV0 +PIE +PWE)
EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
BADV: 0000000000000040
PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack
Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...)
Stack : ...
Call Trace:
[<9000000004162774>] copy_page_to_iter+0x74/0x1c0
[<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560
[<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0
[<90000000049aae34>] inet_recvmsg+0x54/0x100
[<900000000481ad5c>] sock_recvmsg+0x7c/0xe0
[<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0
[<900000000481e27c>] sys_recvfrom+0x1c/0x40
[<9000000004c076ec>] do_syscall+0x8c/0xc0
[<9000000003731da4>] handle_syscall+0xc4/0x160
Code: ...
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Fatal exception
Kernel relocated by 0x3510000
.text @ 0x9000000003710000
.data @ 0x9000000004d70000
.bss @ 0x9000000006469400
---[ end Kernel panic - not syncing: Fatal exception ]---
[...]
This crash happens every time when running sockmap_skb_verdict_shutdown
subtest in sockmap_basic.
This crash is because a NULL pointer is passed to page_address() in the
sk_msg_recvmsg(). Due to the different implementations depending on the
architecture, page_address(NULL) will trigger a panic on Loongarch
platform but not on x86 platform. So this bug was hidden on x86 platform
for a while, but now it is exposed on Loongarch platform. The root cause
is that a zero length skb (skb->len == 0) was put on the queue.
This zero length skb is a TCP FIN packet, which was sent by shutdown(),
invoked in test_sockmap_skb_verdict_shutdown():
shutdown(p1, SHUT_WR);
In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no
page is put to this sge (see sg_set_page in sg_set_page), but this empty
sge is queued into ingress_msg list.
And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by
sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it
to kmap_local_page() and to page_address(), then kernel panics.
To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(),
if copy is zero, that means it's a zero length skb, skip invoking
copy_page_to_iter(). We are using the EFAULT return triggered by
copy_page_to_iter to check for is_fin in tcp_bpf.c.
Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Suggested-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/e3a16eacdc6740658ee02a33489b1b9d4912f378.1719992715.git.tanggeliang@kylinos.cn
Reinit PHY after cable test, otherwise link can't be established on
tested port. This issue is reproducible on LAN9372 switches with
integrated 100BaseT1 PHYs.
Fixes: 788050256c ("net: phy: microchip_t1: add cable test support for lan87xx phy")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20240705084954.83048-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When probing, the hardware is not brought into a known state. This may
be a problem when a hypervisor restarts Linux without resetting the
hardware, leaving an old state running. Make sure the hardware gets
initialized, especially interrupts should be cleared and disabled.
Reported-by: Dirk Behme <dirk.behme@de.bosch.com>
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Closes: https://lore.kernel.org/r/20240702045535.2000393-1-dirk.behme@de.bosch.com
Fixes: 6ccbe60713 ("i2c: add Renesas R-Car I2C driver")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Once again, we've broken PASEMI Nemo boards with its incomplete
"interrupt-map" translations. Commit 935df1bd40 ("of/irq: Factor out
parsing of interrupt-map parent phandle+args from of_irq_parse_raw()")
changed the behavior resulting in the existing work-around not taking
effect. Rework the work-around to just skip parsing "interrupt-map" up
front by using the of_irq_imap_abusers list.
Fixes: 935df1bd40 ("of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw()")
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/86ed8ba2sp.wl-maz@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
These addresses the performance issues reported by Matt, Namhyung and
Linus. Recently it changed processing comm string and DSO with sorted
arrays but it required to sort the array whenever it adds a new entry.
This caused a performance issue and fix is to enhance the sorting by
finding the insertion point in the sorted array and to shift righthand
side using memmove().
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZov/wgAKCRCMstVUGiXM
g3olAQCFzp/BnopE7VgUSK5j0EOnMjSsvkQGkocqCVN1Km3y8AEAlV3EKb1rUN8s
SQ+QcEx7F4V38s+Aoo2SqU1yAwYsXAc=
=Ao/v
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v6.10-2024-07-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Namhyung Kim:
"Fix performance issue for v6.10
These address the performance issues reported by Matt, Namhyung and
Linus. Recently perf changed the processing of the comm string and DSO
using sorted arrays but this caused it to sort the array whenever
adding a new entry.
This caused a performance issue and the fix is to enhance the sorting
by finding the insertion point in the sorted array and to shift
righthand side using memmove()"
* tag 'perf-tools-fixes-for-v6.10-2024-07-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf dsos: When adding a dso into sorted dsos maintain the sort order
perf comm str: Avoid sort during insert
Pedro Pinto and later independently also Hyunwoo Kim and Wongi Lee reported
an issue that the tcx_entry can be released too early leading to a use
after free (UAF) when an active old-style ingress or clsact qdisc with a
shared tc block is later replaced by another ingress or clsact instance.
Essentially, the sequence to trigger the UAF (one example) can be as follows:
1. A network namespace is created
2. An ingress qdisc is created. This allocates a tcx_entry, and
&tcx_entry->miniq is stored in the qdisc's miniqp->p_miniq. At the
same time, a tcf block with index 1 is created.
3. chain0 is attached to the tcf block. chain0 must be connected to
the block linked to the ingress qdisc to later reach the function
tcf_chain0_head_change_cb_del() which triggers the UAF.
4. Create and graft a clsact qdisc. This causes the ingress qdisc
created in step 1 to be removed, thus freeing the previously linked
tcx_entry:
rtnetlink_rcv_msg()
=> tc_modify_qdisc()
=> qdisc_create()
=> clsact_init() [a]
=> qdisc_graft()
=> qdisc_destroy()
=> __qdisc_destroy()
=> ingress_destroy() [b]
=> tcx_entry_free()
=> kfree_rcu() // tcx_entry freed
5. Finally, the network namespace is closed. This registers the
cleanup_net worker, and during the process of releasing the
remaining clsact qdisc, it accesses the tcx_entry that was
already freed in step 4, causing the UAF to occur:
cleanup_net()
=> ops_exit_list()
=> default_device_exit_batch()
=> unregister_netdevice_many()
=> unregister_netdevice_many_notify()
=> dev_shutdown()
=> qdisc_put()
=> clsact_destroy() [c]
=> tcf_block_put_ext()
=> tcf_chain0_head_change_cb_del()
=> tcf_chain_head_change_item()
=> clsact_chain_head_change()
=> mini_qdisc_pair_swap() // UAF
There are also other variants, the gist is to add an ingress (or clsact)
qdisc with a specific shared block, then to replace that qdisc, waiting
for the tcx_entry kfree_rcu() to be executed and subsequently accessing
the current active qdisc's miniq one way or another.
The correct fix is to turn the miniq_active boolean into a counter. What
can be observed, at step 2 above, the counter transitions from 0->1, at
step [a] from 1->2 (in order for the miniq object to remain active during
the replacement), then in [b] from 2->1 and finally [c] 1->0 with the
eventual release. The reference counter in general ranges from [0,2] and
it does not need to be atomic since all access to the counter is protected
by the rtnl mutex. With this in place, there is no longer a UAF happening
and the tcx_entry is freed at the correct time.
Fixes: e420bed025 ("bpf: Add fd-based tcx multi-prog infra with link support")
Reported-by: Pedro Pinto <xten@osec.io>
Co-developed-by: Pedro Pinto <xten@osec.io>
Signed-off-by: Pedro Pinto <xten@osec.io>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Hyunwoo Kim <v4bel@theori.io>
Cc: Wongi Lee <qwerty@theori.io>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240708133130.11609-1-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>