The cache needs to be flushed to ensure that the hardware stops offloading
the flow immediately.
Fixes: 33fc42de33 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230330120840.52079-3-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Check for skb metadata in order to detect the case where the DSA header
is not present.
Fixes: 2d7605a729 ("net: ethernet: mtk_eth_soc: enable hardware DSA untagging")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230330120840.52079-2-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since we call flow_block_cb_decref on FLOW_BLOCK_UNBIND, we also need to
call flow_block_cb_incref for a newly allocated cb.
Also fix the accidentally inverted refcount check on unbind.
Fixes: 502e84e238 ("net: ethernet: mtk_eth_soc: add flow offloading support")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230330120840.52079-1-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reported on the Turris forum, mvneta provokes kernel warnings in the
architecture DMA mapping code when mvneta_setup_txqs() fails to
allocate memory. This happens because when mvneta_cleanup_txqs() is
called in the mvneta_stop() path, we leave pointers in the structure
that have been freed.
Then on mvneta_open(), we call mvneta_setup_txqs(), which starts
allocating memory. On memory allocation failure, mvneta_cleanup_txqs()
will walk all the queues freeing any non-NULL pointers - which includes
pointers that were previously freed in mvneta_stop().
Fix this by setting these pointers to NULL to prevent double-freeing
of the same memory.
Fixes: 2adb719d74 ("net: mvneta: Implement software TSO")
Link: https://forum.turris.cz/t/random-kernel-exceptions-on-hbl-tos-7-0/18865/8
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1phUe5-00EieL-7q@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If certain conditions are met, DSA can install all necessary MAC
addresses on the CPU ports as FDB entries and disable flooding towards
the CPU (we call this RX filtering).
There is one corner case where this does not work.
ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up
ip link set swp0 master br0 && ip link set swp0 up
ip link add link swp0 name swp0.100 type vlan id 100
ip link set swp0.100 up && ip addr add 192.168.100.1/24 dev swp0.100
Traffic through swp0.100 is broken, because the bridge turns on VLAN
filtering in the swp0 port (causing RX packets to be classified to the
FDB database corresponding to the VID from their 802.1Q header), and
although the 8021q module does call dev_uc_add() towards the real
device, that API is VLAN-unaware, so it only contains the MAC address,
not the VID; and DSA's current implementation of ndo_set_rx_mode() is
only for VID 0 (corresponding to FDB entries which are installed in an
FDB database which is only hit when the port is VLAN-unaware).
It's interesting to understand why the bridge does not turn on
IFF_PROMISC for its swp0 bridge port, and it may appear at first glance
that this is a regression caused by the logic in commit 2796d0c648
("bridge: Automatically manage port promiscuous mode."). After all,
a bridge port needs to have IFF_PROMISC by its very nature - it needs to
receive and forward frames with a MAC DA different from the bridge
ports' MAC addresses.
While that may be true, when the bridge is VLAN-aware *and* it has a
single port, there is no real reason to enable promiscuity even if that
is an automatic port, with flooding and learning (there is nowhere for
packets to go except to the BR_FDB_LOCAL entries), and this is how the
corner case appears. Adding a second automatic interface to the bridge
would make swp0 promisc as well, and would mask the corner case.
Given the dev_uc_add() / ndo_set_rx_mode() API is what it is (it doesn't
pass a VLAN ID), the only way to address that problem is to install host
FDB entries for the cartesian product of RX filtering MAC addresses and
VLAN RX filters.
Fixes: 7569459a52 ("net: dsa: manage flooding on the CPU ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20230329151821.745752-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Do not set the MV88E6XXX_PORT_CTL0_IGMP_MLD_SNOOP bit on CPU or DSA ports.
This allows the host CPU port to be a regular IGMP listener by sending out
IGMP Membership Reports, which would otherwise not be forwarded by the
mv88exxx chip, but directly looped back to the CPU port itself.
Fixes: 54d792f257 ("net: dsa: Centralise global and port setup code into mv88e6xxx.")
Signed-off-by: Steffen Bätz <steffen@innosonix.de>
Signed-off-by: Fabio Estevam <festevam@denx.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20230329150140.701559-1-festevam@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When cpumask is specified as a module parameter the value is
overwritten by the module init routine. This can easily be fixed
by checking to see if the mask has already been allocated in the
init routine.
When max_idle is specified as a module parameter a panic will occur.
The problem is that the idle_injection_cpu_mask is not allocated until
the module init routine executes. This can easily be fixed by allocating
the cpumask if it's not already allocated.
Fixes: ebf5197102 ("thermal: intel: powerclamp: Add two module parameters")
Signed-off-by: David Arcari <darcari@redhat.com>
Reviewed-by: Srinivas Pandruvada<srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* various ivpu fixes
* fix nouveau backlight registration
* fix buddy allocator in 32-bit systems
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmQll8UACgkQaA3BHVML
eiOeCAgAuZ4njGFuTDPbSj8xvMSTzBhCK/q452MosuIL8XBa3wMJy8SsjpGP5yA5
uUBCDO9ZdIw1ZVvUfmSR9X+Q0L3HSQEGax9jgk67Hj0bGJlVhEqVvifBrhCum+Jd
GUHRYPuNCSrlGrtlQEtCd2puZ+pS5IJoClunJ4wTdO/hwXmvaRF8QMWkn7Y9GWKs
JcNY5kd2ZeBw05XcSO9A89vrQMmqAbAjHfMEdwqZu22C9gZ8dlnPUxpqLHfg5tkz
XQj7PWQK4vQzvjBA70vS6sNcjpiS6i7FekWzAL/3fb5BQHRZx1RTOPcruQcwE+O5
bKKsB09JSGgg5M4noJQ4I/bXIWXnjA==
=mQp3
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-fixes-2023-03-30' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull:
* various ivpu fixes
* fix nouveau backlight registration
* fix buddy allocator in 32-bit systems
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230330141006.GA22908@linux-uq9g
drm/i915 fixes for v6.3-rc5:
- Fix PMU support by reusing functions with sysfs
- Fix a number of issues related to color, PSR and arm/noarm
- Fix state check related to ICL PHY ownership check in TC-cold state
- Flush lmem contents after construction
- Fix hibernate oops related to DPT BO
- Fix perf stream error path wakeref balance
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87355m4gtm.fsf@intel.com
A collection of small fixes:
- A potential deadlock fix for USB-audio, involving some change
in PCM core side
- A regression fix for probes of USB-audio devices with the
vendor-specific PCM format bits
- Two regression fixes for the old YMFPCI driver
- A few HD-audio quirks as usual
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmQkN0cOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE8xmhAAwKf45KZWeuWs/HtlnJe7tyaS88zyAjdw2JY0
+WiZsjIFruOC2NwHtrIBmpqC8baaCzUTEveorVo/5mHt1KsJmwl5Necz/2CkGmCQ
VfAaaDoMMxh75VIod69i6vAY6L5tYjyNOjKrsYixdR90mtTlgCxhZumoAwoB5/2x
TJZUL9cmRnHm3V/9eCzb0UOOPYVr/WRleUMovr31uuHHp/owNAKzPh960l+hax9/
Tf1nHPVtq17x/JKagSTKIyNgK53mwT8t+86S5gGgGiVdqGtj5mfxoRdgyRMR17mr
WoL1bNX4V6Y/CbRTBxmHqWYb/RHLWvq5fcJRP1BGI/nMds0hCSW2/4ywKZYqgllr
sHs1h3HJVl94X/sA5bCHFTB4EQC6n3Ci+TZmIbeAsMaTPq9E1M6jUB+6m3qKfnGk
7YW09vx460jy8X2rK3jdq7EAaCTOaiJzcGXLYlk6G2lyynRHVOh7MGp3opKHZB9k
9LlIdWjchpOZfeizhcpmvUegVKfS2JyIQEIyhGc9StNhFlPGj8+aRHX/U1D0W9op
Ad1cnieSiN5b8+MOFJO4LeqOQ+yHYV1WJRNjCRztna2x5j/Hif8aV35JRLoGesnm
UYcPcjTiWZVreFYbuKcCOkYRWN8FR9xUkLR1Hxj+BuenvKq5SEZINEokiuQddb2b
TUS33XQ=
=T5LZ
-----END PGP SIGNATURE-----
Merge tag 'sound-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes:
- A potential deadlock fix for USB-audio, involving some change in
PCM core side
- A regression fix for probes of USB-audio devices with the
vendor-specific PCM format bits
- Two regression fixes for the old YMFPCI driver
- A few HD-audio quirks as usual"
* tag 'sound-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Add quirk for Lenovo ZhaoYang CF4620Z
ALSA: ymfpci: Fix BUG_ON in probe function
ALSA: ymfpci: Create card with device-managed snd_devm_card_new()
ALSA: usb-audio: Fix regression on detection of Roland VS-100
ALSA: hda/realtek: Fix support for Dell Precision 3260
ALSA: usb-audio: Fix recursive locking at XRUN during syncing
ALSA: hda/conexant: Partial revert of a quirk for Lenovo
ALSA: hda/realtek: Add quirks for some Clevo laptops
* Make sure to always invalidate the last page of an inode straddling
inode->i_size to avoid data inconsistencies with appended data when
the device zone write granularity does not match the page size.
* Do not propagate iomap -ENOBLK error to userspace and use -EBUSY
instead.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCZCV+vgAKCRDdoc3SxdoY
dhtrAQCYLbuQPHM5WwO9oJPnKYGgDjOBQbcDC0hjTG3rLjquHgD+O16QHO02raDk
++C5sl6r22jK1m8GqOckfJ/wdESgCQM=
=Lmth
-----END PGP SIGNATURE-----
Merge tag 'zonefs-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs fixes from Damien Le Moal:
- Make sure to always invalidate the last page of an inode straddling
inode->i_size to avoid data inconsistencies with appended data when
the device zone write granularity does not match the page size.
- Do not propagate iomap -ENOBLK error to userspace and use -EBUSY
instead.
* tag 'zonefs-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user space
zonefs: Always invalidate last cached page on append write
This reverts commit df622729dd as it introduces a use-after-free,
which isn't easy to fix without going back to the design drawing board.
Reported-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
This reverts commit 97804a133c, as it builds on top of df622729dd
("drm/scheduler: track GPU active time per entity") which needs to be
reverted, as it introduces a use-after-free.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
drm_gem_prime_mmap() takes a reference on the GEM object, but before that
drm_gem_mmap_obj() already takes a reference, which will be leaked as only
one reference is dropped when the mapping is closed. Drop the extra
reference when dma_buf_mmap() succeeds.
Cc: stable@vger.kernel.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Issue the same error message in case an illegal page boundary crossing
has been detected in both cases where this is tested.
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Link: https://lore.kernel.org/r/20230329080259.14823-1-jgross@suse.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
We increase cache->nr_cached when we free into the cache but don't
decrease when we take from it, so in some time we'll get an empty
cache with cache->nr_cached larger than IO_ALLOC_CACHE_MAX, that fails
io_alloc_cache_put() and effectively disables caching.
Fixes: 9b797a37c4 ("io_uring: add abstraction around apoll cache")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The call to invalidate_inode_pages2_range() in __iomap_dio_rw() may
fail, in which case -ENOTBLK is returned and this error code is
propagated back to user space trhough iomap_dio_rw() ->
zonefs_file_dio_write() return chain. This error code is fairly obscure
and may confuse the user. Avoid this and be consistent with the behavior
of zonefs_file_dio_append() for similar invalidate_inode_pages2_range()
errors by returning -EBUSY to user space when iomap_dio_rw() returns
-ENOTBLK.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Fixes: 8dcc1a9d90 ("fs: New zonefs file system")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Hans Holmberg <hans.holmberg@wdc.com>
When a direct append write is executed, the append offset may correspond
to the last page of a sequential file inode which might have been cached
already by buffered reads, page faults with mmap-read or non-direct
readahead. To ensure that the on-disk and cached data is consistant for
such last cached page, make sure to always invalidate it in
zonefs_file_dio_append(). If the invalidation fails, return -EBUSY to
userspace to differentiate from IO errors.
This invalidation will always be a no-op when the FS block size (device
zone write granularity) is equal to the page size (e.g. 4K).
Reported-by: Hans Holmberg <Hans.Holmberg@wdc.com>
Fixes: 02ef12a663 ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Hans Holmberg <hans.holmberg@wdc.com>
Avoid potential data corruption issues caused by uninitialized driver
private data structures.
Reported-by: Brian Coverstone <brian@mainsequence.net>
Fixes: 6a9d1b91f3 ("mac80211: add pre-RCU-sync sta removal driver operation")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20230324120924.38412-3-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Adjust the network header to point at the correct payload offset
Fixes: 986e43b19a ("wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20230324120924.38412-2-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When forwarding is set to 0, frames are typically sent with ttl=1.
Move the ttl decrement check below the check for local receive in order to
fix packet drops.
Reported-by: Thomas Hühn <thomas.huehn@hs-nordhausen.de>
Reported-by: Nick Hainke <vincent@systemli.org>
Fixes: 986e43b19a ("wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20230326151709.17743-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
rx->sta->amsdu_mesh_control is being passed to ieee80211_amsdu_to_8023s
without checking rx->sta. Since it doesn't make sense to accept A-MSDU
packets without a sta, simply add a check earlier.
Fixes: 6e4c0d0460 ("wifi: mac80211: add a workaround for receiving non-standard mesh A-MSDU")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20230330090001.60750-2-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Arseniy Krasnov says:
====================
fix header length on skb merging
this patchset fixes appending newly arrived skbuff to the last skbuff of
the socket's queue during rx path. Problem fires when we are trying to
append data to skbuff which was already processed in dequeue callback
at least once. Dequeue callback calls function 'skb_pull()' which changes
'skb->len'. In current implementation 'skb->len' is used to update length
in header of last skbuff after new data was copied to it. This is bug,
because value in header is used to calculate 'rx_bytes'/'fwd_cnt' and
thus must be constant during skbuff lifetime. Here is example, we have
two skbuffs: skb0 with length 10 and skb1 with length 4.
1) skb0 arrives, hdr->len == skb->len == 10, rx_bytes == 10
2) Read 3 bytes from skb0, skb->len == 7, hdr->len == 10, rx_bytes == 10
3) skb1 arrives, hdr->len == skb->len == 4, rx_bytes == 14
4) Append skb1 to skb0, skb0 now has skb->len == 11, hdr->len == 11.
But value of 11 in header is invalid.
5) Read whole skb0, update rx_bytes by 11 from skb0's header.
6) At this moment rx_bytes == 3, but socket's queue is empty.
This bug starts to fire since:
commit
0777061657 ("virtio/vsock: don't use skbuff state to account credit")
In fact, it presents before, but didn't triggered due to a little bit
buggy implementation of credit calculation logic. So i'll use Fixes tag
for it.
I really forgot about this branch in rx path when implemented patch
0777061657.
This patchset contains 3 patches:
1) Fix itself.
2) Patch with WARN_ONCE() to catch such problems in future.
3) Patch with test which triggers skb appending logic. It looks like
simple test with several 'send()' and 'recv()', but it checks, that
skbuff appending works ok.
====================
Link: https://lore.kernel.org/r/0683cc6e-5130-484c-1105-ef2eb792d355@sberdevices.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This adds test which checks case when data of newly received skbuff is
appended to the last skbuff in the socket's queue. It looks like simple
test with 'send()' and 'recv()', but internally it triggers logic which
appends one received skbuff to another. Test checks that this feature
works correctly.
This test is actual only for virtio transport.
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This adds WARN_ONCE() and return from stream dequeue callback when
socket's queue is empty, but 'rx_bytes' still non-zero. This allows
the detection of potential bugs due to packet merging (see previous
patch).
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This fixes appending newly arrived skbuff to the last skbuff of the
socket's queue. Problem fires when we are trying to append data to skbuff
which was already processed in dequeue callback at least once. Dequeue
callback calls function 'skb_pull()' which changes 'skb->len'. In current
implementation 'skb->len' is used to update length in header of the last
skbuff after new data was copied to it. This is bug, because value in
header is used to calculate 'rx_bytes'/'fwd_cnt' and thus must be not
be changed during skbuff's lifetime.
Bug starts to fire since:
commit 0777061657
("virtio/vsock: don't use skbuff state to account credit")
It presents before, but didn't triggered due to a little bit buggy
implementation of credit calculation logic. So use Fixes tag for it.
Fixes: 0777061657 ("virtio/vsock: don't use skbuff state to account credit")
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Michael Chan says:
====================
bnxt_en: 3 Bug fixes
This series contains 3 small bug fixes covering ethtool self test, PCI
ID string typos, and some missing 200G link speed ethtool reporting logic.
====================
Link: https://lore.kernel.org/r/20230329013021.5205-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
bnxt_fw_to_ethtool_speed() is missing the case statement for 200G
link speed reported by firmware. As a result, ethtool will report
unknown speed when the firmware reports 200G link speed.
Fixes: 532262ba3b ("bnxt_en: ethtool: support PAM4 link speeds up to 200G")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix 57502 and 57508 NPAR description string entries. The typos
caused these devices to not match up with lspci output.
Fixes: 49c98421e6 ("bnxt_en: Add PCI IDs for 57500 series NPAR devices.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the selftest command fails, driver is not reporting the failure
by updating the "test->flags" when bnxt_close_nic() fails.
Fixes: eb51365846 ("bnxt_en: Add basic ethtool -t selftest support.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix invalid registers dump from ethtool -d ethX after adapter self test
by ethtool -t ethY. It causes invalid data display.
The problem was caused by overwriting i40e_reg_list[].elements
which is common for ethtool self test and dump.
Fixes: 22dd9ae8af ("i40e: Rework register diagnostic")
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230328172659.3906413-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-03-28 (ice)
This series contains updates to ice driver only.
Jesse fixes mismatched header documentation reported when building with
W=1.
Brett restricts setting of VSI context to only applicable fields for the
given ICE_AQ_VSI_PROP_Q_OPT_VALID bit.
Junfeng adds check when adding Flow Director filters that conflict with
existing filter rules.
Jakob Koschel adds interim variable for iterating to prevent possible
misuse after looping.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: fix invalid check for empty list in ice_sched_assoc_vsi_to_agg()
ice: add profile conflict check for AVF FDIR
ice: Fix ice_cfg_rdma_fltr() to only update relevant fields
ice: fix W=1 headers mismatch
====================
Link: https://lore.kernel.org/r/20230328172035.3904953-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stefan Schmidt says:
====================
ieee802154 for net 2023-03-29
Two small fixes this time.
Dongliang Mu removed an unnecessary null pointer check.
Harshit Mogalapalli fixed an int comparison unsigned against signed from a
recent other fix in the ca8210 driver.
* tag 'ieee802154-for-net-2023-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan:
net: ieee802154: remove an unnecessary null pointer check
ca8210: Fix unsigned mac_len comparison with zero in ca8210_skb_tx()
====================
Link: https://lore.kernel.org/r/20230329064541.2147400-1-stefan@datenfreihafen.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
build_skb() no longer accepts slab buffers. Since slab use is fairly
uncommon we prefer the drivers to call a separate slab_build_skb()
function appropriately.
bnx2x uses the old semantics where size of 0 meant buffer from slab.
It sets the fp->rx_frag_size to 0 for MTUs which don't fit in a page.
It needs to call slab_build_skb().
This fixes the WARN_ONCE() of incorrect API use seen with bnx2x.
Reported-by: Thomas Voegtle <tv@lio96.de>
Link: https://lore.kernel.org/all/b8f295e4-ba57-8bfb-7d9c-9d62a498a727@lio96.de/
Fixes: ce098da149 ("skbuff: Introduce slab_build_skb()")
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230329000013.2734957-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In gsi_trans_pool_init_dma(), the total size of a pool of memory
used for DMA transactions is calculated. However the calculation is
done incorrectly.
For 4KB pages, this total size is currently always more than one
page, and as a result, the calculation produces a positive (though
incorrect) total size. The code still works in this case; we just
end up with fewer DMA pool entries than we intended.
Bjorn Andersson tested booting a kernel with 16KB pages, and hit a
null pointer derereference in sg_alloc_append_table_from_pages(),
descending from gsi_trans_pool_init_dma(). The cause of this was
that a 16KB total size was going to be allocated, and with 16KB
pages the order of that allocation is 0. The total_size calculation
yielded 0, which eventually led to the crash.
Correcting the total_size calculation fixes the problem.
Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Tested-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Fixes: 9dd441e4ed ("soc: qcom: ipa: GSI transactions")
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230328162751.2861791-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When we allocate a nvme-tcp queue, we set the data_ready callback before
we actually need to use it. This creates the potential that if a stray
controller sends us data on the socket before we connect, we can trigger
the io_work and start consuming the socket.
In this case reported: we failed to allocate one of the io queues, and
as we start releasing the queues that we already allocated, we get
a UAF [1] from the io_work which is running before it should really.
Fix this by setting the socket ops callbacks only before we start the
queue, so that we can't accidentally schedule the io_work in the
initialization phase before the queue started. While we are at it,
rename nvme_tcp_restore_sock_calls to pair with nvme_tcp_setup_sock_ops.
[1]:
[16802.107284] nvme nvme4: starting error recovery
[16802.109166] nvme nvme4: Reconnecting in 10 seconds...
[16812.173535] nvme nvme4: failed to connect socket: -111
[16812.173745] nvme nvme4: Failed reconnect attempt 1
[16812.173747] nvme nvme4: Reconnecting in 10 seconds...
[16822.413555] nvme nvme4: failed to connect socket: -111
[16822.413762] nvme nvme4: Failed reconnect attempt 2
[16822.413765] nvme nvme4: Reconnecting in 10 seconds...
[16832.661274] nvme nvme4: creating 32 I/O queues.
[16833.919887] BUG: kernel NULL pointer dereference, address: 0000000000000088
[16833.920068] nvme nvme4: Failed reconnect attempt 3
[16833.920094] #PF: supervisor write access in kernel mode
[16833.920261] nvme nvme4: Reconnecting in 10 seconds...
[16833.920368] #PF: error_code(0x0002) - not-present page
[16833.921086] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
[16833.921191] RIP: 0010:_raw_spin_lock_bh+0x17/0x30
...
[16833.923138] Call Trace:
[16833.923271] <TASK>
[16833.923402] lock_sock_nested+0x1e/0x50
[16833.923545] nvme_tcp_try_recv+0x40/0xa0 [nvme_tcp]
[16833.923685] nvme_tcp_io_work+0x68/0xa0 [nvme_tcp]
[16833.923824] process_one_work+0x1e8/0x390
[16833.923969] worker_thread+0x53/0x3d0
[16833.924104] ? process_one_work+0x390/0x390
[16833.924240] kthread+0x124/0x150
[16833.924376] ? set_kthread_struct+0x50/0x50
[16833.924518] ret_from_fork+0x1f/0x30
[16833.924655] </TASK>
Reported-by: Yanjun Zhang <zhangyanjun@cestc.cn>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Yanjun Zhang <zhangyanjun@cestc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
8b/10b encoding needs to add 3% fec overhead into the pbn.
In the Synapcis Cascaded MST hub, the first stage MST branch device
needs the information to determine the timeslot count for the
second stage MST branch device. Missing this overhead will leads to
insufficient timeslot allocation.
Cc: stable@vger.kernel.org
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Traditional synaptics hub has one MST branch device without virtual dpcd.
Synaptics cascaded hub has two chained MST branch devices. DSC decoding
is performed via root MST branch device, instead of the second MST branch
device.
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Conor Dooley <conor.dooley@microchip.com> says:
Here's my attempt at fixing both the use of an FPU on XIP kernels and
the issue that Jason ran into where CONFIG_FPU, which needs the
alternatives frame work for has_fpu() checks, could be enabled without
the alternatives actually being present.
For the former, a "slow" fallback that does not use alternatives is
added to riscv_has_extension_[un]likely() that can be used with XIP.
Obviously, we want to make use of Jisheng's alternatives based approach
where possible, so any users of riscv_has_extension_[un]likely() will
want to make sure that they select RISCV_ALTERNATIVE.
If they don't however, they'll hit the fallback path which (should,
sparing a silly mistake from me!) behave in the same way, thus
succeeding silently. Sounds like a
To prevent "depends on !XIP_KERNEL; select RISCV_ALTERNATIVE" spreading
like the plague through the various places that want to check for the
presence of extensions, and sidestep the potential silent "success"
mentioned above, all users RISCV_ALTERNATIVE are converted from selects
to dependencies, with the option being selected for all !XIP_KERNEL
builds.
I know that the VDSO was a key place that Jisheng wanted to use the new
helper rather than static branches, and I think the fallback path
should not cause issues there.
See the thread at [1] for the prior discussion.
1 - https://lore.kernel.org/linux-riscv/20230128172856.3814-1-jszhang@kernel.org/T/#m21390d570997145d31dd8bb95002fd61f99c6573
[Palmer: merging in the fixes as a branch as there's some features that
depend on it.]
* b4-shazam-merge:
RISC-V: always select RISCV_ALTERNATIVE for non-xip kernels
RISC-V: add non-alternative fallback for riscv_has_extension_[un]likely()
Link: https://lore.kernel.org/r/20230324100538.3514663-1-conor.dooley@microchip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
When moving switch_to's has_fpu() over to using
riscv_has_extension_likely() rather than static branches, the FPU code
gained a dependency on the alternatives framework.
That dependency has now been removed, as riscv_has_extension_ikely() now
contains a fallback path, using __riscv_isa_extension_available(), but
if CONFIG_RISCV_ALTERNATIVE isn't selected when CONFIG_FPU is, has_fpu()
checks will not benefit from the "fast path" that the alternatives
framework provides.
We want to ensure that alternatives are available whenever
riscv_has_extension_[un]likely() is used, rather than silently falling
back to the slow path, but rather than rely on selecting
RISCV_ALTERNATIVE in the myriad of locations that may use
riscv_has_extension_[un]likely(), select it (almost) always instead by
adding it to the main RISCV config entry.
xip kernels cannot make use of the alternatives framework, so it is not
enabled for those configurations, although this is the status quo.
All current sites that select RISCV_ALTERNATIVE are converted to
dependencies on the option instead. The explicit dependencies on
!XIP_KERNEL can be dropped, as RISCV_ALTERNATIVE is not user selectable.
Fixes: 702e64550b ("riscv: fpu: switch has_fpu() to riscv_has_extension_likely()")
Link: https://lore.kernel.org/all/ZBruFRwt3rUVngPu@zx2c4.com/
Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20230324100538.3514663-3-conor.dooley@microchip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The has_fpu() check, which in turn calls riscv_has_extension_likely(),
relies on alternatives to figure out whether the system has an FPU.
As a result, it will malfunction on XIP kernels, as they do not support
the alternatives mechanism.
When alternatives support is not present, fall back to using
__riscv_isa_extension_available() in riscv_has_extension_[un]likely()
instead stead, which handily takes the same argument, so that kernels
that do not support alternatives can accurately report the presence of
FPU support.
Fixes: 702e64550b ("riscv: fpu: switch has_fpu() to riscv_has_extension_likely()")
Link: https://lore.kernel.org/all/ad445951-3d13-4644-94d9-e0989cda39c3@spud/
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20230324100538.3514663-2-conor.dooley@microchip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Commit 52f04f10b9 ("thermal: intel: int340x: processor_thermal: Fix
deadlock") addressed deadlock issue during user space trip update. But it
missed a case when thermal zone device is disabled when user writes 0.
Call to thermal_zone_device_disable() also causes deadlock as it also
tries to lock tz->lock, which is already claimed by trip_point_temp_store()
in the thermal core code.
Remove call to thermal_zone_device_disable() in the function
sys_set_trip_temp(), which is called from trip_point_temp_store().
Fixes: 52f04f10b9 ("thermal: intel: int340x: processor_thermal: Fix deadlock")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 6.2+ <stable@vger.kernel.org> # 6.2+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 3e45352259 ("md: Free resources in __md_stop") tried to fix
null-ptr-deference for 'active_io' by moving percpu_ref_exit() to
__md_stop(), however, the commit also moving 'writes_pending' to
__md_stop(), and this will cause mdadm tests broken:
BUG: kernel NULL pointer dereference, address: 0000000000000038
Oops: 0000 [#1] PREEMPT SMP
CPU: 15 PID: 17830 Comm: mdadm Not tainted 6.3.0-rc3-next-20230324-00009-g520d37
RIP: 0010:free_percpu+0x465/0x670
Call Trace:
<TASK>
__percpu_ref_exit+0x48/0x70
percpu_ref_exit+0x1a/0x90
__md_stop+0xe9/0x170
do_md_stop+0x1e1/0x7b0
md_ioctl+0x90c/0x1aa0
blkdev_ioctl+0x19b/0x400
vfs_ioctl+0x20/0x50
__x64_sys_ioctl+0xba/0xe0
do_syscall_64+0x6c/0xe0
entry_SYSCALL_64_after_hwframe+0x63/0xcd
And the problem can be reporduced 100% by following test:
mdadm -CR /dev/md0 -l1 -n1 /dev/sda --force
echo inactive > /sys/block/md0/md/array_state
echo read-auto > /sys/block/md0/md/array_state
echo inactive > /sys/block/md0/md/array_state
Root cause:
// start raid
raid1_run
mddev_init_writes_pending
percpu_ref_init
// inactive raid
array_state_store
do_md_stop
__md_stop
percpu_ref_exit
// start raid again
array_state_store
do_md_run
raid1_run
mddev_init_writes_pending
if (mddev->writes_pending.percpu_count_ptr)
// won't reinit
// inactive raid again
...
percpu_ref_exit
-> null-ptr-deference
Before the commit, 'writes_pending' is exited when mddev is freed, and
it's safe to restart raid because mddev_init_writes_pending() already make
sure that 'writes_pending' will only be initialized once.
Fix the prblem by moving 'writes_pending' back, it's a litter hard to find
the relationship between alloc memory and free memory, however, code
changes is much less and we lived with this for a long time already.
Fixes: 3e45352259 ("md: Free resources in __md_stop")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230328094400.1448955-1-yukuai1@huaweicloud.com